Published using Google Docs
GDAL style/coding guide - int foo[123] -> vector<int>(246, 0)
Updated automatically every 5 minutes

Use vector<T>(length, initial_value) for local blocks of storage.

Project: GDAL

Author: Kurt Schwehr schwehr@google.com / schwehr@gmail.com (goatbar)

Started: 2016-May-04

Status: Discussion underway OBSOLETE - but still a usable concept.

Short link: goo.gl/vuA3D6

Document license: Apache 2.0

Copyright 2016 Google Inc. All Right Reserved.

Note 2021-May-11:

RFC68 made C++11 the minimum supported C++ language version. While the vector solution proposed here is still viable, that makes it less important. std:unique_ptr and other C++11 language features are now often the better choice. I changed text that is no longer valid be strikethrough.

Note: This document does not propose to use any new features of C++ and does not propose any build changes to GDAL.

Consider:

   int anVals[256];

   memset(anVals, 0, 256*sizeof(int));

This is less than optimal because:

In GDAL, there are >700 stack allocations of more than 100 items and > 150 of 1000 or more items.  There are 877 calls to memset in the C++ code.

Proposed solution:

https://trac.osgeo.org/gdal/changeset/34177 

  std::vector<int> oVals(256, 0);

Benefits:

Drawbacks:

Alternatives / improvements

Might be better or worse in different ways (not worrying about if we have C++11/14 support or not):

Key GDAL facts to keep in mind for this proposal:

What if ...:

Note: C++11/14 is not really a big part of this particular proposal.  Just trying to be complete.

GDAL goes with C++11 or 14?  It would probably be best to stick with the vector solution as it is simpler than unqiue_ptr and std::array still puts a large amount of data off the stack.   vector is still the simplest with initialization.

GDAL does not care about stack size and C++14?  You could do a unqiue_ptr with a CPLCalloc and a deleter of CPLFree.

Open questions:

See also:

https://gdal.org/development/rfc/rfc68_cplusplus11.html RFC 68: C++11 Compilation Mode

https://trac.osgeo.org/gdal/ticket/5748 Reduce GDAL stack usage

https://google.github.io/styleguide/cppguide.html#Local_Variables 

http://stackoverflow.com/questions/22571052/replace-fixed-size-arrays-with-stdarray 

http://stackoverflow.com/questions/15294129/overhead-to-using-stdvector 

http://www.cplusplus.com/reference/memory/unique_ptr/operator[]/

http://stackoverflow.com/questions/16711697/is-there-any-use-for-unique-ptr-with-array

https://isocpp.org/files/papers/n4028.pdf Defining a Portable C++ ABI, Herb Sutter 

http://eli.thegreenplace.net/2012/06/20/c11-using-unique_ptr-with-standard-library-containers 

http://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h for vector on the stack

http://stackoverflow.com/questions/10057443/explain-the-concept-of-a-stack-frame-in-a-nutshell

https://en.wikipedia.org/wiki/Call_stack 

http://www.cs.uwm.edu/classes/cs315/Bacon/Lecture/HTML/ch10s07.html 

Acknowledgements:

Several Google engineers commented on an initial draft of this doc.

Many people from the GDAL community have commented on this via the gdal-dev mailing list.

Impact of the proposed change:

Scale of change:

Small.  The impacts are localized to each scope where this is changes.  The replacement vectors behave pretty much the same in these use cases.

Performance:

Impact the C API/ABI:

None & None.  This change is internal to the C++ code.  Any data that is exposed to external callers will remain as C style arrays.

Impact on C++ API/ABI:

None & none.  These are internal local variables.

Impact on SWIG binding (Python/Perl/Ruby/Java/PHP):

None.

Which versions/branches of GDAL is the change required for:

GDAL >= 2.2.

Which versions/branches of GDAL may not use this change:

This change is safe for any branch or version of GDAL.  There is no real driving force to back port this change, but it is fine to do so, especially if it helps backporting other fixes.

Uses cases:

Some users of GDAL have large jobs with thousands of cores and >20 threads per process that run for a long time (e.g. days).  The overhead of having to allow for bumping the stack size adds substantial overhead to all those jobs.  GDAL is sometimes the only thing driving the requirement for the larger stack.  For people using GDAL on a desktop machine, the larger stack is unlikely to be a real issue.  For anyone running large multithreaded jobs in the cloud, the large stack adds real cost.  The overhead of std::vector and heap allocation in all these cases I've looked at in GDAL so far is far smaller than the cost from the stack size overhead.