OTB build performance story
I have been looking at various refactors to OTB with the goal to improve build time. I am currently working on a branch about this, but I report my work-in-progress here because it might be useful to get some community feedback.
include-what-you-use. I investigated using
include-what-you-use to remove unnecessary includes. I documented this in issue #1635 (closed). I do not plan on working on this further (see issue for details).
Code in headers. I think this significantly decreases build times, for example there is a lot of code in
otbWrapperApplication.h which should go in a cxx. This file is included 99 times in OTB. This is also related to the heavy usage of ITK macros like
itkSetStringMacro, which each add ~15 lines of code in headers. Used 20 times per class, this can lead to 991520=29700 useless lines of code that the compiler has to process during a full OTB build. It might not seem like much but this is only for one file. I'm not sure the exact
impact but I think it's worth it to try to remove the biggest offenders and see what happens.
A useful command helps to know where to start looking. There are the most included headers:
$ ag --nofilename "#include" | sort | uniq -c | sort -n [...] 191 #include "itkUnaryFunctorImageFilter.h" 230 #include <fstream> 264 #include "otbMacro.h" 323 /*#include "f2c.h"*/ 345 #include "otbVectorImage.h" 440 #include <iostream> 626 #include "itkMacro.h" 626 #include "otbImageFileWriter.h" 735 #include "otbImageFileReader.h" 770 #include "otbImage.h"
extern templates. Extern templates are a C++11 feature which prevent implicit template instantiation. If we provide explicit instantiation in cxx files (that client code then needs to link to), we can reduce the work the compiler needs to do by a huge amount. For example on my develop build,
otb::Image<double, 2u>::GetUpperRightCorner() is instantiated by the compiler 319 times. One would be enough but each translation unit includes the
txx code independently. I did a quick prototype for only otbImage, readers and writers and already the build time is significantly reduced (about 15% less on my tests).
A useful command to diagnose this is:
$ nm -g --demangle (find . -name "*.o") | sort | uniq -c | sort -n | grep " W otb::" [...] 957 0000000000000000 W otb::Image<double, 2u>::~Image() 1380 0000000000000000 W otb::ImageFileReaderException::ImageFileReaderException(char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) 1923 0000000000000000 W otb::ImageRegionAdaptativeSplitter<2u>::~ImageRegionAdaptativeSplitter() 1926 0000000000000000 W otb::ImageRegionSquareTileSplitter<2u>::~ImageRegionSquareTileSplitter() 2070 0000000000000000 W otb::ImageFileReaderException::~ImageFileReaderException()
I am working on fixing points 2 and 3 in a branch. If you would like to help let me know.
Any other ideas to improve build time ?
(related to !143 (closed))