Poor performance with VRT datasets
For some reason, VRTs are much slower in OTB than in GDAL. I tested concatenating the same 10980x10980 file 100 times into a single one, and the equivalent with a single VRT as an input:
gdal_merge.py tiff -> tiff
18.41user 69.03system 2:32.27elapsed 57%CPU (0avgtext+0avgdata 4650728maxresident)k
gdal_translate vrt/ComplexSource -> tiff
120.56user 11.24system 2:13.08elapsed 99%CPU (0avgtext+0avgdata 3342704maxresident)k
gdal_translate vrt/SimpleSource -> tiff (!)
15.50user 9.70system 0:25.28elapsed 99%CPU (0avgtext+0avgdata 3342984maxresident)k
otbConcatenateImages tiff -> tiff
224.53user 16.34system 1:39.64elapsed 241%CPU (0avgtext+0avgdata 3966920maxresident)k
otbConcatenateImages vrt/ComplexSource -> tiff
643.76user 325.83system 17:02.93elapsed 94%CPU (0avgtext+0avgdata 2179636maxresident)k
otbConcatenateImages vrt/SimpleSource -> tiff
807.17user 224.81system 13:10.33elapsed 130%CPU (0avgtext+0avgdata 1449820maxresident)k
otbConcatenateImages vrt/SimpleSource -> tiff, streaming:type=stripped
208.19user 27.56system 1:39.00elapsed 238%CPU (0avgtext+0avgdata 612944maxresident)k
plain gdal tiff -> tiff
35.26user 27.61system 0:30.50elapsed 206%CPU (0avgtext+0avgdata 3386168maxresident)k
plain gdal vrt/ComplexSource -> tiff
326.27user 20.42system 0:40.57elapsed 854%CPU (0avgtext+0avgdata 3415156maxresident)k
plain gdal vrt/SimpleSource -> tiff
35.78user 22.72system 0:30.32elapsed 192%CPU (0avgtext+0avgdata 3400296maxresident)k
The last implementation is a very simple one, intended to work as a baseline. It reads input blocks (regions, not TIFF blocks) in parallel, writes an output block, then repeats.
The numbers above should be on a warm cache. The inputs and output are striped with pixel interleaving. Band interleaving is yet faster, but I only have numbers for the last one.
This is caused by OTB picking a wrong tile size.
tiff:
(INFO): File merged.tif will be written in 999 blocks of 10980x11 pixels
vrt:
(INFO): File merged.tif will be written in 841 blocks of 384x384 pixels
Actually, GDAL reports a wrong block size (128x128). GDAL 3.3 supports setting the block size on the VRT bands, which can be another workaround.
Edited by Laurențiu Nicola