Skip to content

Draft: Reduce memory footprint when ImageFileWriter can't stream its input pipeline to the output file

Closes #2310

This MR enables to reduce memory footprint to write output images into file formats that does not support streaming write.

Benchmarks

All benchmarks have been performed on a laptop with 32Gb RAM, SSD, Ubuntu 20.04 + a lot of tabs opened in web browser (that makes roughly 20Gb RAM available). We let OTB with the default OTB_MAX_RAM_HINT which I believe is 256Mb. Our goal is to perform the pansharpening of a Spot-7 (or Pléiades, or PNeo) image, in an output image format for which the GDAL driver only supports the GDALDriver::CreateCopy(), which is intended to create a copy of an existing dataset (hence the whole dataset must already exist, either in-memory or in another raster file). In the following, we use the GDAL Cloud Optimized Geotiff driver to write the output image.

We use the following command to perform the processing:

otbcli_BundleToPerfectSensor -inp $dim_pan -inxs $dim_xs -out "/data/pxs.tif" int16

We did use some Spot-7 image but of course we expect the same kind of behavior for Pléiades and PNeo.

Measurements

We have measured the processing time on the BundleToPerfectSensor.

10k x 10k subset:

When everything is fine, and the memory budget is enough.

  • BundleToPerfectSensor (cog): 52s
  • BundleToPerfectSensor (cog, OTB_FORCE_STREAMING=1): 53s

20k x 20k subset:

When the image is big and the processing without streaming requires extra memory.

Original approach (trigger the entire pipeline, everything is in-memory)

Fail

Proposed approach (when OTB_FORCE_STREAMING is set to 1)

Force_streaming

Comparison with original approach + gdal_translate

  • BundleToPerfectSensor (cog, OTB_FORCE_STREAMING=1): 3m37s
  • BundleToPerfectSensor (gtiff) + gdal_translate (cog): 1m24s + 2m09s = 3m33s

Conclusion

PROS

  • The MR enables to save memory when writing output image format for which the GDAL driver only supports the GDALDriver::CreateCopy()

CONS

  • Saving the output image in a streamable-capable raster format + using gdal_translate to achieve the final conversion is as fast, and use a ridiculous smaller memory footprint (because the whole output image is not stored in memory).
  • Ultimately, there will always be an image size, for which the memory budget won't be enough, even with the OTB_FORCE_STREAMING approach. In this case, the user will fall back to the otb + gdal_translate approach.

Discussion

Maybe this MR should be generalized to python API in order to get the output as numpy array with OTB_FORCE_STREAMING (for now, the entire pipeline is triggered over the largest possible image region).

Edited by Rémi Cresson

Merge request reports