OTB must be able to write Cloud Optimized Geotiff (COG)
Cloud Optimized Geotiff (COG) has gained increasing popularity in recent years, thanks to its ability to transit smoothly in HTTP requests, enabling unprecedented rasters interoperability. OTB is able to read COG. However, it's not able to write COG with overviews, a crucial feature. IMO the implementation of COG writing in OTB would be huge.
GDAL does everything
A lot of work has been made in recent GDAL versions to access and write COG more easily. For instance since GDAL 3.1 we can simply convert any raster into COG using [1]:
gdal_translate input.tif output_cog.tif -of COG -co COMPRESS=LZW
A lot of work has also been done in GDAL to read COG files through most kind of online providers based on the /vsicurl
provider.
Thanks to that, OTB is able to read most of online COG smoothly (there are still some small problems when the input URI has some ?
chars and OTB believes its an extended filename... however that's not the topic of this issue, and I'll submit another one soon, or propose an MR for that since it's not a big issue).
OTB is not able to write COD with overviews
But right now, OTB (8.1.0) is not able to write COG with overviews (the issue has been described on the forum here). The following trick enables to write COG, however output raster don't include overviews (which is one crucial feature of COG).
otbcli_SomeApplication ... -out "/path/to/output.tif?&gdal:co:TILED=YES&gdal:co:COPY_SRC_OVERVIEWS=YES"
gdalinfo /tmp/toto.tif | grep Overviews # Nothing here...
And of course using gdaladdo
on the output after its creation doesn't lead to a valid COG (we can use dist-packages/osgeo_utils/samples/validate_cloud_optimized_geotiff.py
to know if it's a valid COG).
What could be done
First we should investigate if the creation of a COG is a streamable process. Given how overviews are computed, I am tempted to answer "yes".
However quick hacking in the otbGDALImageIO.cxx
(replacing 'GTiff' drivers with 'COG') gaves me this message: (WARNING): The file format of /tmp/cogtest.tif does not support streaming. All data will be loaded to memory
.
I don't know how this should be implemented in term of user interface, but I was thinking maybe with a COG-specific extended filename, and/or an environment variable to default the writing of all output rasters in the COG way.