Update Remove OSSIM with a beginning of the description of the architecture authored by Julien Osman's avatar Julien Osman
# Remove OSSIM
[[_TOC_]]
## Context
OSSIM is used for geometric sensor modelling and metadata parsing. It
has been used for that since the beginning of the OTB. Then adapter
classes have been added, to hide OSSIM headers from OTB public
API. Now, it is time to plan the removal of this dependency, whose
development cycle is difficult to follow. In the current state, only a
small portion of OSSIM is used anyway.
See https://gitlab.orfeo-toolbox.org/orfeotoolbox/otb/-/issues/1506
for more details.
This page compiles the description of the technical choices made
during the process of removing OSSIM.
## Risks and benefits
Risks:
- Large refactoring, there is a substantial amount of work needed,
mostly for the porting of geometric models.
- Impacts on the validation tests, and baselines.
- Several components are complex to migrate (DEMHandler,
SarSensorModel,...)
Benefits:
- 4 mandatory dependencies will be removed from OTB: Ossim,
OssimPlugins, GeoTiff, OpenThreads
- 2 modules will be removed: OTBOssimAdapters and
OTBOpenThreadsAdapters
- Sanitize the code base
- Easier support of new sensors models
- Better metadata architecture (which allows better metadata
processing in our pipelines)
- Separation between metadata parsing, and geometric modelling.
## Presentation of the new architecture
### Metadata parsing
Here is a new workflow to replace the current function
``ReadGeometryFromImage()``:
![image](uploads/1c06ec00864af716582714cfe7563345/image.png)
1. Reading the metadata files. The purpose of this step is to parse
each metadata file associated with the image file and supply it as
a (in-memory) XML tree. This tree is given to a
ImageMetadataInterface (IMI) that will look for needed
information. The parsing is be done by different classes,
ddepending on the file format. They can all derive from the base
class MetadataSupplierInterface. The three suppliers are:
* GDALImageIO will use GDALDataset::GetMetadata() to extract
'key=value' pairs and format them into a XML tree.
* XMLMetadataSupplier uses GDAL's XML parsing mechanism
("ReadXMLToList" method from the "GDALMDReaderBase" class) to
convert the XML file into a GDAL ListString, which is a
succession of 'key=value' pairs.
* TextMetadataSupplier trys to parse 'key=value' pairs.
Those classes all implement the method *GetMetadataValue* which
returns the value of the metadata from a given key. The base class
also implements the methods *GetAs* and *GetAsVector* which are
used by the IMI.
1. The GDAL input/output capabilities are encapsulated in the
GDALImageIO class, which derivates from ImageIO. This class is in
charge of fetching the metadata from the product (supplier
capabilities hinerited from the class MetadataSupplierInterface),
and of writting the metadata to the product (storage capabilities
hinerited from the class MetadataStorageInterface).
1. We use a classic IMIFactory to find if a given IMI (associated to a
given sensor) can parse the metadata of a product. The IMI's
*parse* method will pick the metadata from the ImageIO and fill an
*ImageMetadata* object. This step consists in finding the relevant
metadata in the different Metadata Suppliers and using the *Add()*
method of the *ImageMetadata* object to store the metadata. If the
parsing returns successfully, the generated ImageMetadata is given
to the *ImageCommon* that propagate through the pipeline.
## Justifications for the technical choices
### Refactor OTB Metadata
#### Band dependant metadata
The new otb::ImageMetadata object, which will supersede the
KeywordDictionary, makes the distinction between common metadata and
band specific metadata. This means filters that alter the number of
bands have to update the ImageMetadata in order to reflect this
alteration.
For example, a filter that takes a multi-band image as input and
generate a 1-band composition as output should process the
`ImageMetadata` object and take into account the band specific metadata
from the input to generate a 1-band `ImageMetadata` as output.
This should be done in the `GenerateOutputInformation()` method. For example :
```cpp
MyFilter::GenerateOutputInformation()
{
Set number of output pixels :
this->GetOutput()->SetNumberOfComponentsPerPixel(2)
// Override default metadata copying behavior and copy metadata of bands 1 and 2 of the input of the filter.
this->GetOutput()->SetImageMetadata(this->GetInput()->GetImageMetadata().Slice(1,2));
}
```
The default behavior is the suppression of band specific metadata when
the filter change the number of bands. Note that in a lot of cases this is the expected behavior, because the output bands information is not related to the input metadata.
#### ImageMetadataInterface
The ```ImageMetadataInterface``` classes now implement a ```parse``` method that is in charge of filling the ```ImageMetadata``` object.
The unit test for the ```ImageMetadata``` and ```ImageMetadatainterface``` classes currently use a process of writing the content of the ```ImageMetadata``` and comparing it to a file in the baseline. This is not optimal because the ```ImageMetadata``` has to be read many times in order to be sure the metadata are printed in the same order for the comparison. Later, we will implement a comparison and a "Read_From_File" method for the ```ImageMetadata``` in order to avoid having to write the metadata to a file.
#### Sentinel-1
GDAL's driver for Sentinel-1 can be found
[here](https://gdal.org/drivers/raster/safe.html) and the source code
[here](https://github.com/OSGeo/gdal/tree/ef49c00611235df0c1ce4f51344f00567a668661/gdal/frmts/safe). This
driver doesn't read all the metadata. For instance, the calibration
metadata are missing.
Options are:
- Contributing to the driver
- Implement our own metadata parser
##### Contributing to the driver
Contributing to GDAL's driver would present multiple
advantages. First, it would benefit other GDAL users how would be able
to access those missing metadata. Then, since the driver already
exists, the quantity of work to read the missing metadata should be
moderate. Moreover, using GDAL's means OTB will use GDALImageIO for
Sentinel-1 products, which is already implemented and simplifies the
processes. One drawback would be that we would need to wait until next
GDAL release to benefit from our contribution.
##### Implement our own metadata parser
Implementing our own parser would provide the OTB with a generic
metadata parser, not linked to a specific sensor. It will be
particularly useful for geom files. The implementation would be
available now, we won't need to wait for the next GDAL release. The
problem of this approach is that it would involves using a new
supplier (different from GDALImageIO), witch would be complicated to
make compatible with the current pipeline.
##### Summary
| | Contributing | Implement |
|:-----------------------------|:--------------------------------|:-----------------------|
| Profits the community | :white_check_mark: | :red_circle: |
| Can be use for other sensors | :red_circle: | :white_check_mark: |
| Availability | :red_circle: wait for release | :white_check_mark: now |
| Quantity of work | :large_orange_diamond: moderate | :red_circle: important |
| Use GDALImageIO | :white_check_mark: | :red_circle: |
A discussion was open on GDAL's mailing list about the contribution to
the driver. It doesn't seems very interesting since Calibration
metadata represents a lot of data to read, and are not used by many
people.
The selected solution was to implement our own metadata parser, but
only for the metadata not read by GDAL. It is based on the
"ReadXMLToList" method from the "GDALMDReaderBase" class. Since this
method is not part of the public API, a similar one was implemented in
the OTB, based on GDAL's work.
### Re-implement DEMHandeler
The current `otb::DEMHandler` class is an adapter class for
OSSIM DEMs. The objective is to refactor this class to use GDAL
instead.
The RPC transform class in GDAL accepts a path to a DEM file, which is
then opened and used internally by the RPC class. The first idea was
to encapsulate this class in OTB. But after investigations, it appears
that the DEM management functions are not part of GDAL's
API. Therefor, it is not possible to access the DEM interpolation as
expected from the OTB.
The new approach is a suggestion from Even Rouault on GDAL's mailing
list (see [this
message](https://lists.osgeo.org/pipermail/gdal-dev/2020-May/052225.html)
and [that
one](https://lists.osgeo.org/pipermail/gdal-dev/2020-May/052227.html)). The
idea is to use the GDALRasterIOExtraArg argument of the RasterIO
function, by setting bFloatingPointWindowValidity to TRUE and setting
dfXOff, dfYOff, dfXSize, dfYSize.
Geoid are managed with the `GDALOpenVerticalShiftGrid` and `GDALApplyVerticalShiftGrid` function from GDAL API. The former opens a 1D raster grid as a GDAL Datasource, and the latter creates a new datasource from the raster grid (geoid) and a raster (the DEM). Vertical datums (shifts from the reference ellipsoid) are applied on the fly.
The new DEMHandler has the same API as its Ossim counterpart, in particular the following methods are provided :
* GetHeightAboveEllipsoid():
* SRTM and geoid both available: dem_value + geoid_offset
* No SRTM but geoid available: default height above ellipsoid + geoid_offset
* SRTM available, but no geoid: dem_value
* No SRTM and no geoid available: default height above ellipsoid
* GetHeightAboveMSL():
* SRTM and geoid both available: dem_value
* No SRTM but geoid available: 0
* SRTM available, but no geoid: dem_value
* No SRTM and no geoid available: 0
### Re-implement RPC model
### Re-implement generic SAR model
### Implement sensor factory for external models