Refactor regression applications
The TrainRegression and PredicRegression applications do not follow the same rationale as their classification counterparts. Indeed, the new sampling framework in OTB allowed to refactor the classification applications in order to split the different steps of the machine learning pipeline: sampling, feature extraction, training and prediction. This also allowed to provide an API which separates raster and vector inputs in different applications (TrainVectorClassifier vs TrainImagesClassifier, for instance).
However, regression is still done in OTB with just an application for training (which also does the sample selection) and another one for the prediction which does not work with vector inputs. Furthermore, the training application only accepts raster data or point sets in a CSV file and can't use vector files produced by the sampling framework.
The goal of this FR is to propose the steps needed to provide an API for regression equivalent to the classification API. These would include:
- Split the train application in 2, one for raster input and another one for vector input (including CSV for backwards compatibility)
- Split the predict application in 2, one for raster input and another for vector input
- Remove all sampling operations from the training applications and delegate them to the sampling framework.
EDIT
Work has been organized in:
-
Remove all sampling operations from the training applications and delegate them to the sampling framework (5pts) -
Split the train application in 2, one for raster input and another one for vector input (including CSV for backwards compatibility) (8pts) -
Split the predict application in 2, one for raster input and another for vector input (8pts) -
A cookbook recipe for this new framework (5pts)