Regression refactoring : TrainVectorRegression

Cédric Traizet requested to merge regression_refactoring into develop


This MR introduces a new application, TrainVectorRegression for training a regression machine learning model from vector data, in the same fashion as TrainVectorClassifier


See #1799 (closed)

This MR includes two major changes:

  • The application TrainVectorBase (the base class for TrainVectorClassifier), is now template on TInputValue (features) and TOutputValues (class). Before the MR it only works on float as feature type and int as class type (classification case), but now it can also be used for other type like float,float (regression case).

  • A new application TrainVectorRegression deriving from TrainVectorBase


A test has been added for the new application, using a rf classifier as regression algorithm. In the end all regression algorithm should be tested, but I think we can do that in the (future) TrainImagesRegression, to keep the same testing strategy as for classification.

Additional notes

This is not exactly the workflow described in the issue, because I don't think the first step is relevent (removing sampling from TrainRegrssion), as TrainRegression will be deprecated at the end of the refactoring.

The next step of the refactoring is to create a TrainImagesRegression application: it could be a composite application that chains ImageEnvelope to create a polygon on the extent of the image, SampleSelection to select random points over this polygon, SampleExtraction to extract feature and predictor values over two input images and finally TrainVectorRegression to extract the model (this is the workflow used in the KMeansClassification composite application), what do you think ? Anyway I think this is out of the scope of this MR.

In the issue we talked about CSV input compatibility. It is hard to add it in TrainVectorBase because of the design of the application, the best way (given the design of the learning applications) would probably to create a new application TrainCSVBase inheriting from LearningApplicationBase doing the CSV input reading, and then create a TrainCSVRegression from it, and maybe also a TrainCSVClassifier. But it there really a need for such functionality ?


