Add the option to provide user defined centroids as initialization of the kmean algorithm in KMeansClassification
and TrainVectorClassifier
.
See feature request #1820 (closed)
The result of the KMeans algorithm depends on the input centroids, but it is currently not possible to set them (the k
first points of the training sample are used as initialization`. This MR adds the possibility to provide the centroids in a text file.
In the TrainVectorClassifier
application, the following parameters have been added:
In the KMeansClassification
composite application, the following parameters have been added:
Input centroid file reading is done using the Shark API (importCSV).
In SharkKMeansMachineLearningModel
, the normalization option has been removed. Normalization was possible during training (Train()
), using the Shark API to train a normalizer on the input list sample. This option was not used anywhere in OTB, and I removed it because the normalizer cannot be used afterward during classification... Instead the data normalization should be done prior to the training (as it is done in the applications).
In SharkKMeansMachineLearningModel
, I added a method to export the centroids as a text file (using the Shark's exportCSV
method), this can be used to obtain a human readable version of the centroid (the serialized model file can be hard to read). The centroids can now be exported in the TrainVectorClassifier
application, and the KMeansClassification
uses this method instead of creating the output centroid file from the serialized file (this was not working anyway, the output centroids where wrong...)
This means that the output centroids text file from a kmean application can be used as input of another kmean application !
The copyright owner is CNES and has signed the ORFEO ToolBox Contributor License Agreement.
Check before merging: