Test refactoring
What changes will be made and why they would make a better Orfeo ToolBox?
Refactor the tests in order to have a faster validation procedure. Ideally, this refactoring should also simplify the testing strategy, and result in a better balance between over-tested and under-tested classes.
High level description
3 steps are proposed for this refactoring:
-
Refactor the longest tests. 41 tests have been identified as they take more than 10s to run. -
Design a test framework without IO, only in-memory images -
Refactor filters Tv
to use the new test framework
Prior to these steps, 2 other taks should simplify this work:
- #1765 (closed) : the modules to be moved out of OTB should reduce the number of tests
- !299 (merged) : refactor all functor based filters, their test will be much simpler
Risks and benefits
The main risk is on the refactoring time, that could be significant as there are about 2300 tests.
On the bright side, it will be possible to greatly reduce the total testing time, as well as the need for large test data repository.
Alternatives for implementations
Step 1
The list of 41 pre-selected tests to refactor is here (with associated timings in seconds):
10.1 ioTvDEMToImageGeneratorFromImageTest_SensorModel
10.38 bfTuMeanShiftSmoothingImageFilterROIQBMul4
10.49 coTvCoordinateToNameMultithreadTest
10.93 obTuMeanShiftStreamingConnectedComponentSegmentationOBIAToVectorDataFilter
11.1 ioTvKmzProductWriterWithGCP
11.27 apTvSeLargeScaleMeanShiftTest
11.6 leTvSVMMachineLearningModelReg
13.18 bfTvMeanShiftSmoothingImageFilterThreadingNonOpt
13.28 apTvSeSegmentationWatershedVector
14.37 dmTvDisparityMapEstimationMethod
14.72 leTeSEMModelEstimatorExampleTest
14.92 apTvSeSegmentationCCVector_ULOVW
15.2 leTeSOMClassifierExampleTest
15.29 apTvSeSegmentationMeanshiftVector
15.84 obTuMeanShiftConnectedComponentSegmentationFilter
16.47 feTvLocalHoughDraw2
16.63 apTvRaOpticalCalibration_UnknownSensor
16.79 apTvSeSegmentationCCVector_ULU
18.25 apTvClKMeansImageClassification_composite
20.58 apTvSeSegmentationCCVector
22.13 raTvReflectanceToSurfaceReflectanceImageFilter2
23.11 bfTuMeanShiftSmoothingImageFilterQBRoad
23.75 apTvCdbDSFuzzyModelEstimation_LI
26.45 apTvCdbDSFuzzyModelEstimation_LI_autoInit
27.77 apTvClSampleAugmentationReplicate
28.12 apTvFeLineSegmentDetectionNoRescale
29.79 apTvClSampleAugmentationSmote
29.96 leTvGradientBoostedTreeMachineLearningModel
31.1 maTeMarkovClassification3ExampleTest
32.73 leTeSVMImageEstimatorClassificationMultiExampleTest
33.31 apTvClSampleAugmentationJitter
34.57 hyTvMDMDNMFImageFilterTest2
36.32 apTvFeLineSegmentDetection
38.8 leTeSOMExampleTest
51.22 ioTeTileMapImageIOExampleTest
52.66 reTeImageRegistration2ExampleTest
60.84 ioTvSHPVectorDataFileReader3
60.98 ioTvTileMapImageIOWeb
88.13 dmTvFineRegistrationImageFilterTestWithMeanSquare
112.69 owTvQtWidgetShow
135.51 apTvClMethodGBTImageClassifierQB1
Step 2
In module OTBTestKernel:
-
a set of functions to produce hardcoded images:
- template functions working on both otb::Image and otb::VectorImage
- start with image allocation (buffered region equals to largest possible region)
- fills the image with a formula, an array, ...
-
a set of functions to set hardcoded image metadata
- fills any image metadata (projection ref, origin, spacing, OSSIM keywordlist, ...)
-
a set of functions to produce hardcoded vector data:
- two families of functions : for otb::VectorData or otb::OGRDataSource
- fills the dataset with hardcoded geometries and fields
- hardcoded metadata (projection ref, ...)
-
new generic functions to check the result of a boolean condition (similar to an
assert
statement). There are blocking and non-blocking flavours (a blocking check will stop the test execution if it fails).
Step 3
Tv
tests can be re-written to use the in-memory test data. They are composed
of the following steps:
- Instanciate the filter to be tested.
- Instanciate a given fake image
- Set the fake image as input of the filter
- Set the filter parameters
- Call Update() on the filter
- Check the differences with the baseline (if any)
- Check other outputs of the filter (if any)
- If all checks passed: returns EXIT_SUCCESS
- If one check failed:
- write PNG file for the difference image (if any)
- return EXIT_FAILURE
With this strategy, the comparison of test and baseline images in CDash is preserved. The baselines don't have to be images on disk, they can also be hardcoded in the test (suited for tiny images, like 10 x 10 pixels).
Who will be developing the proposed changes?
@salbert & OTB Dev Team