PERF: Use Boost.SmallVec in BCO interpolator

added feature label

added 1 commit

fa393363 - PERF: Use Boost.SmallVec in BCO interpolator

changed the description

added breaking label

This looks great. What config is used in your benchmark?

After making EvaluateCoef non-virtual, the CPU times are longer, but the wall times are shorter:

264.82user 5.31system 0:30.86elapsed 875%CPU (0avgtext+0avgdata 1016596maxresident)k
0inputs+15071432outputs (0major+1443220minor)pagefaults 0swaps
267.91user 4.96system 0:31.26elapsed 872%CPU (0avgtext+0avgdata 1016548maxresident)k
0inputs+15071432outputs (0major+1443217minor)pagefaults 0swaps
264.21user 5.65system 0:36.12elapsed 746%CPU (0avgtext+0avgdata 1016396maxresident)k
0inputs+15071432outputs (0major+1443217minor)pagefaults 0swaps

I feel these numbers are too noisy to be reliable, but it's probably fine to keep that change.

And directly returning the vector:

247.04user 5.30system 0:31.45elapsed 802%CPU (0avgtext+0avgdata 1016364maxresident)k
0inputs+15071432outputs (0major+1443212minor)pagefaults 0swaps
247.62user 5.44system 0:33.31elapsed 759%CPU (0avgtext+0avgdata 1016368maxresident)k
0inputs+15071432outputs (0major+1443205minor)pagefaults 0swaps
252.58user 5.18system 0:30.52elapsed 844%CPU (0avgtext+0avgdata 1016552maxresident)k
0inputs+15071432outputs (0major+1443215minor)pagefaults 0swaps

added 1 commit

1b99e1f2 - PERF: Use Boost.SmallVec in BCO interpolator

Compare with previous version

changed the description

When using out parameters, as the vector is somehow seen as shared, the accumulation would be best done into a local variable instead of vect[i].

Also, copying m_Alpha into a local variable, to tell the compiler that it cannot change while the function is being executed, should slightly improve the performances.

added 1 commit

e20c2bb2 - PERF: Use Boost.SmallVec in BCO interpolator

Compare with previous version

Updated. I'm pretty sure the compiler can see through that, but it does make the code nicer. I also removed some abs calls there weren't doing anything.

Actually, this seems to have regressed back the performance:

268.92user 5.08system 0:31.17elapsed 879%CPU (0avgtext+0avgdata 1016668maxresident)k
112inputs+15071432outputs (1major+1443221minor)pagefaults 0swaps
270.39user 5.11system 0:33.26elapsed 828%CPU (0avgtext+0avgdata 1016628maxresident)k
0inputs+15071432outputs (0major+1443209minor)pagefaults 0swaps
269.68user 5.32system 0:34.94elapsed 787%CPU (0avgtext+0avgdata 1016480maxresident)k
0inputs+15071432outputs (0major+1443214minor)pagefaults 0swaps

Or maybe there's no actual difference.

Do you think the user/system/elapse metrics are reliable? To be sure you can use timeprobes and average runtimes over something like 10 runs, measure mean and stdev

I don't think they're particularly reliable. It's a desktop system, but I don't seem to be able to control the frequency scaling governor:

cat: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor: Invalid argument

and judging by how easily the CPU reaches 80-90 C then goes back, I don't think it's adequately cooled.

I assume the owTvQtWidgetShow failure on centos-xdk-build has nothing to do with these changes.

@lnicola do you confirm that there is an actual speedup? (I had no time to benchmark but I can give a try in coming days)

On my side I measured 12% gain in processing time. Great job @lnicola

@remicress Thanks. I wanted to test on a different computer (a laptop, though), but didn't get a chance. 12% sounds like my best results here, in !697 (comment 84492).

mentioned in issue #1999 (closed)

This has votes, but I would postpone merging it until at least after the release.

Will the next version be 7.2 or 8.0? This is technically semver-breaking, so it should go in 8.0.

Why did you add the breaking label? I don't see

@remicress see the "additional notes" section in the MR description.

resolved all threads

merged

mentioned in commit 33480ce3

note : next version will probably be 8.0 !

changed milestone to %8.0.0

mentioned in issue #2075 (closed)

mentioned in commit 54d2985b

mentioned in issue #1666 (closed)

PERF: Use Boost.SmallVec in BCO interpolator

Summary

Rationale

Implementation Details

Additional notes

Copyright

Merged by Laurențiu Nicola 5 years ago (Apr 6, 2020 10:26am UTC) 5 years ago

Activity

PERF: Use Boost.SmallVec in BCO interpolator

Summary

Rationale

Implementation Details

Additional notes

Copyright

Merge request reports

Merged by Laurențiu Nicola 5 years ago (Apr 6, 2020 10:26am UTC) 5 years ago

Activity