Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a lock-free API in ContinuousTrajectory #3346

Merged
merged 3 commits into from
Apr 22, 2022

Conversation

pleroy
Copy link
Member

@pleroy pleroy commented Apr 20, 2022

This API is for the exclusive use of Ephemeris, which needs to provide proper locking. On the journal provided in #3230, this saves about 4% of the CPU. On the benchmarks, the effect may be as high as 25%, but is typically more modest:

Before:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                                                                                 Time             CPU   Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_EphemerisMultithreadingBenchmark/3/1                                                                                              207773 us        0.000 us          100 +9.98770532350627147e+06 m +1.99941583244033158e+07 m +2.99993771561068073e+07 m 
BM_EphemerisMultithreadingBenchmark/3/2                                                                                              147364 us        0.000 us          100 +9.98770532350627147e+06 m +1.99941583244033158e+07 m +2.99993771561068073e+07 m 
BM_EphemerisMultithreadingBenchmark/3/3                                                                                               88606 us          156 us          100 +9.98770532350627147e+06 m +1.99941583244033158e+07 m +2.99993771561068073e+07 m 
BM_EphemerisMultithreadingBenchmark/3/4                                                                                               86628 us        0.000 us          100 +9.98770532350627147e+06 m +1.99941583244033158e+07 m +2.99993771561068073e+07 m 
BM_EphemerisMultithreadingBenchmark/3/5                                                                                               84283 us        0.000 us          100 +9.98770532350627147e+06 m +1.99941583244033158e+07 m +2.99993771561068073e+07 m 
BM_EphemerisKSPSystem/-3                                                                                                               46.7 s          46.2 s             1 +5.19502291148908779e-01 ua
BM_EphemerisSolarSystem<SolarSystemFactory::Accuracy::MajorBodiesOnly>/-3                                                              47.5 s          47.1 s             1 +9.92179030813603147e-01 ua
BM_EphemerisSolarSystem<SolarSystemFactory::Accuracy::MinorAndMajorBodies>/-3                                                           121 s           114 s             1 +9.92179031500272979e-01 ua
BM_EphemerisSolarSystem<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness>/-3                                                  187 s           185 s             1 +9.92178865558925205e-01 ua
BM_EphemerisL4Probe<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithAdaptiveStep>/-3                                  1.22 s          1.20 s             1 154375 steps, +9.91602312582897105e-01 ua, +1.07738936581449729e+00 ua, degree 78.839016
BM_EphemerisL4Probe<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithAdaptiveStep>/-3                              3.20 s          3.04 s             1 154375 steps, +9.91602312429100796e-01 ua, +1.07738938749615909e+00 ua, degree 177.897618
BM_EphemerisL4Probe<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithAdaptiveStep>/-3                     3.24 s          3.21 s             1 154375 steps, +9.91602312423095378e-01 ua, +1.07738926998526341e+00 ua, degree 177.854602
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithAdaptiveStep>/-3                                 2.43 s          2.37 s             1 750001 steps, +9.91871697508377559e-01 au, +9.99471255698674810e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithFixedStepSLMS>/-3                                5.04 s          5.02 s             1 3155761 steps, +9.91888191584786028e-01 au, +9.99957028991727839e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithFixedStepSRKN>/-3                                19.6 s          19.0 s             1 3155761 steps, +9.91888190770475076e-01 au, +9.99957014032675744e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithAdaptiveStep>/-3                             4.48 s          4.43 s             1 750001 steps, +9.91871699265742257e-01 au, +9.99472218270988861e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithFixedStepSLMS>/-3                            8.26 s          8.22 s             1 3155761 steps, +9.91888192034501626e-01 au, +9.99957081922148916e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithFixedStepSRKN>/-3                            35.3 s          35.0 s             1 3155761 steps, +9.91888191491747340e-01 au, +9.99957068526476718e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithAdaptiveStep>/-3                    9.93 s          9.80 s             1 752034 steps, +9.91875956379899337e-01 au, +8.97636132412922052e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithFixedStepSLMS>/-3                   16.1 s          16.1 s             1 3155761 steps, +9.91859373938503874e-01 au, +8.90992222327369916e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithFixedStepSRKN>/-3                   80.5 s          77.6 s             1 3155761 steps, +9.91859374524636350e-01 au, +8.90992332849360906e+01 nmi
BM_EphemerisTranslunarSpaceProbe<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithFixedStepSLMS>/-3                    5.43 s          5.21 s             1 3155761 steps, +9.25442266022708515e-01 au, +4.57823104198816195e+07 km
BM_EphemerisTranslunarSpaceProbe<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithFixedStepSLMS>/-3                8.55 s          8.53 s             1 3155761 steps, +9.25442265959181887e-01 au, +4.57823103886758387e+07 km
BM_EphemerisTranslunarSpaceProbe<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithFixedStepSLMS>/-3       9.73 s          9.67 s             1 3155761 steps, +9.25443278335496000e-01 au, +4.57832120426791981e+07 km
BM_EphemerisL4Probe1Year<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithFixedStepSLMS>/-3                            5.19 s          5.13 s             1 3155761 steps, +9.91749510274702040e-01 ua, +9.96217241632995520e-01 ua, degree 78.637871
BM_EphemerisL4Probe1Year<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithFixedStepSLMS>/-3                        8.52 s          8.44 s             1 3155761 steps, +9.91749510363307052e-01 ua, +9.96217242169266215e-01 ua, degree 177.615513
BM_EphemerisL4Probe1Year<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithFixedStepSLMS>/-3               9.90 s          9.47 s             1 3155761 steps, +9.91749510363307052e-01 ua, +9.96217243495228888e-01 ua, degree 177.619011
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/-4                                                                        1163 ms         1154 ms            1 154375 steps, +9.91602312582897105e-01 ua, +1.07738936581449729e+00 ua, degree 82.853624
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/-3                                                                        1168 ms         1170 ms            1 154375 steps, +9.91602312582897105e-01 ua, +1.07738936581449729e+00 ua, degree 78.839016
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/-2                                                                        1131 ms         1123 ms            1 154375 steps, +9.91602312582897549e-01 ua, +1.07738936581448441e+00 ua, degree 75.000000
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/-1                                                                        1137 ms         1092 ms            1 154375 steps, +9.91602312582897549e-01 ua, +1.07738936581448441e+00 ua, degree 70.000000
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/0                                                                         1087 ms         1092 ms            1 154375 steps, +9.91602312582898548e-01 ua, +1.07738936581444511e+00 ua, degree 63.205200
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/1                                                                         1100 ms         1045 ms            1 154375 steps, +9.91602312582899548e-01 ua, +1.07738936581440425e+00 ua, degree 61.000000
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/2                                                                         1045 ms         1045 ms            1 154375 steps, +9.91602312582899992e-01 ua, +1.07738936581436162e+00 ua, degree 57.000000
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/3                                                                         1020 ms         1014 ms            1 154375 steps, +9.91602312582898771e-01 ua, +1.07738936581446598e+00 ua, degree 55.000000
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/4                                                                         1000 ms          983 ms            1 154375 steps, +9.91602312582899770e-01 ua, +1.07738936581435807e+00 ua, degree 54.000000
BM_EphemerisStartup<&FlowEphemerisWithFixedStepSLMS>/3                                                                                 1504 us         1459 us          449 11 steps, +9.91835233804689742e-01 ua, +9.91835231548724217e-01 ua, degree 55.000000
BM_EphemerisStartup<&FlowEphemerisWithFixedStepSRKN>/3                                                                                 62.1 us         59.8 us        11218 11 steps, +9.91835233804689742e-01 ua, +9.91835231548724217e-01 ua, degree 55.000000

After:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                                                                                 Time             CPU   Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_EphemerisMultithreadingBenchmark/3/1                                                                                              185634 us          156 us          100 +9.98770532350627147e+06 m +1.99941583244033158e+07 m +2.99993771561068073e+07 m 
BM_EphemerisMultithreadingBenchmark/3/2                                                                                              129402 us          156 us          100 +9.98770532350627147e+06 m +1.99941583244033158e+07 m +2.99993771561068073e+07 m 
BM_EphemerisMultithreadingBenchmark/3/3                                                                                               77352 us        0.000 us          100 +9.98770532350627147e+06 m +1.99941583244033158e+07 m +2.99993771561068073e+07 m 
BM_EphemerisMultithreadingBenchmark/3/4                                                                                               70758 us        0.000 us          100 +9.98770532350627147e+06 m +1.99941583244033158e+07 m +2.99993771561068073e+07 m 
BM_EphemerisMultithreadingBenchmark/3/5                                                                                               71718 us        0.000 us          100 +9.98770532350627147e+06 m +1.99941583244033158e+07 m +2.99993771561068073e+07 m 
BM_EphemerisKSPSystem/-3                                                                                                               46.5 s          46.2 s             1 +5.19502291148908779e-01 ua
BM_EphemerisSolarSystem<SolarSystemFactory::Accuracy::MajorBodiesOnly>/-3                                                              47.9 s          47.6 s             1 +9.92179030813603147e-01 ua
BM_EphemerisSolarSystem<SolarSystemFactory::Accuracy::MinorAndMajorBodies>/-3                                                           117 s           115 s             1 +9.92179031500272979e-01 ua
BM_EphemerisSolarSystem<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness>/-3                                                  191 s           188 s             1 +9.92178865558925205e-01 ua
BM_EphemerisL4Probe<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithAdaptiveStep>/-3                                  1.03 s          1.03 s             1 154375 steps, +9.91602312582897105e-01 ua, +1.07738936581449729e+00 ua, degree 78.839016
BM_EphemerisL4Probe<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithAdaptiveStep>/-3                              2.92 s          2.92 s             1 154375 steps, +9.91602312429100796e-01 ua, +1.07738938749615909e+00 ua, degree 177.897618
BM_EphemerisL4Probe<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithAdaptiveStep>/-3                     3.08 s          3.07 s             1 154375 steps, +9.91602312423095378e-01 ua, +1.07738926998526341e+00 ua, degree 177.854602
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithAdaptiveStep>/-3                                 1.97 s          1.97 s             1 750001 steps, +9.91871697508377559e-01 au, +9.99471255698674810e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithFixedStepSLMS>/-3                                4.56 s          4.56 s             1 3155761 steps, +9.91888191584786028e-01 au, +9.99957028991727839e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithFixedStepSRKN>/-3                                14.9 s          14.9 s             1 3155761 steps, +9.91888190770475076e-01 au, +9.99957014032675744e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithAdaptiveStep>/-3                             3.63 s          3.63 s             1 750001 steps, +9.91871699265742257e-01 au, +9.99472218270988861e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithFixedStepSLMS>/-3                            7.16 s          7.11 s             1 3155761 steps, +9.91888192034501626e-01 au, +9.99957081922148916e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithFixedStepSRKN>/-3                            29.1 s          29.0 s             1 3155761 steps, +9.91888191491747340e-01 au, +9.99957068526476718e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithAdaptiveStep>/-3                    8.70 s          8.61 s             1 752034 steps, +9.91875956379899337e-01 au, +8.97636132412922052e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithFixedStepSLMS>/-3                   15.3 s          15.1 s             1 3155761 steps, +9.91859373938503874e-01 au, +8.90992222327369916e+01 nmi
BM_EphemerisLEOProbe<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithFixedStepSRKN>/-3                   68.4 s          68.1 s             1 3155761 steps, +9.91859374524636350e-01 au, +8.90992332849360906e+01 nmi
BM_EphemerisTranslunarSpaceProbe<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithFixedStepSLMS>/-3                    4.51 s          4.51 s             1 3155761 steps, +9.25442266022708515e-01 au, +4.57823104198816195e+07 km
BM_EphemerisTranslunarSpaceProbe<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithFixedStepSLMS>/-3                6.97 s          6.93 s             1 3155761 steps, +9.25442265959181887e-01 au, +4.57823103886758387e+07 km
BM_EphemerisTranslunarSpaceProbe<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithFixedStepSLMS>/-3       8.37 s          8.35 s             1 3155761 steps, +9.25443278335496000e-01 au, +4.57832120426791981e+07 km
BM_EphemerisL4Probe1Year<SolarSystemFactory::Accuracy::MajorBodiesOnly, &FlowEphemerisWithFixedStepSLMS>/-3                            4.45 s          4.46 s             1 3155761 steps, +9.91749510274702040e-01 ua, +9.96217241632995520e-01 ua, degree 78.637871
BM_EphemerisL4Probe1Year<SolarSystemFactory::Accuracy::MinorAndMajorBodies, &FlowEphemerisWithFixedStepSLMS>/-3                        7.04 s          7.02 s             1 3155761 steps, +9.91749510363307052e-01 ua, +9.96217242169266215e-01 ua, degree 177.615513
BM_EphemerisL4Probe1Year<SolarSystemFactory::Accuracy::AllBodiesAndDampedOblateness, &FlowEphemerisWithFixedStepSLMS>/-3               7.93 s          7.89 s             1 3155761 steps, +9.91749510363307052e-01 ua, +9.96217243495228888e-01 ua, degree 177.619011
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/-4                                                                        1027 ms         1030 ms            1 154375 steps, +9.91602312582897105e-01 ua, +1.07738936581449729e+00 ua, degree 82.853624
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/-3                                                                        1196 ms          998 ms            1 154375 steps, +9.91602312582897105e-01 ua, +1.07738936581449729e+00 ua, degree 78.839016
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/-2                                                                         968 ms          952 ms            1 154375 steps, +9.91602312582897549e-01 ua, +1.07738936581448441e+00 ua, degree 75.000000
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/-1                                                                         969 ms          967 ms            1 154375 steps, +9.91602312582897549e-01 ua, +1.07738936581448441e+00 ua, degree 70.000000
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/0                                                                          932 ms          936 ms            1 154375 steps, +9.91602312582898548e-01 ua, +1.07738936581444511e+00 ua, degree 63.205200
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/1                                                                          916 ms          889 ms            1 154375 steps, +9.91602312582899548e-01 ua, +1.07738936581440425e+00 ua, degree 61.000000
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/2                                                                          870 ms          874 ms            1 154375 steps, +9.91602312582899992e-01 ua, +1.07738936581436162e+00 ua, degree 57.000000
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/3                                                                          871 ms          858 ms            1 154375 steps, +9.91602312582898771e-01 ua, +1.07738936581446598e+00 ua, degree 55.000000
BM_EphemerisFittingTolerance<&FlowEphemerisWithAdaptiveStep>/4                                                                          834 ms          827 ms            1 154375 steps, +9.91602312582899770e-01 ua, +1.07738936581435807e+00 ua, degree 54.000000
BM_EphemerisStartup<&FlowEphemerisWithFixedStepSLMS>/3                                                                                 1009 us          998 us          641 11 steps, +9.91835233804689742e-01 ua, +9.91835231548724217e-01 ua, degree 55.000000
BM_EphemerisStartup<&FlowEphemerisWithFixedStepSRKN>/3                                                                                 46.0 us         46.4 us        15473 11 steps, +9.91835233804689742e-01 ua, +9.91835231548724217e-01 ua, degree 55.000000

Fix #3345.

@pleroy
Copy link
Member Author

pleroy commented Apr 20, 2022

retest this please

@eggrobin eggrobin added the LGTM label Apr 22, 2022
@pleroy pleroy merged commit 0983a2b into mockingbirdnest:master Apr 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider making ContinuousTrajectory non-thread safe
2 participants