Skip to content

APEX Release v2.4.0

Compare
Choose a tag to compare
@khuck khuck released this 28 May 22:53
· 728 commits to develop since this release

This is an update to APEX, with several new features including:

  • New simulated annealing search for policies
  • New Kokkos kernel autotuning support
  • Memory leak detection (experimental)
  • Updated scatterplot support, including counters and updated Python scripts to use python3
  • HIP/ROCm Roctracer support

Full list of commits:

  • view commit • Don't enable examples by default
  • view commit • Kokkos doesn't like it if you replace the OpenMP library at runtime. So OMPT support now has to be explicitly enabled by --apex:ompt to preload the OpenMP runtime library (if desired).
  • view commit • Adding kokkos tuning support. Needs work.
  • view commit • Kokkos tuning working, but AH not getting right answer.
  • view commit • Working, but AH still stuck in local minima.
  • view commit • Adding/fixing PBS and SLURM variables
  • view commit • Fixing build error without kokkos autotuning
  • view commit • Trying to improve convergence for kokkos autotuning
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Debugging Kokkos tuning issues
  • view commit • Adding Kokkos tooling header, eliminates need to require Kokkos as a dependency
  • view commit • Adding quotes around path to harmony home
  • view commit • Working Kokkos autotuner. This uses a Nelder Mead search, with an initial radius of 0.5 centered on the initial point requested by Kokkos (if specified). Future work includes caching results and trying other search strategies like simulated annealing.
  • view commit • Refactoring kokkos tuning away from profiling, making it possible to disable it
  • view commit • Updating to python3
  • view commit • Writing a memory wrapper report. There's a huge amount of CUPTI memory leaks, and they happen when the first real call to CUDA happens. I can't force that call, or ignore memory during the first "real" call, yet.
  • view commit • Cleaner way of preventing "false"(?) CUPTI memory leaks.
  • view commit • Fixing memory leaks and instability during shutdown. When using the memory tracker, make sure that the reporting is done before the BFD address resolution infrastructure is destroyed.
  • view commit • Adding task tree ASCII output, for issue #150
  • view commit • Merge branch 'develop' of git.nic.uoregon.edu:/gitroot/xpress-apex into develop
  • view commit • Adding "Remainder" to tree ASCII output.
  • view commit • Adding support for ratio and ordinal values
  • view commit • Fixing tree ASCII output and memory leak reporting.
  • view commit • Tasktree human readable is now in a file, and hierarchically sorted by time.
  • view commit • Making --apex:quiet truly quiet
  • view commit • Adding direct multidimensional simulated annealing search.
  • view commit • GCC 9.3.0 has an internal pedantic compiler error. So turning off pedantic.
  • view commit • Updating subproject build of LLVM OpenMP runtime for GCC
  • view commit • Fixing race condition in startup of memory wrapper, I hope...
  • view commit • Updating scripts to python3
  • view commit • Adding counter scatterplot support, too
  • view commit • Allowing for custom scatterplot fractions. To change from the default of 1% (0.01), set APEX_SCATTERPLOT_FRACTION equal to some value between 0.0 and 1.0.
  • view commit • Adding counter scatterplot script
  • view commit • Updating scatterplot scripts to handle larger scales
  • view commit • Do lazy opening of sample files so that the correct Node ID is used
  • view commit • improving colors
  • view commit • Merge branch 'develop' of github.com:khuck/xpress-apex into develop
  • view commit • More scatterplot cleanup
  • view commit • Updating escape sequence for new python
  • view commit • Fixing x axis to make all subgraphs uniform
  • view commit • Fixing dlsym() wrapper function to use templates for the function types, it's better than just blindly casting. Better to let the type system help us.
  • view commit • Added HIP to the configure and added a test case. It seems to work. Now have to add the actual roctracer support.
  • view commit • ROCTX support added.
  • view commit • Working callback support for HIP. Next step is to add activity support, and link the correlation IDs. That should be modeled after the CUPTI support.
  • view commit • Updating scatterplot scripts to add mean values
  • view commit • Working HIP with actions
  • view commit • Merge branch 'develop' into hip
  • view commit • Testing HIP code with CUDA config
  • view commit • Working HIP memory tracking, too
  • view commit • Adding CMake support for HPX with HIP
  • view commit • Updating CMake settings for HPX
  • view commit • Don't add line number to resolved address if just zero
  • view commit • Cleaning up hip trace support after testing with some test programs
  • view commit • Updating version number for 2.4.0 release