Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect data for Apple Silicon, needs bump to 2.8.0 #58

Closed
plessl opened this issue Aug 16, 2022 · 5 comments
Closed

Incorrect data for Apple Silicon, needs bump to 2.8.0 #58

plessl opened this issue Aug 16, 2022 · 5 comments

Comments

@plessl
Copy link

plessl commented Aug 16, 2022

I was trying to query the cache sizes using hwloc on a Mac with an Apple Silcon (M1) processor. The currently used version of 2.7.1 hwloc returns faulty data or throws errors. For example, cachesize() fails:

julia> cachesize()
ERROR: BoundsError: attempt to access 0-element Vector{Any} at index [1]
Stacktrace:
 [1] getindex
   @ ./array.jl:924 [inlined]
 [2] first(a::Vector{Any})
   @ Base ./abstractarray.jl:404
 [3] cachesize()
   @ Hwloc ~/.julia/packages/Hwloc/HorQV/src/highlevel_api.jl:205
 [4] top-level scope
   @ REPL[5]:1

and topology yields incorrect data (performance and efficiency cores appear to have same cache size, which is not correct.

julia> topology()
Machine (3.26 GB)
    Package L#0 P#0 (3.26 GB)
        NUMANode (3.26 GB)
        L2 (4.0 MB)
            L1 (64.0 kB) + Core L#0 P#0 
                PU L#0 P#0 
            L1 (64.0 kB) + Core L#1 P#1 
                PU L#1 P#1 
            L1 (64.0 kB) + Core L#2 P#2 
                PU L#2 P#2 
            L1 (64.0 kB) + Core L#3 P#3 
                PU L#3 P#3 
        L2 (4.0 MB)
            L1 (64.0 kB) + Core L#4 P#4 
                PU L#4 P#4 
            L1 (64.0 kB) + Core L#5 P#5 
                PU L#5 P#5 
            L1 (64.0 kB) + Core L#6 P#6 
                PU L#6 P#6 
            L1 (64.0 kB) + Core L#7 P#7 
                PU L#7 P#7 

The most recent release 2.8.0 of hwloc does now officially support M1 processors (https://www.mail-archive.com/[email protected]/msg00151.html) and the output of lstopo looks sane:

lstopo
Machine (3337MB total)
  Package L#0
    NUMANode L#0 (P#0 3337MB)
    L2 L#0 (4096KB)
      L1d L#0 (64KB) + L1i L#0 (128KB) + Core L#0 + PU L#0 (P#0)
      L1d L#1 (64KB) + L1i L#1 (128KB) + Core L#1 + PU L#1 (P#1)
      L1d L#2 (64KB) + L1i L#2 (128KB) + Core L#2 + PU L#2 (P#2)
      L1d L#3 (64KB) + L1i L#3 (128KB) + Core L#3 + PU L#3 (P#3)
    L2 L#1 (12MB)
      L1d L#4 (128KB) + L1i L#4 (192KB) + Core L#4 + PU L#4 (P#4)
      L1d L#5 (128KB) + L1i L#5 (192KB) + Core L#5 + PU L#5 (P#5)
      L1d L#6 (128KB) + L1i L#6 (192KB) + Core L#6 + PU L#6 (P#6)
      L1d L#7 (128KB) + L1i L#7 (192KB) + Core L#7 + PU L#7 (P#7)
  CoProc(OpenCL) "opencl0d0"

Since Hwloc.jl is the most convenient tool to access CPU architecture information in a cross-CPU manner, it would be great if you would bump the bundled version of hwloc to the latest version.

Cheers
Christian

@eschnett
Copy link
Contributor

@plessl The respective update is waiting for review: JuliaPackaging/Yggdrasil#5300

@carstenbauer
Copy link
Member

There will soon be a new version of Hwloc_jll. The corresponding pipelines are running: https://dev.azure.com/JuliaPackaging/Yggdrasil/_build/results?buildId=21337&view=results

@carstenbauer
Copy link
Member

carstenbauer commented Aug 16, 2022

If you ] up Hwloc you should get the new version of Hwloc_jll (2.8.0). On an M1, this will give you:

julia> topology()
Machine (3.49 GB)
    Package L#0 P#0 (3.49 GB)
        NUMANode (3.49 GB)
        L2 (4.0 MB)
            L1 (64.0 kB) + I1Cache Cache{size=131072,depth=1,linesize=128,associativity=0,type=Instruction}
                Core L#0 P#0
                    PU L#0 P#0
            L1 (64.0 kB) + I1Cache Cache{size=131072,depth=1,linesize=128,associativity=0,type=Instruction}
                Core L#1 P#1
                    PU L#1 P#1
            L1 (64.0 kB) + I1Cache Cache{size=131072,depth=1,linesize=128,associativity=0,type=Instruction}
                Core L#2 P#2
                    PU L#2 P#2
            L1 (64.0 kB) + I1Cache Cache{size=131072,depth=1,linesize=128,associativity=0,type=Instruction}
                Core L#3 P#3
                    PU L#3 P#3
        L2 (12.0 MB)
            L1 (128.0 kB) + I1Cache Cache{size=196608,depth=1,linesize=128,associativity=0,type=Instruction}
                Core L#4 P#4
                    PU L#4 P#4
            L1 (128.0 kB) + I1Cache Cache{size=196608,depth=1,linesize=128,associativity=0,type=Instruction}
                Core L#5 P#5
                    PU L#5 P#5
            L1 (128.0 kB) + I1Cache Cache{size=196608,depth=1,linesize=128,associativity=0,type=Instruction}
                Core L#6 P#6
                    PU L#6 P#6
            L1 (128.0 kB) + I1Cache Cache{size=196608,depth=1,linesize=128,associativity=0,type=Instruction}
                Core L#7 P#7
                    PU L#7 P#7
julia> Hwloc.l1cache_sizes()
8-element Vector{Int64}:
  65536
  65536
  65536
  65536
 131072
 131072
 131072
 131072

julia> Hwloc.l2cache_sizes()
2-element Vector{Int64}:
  4194304
 12582912

The printing of the I1Cache elements in the topology() output could be improved but the information should be correct now. Note that you can't use cachesize() because not all caches of the same kind have the same size.

(Closing this for now. Feel free to reopen if necessary.)

@plessl
Copy link
Author

plessl commented Aug 17, 2022

Using Hwloc.l1cache_sizes() and Hwloc.l2cache_sizes() works for my current use case.

However, cachesize() and cachelinesize() fail now with BoundsError. I think this should be considered as bug.

Shall I go ahead open another issue?

@carstenbauer
Copy link
Member

I took care of it and created #59 (and #60 for the formatting improvements). I think I'll be able to fix it later today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants