Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLVM] Expose Host CPU Feature Detection #14946

Merged

Conversation

junrushao
Copy link
Member

@junrushao junrushao commented May 24, 2023

A small script that exposes host CPU name, target triple and features:

import tvm

def main():
    get_default_target_triple = tvm._ffi.get_global_func("tvm.codegen.llvm.GetDefaultTargetTriple")
    get_process_triple = tvm._ffi.get_global_func("tvm.codegen.llvm.GetProcessTriple")
    get_host_cpu_name = tvm._ffi.get_global_func("tvm.codegen.llvm.GetHostCPUName")
    get_host_cpu_features = tvm._ffi.get_global_func("tvm.codegen.llvm.GetHostCPUFeatures")

    target_triple = get_default_target_triple()
    process_triple = get_process_triple()
    host_cpu_name = get_host_cpu_name()
    host_cpu_features = get_host_cpu_features()

    print("target_triple: {}".format(target_triple))
    print("process_triple: {}".format(process_triple))
    print("host_cpu_name: {}".format(host_cpu_name))
    print("host_cpu_features:")
    for name, value in host_cpu_features.items():
        print("  {}: {}".format(name, bool(value)))


if __name__ == "__main__":
    main()

Output (AMD CPU):

target_triple: x86_64-unknown-linux-gnu
process_triple: x86_64-unknown-linux-gnu
host_cpu_name: znver2
host_cpu_features:
  xsaveopt: True
  tsxldtrk: False
  sse: True
  movdiri: False
  mmx: True
  pku: False
  amx-int8: False
  amx-tile: False
  rdpid: True
  avx512vbmi2: False
  cmov: True
  widekl: False
  f16c: True
  bmi: True
  gfni: False
  avx512cd: False
  movdir64b: False
  rdseed: True
  clwb: True
  avx512er: False
  avx512f: False
  sse4.2: True
  avxifma: False
  sse2: True
  avx512vp2intersect: False
  prfchw: True
  avx512pf: False
  vaes: False
  waitpkg: False
  amx-bf16: False
  prefetchi: False
  uintr: False
  fxsr: True
  bmi2: True
  lzcnt: True
  avx512vbmi: False
  avx512bf16: False
  prefetchwt1: False
  xsaves: True
  movbe: True
  rtm: False
  pclmul: True
  hreset: False
  sahf: True
  fma4: False
  xop: False
  vpclmulqdq: False
  sgx: False
  avx512vnni: False
  popcnt: True
  xsavec: True
  aes: True
  avx512vpopcntdq: False
  kl: False
  avx512bitalg: False
  xsave: True
  avxvnni: False
  raoint: False
  clflushopt: True
  sse4a: True
  avx512bw: False
  cx16: True
  avxvnniint8: False
  amx-fp16: False
  cldemote: False
  rdrnd: True
  ptwrite: False
  rdpru: True
  avx: True
  adx: True
  avx512vl: False
  pconfig: False
  shstk: False
  64bit: True
  crc32: True
  sha: True
  cmpccxadd: False
  tbm: False
  serialize: False
  mwaitx: True
  avx512ifma: False
  avx512fp16: False
  clzero: True
  avx2: True
  cx8: True
  fma: True
  lwp: False
  enqcmd: False
  wbnoinvd: True
  sse4.1: True
  avx512dq: False
  ssse3: True
  fsgsbase: True
  invpcid: False
  sse3: True
  avxneconvert: False

Note that LLVM doesn't guarantee automatic feature detection always succeeds, particularly for newer CPU models and older LLVM builds (e.g. M2 CPU + LLVM 16), the result is usually inaccurate. In this case, i.e. CPU feature detection fails, we will print a warning message and return an empty dict instead.

To properly detect CPU features on macbook, the commands below provided by the system are the most accurate:

sysctl -a machdep.cpu
sysctl -a hw.optional

On linux, usually it is recommended to directly query via:

cat /proc/cpuinfo

@tvm-bot
Copy link
Collaborator

tvm-bot commented May 24, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

  • No users to tag found in teams: llvm See #10317 for details

Generated by tvm-bot

@junrushao junrushao force-pushed the feature/2023-05-24/llvm-default-target-triple branch from 5c7365f to 1edaac4 Compare May 24, 2023 22:22
@junrushao junrushao marked this pull request as ready for review May 25, 2023 06:13
@junrushao junrushao merged commit 1c39613 into apache:main May 25, 2023
mei-ye pushed a commit to mei-ye/tvm that referenced this pull request Jun 1, 2023
A small script that exposes host CPU name, target triple and features:

<details>

```python
import tvm

def main():
    get_default_target_triple = tvm._ffi.get_global_func("tvm.codegen.llvm.GetDefaultTargetTriple")
    get_process_triple = tvm._ffi.get_global_func("tvm.codegen.llvm.GetProcessTriple")
    get_host_cpu_name = tvm._ffi.get_global_func("tvm.codegen.llvm.GetHostCPUName")
    get_host_cpu_features = tvm._ffi.get_global_func("tvm.codegen.llvm.GetHostCPUFeatures")

    target_triple = get_default_target_triple()
    process_triple = get_process_triple()
    host_cpu_name = get_host_cpu_name()
    host_cpu_features = get_host_cpu_features()

    print("target_triple: {}".format(target_triple))
    print("process_triple: {}".format(process_triple))
    print("host_cpu_name: {}".format(host_cpu_name))
    print("host_cpu_features:")
    for name, value in host_cpu_features.items():
        print("  {}: {}".format(name, bool(value)))


if __name__ == "__main__":
    main()
```

</details>

Output (AMD CPU):

<details>

```
target_triple: x86_64-unknown-linux-gnu
process_triple: x86_64-unknown-linux-gnu
host_cpu_name: znver2
host_cpu_features:
  xsaveopt: True
  tsxldtrk: False
  sse: True
  movdiri: False
  mmx: True
  pku: False
  amx-int8: False
  amx-tile: False
  rdpid: True
  avx512vbmi2: False
  cmov: True
  widekl: False
  f16c: True
  bmi: True
  gfni: False
  avx512cd: False
  movdir64b: False
  rdseed: True
  clwb: True
  avx512er: False
  avx512f: False
  sse4.2: True
  avxifma: False
  sse2: True
  avx512vp2intersect: False
  prfchw: True
  avx512pf: False
  vaes: False
  waitpkg: False
  amx-bf16: False
  prefetchi: False
  uintr: False
  fxsr: True
  bmi2: True
  lzcnt: True
  avx512vbmi: False
  avx512bf16: False
  prefetchwt1: False
  xsaves: True
  movbe: True
  rtm: False
  pclmul: True
  hreset: False
  sahf: True
  fma4: False
  xop: False
  vpclmulqdq: False
  sgx: False
  avx512vnni: False
  popcnt: True
  xsavec: True
  aes: True
  avx512vpopcntdq: False
  kl: False
  avx512bitalg: False
  xsave: True
  avxvnni: False
  raoint: False
  clflushopt: True
  sse4a: True
  avx512bw: False
  cx16: True
  avxvnniint8: False
  amx-fp16: False
  cldemote: False
  rdrnd: True
  ptwrite: False
  rdpru: True
  avx: True
  adx: True
  avx512vl: False
  pconfig: False
  shstk: False
  64bit: True
  crc32: True
  sha: True
  cmpccxadd: False
  tbm: False
  serialize: False
  mwaitx: True
  avx512ifma: False
  avx512fp16: False
  clzero: True
  avx2: True
  cx8: True
  fma: True
  lwp: False
  enqcmd: False
  wbnoinvd: True
  sse4.1: True
  avx512dq: False
  ssse3: True
  fsgsbase: True
  invpcid: False
  sse3: True
  avxneconvert: False
```

</details>

Note that LLVM doesn't guarantee automatic feature detection always succeeds, particularly for newer CPU models and older LLVM builds (e.g. M2 CPU + LLVM 16), the result is usually inaccurate. In this case, i.e. CPU feature detection fails, we will print a warning message and return an empty dict instead.

To properly detect CPU features on macbook, the commands below provided by the system are the most accurate:

```bash
sysctl -a machdep.cpu
sysctl -a hw.optional
```

On linux, usually it is recommended to directly query via:

```bash
cat /proc/cpuinfo
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants