Skip to content

macOS arm64 support #903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
Closed

macOS arm64 support #903

wants to merge 11 commits into from

Conversation

FancyFurret
Copy link

@FancyFurret FancyFurret commented Feb 3, 2023

Linked issues: #876 #652

Hi! This PR adds proper arm64 build support for M1 macs. Here's some notes:

  • pytorch.org doesn't provide arm64 builds of libtorch, so instead they are extracted from the anaconda python package. These are still released/maintained by pytorch, and have the same license.
  • While MPS support is totally possible now, I've left it out of this PR since that would be out of scope. Locally though I did get it working by adding MPS to Torch/Device (id 13), and adding relevant checks in THSModule/THSJIT. Though we would also want to add torch.mps.is_available() and the rest of the mps specific functions, tests, etc.
  • crc32c is supported since it falls back to a software implementation on arm, so all the tests pass on my machine
    • EDIT: Looking again, it seems like HW support was removed entirely back in September, so there's no point in including cpuid/intrin
  • Cross compilation is supported (on macOS) by changing the TargetArchitecture property to x64 or arm64
  • Platform now defaults to AnyCPU instead of x64, so that the TorchSharp dlls can be used on arm.
  • I added the build to the azure pipeline, and created nupkgproj files for a new libtorch-cpu-osx-arm64 package
    • I added that package to the libtorch-cpu package
    • It seems like the arm64 vms are still in preview, so unfortunately the tests cannot be run from azure, unless there is some way of running arm64 code that I didn't find.
    • Building works great in azure, I downloaded the artifacts and they work on my M1 Mac. Though I wasn't able to test the nuget packaging/publishing parts since I wan't able to sign them of course. In theory though they should work, I think I made all the necessary updates 😅

I think that's about it! Please let me know how it looks and if anything needs to be changed/if I missed anything. I'm using TorchSharp in a project I'm working on and needed M1 support, and would love to be able to pull the arm64 backend from nuget instead of having to manually load it.

@dnfadmin
Copy link

dnfadmin commented Feb 3, 2023

CLA assistant check
All CLA requirements met.

@NiklasGustafsson
Copy link
Contributor

@osum4est,

First of all, thank you for contributing to TorchSharp! Please add an introduction in the #333 discussion thread -- it is always nice to know who the contributors are, not just for myself, but for everyone using TorchSharp.

Since I cannot test this myself, I wonder if there's someone out there with an M1/M2 who could help validate the PR?

Also, I don't think MPS support is out of scope, to be honest.

@FancyFurret
Copy link
Author

FancyFurret commented Feb 3, 2023

Done!

I may look at adding full MPS support, but to be honest I wouldn't be using it, so I don't have much motivation to do so. I'm just relying on CPU for quick development/debugging of models, then pushing them over to a linux system with cuda for longer term training. I'd like to use MPS for development, but currently it doesn't support things like Conv3D, which I'm using in my models.

Yeah hopefully someone with an M1 can take a look at this, it would be much nicer to be able to use the nuget package in my project instead of having to manually reference my custom built dlls 😅

@nhirschey
Copy link

I have an m1 Mac that I can use to help test. @osum4est, thanks for this amazing work. How do I build and test this? I am not familiar with building native libraries and I'm getting errors.

Following devguide.md, I build with this on an m1 Mac using dotnet build /p:SkipNative=true.
That builds, but then if I run dotnet test I get errors about dotnet x64 not being installed.

Then if I run dotnet pack I see these errors.

  -- Up-to-date: /Users/user/Documents/GitHub/TorchSharp/src/Native/../../bin/arm64.Debug/Native/./libLibTorchSharp.dylib
EXEC : error : /Library/Developer/CommandLineTools/usr/bin/install_name_tool: for: /Users/user/Documents/GitHub/TorchSharp/src/Native/../../bin/arm64.Debug/Native/./libLibTorchSharp.dylib (for architecture arm64) option "-add_rpath @loader_path" would duplicate path, file already has LC_RPATH for: @loader_path [/Users/user/Documents/GitHub/TorchSharp/src/Native/build.proj]
EXEC : error : /Library/Developer/CommandLineTools/usr/bin/install_name_tool: for: /Users/user/Documents/GitHub/TorchSharp/src/Native/../../bin/arm64.Debug/Native/./libLibTorchSharp.dylib (for architecture arm64) option "-add_rpath @executable_path" would duplicate path, file already has LC_RPATH for: @executable_path [/Users/user/Documents/GitHub/TorchSharp/src/Native/build.proj]
EXEC : error : /Library/Developer/CommandLineTools/usr/bin/install_name_tool: no LC_RPATH load command with path: /Users/user/Documents/GitHub/TorchSharp/bin/obj/AnyCPU.Debug/libtorch-cpu/pytorch-1.13.0cpu/libtorch/lib found in: /Users/user/Documents/GitHub/TorchSharp/src/Native/../../bin/arm64.Debug/Native/./libLibTorchSharp.dylib (for architecture arm64), required for specified option "-delete_rpath /Users/user/Documents/GitHub/TorchSharp/bin/obj/AnyCPU.Debug/libtorch-cpu/pytorch-1.13.0cpu/libtorch/lib" [/Users/user/Documents/GitHub/TorchSharp/src/Native/build.proj]
  Successfully created package '/Users/user/Documents/GitHub/TorchSharp/bin/packages/Debug/TorchAudio.0.99.3.nupkg'.
/Users/user/Documents/GitHub/TorchSharp/src/Native/build.proj(40,5): error MSB3073: The command ""/Users/user/Documents/GitHub/TorchSharp/src/Native/build.sh" --configuration Debug --arch arm64  --libtorchpath /Users/user/Documents/GitHub/TorchSharp/bin/obj/AnyCPU.Debug/libtorch-cpu/pytorch-1.13.0cpu/libtorch/share/cmake/Torch" exited with code -1.
  Successfully created package '/Users/user/Documents/GitHub/TorchSharp/bin/packages/Debug/TorchSharp.0.99.3.nupkg'.
  Successfully created package '/Users/user/Documents/GitHub/TorchSharp/bin/packages/Debug/TorchVision.0.99.3.nupkg'.
  Done packing!

Finally, I get these errors in an F# script trying to load the nuget

#i """nuget: /Users/user/Documents/GitHub/TorchSharp/bin/packages/Debug"""
#r "nuget: TorchSharp-cpu, 0.99.3"

open TorchSharp

let lin1 = torch.nn.Linear(1000,100)

> 
  Binding session to '/Users/user/.nuget/packages/torchsharp/0.99.3/lib/netcoreapp3.1/TorchSharp.dll'...
System.DllNotFoundException: Unable to load shared library 'LibTorchSharp' or one of its dependencies. In order to help diagnose loading problems, consider setting the DYLD_PRINT_LIBRARIES environment variable: dlopen(libLibTorchSharp, 0x0001): tried: 'libLibTorchSharp' (no such file), '/usr/local/lib/libLibTorchSharp' (no such file), '/usr/lib/libLibTorchSharp' (no such file), '/Users/user/Documents/GitHub/TorchSharp/libLibTorchSharp' (no such file), '/usr/local/lib/libLibTorchSharp' (no such file), '/usr/lib/libLibTorchSharp' (no such file)
   at TorchSharp.PInvoke.LibTorchSharp.THSNN_Linear_ctor(Int64 input_size, Int64 output_size, Boolean bias, IntPtr& pBoxedModule)
   at TorchSharp.torch.nn.Linear(Int64 inputSize, Int64 outputSize, Boolean hasBias, Device device, Nullable`1 dtype)
   at <StartupCode$FSI_0005>.$FSI_0005.main@()
Stopped due to error

@nhirschey
Copy link

This below works:

#i """nuget: /Users/user/Documents/GitHub/TorchSharp/bin/packages/Debug"""
#r "nuget: TorchSharp, 0.99.3"

open System.Runtime.InteropServices

NativeLibrary.Load("bin/arm64.Debug/Native/libLibTorchSharp.dylib")

open TorchSharp

let t = torch.tensor([|1.0 .. 10.0 |])
> 
Binding session to '/Users/user/.nuget/packages/torchsharp/0.99.3/lib/netcoreapp3.1/TorchSharp.dll'...
val t: torch.Tensor = [10], type = Float64, device = cpu

> t.std() |> float;;
val it: float = 3.027650354

@FancyFurret
Copy link
Author

@nhirschey Thank you for helping test!!

A couple notes:

  • TorchSharp is comprised of both a C# library, and a native C++ library (LibTorchSharp)
  • Don't use /p:SkipNative=true. This will prevent the LibTorchSharp native library from being built, which you will need for testing/running the examples. The build docs are a bit confusing in that sense.
  • I was getting those add_rpath errors as well. This is an issue that also existed for building x64 Mac builds before I touched anything, so I didn't get around to fixing it. There's something wrong with the rpath/cmake configuration. When LibTorchSharp gets re-built, it tries to make install a second time, and that is what produces the error. To get around this you can comment out the make install on line 121 of build.sh after LibTorchSharp has been successfully built once.
  • I'm not 100% sure why LibTorchSharp isn't being found from the nuget package in F#, that's something that I didn't test. I'll see if I have some time to look into this. (Possibly its because of the /p:SkipNative, but I'm not 100% sure)

Hopefully this clears up a few things. If I have some time this weekend I'll see if I can fix the rpath issue, though I don't know a ton about how that works.

@dayo05
Copy link
Contributor

dayo05 commented Feb 20, 2023

In my case, all test has passed without extra action like building custom pytorch, I just execute the test in jb rider.
When I create the new project, it requires to Load native library but its excepted situation.
There are no issues on my testing especially in platform architecture.

As far as I remember, crc32c is used for tensorboard, maybe extra test on tensorboard is needed.

Comment on lines +38 to +41
#if defined(__x86_64__) || defined(__i386__)
#include <cpuid.h>
#include <x86intrin.h>
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe affect to tensorboard feature?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea about how crc32c.c works so @NiklasGustafsson Can you look at this?

Copy link
Author

@FancyFurret FancyFurret Feb 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, HW support was removed from crc32c in #2877164, so this shouldn't affect anything. I believe cpuid/x86intrin are only needed for the hardware crc instruction, which is no longer being used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, how about just remove that include lines?

@smoothdeveloper
Copy link

It would be great to get this merged, even if this is not bullet proof, so long it doesn't break the x64 backend and it helps people who want to contribute to this stack but are using macos-arm64.

I gave a shot at it in DiffSharp: https://github.com/DiffSharp/DiffSharp/blob/f018184043fefae382bd04cba1756a3c348b6930/DEVGUIDE.md#torchsharp-backend-on-macos-arm64

there is only 1 failing test to investigate in their repository, using the backend I compiled out of this PR.

@NiklasGustafsson
Copy link
Contributor

I agree. The CLA is apparently not agreed, though. Since you first posted this, we have moved to PyTorch 2.0.1. Before merging, we would have to do the same here. Since the strategy to extract the native binaries is different, that may require some finagling.

@FancyFurret
Copy link
Author

@dotnet-policy-service agree

@dotnet-policy-service agree

@NiklasGustafsson
Copy link
Contributor

Before merging, we would have to do the same here.

And by "we," I mean "you." :-)

I say that, since this needs to be tested by someone with the right HW...

@dayo05
Copy link
Contributor

dayo05 commented Jul 23, 2023

I think this PR has broken now...

0>------- Started building project: TorchSharp
Using VersionSuffix = 
Using Version = 0.99.3
Using VersionSuffix = 
Using Version = 0.99.3
Downloading from "https://raw.githubusercontent.com/pytorch/pytorch/master/LICENSE" to "/Users/dayo/RiderProjects/TorchSharp/bin/obj/AnyCPU.Debug/libtorch-cpu/LICENSE-LIBTORCH" (3,293 bytes).
GetFileHash /Users/dayo/RiderProjects/TorchSharp/../libtorch-cpu/libtorch-macos-1.13.0.zip
Unzip /Users/dayo/RiderProjects/TorchSharp/../libtorch-cpu/libtorch-macos-1.13.0.zip --> /Users/dayo/RiderProjects/TorchSharp/bin/obj/AnyCPU.Debug/libtorch-cpu/libtorch-macos-1.13.0cpu
Copy libtorch/lib/libbackend_with_compiler.dylib;libtorch/lib/libc10.dylib;libtorch/lib/libfbjni.dylib;libtorch/lib/libiomp5.dylib;libtorch/lib/libjitbackend_test.dylib;libtorch/lib/libpytorch_jni.dylib;libtorch/lib/libshm.dylib;libtorch/lib/libtorch.dylib;libtorch/lib/libtorch_cpu.dylib;libtorch/lib/libtorch_global_deps.dylib;libtorch/lib/libtorch_python.dylib;libtorch/lib/libtorchbind_test.dylib -> /Users/dayo/RiderProjects/TorchSharp/bin/obj/packprep/Debug/libtorch-cpu-osx-x64\runtimes\osx-x64\native\
/Users/dayo/RiderProjects/TorchSharp/src/Native/build.sh --configuration Debug --arch x64  --libtorchpath /Users/dayo/RiderProjects/TorchSharp/bin/obj/AnyCPU.Debug/libtorch-cpu/libtorch-macos-1.13.0cpu/libtorch/share/cmake/Torch
"/Users/dayo/RiderProjects/TorchSharp/src/Native/build.sh" --configuration Debug --arch x64  --libtorchpath /Users/dayo/RiderProjects/TorchSharp/bin/obj/AnyCPU.Debug/libtorch-cpu/libtorch-macos-1.13.0cpu/libtorch/share/cmake/Torch
+ cmake /Users/dayo/RiderProjects/TorchSharp/src/Native -G 'Unix Makefiles' -DCMAKE_BUILD_TYPE=Debug -DLIBTORCH_PATH=/Users/dayo/RiderProjects/TorchSharp/bin/obj/AnyCPU.Debug/libtorch-cpu/libtorch-macos-1.13.0cpu/libtorch/share/cmake/Torch
Building Machine Learning native components from /Users/dayo/RiderProjects/TorchSharp/src/Native to /Users/dayo/RiderProjects/TorchSharp/bin/obj/x64.Debug/Native

This is my part of log with just cloning with the PR then trying to run Examples project. It looks like adding x64 as argument of arm64 right???

Maybe recent version of mac might break those things... Because if I tested on past year, it seems look fine

In my case, all test has passed without extra action like building custom pytorch, I just execute the test in jb rider. When I create the new project, it requires to Load native library but its excepted situation. There are no issues on my testing especially in platform architecture.

As far as I remember, crc32c is used for tensorboard, maybe extra test on tensorboard is needed.

Also, this PR looks like using custom built version of libtorch rather then official version. Because official version of arm64 not uploaded to Official webpage. It looks like we need to provide some official built version of libtorch(including libtorchsharp) ourself. This PR seems like using the version someone uploaded to somewhere but I have no idea because I'm not farmilar with .NET's XML build script...
Based on this; For my suggestion, because there are no official libtorch for this platform, moving default building for this project to build libtorch from source. Based on my previous test(this is also a few years ago...) it looks fine anyway and this doesn't depends someone else who may doesn't focused on our project.

@dayo05
Copy link
Contributor

dayo05 commented Jul 23, 2023

One more: crc32c.c and crc32c.h uses HW dependent source codes which we doesn't uses anymore. How about just removing those lines:

#if defined(__x86_64__) || defined(__i386__)
#include <cpuid.h>
#include <x86intrin.h>
#endif
#else
#include <intrin.h>
#endif

This will gives better experience to support new platforms which will supported future

@NiklasGustafsson
Copy link
Contributor

@osum4est @dayo05 -- do we want to keep this PR open? It sounds like it needs some updating if it's going to work. I do think it's very valuable, but I don't have the HW to test it myself.

@dayo05
Copy link
Contributor

dayo05 commented Oct 19, 2023

Recently, it seems github enabled action which targets macos arm64(github/roadmap#528)
I think this can be resolve some testing issue related to it. Also, it can be used for publishing native libtorch binary maybe

Because I have native HW for it, I can make som validation for major backend change if you mention me @NiklasGustafsson . But this pr seems broken now(mayve version issue? But I'm not sure)

It seems pr author not responding to this for long time so if he/she continues to not reply, may I reimplement it later?

@NiklasGustafsson NiklasGustafsson marked this pull request as draft October 20, 2023 03:10
@NiklasGustafsson
Copy link
Contributor

@osum4est -- I have it merged, and after installing the right version of 'cmake' on my Mac, it builds and tests run. Adding MPS support also doesn't seem too hard.

It seems like version 2.20 has added the ARM64 binaries to the C++ download zip file, so we can rely on that instead of grabbing the conda package.

It's been a while since you submitted this -- do you want to be involved in taking it all the way?

@FancyFurret
Copy link
Author

@osum4est -- I have it merged, and after installing the right version of 'cmake' on my Mac, it builds and tests run. Adding MPS support also doesn't seem too hard.

It seems like version 2.20 has added the ARM64 binaries to the C++ download zip file, so we can rely on that instead of grabbing the conda package.

It's been a while since you submitted this -- do you want to be involved in taking it all the way?

Awesome, glad you got it working. Sorry, I'm busy with other things at the moment and am not currently using this library, so I don't have time to wrap this up.

@NiklasGustafsson
Copy link
Contributor

Awesome, glad you got it working. Sorry, I'm busy with other things at the moment and am not currently using this library, so I don't have time to wrap this up.

Okay, then I'll finish it up. I'll update TorchSharp to v2.2.0 of libtorch, first, since the config has changed. Your work was very much appreciated -- it's a solid foundation for what needs to be done.

@NiklasGustafsson
Copy link
Contributor

Let's keep this PR open for reference.

@NiklasGustafsson
Copy link
Contributor

Closing this PR as I'm about to post another one. So many thanks, @osum4est, for getting it started!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants