-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
Hey @samskalicky , Thanks for submitting the PR
CI supported jobs: [clang, website, sanity, unix-gpu, windows-gpu, centos-gpu, windows-cpu, edge, unix-cpu, miscellaneous, centos-cpu] Note: |
@samskalicky thanks for starting on this! does it supersede the custom c++ ops? |
No, I think C++ custom ops are still useful for those not familiar with MXNet backend or who dont want to build MXNet from source. C++ custom ops are the improvement on Python custom ops, easy to write, easy to use, almost as fast as built-in ops. External ops will be for those willing to build MXNet from source, or who need to absolute highest performance for the op. In terms of performance we're talking about ops with a short compute time (ie. less than 10ms) where the overhead of the C++ custom op is not tolerable. For ops with lots of computation, the perf will be bounded by the compute and the C++ custom op overhead wont be the bottleneck. Im working on a project (at some point soon i'll be able to share more details) using C++ custom ops for subgraphs, and able to achieve performance improvements over built-in MXNet ops. So theres still motivation to keep them for their simplicity. |
Hi MXNet community, I would like to propose another feature in MXNet Extensions for "external ops". Please check out the detailed summary [1]. Thanks! [1] #18904 |
What sort of binary compatibility guarantees does MXNet make for this interface? If the safe option is always to compile at the same time, I don't see much difference from including it in the build? |
This interface makes no binary compatibility guarantees. You would need to compile and link your operators agains the version of MXNet you intend to load your library into. Further, you would need to build it in the same environment as MXNet (gcc, glibc, etc). The main benefits of between external operators over built-in operators are:
Of course, you can still always build and distribute your own fork of MXNet with your custom operators in it. This style of external operator is similar to how other DL frameworks support custom operators. So if you're familiar with writing custom operators for TF, PT, etc this is exactly the same. |
I think this would be much cleaner if it was a separate directory because:
So my ideal instructions look more like
|
Agreed, if only MXNet codebase was better organized. We have headers/includes scattered throughout the codebase. Not just in include/mxnet. For example If others have ideas that they wanna give it a whirl, feel free to checkout this branch and try building I added a
Im happy to be wrong, but I think this is an indication of more to come in terms of managing a complex set of includes/dependencies that will constantly be changing. And we havent even gotten into the cmake options mapping to defines/compileOptions/etc |
@kpuatamazon Heres another idea. In order to build files that used to be in src/operator for example, we need to have all the defines, includes, etc that is normally set in the MXNet build:
After all, the use case for this feature is to be able to use custom components in a publicly released build of MXNet (without those components compiled in statically). So initially, those components were build right as part of an MXNet build at some point. So they will need to continue being built the same way. With all of MXNet's complicated layout (includes all over the place, not just in include/mxnet directory) and 3rd submodules its not currently possible to just build MXNet and link it against a custom library. The hardest part is figuring out all the defines/includes/etc to compile a library with. In this new idea, theres a build target setup called "external_lib" in MXNet's CMakeLists.txt (just like we already have for other extensions examples like "pass_lib") that compiles all *.cc and *.cu files in the example/extensions/lib_external_ops directory. After copying your custom files into the lib_external_ops directory you can just go and run Ive updated the README for the example as: |
@kpuatamazon I had a followup idea, I moved all the library specific build stuff into a CMakeLists.txt in the example/extensions/lib_external_ops directory. So now all of your library-specific build "stuff" is there and not in the main MXNet CMakeLists.txt. In this way, building a custom library does not require changing the main MXNet CMakeLists.txt, instead only the small/specific one for your custom library. The main MXNet CMakeLists.txt just refers to the one in the lib_external_ops directory: You still have to build MXNet, and then use MXNet's CMakeLists.txt to generate the Makefile for your custom library. But at least now the control is completely on the custom library. You can clone MXNet, build it, delete all the example files in example/extensions/lib_external_ops, drop in all of your files in the same directory, and build just the "external_lib" target. |
So that does mean a workflow where people pip install mxnet and then build their own operator on top will not be possible? They will still have to maintain a fork, although the separation is a bit clearer now? I think we can draw a lot of benefits if one does not have to compile mxnet itself as that would allow us to start working with optional features as separately disturbed and maintained components. Right now there seems to be a very tight coupling and I understand where it comes from, but the question is whether we see the vision as valuable and if we can work out a plan that would enable that way. I understand that the current build system is not built with that in mind, but mxnet 2.0 would give us the opportunity to break some things to pave the path. |
Yes, external operators will not be possible this way. If you want to not build MXNet you can use C++ Custom Ops (but not external ops) |
How about components in general? Let's say we would like to externalize the onnx support or some other component? We would still always have to build these things together and specify the features set during compile time, right? |
It depends on what "externalize the onnx support" entails. Today the only components we register are: operators, subgraph properties, and graph passes (anything else?). The current onnx support is implemented entirely in Python. Maybe we could externalize TensorRT as a custom subgraph property & custom op though. You dont need to build them together as much as you need to reproduce the same build settings (defines, includes, etc). The big reason that you cant just compile your code and link against libmxnet.so is the same as why you cant build a custom application using the C++ API without compiling it with MXNet: it depends on the entire codebase (+3rd party submodules). |
@@ -24,7 +24,8 @@ | |||
# | |||
# $ cp config/linux_gpu.cmake config.cmake | |||
# | |||
# Next modify the according entries, and then compile by | |||
# Next modify the entries in the config.cmake like MXNET_CUDA_ARCH to set the specific | |||
# GPU architecture, and then compile by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @leezu for that feedback, how about this where we point out that users might want to set the MXNET_CUDA_ARCH when using the linux_gpu.cmake file. And then let them refer below for the specific details. This at least points out that they might need to do something with MXNET_CUDA_ARCH in order to build for GPU
# - "All" for all available GPU architectures supported by the version of CUDA installed | ||
# - "specific GPU architectures" by giving the compute capability number such as | ||
# "7.0" or "7.0;7.5" (ie. sm_70 or sm_75) or you can specify the name like: | ||
# "Volta" or "Volta;Turing". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leezu I tried to merge my version which what was there before. Hows this?
Thanks @Zha0q1, @szha for windows the default is not to export any symbols (according to this: https://stackoverflow.com/q/29038914). So currently im getting this error:
This seems to stem from missing exported symbols for nnvm stuff. Do we want to expose the identical set of symbols between windows/linux? If not, then we should have an identical library loading procedure (like C++ custom ops) where we re-register components from the library into MXNet's internal registry. But this will limit the types of components a user can put in their library by explicitly requiring MXNet to implement support for each (instead of just making the symbols exposed and allowing users to load libraries that just hook into those symbols). |
I think we should expose the MX/NN symbols, unless there's a downside to it. |
The complication is how we go about doing the exporting. @yajiedesign maybe can add some more details, but a cursory search shows that you can use a cmake property WINDOWS_EXPORT_ALL_SYMBOLS to export everything and make it in line with how gcc works. But this will bloat the already bloated windows dlls. The other way to do it would be to go to the source files and add So really the only option is to use a cmake property WINDOWS_EXPORT_ALL_SYMBOLS and accept the bloat. Unless we drop windows and just say that this "external components" feature is linux only (C++ Custom operators can still be used on windows). |
Being explicit about the exported / supported symbols and thus components is preferable, as otherwise it becomes unclear which symbols are tracked as part of semantic versioning and which aren't. In fact, instead of setting WINDOWS_EXPORT_ALL_SYMBOLS, the best practice is actually to make gcc follow the Windows default of hiding symbols by default and only exposing whitelisted symbols. You can refer to the CPPCon talk: https://crascit.com/wp-content/uploads/2019/09/Deep-CMake-For-Library-Authors-Craig-Scott-CppCon-2019.pdf https://www.youtube.com/watch?v=m0DwB4OvDXk Before proceeding here, I think we need to answer the question "What symbols do we need to make available in libmxnet.so for external operators?" @samskalicky, can clarify which symbols are needed for the external operators you are interested in? |
Not sufficiently, anything that anybody uses to write an internal/backend operator could be anything in MXNet/NNVM/TVM/mshadow/etc... so if the goal of the feature is enable any backend operator to be dynamically loaded, we have to export all symbols. The biggest problem with individual symbol exposure many symbols needed arent entirely defined in MXNet source code. One example is that the operator registration macro |
Dear community members, After further analysis and help from many people, I have decided that the best way forward is to put this work on hold while we spend some more time improving the C++ APIs of both MXNet and our dependencies (nnvm, dmlc-core). Once we have a stable, maintainable, versioned set of C++ APIs and a consistent process to build and link libmxnet.so with external libraries we'll revisit this proposal. The major problem we ran into was that we couldnt decide to expose all symbols in libmxnet.so, so to be able to link a custom library with external operators in it we need to export only the specific symbols needed to build an operator (or other external components). And some of these symbols are coming from our dependencies like NNVM (ie. The other big problem was that we found that in order to compile an external operator you really had to use the main MXNet CMakeLists.txt to generate all the defines/includes/etc so that you can compile your library correctly. This was a huge pain, and ideally we would have a small set of includes that are opaque (ie. dont further include all of MXNet's includes) to use instead. So we'll work towards both of these goals (symbol handling and includes) to reconfigure MXNet and its dependencies to have a cleaner C++ API. Once we have that then it will be easier to dynamically load libraries with external ops. In the meantime, some of the refactoring work will continue on in another PR #19016. I will close this PR for now. Thanks for all of your input on this proposal! Sam |
Description
Hi MXNet community,
I would like to propose another feature in MXNet Extensions for "external ops". Currently we have the following categories of operators in MXNet:
External operators will be builtin/backend operators but not compiled into libmxnet.so. Instead, they will be compiled into a separate shared object and dynamically loaded at runtime similar to C++ custom operators.
Motivation
Current C++ custom operators were designed to be simple to write, and easy to add to MXNet. They were not intended to be feature-equivalent to builtin/backend operators. There have been many users who start with a fork of MXNet, add some custom operators, and then try and use the C++ custom operator to load them. But C++ custom operators were not designed to be used this way.
Current examples
Current prototype flow
mx.library.load()
The example in example/extensions/lib_external_ops shows a minimum useful example. In the min_ex-inl.h file we define the minimum required components of a backend operator in MXNet:
Then in the min_ex.cc file we register the operator:
While this operator doesnt actually do anything, it has enough parts to register successfully and execute. Putting these two files in src/operator and building MXNet will produce a min_ex.cc.o file somewhere. After copying this file back into the example/extensions/lib_external_ops directory we can build a shared object with the operator implementation in it. The compilation command in this example is:
Notice that we compile lib_api.cc and init_lib.cc into the library also, these are the APIs necessary to use the
MXLoadLib
API. We also link it against libmxnet.so since the library depends on symbols in MXNet (after all thats the whole point, for this operator to have access to all the internal goodies that regular operators have). After buildinglibmin_ex.so
we can load it into MXNet dynamically at runtime by using the existinglibrary.load()
API:During loading the of the library (ie. in
dlopen
) the operator will be registered directly into the MXNet operator Registry. There are no additional overheads or lambda functions for these operators.Other Considerations
MXLoadLib
APIdlclose
on libraries loaded inside libmxnet.so. Since operators are existing in the library loaded by MXNet, MXNet needs to destruct and free all allocated memory before exiting. But, the objects/functions in the library still need to be available during destruction. So the custom library must outlive libmxnet.so. For python, we can move thedlclose
from libmxnet.so to Python no problem. But for C/C++ users, they will need to call dlcose on their own. So we had to change the signature ofMXLoadLib
to:MXNET_DLL int MXLoadLib(const char *path, unsigned verbose, void** lib);
and return the void* pointer to the library.Open Questions