-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DYNAMIC_ARCH support on Android #2931
Comments
The problem goes a little deeper, see #2715 |
Looks like there is some overlap, but I think the ability to set the core type externally via env variable is currently broken. I'll provide some more comments in #2715. |
Why do you think it is broken ? "gotoblas" points to the cpu model in use - if we find this already set, the initialization routine must already have been run (in another thread). If gotoblas is still undefined, we look in the environment if the user wants a particular model, and if that is not the case the cpu detection is begun. This works fine for the other architectures, the only thing it does not let you do is change your mind and try to switch to another cpu model when the library is already running. |
In my case -- and I admit this may not be how most people use openblas, especially on other platforms -- I have a product that's an SDK that runs on Android. Environment variable for core type can only be set in my code (i.e. it cannot be set externally), and by the time I can do it, openblas had already been initialized (and due to failing to properly detect core type, it it's not setup in an optimal way). Perhaps I am missing a part about initialization... I don't think I am doing anything explicitly to initialize openblas. If I had control over initialization, I could then set environment variable prior to init. |
Initialization happens the moment you load the library, if that is before your own code gets a chance then I'm not sure what to do except try to find a fallback cpu detection method that works on your devices. Would /proc/cpuinfo be available ? (This is what gets parsed by the build-time check when not building DYNAMIC_ARCH, hwcap was just supposed to be more elegant and reliable) |
I agree, but the problem is that at that point I cannot call gotoblas_dynamic_init(), i.e. it won't do anything. Maybe to ask a different question: what is the use case for calling gotoblas_dynamic_init() with the current setup? Unless I am missing something, there isn't really a way to use it properly right now. Btw, I can make the changes to my fork to address my use case; I am just trying to understand if I am doing something wrong and/or if the API could be improved. |
To answer your other question, I assume /proc/cpuinfo would be available (I don't know for sure as my SDK is deployed on variety of Android devices)... Based on the fact that Google is using it in https://github.com/google/cpu_features I would assume so, although @yuyichao suggests alternative approach in #2715. From my comment in #2715 it might be better not to force this to happen under the hood, or at least to allow developers to change it. With the current setup, autodetect could silently fail (and default to ARMV8), and it's not possible to change the coretype after that happens. |
AFAIK gotoblas_dynamic_init was never meant to be exposed, as I mentioned it is called by the constructor code when the library is loaded and so far an environment variable that gets read at that point was all that was needed (and usually only for debugging). |
OK, thanks. The first sentence explains what I needed to know. I am seeing quite a bit of difference... I think I'll just determine the core type in my code, for the relevant cores (to avoid the issue mentioned in #2715, where the type could be different among subsets of cores), and then use modified gotoblas_dynamic_init to set it via env variable (in conjunction with setting thread affinity). Thanks! |
Works as is on android 10 (e.g termux), is there any problem on Android 11? |
Maybe HWCAP access is restricted on some embedded devices, or termux is emulating priviledged syscalls under the hood ? Anyway a more intelligent fallback than just assuming ARMV8 should be doable, I just do not want to experiment before 0.3.12 is out. |
|
I've been experimenting with using DYNAMIC_ARCH support on Android (aarch64) and had some questions about this. I am using 0.3.10 release (commit 63b03ef).
On couple of devices I've been using for testing, upon program startup it seems that get_coretype method gets called and it fails to obtain core type info because
if (!(getauxval(AT_HWCAP) & HWCAP_CPUID)) {
fails, so core type defaults to ARMV8; this seems to happen when the program starts (without any openblas methods being called). One of these devices is Samsung S7, another is custom hardware.
Does this match other people experience? I saw some mentions on https://developer.android.com/ndk/guides/cpu-features#the_google_cpu_features_library that using AT_HWCAP may not be accurate on old devices ("for example, claim to have integer division instructions but do not"), but no mention of not being able to access this info at all.
Alternative would be to use Google's cpu_features library (https://github.com/google/cpu_features) and dynamically set core type via OPENBLAS_CORETYPE env variable. But, because gotoblas is already set, calling gotoblas_dynamic_init() doesn't do anything because of this line: https://github.com/xianyi/OpenBLAS/blob/f6e4cf2f9dac6f90b36f49659d883767d63791f5/driver/others/dynamic_arm64.c#L232
Am I missing something obvious here?
Thank you,
Ogi
The text was updated successfully, but these errors were encountered: