Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to new Android DNS functions android_res_nquery android_res_nresult (API 29+) #4333

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

micha102
Copy link

@micha102 micha102 commented Mar 3, 2025

Add support to new Android DNS functions android_res_nquery android_res_nresult (API 29+) in pjlib/src/pj/addr_resolv_sock.c
New macro
#define PJ_GETADDRINFO_USE_ANDROID 1

I looked at the Android documentation to enhance default getaddrinfo (very slow in recent Android APIs), and found that a native method for DNS resolution is available starting Android API 29, based on android_res_nquery and android_res_nresult functions.

I implemented it and it's definitely faster than default getaddrinfo, the delay between clicking on "Start Library" and actual TLS socket establishment is next to nothing (110 ms)

Add support android_res_nquery for DNS (Android API 29+)
@CLAassistant
Copy link

CLAassistant commented Mar 3, 2025

CLA assistant check
All committers have signed the CLA.

@micha102
Copy link
Author

micha102 commented Mar 4, 2025

Need to implement the check on Android API Level too
If >= 29 then run the DNS lookup with android_res_nquery/android_res_nresult
if < 29 then run the DNS lookup with getaddrinfo

Use dlopen libandroid.so then dlsym to load functions android_res_nquery, android_res_nresult and android_res_cancel.
This avoids devices with Android API < 29 to fail to load the library due to absence of the 29+ functions, and let them fallback directly to call the default getaddrinfo()
@micha102
Copy link
Author

micha102 commented Mar 5, 2025

I found it's trickier than I thought to "hide" the API 29+ functions from devices with API < 29.
If I just do a check on API Level, then older devices will throw "dlopen failed: cannot locate symbol "android_res_nquery" referenced by libpjsua2" even though it shouldn't see it (because of the if statement).

So, I dynamically loaded the functions android_res_nquery, android_res_nresult and android_res_cancel and did this only in case of API >= 29
Otherwise for older APIs, I cloned the same default getaddrinfo() to be used.

I tested it on :
Xiaomi Mi 9T with LineageOS API 35 arm64-v8a : works good with API 29+ functions
Emulated Pixel 6 Pro with API 35 x86_64 : works good with API 29+ functions
Emulated Pixel 6 Pro with API 32 x86_64 : works good with API 29+ functions
Emulated Pixel 6 Pro with API 28 x86 : works good with fallback getaddrinfo()

@micha102 micha102 marked this pull request as ready for review March 5, 2025 01:50
@sauwming sauwming added this to the release-2.16 milestone Mar 5, 2025
*/
res->ai_socktype != 0)
{
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

@sauwming sauwming requested review from bennylp and nanangizz March 5, 2025 05:14
char sdk_value[8] = "";
__system_property_get("ro.build.version.sdk", sdk_value);
if (atoi(sdk_value) >= 29) {
void *libandroid_handle = dlopen("libandroid.so", RTLD_NOW);
Copy link
Member

@nanangizz nanangizz Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it is loaded manually? If modifying build config, e.g: adding library or LD_FLAGS, is needed, perhaps we can help.

Update: ah sorry, just read the discussion above, I guess it is better to also put comments about this in the code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. I corrected it in the last commit and added a comment over that block to explain the reason why I used function pointers to load API 29+ functions only when Android API Level verification is done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if you link -landroid from the configure script instead?
Something like this:

pjproject/aconfigure.ac

Lines 1211 to 1213 in 6dabbd5

if test "$ac_pjmedia_video_has_android" = "yes"; then
ac_android_cflags="-DPJMEDIA_VIDEO_DEV_HAS_ANDROID_OPENGL=1"
LIBS="$LIBS -lGLESv2 -lEGL -landroid"

Copy link
Author

@micha102 micha102 Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the output of TARGET_ABI=xxxxx ./configure-android --use-ndk-cflags $OTHER_OPTIONS for my build and I saw that :

checking for linux/soundcard.h... yes
checking for machine/soundcard.h... no
Checking sound device backend... Android
**Checking if OpenGL ES 2 is available... yes**
Checking if small filter is disabled... no
Checking if large filter is disabled... no
...

So, the statement is actually matched for my build.

Do you mean I should test the verification of the function similar to this ?

pjproject/aconfigure.ac

Lines 391 to 396 in 6dabbd5

AC_LANG_PROGRAM([
[#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>]],
[getaddrinfo(0, 0, 0, 0);])
],

This would be the same issue as the NDK 27c with Target API 35 will successfully detect the functions.
But an older Android device will not find reference to those functions and thus fails to load the pjsua2.so library.

It's a known problem for NDK as compared to iOS to ensure backward compatiblity.
android/ndk#837

The Android NDK documentation mentions :

Only symbols are affected, not libraries. The only way to conditionally depend on a library that is not available in the app's minSdkVersion is with dlopen. We do not know how to solve this in a backwards compatible manner.

So I'm not sure if we can integrate this without dynamically loading functions (dlopen and dlsym)
This is discussed in android/ndk#837 (previously neither symbols nor libraries were possible to be "catched" with an API check.
But now symbols (headers) are possible but not library.
I've checked the header files for NDK and I can't find any headers file for the DNS functions android_res_*, it's only in the library libandroid.so itself.

Edit: another way is to enable "weak symbols" -D__ANDROID_UNAVAILABLE_SYMBOLS_ARE_WEAK__ allowing APIs to be safely called as long as they are only called when the API is available/is double-triple checked. But since, that weak symbols is not activated in configure-android I prefer not to be the one to disabke strong symbols. Or I don't know...

Edit2: I found the functions inside android/multinetwork.h. the function __builtin_available must be used to safeguard calls. But it still needs to activate weak symbols according to the same NDK document

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my PR is stuffed with standard functions and I need to revamp it (like I should use optimized PJ types instead of regular C ones or functions like poll where pj_ioqueue_poll exists)
Your help is welcome either in the form or in the substance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple of minors from me (pj_memcpy and indentation).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to compile with weak symbols but so far I'm failing (still having the dlopen error when running on older Android).
I need to tweak configure-android script.

Weak symbols would make the code "prettier" but I don't really see it to be used elsewhere apart from system features. (Like DNS which is subject of the PR)
I would say if things can be done in Android code then it must be done inside the Android Code and not in pj library.

Copy link
Author

@micha102 micha102 Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I succeeded enabling weak symbols in PJ build.
In fact, I needed to add in configure-android file ENABLE_WEAK_SYMBOLS="-D__ANDROID_UNAVAILABLE_SYMBOLS_ARE_WEAK__ -Werror=unguarded-availability"
and then
export CFLAGS="${NDK_CFLAGS} ${CFLAGS} ${CPPFLAGS} ${ENABLE_WEAK_SYMBOLS}"

However, I realized there're some issues with the present configure-android build file :

  1. I saw in the comments of configure-android, that the APP_PLATFORM will select the latest available Android API Version.
    This is a contradiction with Android recommendation which states :

APP_PLATFORM declares the Android API level this application is built against and corresponds to the application's minSdkVersion.
If not specified, ndk-build will target the minimum API level supported by the NDK. The minimum API level supported by the latest NDK will always be low enough to support nearly all active devices.

Yet, in PJ Android build, it's the latest API level used (not the earliest as the default option from NDK)
Using the latest API Level will, of course, contains the latest APIs (functions) provided by Android, but older devices will not understand functions if they are introduced in a recent API version.

  1. When I forced APP_PLATFORM to be the minimum (in current LTS Android NDK 27c), the minimum is Android API 21), I can see errors that android_res_nquery is only available in API Level 29 and I should surround newer functions with an
    if (__builtin_available(android 29, *)) {
    ...
    }

But I also found usage of functions from API Level 28 and above :

../src/pjmedia-codec/and_aud_mediacodec.cpp:1096:5: error: 'AMediaCodec_setAsyncNotifyCallback' is only available on Android 28 or newer [-Werror,-Wunguarded-availability]
 1096 |     AMediaCodec_setAsyncNotifyCallback(codec_data->enc, async_cb, codec_data);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/xxxx/pjsip-android/pjproject-2.15.1/android-ndk-r27c/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include/media/NdkMediaCodec.h:499:16: note: 'AMediaCodec_setAsyncNotifyCallback' has been marked as being introduced in Android 28 here, but the deployment target is Android 21
  499 | media_status_t AMediaCodec_setAsyncNotifyCallback(
      |                ^
../src/pjmedia-codec/and_aud_mediacodec.cpp:1096:5: note: enclose 'AMediaCodec_setAsyncNotifyCallback' in a __builtin_available check to silence this warning
 1096 |     AMediaCodec_setAsyncNotifyCallback(codec_data->enc, async_cb, codec_data);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/pjmedia-codec/and_aud_mediacodec.cpp:1097:5: error: 'AMediaCodec_setAsyncNotifyCallback' is only available on Android 28 or newer [-Werror,-Wunguarded-availability]
 1097 |     AMediaCodec_setAsyncNotifyCallback(codec_data->dec, async_cb, codec_data);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/xxxx/pjsip-android/pjproject-2.15.1/android-ndk-r27c/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include/media/NdkMediaCodec.h:499:16: note: 'AMediaCodec_setAsyncNotifyCallback' has been marked as being introduced in Android 28 here, but the deployment target is Android 21
  499 | media_status_t AMediaCodec_setAsyncNotifyCallback(

This is caused by PR #4105

So, in my opinion :

  1. We can keep strong symbols, and implement functions with dlopen/dlsym (as it is now in the PR), and implement the same in and_aud_mediacodec.cpp and and_vid_mediacodec.cpp
  2. We enable weak symbols, and instead of the code
    /* Initialize function pointers to NULL */
    android_res_nquery_fn android_res_nquery_ptr = NULL;
    android_res_nresult_fn android_res_nresult_ptr = NULL;
    android_res_cancel_fn android_res_cancel_ptr = NULL;
    /* Buffer to store the SDK version as a string */
    char sdk_value[8] = "";
    /* Get the Android SDK version */
    __system_property_get("ro.build.version.sdk", sdk_value);

    if (atoi(sdk_value) >= 29) {
        /* Load the libandroid.so library containing network functions */
        void *libandroid_handle = dlopen("libandroid.so", RTLD_NOW);
        if (libandroid_handle) {
            /* Get the nquery, nresult and cancel functions from library */
            android_res_nquery_ptr = (android_res_nquery_fn)
                                     dlsym(libandroid_handle,
                                     "android_res_nquery");
            android_res_nresult_ptr = (android_res_nresult_fn)
                                     dlsym(libandroid_handle,
                                     "android_res_nresult");
            android_res_cancel_ptr = (android_res_cancel_fn)
                                     dlsym(libandroid_handle,
                                     "android_res_cancel");

            /* Non-null pointers mean DNS functions are properly loaded */
            if (android_res_nquery_ptr &&
                android_res_nresult_ptr &&
                android_res_cancel_ptr) {
                                [...]
            }}} else {
                do the getaddrinfo stuff }

which are 11 lines of code, we will simply have :

if (__builtin_available(android 29, *)) {
[...]
} else {
do the getaddrinfo stuff }

which is cleaner and prettier.
Should I go with 1 or 2 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for 2.
That's what we usually do in other platforms (such as Mac and iOS). It will be a hassle to do approach no 1 for every new-ish API that we're going to use.

Format for max 80 cols
transform the PJ_GETADDRINFO_USE_ANDROID to elif
indentation correction
add comments for dynamic function loading
@micha102 micha102 requested review from sauwming and nanangizz March 5, 2025 21:28
struct sockaddr_in addr;
pj_bzero(&addr, sizeof(addr));
addr.sin_family = PJ_AF_INET;
memcpy(&addr.sin_addr, rdata, 4);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pj_memcpy

Update APP_PLATFORM value for NDK >= r21
Look for the correct folder containing the minimum SDK
Switch the default selection to the minimum SDK version if not explicitly given, as per document https://developer.android.com/ndk/guides/application_mk
Surround AMediaCodec_setAsyncNotifyCallback call with Android API check
@micha102 micha102 marked this pull request as draft March 11, 2025 02:26
Use __builtin_available to check Android API
Dynamic loading of ns_initparse and ns_parserr to let pjlib compile with Android 21 (the 2 functions only exist in API 23+)
@micha102
Copy link
Author

micha102 commented Mar 12, 2025

I did commits for :

configure-android

  1. Add the SDK level detection for latest NDK (starting NDK 22 the folder "platforms" disappeared)

The paths being removed are:
platforms

  1. Set the platform to minimum (which is android-21 in latest LTS NDK 27c)
  2. Enabled weak symbols inside configure-android

pjmedia/src/pjmedia-codec/and_vid_mediacodec.cpp and and_aud_mediacodec.cpp

  1. Surround the AMediaCodec_setAsyncNotifyCallback (API 28+) with availability check

addr_pjlib/src/pj/addr_resolv_sock.c

  1. Surround the new Android DNS resolution with availability check
  2. I noticed that the functions ns_initparse and ns_parserr are only available starting API 23+

New libc functions in M (API level 23):
...
re-introduced various <resolv.h> functions: ..., ns_initparse, ..., ns_parserr

And with this, I got a problem because, unlike Android API functions which are handled by the __builtin_available, the functions from libc cannot be handled, even never called (it's already inside an if statement requiring API 29!)

Known issues and limitations:
APIs in bionic (libc, libm, libdl) are not currently supported. See the bug for more information. If the source compatibility issues can be resolved, that will change in a future NDK release.

So I had two choices :

  • Make the code only buildable for Android API 23 and above (APP_PLATFORM=23)
  • Dynamically load ns_initparse and ns_parserr which are anyway loaded and called only when API is 29+ (a constraint even harder than the API 23)

I preferred to keep Android compatibility large (compiles successfully for android-21 which is the minimum for NDK 27c)
I tested it on :
Xiaomi Mi 9T LineageOS (Android API 35) ARM64-V8A successfully (new DNS)
Xiaomi Mi 9T with LineageOS API 35 arm64-v8a : works good with API 29+ functions
Emulated Pixel 6 Pro with API 35 x86_64 : works good with API 29+ functions
Emulated Pixel 6 Pro with API 32 x86_64 : works good with API 29+ functions
Emulated Pixel 6 Pro with API 28 x86 : works good with fallback getaddrinfo()
Emulated LineageOS 7.1 on VirtualBox with API 25 x86 : works good with fallback getaddrinfo()
Emulated LineageOS 5.1 on VirtualBox with API 22 x86 : works good with fallback getaddrinfo()
I'm ready for code review.

@micha102 micha102 marked this pull request as ready for review March 12, 2025 01:04
@micha102 micha102 requested a review from sauwming March 12, 2025 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Slow destination DNS resolution in Android API 32+ (fail for Emulated API 32)
4 participants