Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASYNCIFY impossible to use WHITELIST with multiple arg functions #9128

Closed
caiiiycuk opened this issue Jul 31, 2019 · 15 comments
Closed

ASYNCIFY impossible to use WHITELIST with multiple arg functions #9128

caiiiycuk opened this issue Jul 31, 2019 · 15 comments

Comments

@caiiiycuk
Copy link
Contributor

caiiiycuk commented Jul 31, 2019

Related to #9094.

Hi, still can't use ASYNCIFY_WHITELIST because functions name from it is not mapped correctly to real function names from binary.

Testcase:

#include <emscripten.h>

void DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool) {
    emscripten_sleep(1);
}

int main(int argc, char** argv) {
    DOS_ReadFile(0, 0, 0, 0);
    return 0;
}

Next, build it with whitelist:

emcc -s ASYNCIFY=1 -s ASYNCIFY_WHITELIST=["DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool)"] asyncify.cpp -o asyncify.wasm --profiling

warning: Asyncify whitelist contained a non-existing function name: DOS_ReadFile(unsigned short ($DOS_ReadFile\28unsigned\20short)
warning: Asyncify whitelist contained a non-existing function name: unsigned char* ($unsigned\20char*)
warning: Asyncify whitelist contained a non-existing function name: unsigned short* ($unsigned\20short*)
warning: Asyncify whitelist contained a non-existing function name: bool) ($bool\29)

The problem is in , because it's treated as whitelist array delimiter not as function arguments delimiter. And escape didn't help:

emcc -s ASYNCIFY=1 -s ASYNCIFY_WHITELIST=["DOS_ReadFile(unsigned short\, unsigned char*\, unsigned short*\, bool)"] asyncify.cpp -o asyncify.wasm --profiling
shared:WARNING: LLVM version appears incorrect (seeing "9.0", expected "10.0")
warning: Asyncify whitelist contained a non-existing function name: DOS_ReadFile(unsigned short\\ ($DOS_ReadFile\28unsigned\20short\5c\5c)
warning: Asyncify whitelist contained a non-existing function name: unsigned char*\\ ($unsigned\20char*\5c\5c)
warning: Asyncify whitelist contained a non-existing function name: unsigned short*\\ ($unsigned\20short*\5c\5c)
warning: Asyncify whitelist contained a non-existing function name: bool) ($bool\29)

I tried also:

-s ASYNCIFY_WHITELIST=['DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool)']
-s ASYNCIFY_WHITELIST="['DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool)']"
-s ASYNCIFY_WHITELIST='["DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool)"]'
-s ASYNCIFY_WHITELIST=@file

Results are same.

Actual function name in wasm binary:

~/emscripten/sdk/binaryen/bin/wasm-opt --print-function-map asyncify.wasm|grep DOS_Read
(no output file specified, not emitting output)
5:DOS_ReadFile\28unsigned\20short\2c\20unsigned\20char*\2c\20unsigned\20short*\2c\20bool\29

I think DOS_ReadFile\28unsigned\20short\2c\20unsigned\20char*\2c\20unsigned\20short*\2c\20bool\29 is exact DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool) if you decode it. Btw, passing \encoded names also not supported:

emcc -s ASYNCIFY=1 -s ASYNCIFY_WHITELIST=["DOS_ReadFile\28unsigned\20short\2c\20unsigned\20char*\2c\20unsigned\20short*\2c\20bool\29"] asyncify.cpp -o asyncify.wasm --profiling

warning: Asyncify whitelist contained a non-existing function name: DOS_ReadFile\\28unsigned\\20short\\2c\\20unsigned\\20char*\\2c\\20unsigned\\20short*\\2c\\20bool\\29 ($DOS_ReadFile\5c\5c28unsigned\5c\5c20short\5c\5c2c\5c\5c20unsigned\5c\5c20char*\5c\5c2c\5c\5c20unsigned\5c\5c20short*\5c\5c2c\5c\5c20bool\5c\5c29)

Function name now is double escaped.

@cwoffenden
Copy link
Contributor

cwoffenden commented Jul 31, 2019

Does it work if you double- and single-quote them? We do this, not for ASYNCIFY_WHITELIST but it takes a list the same:

-s EXTRA_EXPORTED_RUNTIME_METHODS="['UTF8ToString', 'stringToUTF8']"

@caiiiycuk
Copy link
Contributor Author

Seems no

@kripken
Copy link
Member

kripken commented Jul 31, 2019

Thanks @caiiiycuk , yes, this looks like something I missed here. In fact we need fixes in two stages of parsing in order to handle it, I opened

With those two this should work properly, let me know if you test it.

@caiiiycuk
Copy link
Contributor Author

@kripken Still does not work. I am tested on emscripten/parse and binaryen/asyncify-names. The comma is handled properly, but looks like there is another problems with spaces after commas.

emcc -s ASYNCIFY=1 -s ASYNCIFY_WHITELIST=["DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool)"] asyncify.cpp -o asyncify.wasm --profiling

warning: Asyncify whitelist contained a non-existing function name: DOS_ReadFile(unsigned short,unsigned char*,unsigned short*,bool) ($DOS_ReadFile\28unsigned\20short\2cunsigned\20char*\2cunsigned\20short*\2cbool\29)
expected: DOS_ReadFile\28unsigned\20short\2c\20unsigned\20char*\2c\20unsigned\20short*\2c\20bool\29
actual:  $DOS_ReadFile\28unsigned\20short\2cunsigned\20char*\2cunsigned\20short*\2cbool\29

Original function name coded with spaces after comma \2c\20 where emscripten uses just \2c.

kripken added a commit to WebAssembly/binaryen that referenced this issue Aug 1, 2019
The lists are comma separated, but the names can have internal commas since they are human-readable. This adds awareness of bracketing things, so void foo(int, double) is parsed as a single function name, properly.

Helps emscripten-core/emscripten#9128
@kripken
Copy link
Member

kripken commented Aug 1, 2019

@caiiiycuk looks like an escaping issue, need extra quotes around the whole thing, the first command here fails while the second works:

$ ./emcc -s ASYNCIFY=1 -s ASYNCIFY_WHITELIST=["DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool)"] --profiling a.cpp
warning: Asyncify whitelist contained a non-existing function name: DOS_ReadFile(unsigned short,unsigned char*,unsigned short*,bool) ($DOS_ReadFile\28unsigned\20short\2cunsigned\20char*\2cunsigned\20short*\2cbool\29)
$ ./emcc -s ASYNCIFY=1 -s 'ASYNCIFY_WHITELIST=["DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool)"]' --profiling a.cpp
$

It seems like the shell just removes the " without the extra quoting, and then the parser can't tell it should all be a single string.

I'll add a clarification to the docs when I open the emscripten PR to add a test for this.

@kripken
Copy link
Member

kripken commented Aug 1, 2019

#9132 now has extra warnings for escaping errors, and a test for that.

@caiiiycuk
Copy link
Contributor Author

Yes you right this was an shell escape error. Sorry again, but it didn't work when I pass file to whitelist.

-s 'ASYNCIFY_WHITELIST=@${CMAKE_CURRENT_LIST_DIR}/js-dos-cpp/asyncify.txt'

error is:

Fatal: Asyncify: failed to parse lists
shared:ERROR: '/home/caiiiycuk/emscripten/sdk/binaryen/bin/wasm-opt wdosbox.wasm -o wdosbox.wasm --no-exit-runtime -O3 --post-emscripten --low-memory-unused --strip-debug --strip-producers --asyncify [email protected]_sleep,env.emscripten_wget,env.emscripten_wget_data,env.emscripten_idb_load,env.emscripten_idb_store,env.emscripten_idb_delete,env.emscripten_idb_exists,env.emscripten_idb_load_blob,env.emscripten_idb_store_blob,env.SDL_Delay,env.__syscall118,env.invoke_* --pass-arg=asyncify-whitelist@main,SHELL_Init(),DOS_ReadFile(unsigned short,
"DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool),DOSBOX_RunMachine(),CALLBACK_RunRealInt(unsigned char),CALLBACK_RunRealFar(unsigned short,
"CALLBACK_RunRealFar(unsigned short, unsigned short),CALLBACK_Idle(),Normal_Loop(),DOS_21Handler(),PROGRAMS_Handler(),device_CON::Read(unsigned char*,
"device_CON::Read(unsigned char*, unsigned short*),device_CON::device_CON(),DOS_Device::Read(unsigned char*,
"DOS_Device::Read(unsigned char*, unsigned short*),MEM::Run(),INTRO::Run(),BOOT::Run(),Config::StartUp(),Config::Init(),Section::ExecuteInit(bool),GUI_StartUp(Section*),DOS_Shell::InputCommand(char*),DOS_Shell::Run(),DOS_Shell::Execute(char*,
"DOS_Shell::Execute(char*, char*),DOS_Shell::DoCommand(char*),DOS_Shell::ParseLine(char*),DOS_Shell::CMD_HELP(char*),DOS_Shell::CMD_PAUSE(char*),DOS_Shell::CMD_CHOICE(char*),DOS_Shell::CMD_DIR(char*),DOS_Shell::CMD_TIME(char*),DOS_Shell::CMD_CLS(char*),DOS_Shell::CMD_DATE(char*),DOS_Shell::CMD_COPY(char*),Reboot_Handler(),DOS_LoadKeyboardLayout(char const*,
"DOS_LoadKeyboardLayout(char const*, int, char const*),DOS_KeyboardLayout_Init(Section*),DOS_SwitchKeyboardLayout(char const*,
"DOS_SwitchKeyboardLayout(char const*, int&),keyboard_layout::read_codepage_file(char const*,
"keyboard_layout::read_codepage_file(char const*, int),keyboard_layout::switch_keyboard_layout(char const*,
"keyboard_layout::switch_keyboard_layout(char const*, keyboard_layout*&, int&),DOS_KeyboardLayout::DOS_KeyboardLayout(Section*),KEYB::Run(),PIC_RunQueue(),MAPPER_RunEvent(unsigned int),MAPPER_RunInternal(),Pause_Loop(),WGET::Run() --mvp-features -g' failed (1)
ninja: build stopped: subcommand failed.

@caiiiycuk
Copy link
Contributor Author

asyncify.txt

@kripken
Copy link
Member

kripken commented Aug 1, 2019

Thanks @caiiiycuk , we didn't handle response files with newlines. Fix in that PR.

@kripken kripken closed this as completed in f0de64c Aug 2, 2019
@gabrielcuvillier
Copy link
Contributor

gabrielcuvillier commented Aug 9, 2019

I have similar issues with the D3 port (multiple args functions not working). Tried various extra quotes tricks, but no luck.

in the --pass-arg=asyncify-whitelist@<...> of wasm-opt, I can see some weird string concatenation such as:
idCommonLocal::GUIFrame(bool,"idCommonLocal::GUIFrame(bool,bool)

I am using the 1.38.41-upstream, maybe some PR have not landed though. I'll wait a bit

@kripken
Copy link
Member

kripken commented Aug 9, 2019

@gabrielcuvillier 1.38.41 should have all those fixes, so you may be seeing another bug. Can you provide your full emcc link command, and I'll try to reproduce?

@gabrielcuvillier
Copy link
Contributor

gabrielcuvillier commented Aug 10, 2019

@kripken Yes. Reproducing the issue is very simple:

in test.cpp (test.cpp.zip):

void f(bool,bool) {
  return;
}

int main() {
  f(true,true);
}

command:
emcc test.cpp --profiling -s ASYNCIFY=1 -s 'ASYNCIFY_WHITELIST=["f(bool,bool)"]' -o test.html
will fail during wasm-opt, with weird string concatenation in the arguments

command (same as previous, without quotes):
emcc test.cpp --profiling -s ASYNCIFY=1 -s ASYNCIFY_WHITELIST=["f(bool,bool)"] -o test.html
will succeed, but mentioning f(bool,bool) is not found

Either this is another bug, or something is not good on my system (?)

@kripken
Copy link
Member

kripken commented Aug 10, 2019

The human-readable name is f(bool, bool) - with a space after the comma. This can be confusing, sadly... in the end it's often easiest to just view the wasm binary (with -g so there are names). With that name, I don't get a warning about it not being found.

@gabrielcuvillier
Copy link
Contributor

Great, that works!

@gabrielcuvillier
Copy link
Contributor

Huh... not so fast (still fails on windows): #9206

belraquib pushed a commit to belraquib/emscripten that referenced this issue Dec 23, 2020
…-readable C++ names in -s lists, which is important for Asyncify (emscripten-core#9132)

We have nice syntax for lists, -s LIST=[foo,bar] which works
well in many cases. But sometimes the list contains items
that contain ",", for example -s LIST=["foo(x, y)", "bar"]. To handle
that properly, try to parse things that look like json using the json
parser; if it fails go back to the old code.

Together with WebAssembly/binaryen#2275 
this fixes emscripten-core#9128 where the lists there actually do have ","s in them.

Add testing for Asyncify lists with such names, and also added
extra warnings for when escaping is confusing, and a test for that
as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants