-
Notifications
You must be signed in to change notification settings - Fork 722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error open urandom #3445
Comments
There is still this problem after I restore aws-c-io to 0.10.7. |
Hey Roy, were there any other changes to your system that may have affected Also, if you're using s2n-tls 0.10.7, that's a pre-release version before our 1.0.0 release. Are you using an old version of aws-c-io? Is it possible to upgrade to a version that uses a more recent s2n-tls version (see https://github.com/aws/s2n-tls/releases)? |
Hi Frank, Thanks for your reply! I have tried this example code to github host and /dev/urandom could be accessied without a problem. We are developing application level code and do not touch low level like /dev/urandom. Currently we are unsure how it is introduced. Seems an environment problem, but the urandom test code could run successfully. The aws-c-io is 0.6.20. We also restored it to 0.6.9. But that still could not solve the problem. We have not changed anything. Just from 3 days ago this error emerged and became a blocking issue. Even we restored aws-c-io and s2n-tls to the previous successful one. --test code Currently the error S2N_ERR_OPEN_RANDOM could be thrown from two functions. One is s2n_rand_init_impl() and the other one is s2n_rand_init(). As the above test code could succeed, we are suspecting the error is from s2n_rand_init(). Do you have a way to identity which line generates this error?
|
I also tried the newest s2n 1.3.19. Still same issue. Seems an environment problem. But I could not understand the test code could succeed, that means /dev/urandom could be accessed and works normally. |
We have an API that can report the line of code that generated the error: https://github.com/aws/s2n-tls/blob/main/api/s2n.h#L384-L394 Are you able to call that? And are you calling s2n_init() directly in your code? If so, is it possible you're calling s2n_init() more than once? |
We are not calling s2n_init() directly. We just called AWS SDK initAPI as below. All these are done implicitly.
|
We have two executables which calls Aws::InitAPI(). They are executed sequentially. We have called Aws::ShutdownAPI() before each executable exits. The problem is always seen for the 2nd executable even if I disabled the first executable. Could not understand why. But today I saw it ran successfully. It is the first time for the recent 3 days. But it is only once. |
Interesting. That could suggest an issue with cleanup. Just looking at our code, we close urandom in s2n_rand_cleanup, which should get triggered by s2n_cleanup unless s2n_disable_atexit is called. We should probably check whether the SDK calls s2n_disable_atexit. We probably also need a new s2n-tls test: it looks like the current "repeatedly call s2n_init and s2n_cleanup" test calls s2n_disable_atexit. |
The SDK does call s2n_disable_atexit. In fact, that call actually was added to support the SDK: #3011. |
I think it is not related with close urandom as the problem could be seen even when we ran the 2nd executable alone. So for this case there is no urandom use before s2n_init() is called. I think the error may be from the s2n_rand_init() which uses openssl Engine_new(). 352 static int s2n_rand_init_impl(void) |
Hi, I have identified the error is generated by ENGINE_add(e) from s2n_rand_init(). Is there a way to get the details from ENGINE_add(e)? 370 S2N_RESULT s2n_rand_init(void) |
From openssl ENGINE_add() code, the error is possible from CRYPTO_THREAD_write_lock(global_engine_lock) as the e is not NULL and it has id and name set. Also it is not added more than once. The CRYPTO_THREAD_write_lock(global_engine_lock) failure seems be caused by global_engine_lock is 0.
|
If the error is coming from openssl, can you use the openssl error methods to examine it? Something like |
I got this error error:13000067:engine routines::conflicting engine id |
Is it possible s2n_rand_init() is executed in multiple threads? |
Are you calling InitAPI in multiple threads? s2n_init can only be called once across all threads, so that would be an issue if the SDK doesn't have extra protection. It looks like they have something, but I'm not sure to what extent. |
It may be in multiple threads. But we have a protection to run InitAPI only once.
|
Do you know what version of Openssl you're using as libcrypto? If not, can you add Edit: Although that actually shouldn't affect s2n_rand_init, because we initialize the locking callbacks after we call s2n_rand_init. That's probably a potential bug we need to fix, even if it isn't related to this issue. |
I added the below code. I see ENGINE_add() is called multiple times. The openssl version is 3.0.5. 387 fprintf(stderr, "#### ENGINE_add() will be called\n"); ENGINE_add() will be calledOpenSSL 3.0.5 5 Jul 2022 ENGINE_add() will be calledOpenSSL 3.0.5 5 Jul 2022 |
Ah but that's probably to do with how you're building s2n-tls vs how you're building your code and where you're calling the method to get the version. There is this issue with the SDK: awslabs/aws-c-io#508 |
We do not use s2n-tls or aws-c-io directly. We just use AWS SDK and download all dependencies by vcpkg. Not sure why openssl-3.0 is used. For the fprintf info I put them to s2n_rand_init(). |
All your debug info points to s2n_init being called twice, which we know (and document) is illegal and will fail. It's likely that something in your application code or one of the libraries consuming s2n-tls is leading to multiple calls to s2n_init. At this point I'm not sure I can offer any advice specific to s2n-tls, but if I were debugging your code I would:
You could also try reaching out to aws-c-io / the SDK. |
Thanks for your suggestion. Let me try. |
I think I have identified the root cause. We did call Aws::InitAPI multiple times. But sequentially, not parallelly. We called the Aws::InitAPI in our constructor with mutex protection for multi-threads. We called the Aws::ShutdownAPI in our destructor when no such objects are in use. We used a static int as a counter. But our tests are executed one by one and each one creates an object and destructs it when the test is finished. So we have a lot of Aws::InitAPI--> Aws::ShutdownAPI execution. Seems openssl ENGINE is not released in a sync mode, but in an async mode. The ShutdownAPI returns but the resource is not released. So we could see this problem. I think you could write a test to run s2n_init and free in a loop to reproduce the problem. Sorry I could not build s2n-tls successfully on my local side because gnutls configure always fails due to Libnettle3.6 not found issue no matter what I do. |
We have a test like that: https://github.com/aws/s2n-tls/blob/main/tests/unit/s2n_init_test.c |
By adding debug info to function that calls ENGINE_remove(), I see there is one ENGINE_remove is not executed which causes this problem. It is related with our multi-threads execution but we have already protection for only once Aws::InitAPI() and Aws::ShutdownAPI() execution. Need to investigate the root cause behind that. But at least we have a workround. |
Security issue notifications
If you discover a potential security issue in s2n we ask that you notify
AWS Security via our vulnerability reporting page. Please do not create a public github issue.
Problem:
I am using AWS SDK to develop an ODBC driver. Our code is running on github. The test works fine locally on Ubuntu 20.04. It also could run successfully until Aug 9. But from yesterday the test on github starts to fail like below. It is from Aws::InitAPI(). As it is AWS SDK code we have no idea what is wrong. Just from the error info it seems /dev/urandom could not be accessed.
One more thing we noticed is the aws-c-io which calls s2n is updated from 0.10.7 to 0.10.22. Maybe that caused this problem.
--error info
2022-08-12T01:51:39.3781517Z s2n_init() failed: 402653198 (error opening urandom)
2022-08-12T01:51:39.3782986Z Fatal error condition occurred in /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/vcpkg/buildtrees/aws-c-io/src/f2a46b436e-18f356d7b4.clean/source/s2n/s2n_tls_channel_handler.c:187: 0 && "s2n_init() failed"
--call stack
2022-08-12T01:51:39.3864476Z /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/src/../build/odbc/lib/libtimestream-odbc.so.2.13.0.58917(aws_backtrace_print+0xba) [0x7f0212002fb8]
2022-08-12T01:51:39.3865476Z /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/src/../build/odbc/lib/libtimestream-odbc.so.2.13.0.58917(aws_fatal_assert+0x5f) [0x7f0211feea7e]
2022-08-12T01:51:39.3866464Z /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/src/../build/odbc/lib/libtimestream-odbc.so.2.13.0.58917(aws_tls_init_static_state+0x115) [0x7f0211e4d981]
2022-08-12T01:51:39.3867438Z /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/src/../build/odbc/lib/libtimestream-odbc.so.2.13.0.58917(aws_io_library_init+0x61) [0x7f0211e40f1a]
2022-08-12T01:51:39.3868405Z /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/src/../build/odbc/lib/libtimestream-odbc.so.2.13.0.58917(aws_mqtt_library_init+0x31) [0x7f0211dc91e3]
2022-08-12T01:51:39.3869332Z /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/src/../build/odbc/lib/libtimestream-odbc.so.2.13.0.58917(+0xdbce06) [0x7f0211dabe06]
2022-08-12T01:51:39.3870707Z /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/src/../build/odbc/lib/libtimestream-odbc.so.2.13.0.58917(_ZN3Aws3Crt9ApiHandleC1EP13aws_allocator+0x4e) [0x7f0211dabea6]
2022-08-12T01:51:39.3871805Z /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/src/../build/odbc/lib/libtimestream-odbc.so.2.13.0.58917(ZN3Aws3NewINS_3Crt9ApiHandleEJP13aws_allocatorEEEPT_PKcDpOT0+0x5a) [0x7f021198d787]
2022-08-12T01:51:39.3872982Z /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/src/../build/odbc/lib/libtimestream-odbc.so.2.13.0.58917(_ZN3Aws13InitializeCrtEv+0x37) [0x7f021198d3df]
2022-08-12T01:51:39.3873996Z /home/runner/work/amazon-timestream-odbc-driver/amazon-timestream-odbc-driver/src/../build/odbc/lib/libtimestream-odbc.so.2.13.0.58917(_ZN3Aws7InitAPIERKNS_10SDKOptionsE+0x2b) [0x7f0211989470]
A short description of what the problem is and why we need to fix it. Add reproduction steps if necessary.
This is an error from aws-c-io and s2n. It is not controlled by outside. Seems a bug here. Please check.
Solution:
A description of the possible solution in terms of S2N architecture. Highlight and explain any potentially controversial design decisions taken.
Requirements / Acceptance Criteria:
What must a solution address in order to solve the problem? How do we know the solution is complete?
Out of scope:
Is there anything the solution will intentionally NOT address?
The text was updated successfully, but these errors were encountered: