Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHP-FPM segfaults with Opcache enabled with Late Static Binding #9396

Open
vityank opened this issue Aug 22, 2022 · 12 comments
Open

PHP-FPM segfaults with Opcache enabled with Late Static Binding #9396

vityank opened this issue Aug 22, 2022 · 12 comments

Comments

@vityank
Copy link
Contributor

vityank commented Aug 22, 2022

Description

PHP-FPM crashes then OPCache enabled(Even if I disable all low 16bits of optimization flags in opcache.optimization_level) with some pattern of Late static binding involved.

  • Only PHP-FPM is affected. Main PHP binary works fine(With opcache.enable_cli=1)
  • PHP-FPM works fine if OPCache is completely disabled(opcache.enable=0)
  • PHP-FPM 7.3 with OPCache works fine - 8.0 Crashes.
  • Specific case in dev crashes in ZEND_FETCH_CLASS_CONSTANT_SPEC_UNUSED_CONST_HANDLER , but coredumps from our production shown other handlers from Zend/zend_vm_execute.h crashing the same way.

Unfortunately I can't create minimal working test case(I tried my best).
The only thing I know is that pattern like this causes it in the end(Note the constant having initial value via 'self' and later used as LSB via 'static'):

class TestClass
{
    const PHOENIX = '/usr/local/bin/grep';
    const CON1 = 'propose';
    const CON2 = self::CON1;
    const CON3 = 'r2';

    static public function crash($cmd, $params)
    {
        $paramsCmd = '';
        $fullCmd = static::CON3." '{$cmd}' {$paramsCmd}";
        $escalateOptimizer = ' '.static::CON2.' / '.static::CON3;
        return 13;
    }
}

Segmentation fault info:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000665bc6 in ZEND_FETCH_CLASS_CONSTANT_SPEC_UNUSED_CONST_HANDLER ()
    at PHP_BUILD_ROOT/php-8.0.22/Zend/zend_vm_execute.h:32987
32987                           if (EXPECTED(CACHED_PTR(opline->extended_value) == ce)) {
(gdb) bt
#0  0x0000000000665bc6 in ZEND_FETCH_CLASS_CONSTANT_SPEC_UNUSED_CONST_HANDLER ()
    at PHP_BUILD_ROOT/php-8.0.22/Zend/zend_vm_execute.h:32987
#1  0x00000000006971ac in execute_ex (ex=0x7feb82c141d0)
    at PHP_BUILD_ROOT/php-8.0.22/Zend/zend_vm_execute.h:58077
#2  0x000000000069b692 in zend_execute (op_array=0x7feb82c02000, return_value=<optimized out>)
    at PHP_BUILD_ROOT/php-8.0.22/Zend/zend_vm_execute.h:59499
#3  0x0000000000635e2b in zend_execute_scripts (type=-2101263920, type@entry=8, retval=retval@entry=0x0,
    file_count=file_count@entry=3) at PHP_BUILD_ROOT/php-8.0.22/Zend/zend.c:1694
#4  0x00000000005d53a8 in php_execute_script (primary_file=primary_file@entry=0x7ffeda60fae0)
    at PHP_BUILD_ROOT/php-8.0.22/main/main.c:2543
#5  0x00000000004408e3 in main (argc=<optimized out>, argv=<optimized out>)
    at PHP_BUILD_ROOT/php-8.0.22/sapi/fpm/fpm/fpm_main.c:1914
(gdb) print ce
$1 = (zend_class_entry *) 0x42d0e058
(gdb) print opline->extended_value
$2 = 0
(gdb) print *opline
$3 = {handler = 0x6971a7 <execute_ex+21591>, op1 = {constant = 515, var = 515, num = 515, opline_num = 515,
    jmp_offset = 515}, op2 = {constant = 4294967152, var = 4294967152, num = 4294967152,
    opline_num = 4294967152, jmp_offset = 4294967152}, result = {constant = 128, var = 128, num = 128,
    opline_num = 128, jmp_offset = 128}, extended_value = 0, lineno = 27, opcode = 181 '\265',
  op1_type = 0 '\000', op2_type = 1 '\001', result_type = 2 '\002'}
(gdb)

PHP Version

PHP 8.0.22/8.0.23/8.024

8.1 tree seems to not be affected(Tested on 8.1.11).

Operating System

CentOS 7

@vityank
Copy link
Contributor Author

vityank commented Oct 11, 2022

My further investigation into the issue for now(After expanding the EX and CACHED_PTR macros), points to use after free like issue with (execute_data)->run_time_cache storing a no more valid address:

(gdb) print ((execute_data)->run_time_cache)
$30 = (void ) 0x7f5ba20237c0
(gdb) print ((void
)((char*)((execute_data)->run_time_cache) + (opline->extended_value)))
$31 = (void ) 0x7f5ba20237c0
(gdb) print ((void
)((char*)((execute_data)->run_time_cache) + (opline->extended_value)))[0]
Cannot access memory at address 0x7f5ba20237c0

Other found facts:
The problem does not occur if opcache.consistency_checks is set to any value other than 0.
opcache.preferred_memory_model setting have not any effect.
Setting opcache.protect_memory to 1, allows to reproduce the crash also on CLI, through in other place @ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER.

@vityank
Copy link
Contributor Author

vityank commented Oct 12, 2022

testConstOptimizerBug.zip

I arranged to make small testcase which allows to reproduce part of it with opcache.protect_memory=1.
May-be it will reveal the cause of the instability in the optimizer generated code.

@cmb69 , I hope you'll be able to reproduce the crash with it without extra php.ini changes(Except of mentioned one).
Looks like only 8.0.x are affected(8.1 is not).

@vityank
Copy link
Contributor Author

vityank commented Oct 28, 2022

The just released 8.0.25 is still affected(It was expected from the changelog and silence here).
It seems like to be fallen under the hood, so @cmb69 or may be @nikic, I'll be glad if you find some time to look at this one(Reproduce script is in previous message).

@TysonAndre
Copy link
Contributor

TysonAndre commented Nov 8, 2022

I can reproduce with https://github.com/php/php-src/files/9763190/testConstOptimizerBug.zip and USE_ZEND_ALLOC=1 gdb -args php -d zend_extension=opcache -d opcache.protect_memory=1 -d opcache.enable_cli=1 -d opcache.enable=1 testConstOptimizerBug.php in the latest php 8.0.

The commit bd98d84 Reorder conditions and always mark methods in SHM as ZEND_ACC_IMMUTABLE possibly seems relevant to why this would be fixed in php 8.1, but that's a large guess, there's many other things it could be and I'm only a bit familiar with this code

  1. I'm not familiar with the policy here - but backporting the patch might affect extensions (performance monitoring tools, debuggers/zend_extensions replacing the interpreter such as xdebug) that are affected by internal implementation details of the php compiler and op arrays

    If backporting the patch causes (or exposes new bugs) there won't be another bug fix release to fix those.

  2. If it's a race condition, adding write locking here before initializing the run time cache pointer in php 8.0 might help? Though again, I haven't actually confirmed what the issue is

  3. Is it something opcache is failing to initialize for the run_time_cache__ptr? If so, that may be more straightforward to fix (again, not sure what the issue is yet, and brainstorming ideas)

  4. This could be a red herring if there are multiple bugs

(gdb) run
Starting program: /path/to/php-8.0.26-debug-install/bin/php -d zend_extension=opcache -d opcache.protect_memory=1 -d opcache.enable_cli=1 -d opcache.enable=1 testConstOptimizerBug.php
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x0000555555974338 in init_func_run_time_cache_i (op_array=0x408bf010) at /path/to/php-src/Zend/zend_execute.c:3678
3678            ZEND_MAP_PTR_SET(op_array->run_time_cache, run_time_cache);
(gdb) bt
#0  0x0000555555974338 in init_func_run_time_cache_i (op_array=0x408bf010) at /path/to/php-src/Zend/zend_execute.c:3678
#1  0x000055555597435a in init_func_run_time_cache (op_array=0x408bf010) at /path/to/php-src/Zend/zend_execute.c:3684
#2  0x0000555555987328 in ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER () at /path/to/php-src/Zend/zend_vm_execute.h:6671
#3  0x00005555559ed18c in execute_ex (ex=0x7ffff7a14020) at /path/to/php-src/Zend/zend_vm_execute.h:55756
#4  0x00005555559f1b7b in zend_execute (op_array=0x7ffff7a5d3c0, return_value=0x0) at /path/to/php-src/Zend/zend_vm_execute.h:59523
#5  0x0000555555940502 in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /path/to/php-src/Zend/zend.c:1694
#6  0x00005555558a13d0 in php_execute_script (primary_file=0x7fffffffc790) at /path/to/php-src/main/main.c:2545
#7  0x0000555555a32cb0 in do_cli (argc=10, argv=0x555556221b20) at /path/to/php-src/sapi/cli/php_cli.c:949
#8  0x0000555555a33d0c in main (argc=10, argv=0x555556221b20) at /path/to/php-src/sapi/cli/php_cli.c:1337


(gdb) print op_array->run_time_cache__ptr                                                                                                                                                                  
$5 = (void ***) 0x408bf0f8
(gdb) set {int}0x408bf0f8=0
Cannot access memory at address 0x408bf0f8
(gdb) print *(op_array->run_time_cache__ptr)
$6 = (void **) 0x0

https://www.php.net/supported-versions.php

php 8.0 has active bug fix support until 26 Nov 2022 | in 17 days - and I'd personally consider this a bug fix rather than a security fix (the end user wrote the code that used late static binding, not an attacker)

@cmb69
Copy link
Member

cmb69 commented Nov 8, 2022

php 8.0 has active bug fix support until 26 Nov 2022 | in 17 days

Right. While PHP 8.0.26 will be a regular bug fix release, 8.0.26RC1 is supposed to be tagged today, so this is likely to late for this issue to be fixed. I suggest to close the ticket as WONTFIX; those who are affected by this issue should better update to PHP 8.1.

@TysonAndre
Copy link
Contributor

TysonAndre commented Nov 8, 2022

https://www.npopov.com/2021/10/13/How-opcache-works.html#map-pointers

I also see that in php 8.1, this example is pointing into immutable memory through an offset (map_ptr & 1) == 1, and in php 8.0, it was a pointer into read-only memory (hadn't checked whether that is shared memory or a corrupted pointer, though I suspect shared memory with it only crashing with opcache.protect_memory=1)

  • Why would a run_time_cache pointer even get mutated in shared memory in php 8.0? Wouldn't that cause problems with multiple processes (pcntl_fork, apache worker mpm, etc) in shared memory (e.g. with race conditions)
For mutable memory: map_ptr & 1 == 0

map pointer ----> indirection pointer -----> static variables
                  (arena allocated)


For immutable memory: map_ptr & 1 == 1

map base pointer: slot 0
                  slot 1
    + map offset: slot 2 -----> static variables
                  slot 3
(gdb) print op_array->run_time_cache__ptr
$1 = (void ***) 0x791
(gdb) bt
#0  init_func_run_time_cache_i (op_array=0x408be830) at /path/to/php-src/Zend/zend_execute.c:3948
#1  0x0000555555d264c1 in init_func_run_time_cache (op_array=0x408be830) at /path/to/php-src/Zend/zend_execute.c:3956
#2  0x0000555555d39946 in ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER () at /path/to/php-src/Zend/zend_vm_execute.h:6846
#3  0x0000555555da1cca in execute_ex (ex=0x5555571353c0) at /path/to/php-src/Zend/zend_vm_execute.h:56351
#4  0x0000555555da66db in zend_execute (op_array=0x5555570d9270, return_value=0x0) at /path/to/php-src/Zend/zend_vm_execute.h:60123
#5  0x0000555555cef0c3 in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /path/to/php-src/Zend/zend.c:1813
#6  0x0000555555c4bda1 in php_execute_script (primary_file=0x7fffffffc780) at /path/to/php-src/main/main.c:2539
#7  0x0000555555e63f34 in do_cli (argc=10, argv=0x555556e69e60) at /path/to/php-src/sapi/cli/php_cli.c:965
#8  0x0000555555e6503c in main (argc=10, argv=0x555556e69e60) at /path/to/php-src/sapi/cli/php_cli.c:1367

Right. While PHP 8.0.26 will be a regular bug fix release, 8.0.26RC1 is supposed to be tagged today, so this is likely to late for this issue to be fixed. I suggest to close the ticket as WONTFIX; those who are affected by this issue should better update to PHP 8.1.

I forgot about the RC builds. Agreed, even if a fix was ready by then I don't expect that reviewers would be confident enough to approve it


I wasn't sure of this from the original ticket - was that a segfault that happened some of the time or all of the time?

@TysonAndre
Copy link
Contributor

TysonAndre commented Nov 8, 2022

I checked and found that the crash (with opcache.protect_memory=1) also occurs in php 7.4 (tested with 7.4.31-dev), which stopped receiving bug fix support 11 months ago - https://www.php.net/supported-versions.php (same stack trace of init_func_run_time_cache and ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER in gdb)

gdb -args php --no-php-ini -d zend_extension=opcache.so -d opcache.protect_memory=1 -d opcache.enable_cli=1 -d opcache.enable=1 ~/Downloads/php-issue-9396-segfault/testConstOptimizerBug.php

(the crash with protect_memory=1 is something I suspect is a symptom of a race condition that would possibly cause memory corruption in php 7.4 and 8.0 for late static binding)

@TysonAndre
Copy link
Contributor

Looking at zend_persist_class_method , it just seems wrong. For the class in php 8.0, it has ZCG(is_immutable_class) == 0 (from late static binding?)

I added debugging code, and it's initializing run_time_cache__ptr to a pointer within the shared memory arena, rather than to the offset corresponding to that pointer (ZEND_MAP_PTR_INIT(op_array->run_time_cache, ZCG(arena_mem));)

Changing to an offset might fix that, but I'm not clear on why the non-immutable class case was using arena_mem in the first place

	if (ZCG(is_immutable_class)) {
		op_array->fn_flags |= ZEND_ACC_IMMUTABLE;
		ZEND_MAP_PTR_NEW(op_array->run_time_cache);
		fprintf(stderr, "imm run_time_cache=%p\n", op_array->run_time_cache__ptr);
		if (op_array->static_variables) {
			ZEND_MAP_PTR_NEW(op_array->static_variables_ptr);
		}
	} else {
		ZEND_MAP_PTR_INIT(op_array->run_time_cache, ZCG(arena_mem));
		fprintf(stderr, "arena_mem=%p run_time_cache=%p\n", ZCG(arena_mem), op_array->run_time_cache__ptr);
		ZCG(arena_mem) = (void*)(((char*)ZCG(arena_mem)) + ZEND_ALIGNED_SIZE(sizeof(void*)));
		ZEND_MAP_PTR_SET(op_array->run_time_cache, NULL);
	}

@TysonAndre
Copy link
Contributor

I have to wonder if something is causing the arena_mem to point into shared memory (zend_shared_alloc) instead of per-request(emalloced) memory in php 8.0. E.g. per-request would be the emalloced arena from zend_arena_alloc ZCG(arena_mem) = zend_arena_alloc(&CG(arena), persistent_script->arena_size); -

Possibly something to do with inheritance, since zend_accel_inheritance_cache_add changes ZCG(mem) to point into shared memory rather than per-request memory - the inheritance cache was changed in php 8.1, so I don't know if it no longer happens for all cases or just in the specific case

ext/opcache/ZendAccelerator.c

ZCG(mem) = zend_shared_alloc(memory_used + 64);

ext/opcache/zend_persist.c

script->arena_mem = ZCG(arena_mem) = ZCG(mem);

@TysonAndre
Copy link
Contributor

Anyway, my best guess as to why this crashes (in 7.4 and 8.0) is that the inheritance code causes the memory to be shared memory instead of per-request memory when this is compiled:

  • op_array is in shared memory
  • op_array->run_time_cache__ptr is a pointer into shared memory (because of the inheritance and late static binding in the test case), when it should instead be an offset or maybe(???) null
    • I'm still confused about the difference in handling for immutable classes and why there's different handling. It may or may not be safe to use ZEND_MAP_PTR_NEW but I'm not familiar with the code
  • When process A of fpm uses the run_time_cache__ptr, it writes *(op_array->run_time_cache__ptr) = ADDRESS_IN_LOCAL_MEMORY_OF_PROCESS_A - this is what opcache.protect_memory=1 is properly catching
  • When process B of fpm uses the run_time_cache__ptr, it appears as if op_array->run_time_cache__ptr was initialized because it's non-null. But it's improperly set up because op_array->run_time_cache__ptr was an address in shared memory, which shouldn't happen but did due to this bug. So it tries to access a memory that would be valid in process A's memory but is either pointing to the wrong data (and has undefined behavior) or preferably crashes quickly before it can misbehave

@vityank
Copy link
Contributor Author

vityank commented Nov 8, 2022

TysonAndre
Oh, great breakdown here on the issue internals and possible causes.

I wasn't sure of this from the original ticket - was that a segfault that happened some of the time or all of the time?

When running with our standard production configuration with Opcache(Ofc, w/o the opcache.protect_memory=1) and PHP-FPM it starts on up-to n-th, there n is number of PHP-FPM child processes, and then crashes constantly on any following request, and it's probably matches your conclusion here:

When process A of fpm uses the run_time_cache__ptr, it writes *(op_array->run_time_cache__ptr) = ADDRESS_IN_LOCAL_MEMORY_OF_PROCESS_A - this is what opcache.protect_memory=1 is properly catching

I checked and found that the crash (with opcache.protect_memory=1) also occurs in php 7.4 (tested with 7.4.31-dev)

This is indeed very interesting find. I didn't test it with our PHP 7.4 binaries(Which I stopped updating since we targeted 8.0 as upgrade target from our mainline 7.3), and was almost sure it was PHP 8.0 regression... May-be analyzing optimizer differences between 7.3 and 7.4 will put some light on it, and find the original change which broke it, and possibly fix it w/o changing allocation targets from SHM to heap, and similar large and undesirable changes to the engine.

Thanks.

cmb69

those who are affected by this issue should better update to PHP 8.1.

If only it was that easy... PHP version upgrades are a long complicated process in a business. We are migrating from 7.3 to 8.0 for the whole year now, with several types of servers running just fine with it. However during migration of one of servers which uses wider part of our codebase(Including inherited classes with late static binding) we started to receive lots of failures and saw tons of php-fpm process segfaults in the dmesg, resulting, ofc, in immediate revert to 'stable-and-proven' PHP 7.3 on this machine and halt of the migration project until further notice. It got through internal dev QA as most parts there pretested from CLI which has OPCache deactivated...

Anyway, I quite understand the release process and that PHP 8.0 reaches its end of bug fix support quite soon.
As this issue is deep into Zend engine, optimizer and FPM internals I would not be able to fix it myself or backport it from newer versions(As PHP 8.1 is already not affected).

@mophilly
Copy link

I am still having WordPress sites hang on segfault on PHP-fpm. This occurs after the site has been running for a while. I have run the sites on PHP 8.0, 8.1, 8.2 and 8.3. None stay up but 8.2 throws the error sooner than 8.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants