-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Downstream all aarch64-related patches from vanilla LuaJIT repo #5629
Labels
Comments
15 tasks
5 tasks
igormunkin
added a commit
that referenced
this issue
Apr 27, 2021
This patch fixes inaccuracy in Tarantool build configuration introduced by commit 07c83aa ('build: adjust LuaJIT build system'). All those MacOS-related tweaks for __PAGEZERO size and preferred load address for the bundle are necessary only for builds with 32-bit GC area on 64-bit host. The only case fitting these conditions is x86_64 with no LUAJIT_ENABLE_GC64. All other 64-bit builds use 64-bit GC area unconditionally. Part of #5983 Needed for #5629 Follows up #4862 Signed-off-by: Igor Munkin <[email protected]>
igormunkin
added a commit
that referenced
this issue
Apr 28, 2021
This patch fixes inaccuracy in Tarantool build configuration introduced by commit 07c83aa ('build: adjust LuaJIT build system'). All those MacOS-related tweaks for __PAGEZERO size and preferred load address for the bundle are necessary only for builds with 32-bit GC area on 64-bit host. The only case fitting these conditions is x86_64 with no LUAJIT_ENABLE_GC64. All other 64-bit builds use 64-bit GC area unconditionally. Part of #5983 Needed for #5629 Follows up #4862 Reviewed-by: Sergey Kaplun <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
added a commit
that referenced
this issue
Apr 28, 2021
This patch fixes inaccuracy in Tarantool build configuration introduced by commit 07c83aa ('build: adjust LuaJIT build system'). All those MacOS-related tweaks for __PAGEZERO size and preferred load address for the bundle are necessary only for builds with 32-bit GC area on 64-bit host. The only case fitting these conditions is x86_64 with no LUAJIT_ENABLE_GC64. All other 64-bit builds use 64-bit GC area unconditionally. Part of #5983 Needed for #5629 Follows up #4862 Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Nikita Pettik <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
added a commit
to tarantool/luajit
that referenced
this issue
Apr 28, 2021
Relates to tarantool/tarantool#5629 Needed for tarantool/tarantool#5983 Follows up tarantool/tarantool#4862 Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Apr 28, 2021
Part of tarantool/tarantool#5629 Relates to tarantool/tarantool#5983 Signed-off-by: Igor Munkin <[email protected]>
igormunkin
added a commit
that referenced
this issue
Apr 29, 2021
This patch fixes inaccuracy in Tarantool build configuration introduced by commit 07c83aa ('build: adjust LuaJIT build system'). All those MacOS-related tweaks for __PAGEZERO size and preferred load address for the bundle are necessary only for builds with 32-bit GC area on 64-bit host. The only case fitting these conditions is x86_64 with no LUAJIT_ENABLE_GC64. All other 64-bit builds use 64-bit GC area unconditionally. Part of #5983 Needed for #5629 Follows up #4862 Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Nikita Pettik <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
added a commit
that referenced
this issue
Apr 30, 2021
This patch fixes inaccuracy in Tarantool build configuration introduced by commit 07c83aa ('build: adjust LuaJIT build system'). All those MacOS-related tweaks for __PAGEZERO size and preferred load address for the bundle are necessary only for builds with 32-bit GC area on 64-bit host. The only case fitting these conditions is x86_64 with no LUAJIT_ENABLE_GC64. All other 64-bit builds use 64-bit GC area unconditionally. Part of #5983 Needed for #5629 Follows up #4862 Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Nikita Pettik <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
added a commit
that referenced
this issue
Apr 30, 2021
This patch fixes inaccuracy in Tarantool build configuration introduced by commit 07c83aa ('build: adjust LuaJIT build system'). All those MacOS-related tweaks for __PAGEZERO size and preferred load address for the bundle are necessary only for builds with 32-bit GC area on 64-bit host. The only case fitting these conditions is x86_64 with no LUAJIT_ENABLE_GC64. All other 64-bit builds use 64-bit GC area unconditionally. Part of #5983 Needed for #5629 Follows up #4862 Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Nikita Pettik <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]> (cherry picked from commit e50a6d9)
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
May 3, 2021
(cherry picked from commit 2e2fb8f) Part of tarantool/tarantool#5629 Relates to tarantool/tarantool#5983 Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
May 3, 2021
Thanks to Igor Munkin. (cherry picked from commit 521b367) Part of tarantool/tarantool#5629 Relates to tarantool/tarantool#5983 Signed-off-by: Igor Munkin <[email protected]>
igormunkin
added a commit
that referenced
this issue
May 3, 2021
Since commit c9d88d5 ('Fix #984: add jit.* library to the binary') all required modules implemented in Lua are bundled (i.e. compiled to the binary as a C literal) into Tarantool executable. To save the memory footprint (this is the only reason I can imagine as a rationale) Lua sources related to unsupported platforms are not bundled. While making Tarantool work on ARM64 hosts, it turned out the module specific for this arch (i.e. jit/dis_arm64.lua) is missing. As a result of this patch, <jit.dump> loads fine on ARM64 platforms. Part of #5983 Relates to #5629 Follows up #984 Signed-off-by: Igor Munkin <[email protected]>
Buristan
pushed a commit
to tarantool/luajit
that referenced
this issue
Sep 8, 2021
Reported by XmiliaH. (cherry picked from commit 16d38a4) This patch fixes the regression introduced in scope of fa8e7ffefb715abf55dc5b0c708c63251868 ('Add support for full-range 64 bit lightuserdata.'). The maximum available number of lightuserdata segment is 255. So the high bits of this lightuserdata TValue are 0xfffe7fff. The same high bits are set for special control variable on the stack for ITERN/ITERC bytecodes via ISNEXT bytecode. When ITERN bytecode is despecialize to ITERC bytecode and a table has the lightuserdata with the maximum available segment number as a key, the special control variable is considered as this key and iteration is broken. This patch forbids to use more than 254 lightuserdata segments to avoid clashing with the aforementioned control variable. In case when user tries to create lightuserdata with 255th segment number an error "bad light userdata pointer" is raised. Sergey Kaplun: * added the description and the test for the problem Part of tarantool/tarantool#5629
Buristan
pushed a commit
to tarantool/luajit
that referenced
this issue
Sep 8, 2021
Reported by XmiliaH. (cherry picked from commit 16d38a4) This patch fixes the regression introduced in scope of fa8e7ffefb715abf55dc5b0c708c63251868 ('Add support for full-range 64 bit lightuserdata.'). The maximum available number of lightuserdata segment is 255. So the high bits of this lightuserdata TValue are 0xfffe7fff. The same high bits are set for special control variable on the stack for ITERN/ITERC bytecodes via ISNEXT bytecode. When ITERN bytecode is despecialize to ITERC bytecode and a table has the lightuserdata with the maximum available segment number as a key, the special control variable is considered as this key and iteration is broken. This patch forbids to use more than 254 lightuserdata segments to avoid clashing with the aforementioned control variable. In case when user tries to create lightuserdata with 255th segment number an error "bad light userdata pointer" is raised. Sergey Kaplun: * added the description and the test for the problem Part of tarantool/tarantool#5629
Buristan
pushed a commit
to tarantool/luajit
that referenced
this issue
Sep 8, 2021
Reported by XmiliaH. (cherry picked from commit 16d38a4) This patch fixes the regression introduced in scope of fa8e7ffefb715abf55dc5b0c708c63251868 ('Add support for full-range 64 bit lightuserdata.'). The maximum available number of lightuserdata segment is 255. So the high bits of this lightuserdata TValue are 0xfffe7fff. The same high bits are set for special control variable on the stack for ITERN/ITERC bytecodes via ISNEXT bytecode. When ITERN bytecode is despecialize to ITERC bytecode and a table has the lightuserdata with the maximum available segment number as a key, the special control variable is considered as this key and iteration is broken. This patch forbids to use more than 254 lightuserdata segments to avoid clashing with the aforementioned control variable. In case when user tries to create lightuserdata with 255th segment number an error "bad light userdata pointer" is raised. Sergey Kaplun: * added the description and the test for the problem Part of tarantool/tarantool#5629
Buristan
pushed a commit
to tarantool/luajit
that referenced
this issue
Sep 20, 2021
This patch only performs a code movement of lightuserdata interning to <lj_udata.c> file and does nothing else. This patch is backported to simplify syncing with the upstream. Sergey Kaplun: * added the description for the patch Needed for tarantool/tarantool#5629
Buristan
pushed a commit
to tarantool/luajit
that referenced
this issue
Sep 20, 2021
Reported by XmiliaH. (cherry picked from commit 16d38a4) This patch fixes the regression introduced in scope of fa8e7ffefb715abf55dc5b0c708c63251868 ('Add support for full-range 64 bit lightuserdata.'). The maximum available number of lightuserdata segment is 255. So the high bits of this lightuserdata TValue are 0xfffe7fff. The same high bits are set for special control variable on the stack for ITERN/ITERC bytecodes via ISNEXT bytecode. When ITERN bytecode is despecialize to ITERC bytecode and a table has the lightuserdata with the maximum available segment number as a key, the special control variable is considered as this key and iteration is broken. This patch forbids to use more than 254 lightuserdata segments to avoid clashing with the aforementioned control variable. In case when user tries to create lightuserdata with 255th segment number an error "bad light userdata pointer" is raised. Sergey Kaplun: * added the description and the test for the problem Part of tarantool/tarantool#5629
igormunkin
added a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
There were issues with configuring LuaJIT on Apple machines, since <LuaJITTestArch> CMake auxiliary routine fails to locate system headers (e.g. assert.h in case when LUA_USE_ASSERT is enabled). As a result platform detection fails and LuaJIT configuration ends with the fatal error. This patch adds the necessary flags to help the routine to find the required system headers. Needed for tarantool/tarantool#6065 Relates to tarantool/tarantool#5629 Follows up tarantool/tarantool#4862 Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
(cherry picked from commit 2e2fb8f) After Apple released Macs working on ARM64, the previous recipe in lj_arch.h for detecting various Apple platforms is not valid anymore. Fortunately, there is a system header (namely, TargetConditionals.h), provided by SDK with the proper defines to be set. Starting from this patch, LuaJIT identifies Apple hosts via this header. Since testing machinery assumes that LuaJIT is built with JIT support being enabled unconditionally, a smoke test for it is also added alongside with this patch. Igor Munkin: * added the description and the test for the problem * backported the original patch to tarantool/luajit repo Resolves tarantool/tarantool#6065 Part of tarantool/tarantool#5629 Relates to tarantool/tarantool#5983 Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
Thanks to Igor Munkin. (cherry picked from commit 521b367) This patch fixes the issue introduced by commit 2e2fb8f ('OSX/iOS: Handle iOS simulator and ARM64 Macs.'). Within the mentioned commit LJ_TARGET_IOS define is set via Apple system header to enable several features (e.g. JIT and external unwinder) on ARM64 Macs, but its usage was not adjusted source-wide. This is done for FFI machinery within this commit. All LJ_TARGET_IOS uses in FFI sources are done with LJ_TARGET_ARM64 define being set, so we can simply replace these occurrences with LJ_TARGET_OSX. Igor Munkin: * added the description and the test for the problem Resolves tarantool/tarantool#6066 Part of tarantool/tarantool#5629 Relates to tarantool/tarantool#5983 Reported-by: Nikita Pettik <[email protected]> Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
Thanks to Javier Guerra Giraldez. (cherry picked from commit ae20998) This patch fixes the issue introduced by commits f307d0a ('ARM64: Add build infrastructure and initial port of interpreter.') for arm64 and 73ef845 ('Add special bytecodes for builtins.') for arm and ppc. Within the mentioned commits the new bytecode TSETR is introduced for the corresponding architectures. When the new index of the table processed during this bytecode is the integer, that is greater than asize of the table, the VM fallbacks to vmeta_tsetr, for calling lj_tab_setinth(lua_State *L, GCtab *t, int32_t key). The first argument CARG1 is not set in VM to the Lua thread being executed and contains an invalid value, so the mentioned call leads to crash. This patch adds the missed set of CARG1 to the right value. Sergey Kaplun: * added the description and the test for the problem Resolves tarantool/tarantool#6084 Part of tarantool/tarantool#5629 Reviewed-by: Sergey Ostanevich <[email protected]> Reviewed-by: Igor Munkin <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
(cherry picked from commit e9af1ab) LuaJIT uses special NaN-tagging technique to store internal type on the Lua stack. In case of LJ_GC64 the first 13 bits are set in special NaN type (0xfff8...). The next 4 bits are used for an internal LuaJIT type of object on stack. The next 47 bits are used for storing this object's content. For userdata, it is its address. For arm64 a pointer can have more than 47 significant bits [1]. In this case the error BADLU error is raised. For the support of full 64-bit range lightuserdata pointers two new fields in GCState are added: `lightudseg` - vector of segments of lightuserdata. Each element keeps 32-bit value. 25 MSB equal to MSB of lightuserdata 64-bit address, the rest are filled with zeros. The length of the vector is power of 2. `lightudnum` - the length - 1 of aforementioned vector (up to 255). When lightuserdata is pushed on the stack, if its segment is not stored in vector new value is appended to of this vector. The maximum amount of segments is 256. BADLU error is raised in case when user tries to add userdata with the new 257-th segment, so the whole VA-space isn't covered by this patch. Also, in this patch all internal usage of lightuserdata (for hooks, profilers, built-in package, IR and so on) is changed to special values on Lua Stack. Also, conversion of TValue to FFI C type with store is no longer compiled for lightuserdata. [1]: https://www.kernel.org/doc/html/latest/arm64/memory.html Sergey Kaplun: * added the description and the test for the problem Resolves tarantool/tarantool#2712 Needed for tarantool/tarantool#6154 Part of tarantool/tarantool#5629 Reviewed-by: Igor Munkin <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
This reduces overall performance on ARM64, but we have no choice. Linux kernel default userspace VA is 48 bit, but we'd need 47 bit. mremap() ignores address hints due to a kernel API issue. The mapping may move to an undesired address which will cause an assert or crash. Reported by Raymond W. Ko. (cherry picked from commit 67dbec8) 47-bit VA space is required by LuaJIT for keeping a GC object pointer in TValue. In case of huge blobs that are mapped directly, `mremap()` may move the chunk out of 47-bit range of VA space on ARM64. `mremap()` accepts the fifth argument (new address hint) only with MREMAP_FIXED flag. In that case it unmaps any other mapping to specified address. To avoid this behaviour this patch restricts `mremap()` to relocate the mapping to a new virtual address by setting CALL_MREMAP_NOMOVE flag instead of CALL_MREMAP_MAYMOVE for arm64 architecture. Sergey Kaplun: * added the description and the test for the problem Needed for tarantool/tarantool#6154 Part of tarantool/tarantool#5629 Reviewed-by: Igor Munkin <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
Contributed by Javier Guerra Giraldez. (cherry picked from commit c785131) Closed upvalues are never gray. Hence, when closed upvalue is marked, it is marked as black. Black objects can't refer white objects, so for storing a white value in a closed upvalue, we need to move the barrier forward and color our value to gray by using `lj_gc_barrieruv()`. This function can't be called on closed upvalues with non-white values since there is no need to mark it again. USETS bytecode for arm64 architecture has the incorrect NZCV condition flag value in the instruction that checks the upvalue is closed: | tst TMP1w, #LJ_GC_WHITES | ccmp TMP0w, #0, #0, ne | beq <1 // branch out from barrier movement `TMP0w` contains `upvalue->closed` field, so the upvalue is open if this field equals to zero (the first one in `ccmp`). The second zero is the value of NZCV condition flags[1] yielded if the specified condition (`ne`) is met for the current values of the condition flags[2]. Hence, if the value to be stored is not white (`TMP1w` holds its color), then the condition is FALSE and all flags bits are set to zero so the branch is not taken (Zero flag is not set). If this happens at propagate or atomic GC phase, the `lj_gc_barrieruv()` function is called and the gray value to be set is marked like if it is white. That leads to the assertion failure in the `gc_mark()` function. This patch changes NZCV condition flag to 4 (Zero flag is set) to take the correct branch after `ccmp` instruction. Sergey Kaplun: * added the description and the test for the problem [1]: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/condition-codes-1-condition-flags-and-codes [2]: https://developer.arm.com/documentation/dui0801/g/pge1427897656225 Part of tarantool/tarantool#5629 Reviewed-by: Igor Munkin <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
(cherry picked from commit 2f3f078) When LuaJIT is built with LJ_FR2 (e.g. with GC64 mode enabled), information about frame takes two slots -- the first takes the TValue with the function to be called, the second takes the framelink. The JIT recording machinery does pretty the same -- the function IR_KGC is loaded in the first slot, and the second is set to TREF_FRAME value. This value should be rewritten after return from a callee. This slot is cleared either by return values or manually (set to zero), when there are no values to return. The latter case is done by the next bytecode with RA dst mode. This obliges that the destination of RA takes the next slot after TREF_FRAME. Hence, an earlier instruction must use the smallest possible destination register (see `lj_record_ins()` for the details). Bytecode emitter swaps operands for ISGT and ISGE comparisons. As a result, the aforementioned rule for registers allocations may be violated. When it happens for a chunk being recorded, the slot with TREF_FRAME is not rewritten (but the next empty slot after TREF_FRAME is). This leads to JIT slots inconsistency and assertion failure in `rec_check_slots()` during recording of the next bytecode instruction. This patch fixes bytecode register allocation by changing the VM register allocation order in case of ISGT and ISGE bytecodes. Sergey Kaplun: * added the description and the test for the problem Resolves tarantool/tarantool#6227 Part of tarantool/tarantool#5629 Reviewed-by: Sergey Ostanevich <[email protected]> Reviewed-by: Igor Munkin <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
Contributed by Javier Guerra Giraldez. (cherry picked from commit 9da0653) When the side trace is assembled, it is linked to its parent trace. For this purpose, JIT runs through the parent trace mcode and updates jump instruction targeted to the corresponding exitno. Prior to this patch, these instructions were patched unconditionally, that leads to errors if the jump target address is out of the value ranges specified in ARM64 references[1][2][3][4][5][6]. As a result of the patch <lj_asm_patchexit> considers value ranges of the jump targets and updates directly only those instructions fitting the particular jump range. Moreover, the corresponding jump in the pad leading to <lj_vm_exit_handler> is also patched, so those instructions, that are not updated before, targets to the linked side trace too. Additionally, there is some refactoring of jump targets assembling in scope of this patch. Igor Munkin: * added the description and the test for the problem [1]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/B [2]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/B-cond [3]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/CBZ [4]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/CBNZ [5]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/TBZ [6]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/TBNZ Resolves tarantool/tarantool#6098 Part of tarantool/tarantool#5629 Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Kirill Yukhin <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
added a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
There were issues with configuring LuaJIT on Apple machines, since <LuaJITTestArch> CMake auxiliary routine fails to locate system headers (e.g. assert.h in case when LUA_USE_ASSERT is enabled). As a result platform detection fails and LuaJIT configuration ends with the fatal error. This patch adds the necessary flags to help the routine to find the required system headers. Needed for tarantool/tarantool#6065 Relates to tarantool/tarantool#5629 Follows up tarantool/tarantool#4862 Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
(cherry picked from commit 2e2fb8f) After Apple released Macs working on ARM64, the previous recipe in lj_arch.h for detecting various Apple platforms is not valid anymore. Fortunately, there is a system header (namely, TargetConditionals.h), provided by SDK with the proper defines to be set. Starting from this patch, LuaJIT identifies Apple hosts via this header. Since testing machinery assumes that LuaJIT is built with JIT support being enabled unconditionally, a smoke test for it is also added alongside with this patch. Igor Munkin: * added the description and the test for the problem * backported the original patch to tarantool/luajit repo Resolves tarantool/tarantool#6065 Part of tarantool/tarantool#5629 Relates to tarantool/tarantool#5983 Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
Thanks to Igor Munkin. (cherry picked from commit 521b367) This patch fixes the issue introduced by commit 2e2fb8f ('OSX/iOS: Handle iOS simulator and ARM64 Macs.'). Within the mentioned commit LJ_TARGET_IOS define is set via Apple system header to enable several features (e.g. JIT and external unwinder) on ARM64 Macs, but its usage was not adjusted source-wide. This is done for FFI machinery within this commit. All LJ_TARGET_IOS uses in FFI sources are done with LJ_TARGET_ARM64 define being set, so we can simply replace these occurrences with LJ_TARGET_OSX. Igor Munkin: * added the description and the test for the problem Resolves tarantool/tarantool#6066 Part of tarantool/tarantool#5629 Relates to tarantool/tarantool#5983 Reported-by: Nikita Pettik <[email protected]> Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
Thanks to Javier Guerra Giraldez. (cherry picked from commit ae20998) This patch fixes the issue introduced by commits f307d0a ('ARM64: Add build infrastructure and initial port of interpreter.') for arm64 and 73ef845 ('Add special bytecodes for builtins.') for arm and ppc. Within the mentioned commits the new bytecode TSETR is introduced for the corresponding architectures. When the new index of the table processed during this bytecode is the integer, that is greater than asize of the table, the VM fallbacks to vmeta_tsetr, for calling lj_tab_setinth(lua_State *L, GCtab *t, int32_t key). The first argument CARG1 is not set in VM to the Lua thread being executed and contains an invalid value, so the mentioned call leads to crash. This patch adds the missed set of CARG1 to the right value. Sergey Kaplun: * added the description and the test for the problem Resolves tarantool/tarantool#6084 Part of tarantool/tarantool#5629 Reviewed-by: Sergey Ostanevich <[email protected]> Reviewed-by: Igor Munkin <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
(cherry picked from commit e9af1ab) LuaJIT uses special NaN-tagging technique to store internal type on the Lua stack. In case of LJ_GC64 the first 13 bits are set in special NaN type (0xfff8...). The next 4 bits are used for an internal LuaJIT type of object on stack. The next 47 bits are used for storing this object's content. For userdata, it is its address. For arm64 a pointer can have more than 47 significant bits [1]. In this case the error BADLU error is raised. For the support of full 64-bit range lightuserdata pointers two new fields in GCState are added: `lightudseg` - vector of segments of lightuserdata. Each element keeps 32-bit value. 25 MSB equal to MSB of lightuserdata 64-bit address, the rest are filled with zeros. The length of the vector is power of 2. `lightudnum` - the length - 1 of aforementioned vector (up to 255). When lightuserdata is pushed on the stack, if its segment is not stored in vector new value is appended to of this vector. The maximum amount of segments is 256. BADLU error is raised in case when user tries to add userdata with the new 257-th segment, so the whole VA-space isn't covered by this patch. Also, in this patch all internal usage of lightuserdata (for hooks, profilers, built-in package, IR and so on) is changed to special values on Lua Stack. Also, conversion of TValue to FFI C type with store is no longer compiled for lightuserdata. [1]: https://www.kernel.org/doc/html/latest/arm64/memory.html Sergey Kaplun: * added the description and the test for the problem Resolves tarantool/tarantool#2712 Needed for tarantool/tarantool#6154 Part of tarantool/tarantool#5629 Reviewed-by: Igor Munkin <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
This reduces overall performance on ARM64, but we have no choice. Linux kernel default userspace VA is 48 bit, but we'd need 47 bit. mremap() ignores address hints due to a kernel API issue. The mapping may move to an undesired address which will cause an assert or crash. Reported by Raymond W. Ko. (cherry picked from commit 67dbec8) 47-bit VA space is required by LuaJIT for keeping a GC object pointer in TValue. In case of huge blobs that are mapped directly, `mremap()` may move the chunk out of 47-bit range of VA space on ARM64. `mremap()` accepts the fifth argument (new address hint) only with MREMAP_FIXED flag. In that case it unmaps any other mapping to specified address. To avoid this behaviour this patch restricts `mremap()` to relocate the mapping to a new virtual address by setting CALL_MREMAP_NOMOVE flag instead of CALL_MREMAP_MAYMOVE for arm64 architecture. Sergey Kaplun: * added the description and the test for the problem Needed for tarantool/tarantool#6154 Part of tarantool/tarantool#5629 Reviewed-by: Igor Munkin <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
Contributed by Javier Guerra Giraldez. (cherry picked from commit c785131) Closed upvalues are never gray. Hence, when closed upvalue is marked, it is marked as black. Black objects can't refer white objects, so for storing a white value in a closed upvalue, we need to move the barrier forward and color our value to gray by using `lj_gc_barrieruv()`. This function can't be called on closed upvalues with non-white values since there is no need to mark it again. USETS bytecode for arm64 architecture has the incorrect NZCV condition flag value in the instruction that checks the upvalue is closed: | tst TMP1w, #LJ_GC_WHITES | ccmp TMP0w, #0, #0, ne | beq <1 // branch out from barrier movement `TMP0w` contains `upvalue->closed` field, so the upvalue is open if this field equals to zero (the first one in `ccmp`). The second zero is the value of NZCV condition flags[1] yielded if the specified condition (`ne`) is met for the current values of the condition flags[2]. Hence, if the value to be stored is not white (`TMP1w` holds its color), then the condition is FALSE and all flags bits are set to zero so the branch is not taken (Zero flag is not set). If this happens at propagate or atomic GC phase, the `lj_gc_barrieruv()` function is called and the gray value to be set is marked like if it is white. That leads to the assertion failure in the `gc_mark()` function. This patch changes NZCV condition flag to 4 (Zero flag is set) to take the correct branch after `ccmp` instruction. Sergey Kaplun: * added the description and the test for the problem [1]: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/condition-codes-1-condition-flags-and-codes [2]: https://developer.arm.com/documentation/dui0801/g/pge1427897656225 Part of tarantool/tarantool#5629 Reviewed-by: Igor Munkin <[email protected]> Reviewed-by: Sergey Ostanevich <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
(cherry picked from commit 2f3f078) When LuaJIT is built with LJ_FR2 (e.g. with GC64 mode enabled), information about frame takes two slots -- the first takes the TValue with the function to be called, the second takes the framelink. The JIT recording machinery does pretty the same -- the function IR_KGC is loaded in the first slot, and the second is set to TREF_FRAME value. This value should be rewritten after return from a callee. This slot is cleared either by return values or manually (set to zero), when there are no values to return. The latter case is done by the next bytecode with RA dst mode. This obliges that the destination of RA takes the next slot after TREF_FRAME. Hence, an earlier instruction must use the smallest possible destination register (see `lj_record_ins()` for the details). Bytecode emitter swaps operands for ISGT and ISGE comparisons. As a result, the aforementioned rule for registers allocations may be violated. When it happens for a chunk being recorded, the slot with TREF_FRAME is not rewritten (but the next empty slot after TREF_FRAME is). This leads to JIT slots inconsistency and assertion failure in `rec_check_slots()` during recording of the next bytecode instruction. This patch fixes bytecode register allocation by changing the VM register allocation order in case of ISGT and ISGE bytecodes. Sergey Kaplun: * added the description and the test for the problem Resolves tarantool/tarantool#6227 Part of tarantool/tarantool#5629 Reviewed-by: Sergey Ostanevich <[email protected]> Reviewed-by: Igor Munkin <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
igormunkin
pushed a commit
to tarantool/luajit
that referenced
this issue
Jun 16, 2022
Contributed by Javier Guerra Giraldez. (cherry picked from commit 9da0653) When the side trace is assembled, it is linked to its parent trace. For this purpose, JIT runs through the parent trace mcode and updates jump instruction targeted to the corresponding exitno. Prior to this patch, these instructions were patched unconditionally, that leads to errors if the jump target address is out of the value ranges specified in ARM64 references[1][2][3][4][5][6]. As a result of the patch <lj_asm_patchexit> considers value ranges of the jump targets and updates directly only those instructions fitting the particular jump range. Moreover, the corresponding jump in the pad leading to <lj_vm_exit_handler> is also patched, so those instructions, that are not updated before, targets to the linked side trace too. Additionally, there is some refactoring of jump targets assembling in scope of this patch. Igor Munkin: * added the description and the test for the problem [1]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/B [2]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/B-cond [3]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/CBZ [4]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/CBNZ [5]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/TBZ [6]: https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/TBNZ Resolves tarantool/tarantool#6098 Part of tarantool/tarantool#5629 Reviewed-by: Sergey Kaplun <[email protected]> Reviewed-by: Kirill Yukhin <[email protected]> Signed-off-by: Igor Munkin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There are several issues blocking Tarantool from being used on arm64. One of the showstoppers is #2712. As a first step to get closer to arm64 stability is syncing all aarch64-related patches from vanilla LuaJIT repo according to the procedure developed in scope of #5534.
There is the list of related issues:
The text was updated successfully, but these errors were encountered: