Skip to content

PANDA 1.0 record cannot handle a record file that is more than 2GB #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nsapountzis opened this issue Mar 21, 2018 · 3 comments
Open

Comments

@nsapountzis
Copy link

nsapountzis commented Mar 21, 2018

I am experiencing the following problem with PANDA recording. We use PANDA 1.0.

I record back-to-back record files that each last 2 minutes. Each record file has a certain amount of size in bits. It seems that when the record size is more than 2GB, there is a casting overflow problem. And the linux (host) cannot handle it, PANDA record crashes, and of course the guest stops. Specifically, I think that the guest or the host "translates" the record (filesize) 2GB into some thousands of terrabytes (due to the potential casting error), and I get the error:
Glib-ERROR **: build/buildd/gliz2.40.2/./glib/gmem.c:103: failed to allocate 18446744071595337090 bytes.

Overall, it seems that PANDA cannot handle more than 2GB record filesize (more precisely, PANDA cannot handle a workload (in the guest) that corresponds to a record size higher than 2GB ). Has anyone got this issue before?

It's really annoying to not be able to record a heavy workload because the record filesize might exceed 2GB and PANDA crash.

@nsapountzis nsapountzis changed the title PANDA record cannot handle a record file that is more than 2GB PANDA 1.0 record cannot handle a record file that is more than 2GB Mar 22, 2018
@moyix
Copy link
Owner

moyix commented Mar 22, 2018

Could you provide a backtrace for this? (e.g. by running under gdb and using bt). That will help narrow down where the int that is too small is.

@nsapountzis
Copy link
Author

Hi Moyix,

Thanks a lot for your quick reply. Here is the backtrace of the qemu process when the crash happened:

Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0 0x00007f2c5e4bac13 in g_logv () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
(gdb) bt
#0 0x00007f2c5e4bac13 in g_logv () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#1 0x00007f2c5e4bad72 in g_log () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2 0x00007f2c5e4b9644 in g_malloc () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3 0x0000000000609d46 in qemu_sendfile (offset=0, len=-1876860136, src=0x30c8880, dst=0x31e97c0) at savevm.c:750
#4 qemu_concat_section (dst=dst@entry=0x31e97c0, src=src@entry=0x30c8880) at savevm.c:1656
#5 0x000000000060b60b in qemu_savevm_state_begin (mon=mon@entry=0x1de4a20, f=f@entry=0x31e97c0, blk_enable=blk_enable@entry=0, shared=shared@entry=0) at savevm.c:1707
#6 0x000000000060b807 in qemu_savevm_state (mon=mon@entry=0x1de4a20, f=f@entry=0x31e97c0) at savevm.c:1846
#7 0x000000000060c16d in do_savevm_rr (mon=0x1de4a20, name=name@entry=0x7ffeefa4b350 "/home/hari/ReplayServer/records/12681149-rr-snp") at savevm.c:2283
#8 0x00000000006e3fa3 in rr_do_begin_record (file_name_full=, cpu_state=0x1de7e60) at /home/hari/temp/faros/faros/panda/qemu/rr_log.c:1492
#9 0x0000000000536fb8 in main_loop () at /home/hari/temp/faros/faros/panda/qemu/vl.c:1563
#10 main (argc=, argv=, envp=) at /home/hari/temp/faros/faros/panda/qemu/vl.c:3827

Please do let me know if there is anything more that I can provide you with. We are looking forward for your reply.

@moyix
Copy link
Owner

moyix commented Mar 24, 2018

Hmm, it looks like the culprit may be this bit of QEMU code:

panda/qemu/savevm.c

Lines 731 to 737 in f758bee

/*
* Sendfile takes an 64-bit len, but qemu_[get|set]_buffer only takes
* 32-bits.
* TODO: handle 64-bits, loop over get and set buffers. Unclear how to fail in the middle,
* since the dst may be write-only, so we can't roll back
*/
static int qemu_sendfile(QEMUFile *dst, QEMUFile *src, int64_t offset, int len)

I'll have to think about how to fix this. Possibly we could detect size > 2GB and split up the section into smaller chunks...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants