Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV or SIGBUS when validating #15

Open
AlwxSin opened this issue Nov 23, 2023 · 13 comments
Open

SIGSEGV or SIGBUS when validating #15

AlwxSin opened this issue Nov 23, 2023 · 13 comments

Comments

@AlwxSin
Copy link

AlwxSin commented Nov 23, 2023

I've got strange issue when trying to validate large number (more than 100) of large xml files (20-130mb).
It looks like this

unexpected fault address 0xc0068c3000
fatal error: fault
[signal SIGBUS: bus error code=0x4 addr=0xc0068c3000 pc=0x55dffd]

or

SIGSEGV: segmentation violation
PC=0x7f19813d22222e3 m=43 sigcode=1
signal arrive during cgo execution

Stacktrace always the same

runtime.cgocall(0x868de0, 0xc000739698)
  /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc000739670 sp=0xc000739638 pc=0x40a68b
github.com/terminalstatic/go-xsd-validate._C2func_cParseDoc(0x7fe986193010, 0x5e6c74f, 0x1)
  _cgo_gotypes.go:254 +0x57 fp=0xc000739698 sp=0xc000739670 pc=0x538d57
github.com/terminalstatic/go-xsd-validate.parseXmlMem.func3(0x7fe986193010?, {0xc020aa0000?, 0x5e6c74f, 0x757e000?}, 0x1?)
  /builds/app/.go/pkg/mod/github.com/terminalstatic/[email protected]/libxml2.go:433 +0x5a fp=0xc0007396e8 sp=0xc000739698 pc=0x5397fa
github.com/terminalstatic/go-xsd-validate.parseXmlMem({0xc020aa0000, 0x5e6c74f, 0x757e000}, 0xfe?)
  /builds/app/.go/pkg/mod/github.com/terminalstatic/[email protected]/libxml2.go:433 +0xa5 fp=0xc000739780 sp=0xc0007396e8 pc=0x5395e5
github.com/terminalstatic/go-xsd-validate.NewXmlHandlerMem({0xc020aa0000?, 0xc00f542010?, 0x0?}, 0x1d?)
  /builds/app/.go/pkg/mod/github.com/terminalstatic/[email protected]/validate_xsd.go:94 +0x29 fp=0xc0007397d8 sp=0xc000739780 pc=0x53a3e9

Problem is that I can't reproduce it on my machine with same files and I don't have access to server, where error occurs.

Error can happen at any time, on any file, can't reproduce it on exact one file or set of files. Only on bunch of xml's. Error can happen at second file or at 29th, no pattern.

How can I debug or reproduce error? Maybe there is a bug C code?

@terminalstatic
Copy link
Owner

I've been away from go and C for quite some time so ymmv.
If I understand it right it only happens on a particular machine? Does the architecture of the machine differ from your dev machine? What OS ist this and your dev machine running? Do you cross compile? Which version of go are you using? Did you try a different (possibly older) one?

@AlwxSin
Copy link
Author

AlwxSin commented Nov 23, 2023

I think, it depends on data, rather than hardware. Because we have tested several setups (everywhere go1.21):

  • my local mac os (once happened)
  • dev servers, centos (never happened)
  • production servers, centos

We do not cross compile, binary builds on same os where it runs. And we can't try older versions, because we need updates from #12

@AlwxSin
Copy link
Author

AlwxSin commented Nov 23, 2023

Ah, forgot to mention, that we never run production data on local or dev machines

@terminalstatic
Copy link
Owner

Understandable but really hard to debug then.
Maybe far fetched but when you handle heaps of data concurrently (do you?), did you play around with go's and memory settings?

@AlwxSin
Copy link
Author

AlwxSin commented Nov 23, 2023

Yeah, we process xml files concurrently in workers, it may affect?

play around with go's and memory settings?

No, default everywhere.

@terminalstatic
Copy link
Owner

Just to make sure, you are freeing correctly and are using Init and cleanup only once?

@AlwxSin
Copy link
Author

AlwxSin commented Nov 23, 2023

I should. Init on startup and then in each worker I handle files, validation and cleanup.

@terminalstatic
Copy link
Owner

Cleanup or Free? Cleanup should only be called when program or part of program exits ...

@AlwxSin
Copy link
Author

AlwxSin commented Nov 23, 2023

Cleanup on program exit and Free on workers' job done.

@terminalstatic
Copy link
Owner

Can't really help immediately then I guess. Only thing I could think of is to fake huge xml requests and concurrency using go 1.21 but unfortunately I don't really have spare time currently to set this up.

@terminalstatic
Copy link
Owner

Just a short notice, I tested this a little and I still have the suspicion that the cause could be go's memory management. You could try to play around with the InitWithGc function time parameter and the GOMEMLIMIT env variable. You could also check your systems ulimit settings.
You could probably also try to delay worker execution and/or turn the concurrency down.

Funny thing is my mac with go 1.21 fails pretty soon when testing with concurrency of 100 and 100MB xml file ... my linux machine with go1.17 just chuckles along, with the restriction that with a concurrency of 100 it sometimes just stalls because of cpu load. I'd personally give it a lower concurrency setting, at least on my hardware around 20 performs rather well.

I will give this a spin a little later on my linux machine with go 1.21 and check if it makes a difference.

@AlwxSin
Copy link
Author

AlwxSin commented Nov 24, 2023

Does it fails with same stacktrace?

@terminalstatic
Copy link
Owner

On mac it just fails with a killed signal without a trace, on linux it never fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants