-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
java.lang.OutOfMemoryError: Java heap space #64
Comments
on a first sight, flush after each object. An internal buffer is increased until first flush, so flush after each large object written. Always use shared refs (cycles!) [is default]. |
the problem is the read.. not the write |
forgot to remove the |
Currently during read the complete file is mirrored in an internal buffer, rendering Serialization of a large object impossible. Enabling this would require some internal offsetting work, however as fst comes from a remoting/network background, it seemed valid at this time to me :-). Good news is, you can easily workaround this by reading writing a series of "indepenent" objects. The way your program is built, the object stream will build up a huge object map internally (ref sharing) and large buffers. // workarounded solution to fst eager input buffer reading issue
public static void main(String[] args) throws Exception {
File temp = File.createTempFile("test", "dat");
final int BUFFER_SIZE_IN_BYTES = 10 * 1024 * 1024;
final int MAX_ITEMS_BEFORE_FLUSH = 10000;
final int NUMBER_OF_ITEMS = 1000000;
try {
FSTConfiguration config = FSTConfiguration.getDefaultConfiguration();
int numberOfObjects = 0;
try (FileOutputStream fileOutputStream = new FileOutputStream(temp)) {
try (BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream, BUFFER_SIZE_IN_BYTES)) {
for (int i = 0; i < NUMBER_OF_ITEMS; i++) {
Object[] arr = new Object[100];
for (int objIdx = 0; objIdx < arr.length; objIdx++) {
arr[objIdx] = "row " + i + " - " + "my object" + objIdx;
}
// same as in TCPObjectSocket ..
FSTObjectOutput objectOutput = config.getObjectOutput(); // could also do new with minor perf impact
objectOutput.writeObject(arr);
int written = objectOutput.getWritten();
bufferedOutputStream.write((written >>> 0) & 0xFF);
bufferedOutputStream.write((written >>> 8) & 0xFF);
bufferedOutputStream.write((written >>> 16) & 0xFF);
bufferedOutputStream.write((written >>> 24) & 0xFF);
// copy internal buffer to bufferedoutput
bufferedOutputStream.write(objectOutput.getBuffer(), 0, written);
objectOutput.flush();
numberOfObjects++;
if (i % MAX_ITEMS_BEFORE_FLUSH == 0) {
System.out.println("writing " + i);
}
}
}
}
System.out.println("done with write");
try (FileInputStream fileInputStream = new FileInputStream(temp)) {
try (BufferedInputStream bufferedInputStream = new BufferedInputStream(fileInputStream, BUFFER_SIZE_IN_BYTES)) {
for (int idx = 0; idx < numberOfObjects; idx++) {
int ch1 = (bufferedInputStream.read() + 256) & 0xff;
int ch2 = (bufferedInputStream.read()+ 256) & 0xff;
int ch3 = (bufferedInputStream.read() + 256) & 0xff;
int ch4 = (bufferedInputStream.read() + 256) & 0xff;
int len = (ch4 << 24) + (ch3 << 16) + (ch2 << 8) + (ch1 << 0);
if ( len <= 0 )
throw new EOFException("client closed");
byte buffer[] = new byte[len]; // this could be reused !
while (len > 0) {
len -= bufferedInputStream.read(buffer, buffer.length - len, len);
}
Object[] row = (Object[]) config.getObjectInput(buffer).readObject();
if (idx % MAX_ITEMS_BEFORE_FLUSH == 0) {
System.out.println("reading " + idx);
}
}
}
}
System.out.println("done with read");
}
finally {
temp.delete();
}
} |
The code above should solve your issue, however midterm I'll improve handling of streaming serialization (though in general, sharedRef's require the writer to keep all objects in a map until stream is closed, this won't work for huge datastructures even if the buffer issue is solved). action items:
|
further investigation shows you hit the int_max on stream position so resizing of the buffer fails as
A single serialized object stream with size > 1.4GB hits the integer range wall (not solvable as all stream positions are int also in JDK). So only improvement done will be to ease the mechanics shown above. |
fixed with 2.26 (released within some days)
working (and faster) code (requires 2.26): public static void main(String[] args) throws Exception {
File temp = File.createTempFile("test", "dat");
final int BUFFER_SIZE_IN_BYTES = 10 * 1024 * 1024;
final int MAX_ITEMS_BEFORE_FLUSH = 10000;
final int NUMBER_OF_ITEMS = 1000000;
try {
FSTConfiguration config = FSTConfiguration.createDefaultConfiguration();
int numberOfObjects = 0;
try (FileOutputStream fileOutputStream = new FileOutputStream(temp)) {
try (BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream, BUFFER_SIZE_IN_BYTES)) {
for (int i = 0; i < NUMBER_OF_ITEMS; i++) {
Object[] arr = new Object[100];
for (int objIdx = 0; objIdx < arr.length; objIdx++) {
arr[objIdx] = "row " + i + " - " + "my object" + objIdx;
}
config.encodeToStream(bufferedOutputStream,arr);
numberOfObjects++;
if (i % MAX_ITEMS_BEFORE_FLUSH == 0) {
System.out.println("writing " + i);
}
}
}
}
System.out.println("done with write");
try (FileInputStream fileInputStream = new FileInputStream(temp)) {
try (BufferedInputStream bufferedInputStream = new BufferedInputStream(fileInputStream, BUFFER_SIZE_IN_BYTES)) {
for (int idx = 0; idx < numberOfObjects; idx++) {
Object[] row = (Object[]) config.decodeFromStream(bufferedInputStream);
if (idx % MAX_ITEMS_BEFORE_FLUSH == 0) {
System.out.println("reading " + idx);
}
}
}
}
System.out.println("done with read");
}
finally {
temp.delete();
}
} |
thanks! |
When executing
https://gist.github.com/raananraz/5c466d6208a0f1429f20
at around 130K object i am getting
Caused by: java.lang.OutOfMemoryError: Java heap space
when previously set on
config.setShareReferences(true);
it crashed on around 60k.
any idea?
thanks!
The text was updated successfully, but these errors were encountered: