Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.OutOfMemoryError: Java heap space #64

Closed
raananraz opened this issue Apr 29, 2015 · 8 comments
Closed

java.lang.OutOfMemoryError: Java heap space #64

raananraz opened this issue Apr 29, 2015 · 8 comments

Comments

@raananraz
Copy link

When executing
https://gist.github.com/raananraz/5c466d6208a0f1429f20

at around 130K object i am getting
Caused by: java.lang.OutOfMemoryError: Java heap space

when previously set on
config.setShareReferences(true);
it crashed on around 60k.

any idea?
thanks!

@RuedigerMoeller
Copy link
Owner

on a first sight, flush after each object. An internal buffer is increased until first flush, so flush after each large object written. Always use shared refs (cycles!) [is default].

@raananraz
Copy link
Author

the problem is the read.. not the write

@raananraz
Copy link
Author

forgot to remove the
fstObjectInput.reset(); (was testing things out)
please try it without it..

@RuedigerMoeller
Copy link
Owner

Currently during read the complete file is mirrored in an internal buffer, rendering Serialization of a large object impossible. Enabling this would require some internal offsetting work, however as fst comes from a remoting/network background, it seemed valid at this time to me :-).

Good news is, you can easily workaround this by reading writing a series of "indepenent" objects. The way your program is built, the object stream will build up a huge object map internally (ref sharing) and large buffers.
The solution is similar to TCPObjectSocket streaming serialization (see fst sourcecode)

    // workarounded solution to fst eager input buffer reading issue
    public static void main(String[] args) throws Exception {

        File temp = File.createTempFile("test", "dat");

        final int BUFFER_SIZE_IN_BYTES = 10 * 1024 * 1024;
        final int MAX_ITEMS_BEFORE_FLUSH = 10000;
        final int NUMBER_OF_ITEMS = 1000000;

        try {

            FSTConfiguration config = FSTConfiguration.getDefaultConfiguration();
            int numberOfObjects = 0;

            try (FileOutputStream fileOutputStream = new FileOutputStream(temp)) {

                try (BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream, BUFFER_SIZE_IN_BYTES)) {
                    for (int i = 0; i < NUMBER_OF_ITEMS; i++) {
                        Object[] arr = new Object[100];

                        for (int objIdx = 0; objIdx < arr.length; objIdx++) {
                            arr[objIdx] = "row " + i + " - " + "my object" + objIdx;
                        }

                        // same as in TCPObjectSocket ..
                        FSTObjectOutput objectOutput = config.getObjectOutput(); // could also do new with minor perf impact
                        objectOutput.writeObject(arr);
                        int written = objectOutput.getWritten();
                        bufferedOutputStream.write((written >>> 0) & 0xFF);
                        bufferedOutputStream.write((written >>> 8) & 0xFF);
                        bufferedOutputStream.write((written >>> 16) & 0xFF);
                        bufferedOutputStream.write((written >>> 24) & 0xFF);

                        // copy internal buffer to bufferedoutput
                        bufferedOutputStream.write(objectOutput.getBuffer(), 0, written);
                        objectOutput.flush();

                        numberOfObjects++;

                        if (i % MAX_ITEMS_BEFORE_FLUSH == 0) {
                            System.out.println("writing " + i);
                        }
                    }
                }
            }

            System.out.println("done with write");

            try (FileInputStream fileInputStream = new FileInputStream(temp)) {

                try (BufferedInputStream bufferedInputStream = new BufferedInputStream(fileInputStream, BUFFER_SIZE_IN_BYTES)) {
                    for (int idx = 0; idx < numberOfObjects; idx++) {
                        int ch1 = (bufferedInputStream.read() + 256) & 0xff;
                        int ch2 = (bufferedInputStream.read()+ 256) & 0xff;
                        int ch3 = (bufferedInputStream.read() + 256) & 0xff;
                        int ch4 = (bufferedInputStream.read() + 256) & 0xff;
                        int len = (ch4 << 24) + (ch3 << 16) + (ch2 << 8) + (ch1 << 0);
                        if ( len <= 0 )
                            throw new EOFException("client closed");
                        byte buffer[] = new byte[len]; // this could be reused !
                        while (len > 0) {
                            len -= bufferedInputStream.read(buffer, buffer.length - len, len);
                        }
                        Object[] row = (Object[]) config.getObjectInput(buffer).readObject();
                        if (idx % MAX_ITEMS_BEFORE_FLUSH == 0) {
                            System.out.println("reading " + idx);
                        }
                    }
                }
            }

            System.out.println("done with read");
        }
        finally {
            temp.delete();
        }

    }

@RuedigerMoeller
Copy link
Owner

The code above should solve your issue, however midterm I'll improve handling of streaming serialization (though in general, sharedRef's require the writer to keep all objects in a map until stream is closed, this won't work for huge datastructures even if the buffer issue is solved).

action items:

  1. as workaround is somewhat bulky, add a convenience method to fstconfiguration (encodeToStream or so)
  2. enhance read buffer handling in fst.

@RuedigerMoeller
Copy link
Owner

further investigation shows you hit the int_max on stream position so resizing of the buffer fails as buf.length*2 < 0

ensureCapacity(Math.max(buf.length * 2, count + chunk_size)); // chunk_size is 5kb

A single serialized object stream with size > 1.4GB hits the integer range wall (not solvable as all stream positions are int also in JDK).

So only improvement done will be to ease the mechanics shown above.

@RuedigerMoeller
Copy link
Owner

fixed with 2.26 (released within some days)

  • throws proper exception "array size too large" when hitting int_max_value wall
  • added utility methods to FSTConfiguration encodeToStream, decodeFromStream

working (and faster) code (requires 2.26):

public static void main(String[] args) throws Exception {

        File temp = File.createTempFile("test", "dat");

        final int BUFFER_SIZE_IN_BYTES = 10 * 1024 * 1024;
        final int MAX_ITEMS_BEFORE_FLUSH = 10000;
        final int NUMBER_OF_ITEMS = 1000000;

        try {

            FSTConfiguration config = FSTConfiguration.createDefaultConfiguration();
            int numberOfObjects = 0;

            try (FileOutputStream fileOutputStream = new FileOutputStream(temp)) {

                try (BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream, BUFFER_SIZE_IN_BYTES)) {
                    for (int i = 0; i < NUMBER_OF_ITEMS; i++) {

                        Object[] arr = new Object[100];
                        for (int objIdx = 0; objIdx < arr.length; objIdx++) {
                            arr[objIdx] = "row " + i + " - " + "my object" + objIdx;
                        }

                        config.encodeToStream(bufferedOutputStream,arr);
                        numberOfObjects++;

                        if (i % MAX_ITEMS_BEFORE_FLUSH == 0) {
                            System.out.println("writing " + i);
                        }
                    }
                }
            }

            System.out.println("done with write");

            try (FileInputStream fileInputStream = new FileInputStream(temp)) {
                try (BufferedInputStream bufferedInputStream = new BufferedInputStream(fileInputStream, BUFFER_SIZE_IN_BYTES)) {
                    for (int idx = 0; idx < numberOfObjects; idx++) {
                        Object[] row = (Object[]) config.decodeFromStream(bufferedInputStream);
                        if (idx % MAX_ITEMS_BEFORE_FLUSH == 0) {
                            System.out.println("reading " + idx);
                        }
                    }
                }
            }

            System.out.println("done with read");
        }
        finally {
            temp.delete();
        }

    }

@raananraz
Copy link
Author

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants