Skip to content

Commit

Permalink
Fix MpscLinkedQueue GC issues (#7799)
Browse files Browse the repository at this point in the history
* Refactor MpscLinkedQueue.poll

Handle empty queue first, then share most of the
implementation for non-empty scenarios (spin and
non-spin).

* Unlink dead nodes in MpscLinkedQueue

Similar to
https://github.com/JCTools/JCTools/blob/master/jctools-core/src/main/java/org/jctools/queues/MpscLinkedQueue.java#L120,
null out the next pointer in the discarded consumer node
when polling from the queue. If not, we leave behind a (potentially long)
chain of connected garbage nodes. If we're unlucky (for example one of
the early nodes is promoted to old generation, triggering nepotism),
this can cause GC issues as now we have a long linked list which must be
marked by young collections.

Reproducer:

```
import io.reactivex.rxjava3.internal.queue.MpscLinkedQueue;

public class MpscLinkedQueueGC {
    public static void main(String[] args) {
        MpscLinkedQueue<Integer> queue = new MpscLinkedQueue<>();
        for (int i = 0; i < 10; i++) System.gc(); // tenure consumer node
        while (true) {
            queue.offer(123);
            queue.poll();
        }
    }
}
```

```
Before fix:

$ java -Xlog:gc -Xmx1G -cp build/classes/java/main MpscLinkedQueueGC.java
...
[1.261s] GC(20) Pause Young (Normal) (G1 Preventive Collection) 115M->115M(204M) 209.335ms
[1.385s] GC(23) Pause Young (Normal) (G1 Evacuation Pause) 148M->149M(204M) 31.491ms
[1.417s] GC(24) Pause Young (Normal) (G1 Evacuation Pause) 157M->158M(204M) 19.333ms
[1.453s] GC(25) Pause Young (Normal) (G1 Evacuation Pause) 166M->167M(599M) 22.678ms
[1.966s] GC(26) Pause Young (Normal) (G1 Evacuation Pause) 249M->249M(497M) 305.238ms
...

After fix:
$ java -Xlog:gc -Xmx1G -cp build/classes/java/main MpscLinkedQueueGC.java
...
[1.169s] GC(14) Pause Young (Normal) (G1 Evacuation Pause) 304M->2M(506M) 0.755ms
[1.558s] GC(15) Pause Young (Normal) (G1 Evacuation Pause) 304M->2M(506M) 0.689ms
[1.948s] GC(16) Pause Young (Normal) (G1 Evacuation Pause) 304M->2M(506M) 0.800ms
[2.337s] GC(17) Pause Young (Normal) (G1 Evacuation Pause) 304M->2M(506M) 0.714ms
...
```

* Revert "Unlink dead nodes in MpscLinkedQueue"

This reverts commit 4c6c6fa.

* Revert "Refactor MpscLinkedQueue.poll"

This reverts commit b8582b9.

* Unlink dead nodes in MpscLinkedQueue

Similar to
https://github.com/JCTools/JCTools/blob/master/jctools-core/src/main/java/org/jctools/queues/MpscLinkedQueue.java#L120,
null out the next pointer in the discarded consumer node
when polling from the queue. If not, we leave behind a (potentially long)
chain of connected garbage nodes. If we're unlucky (for example one of
the early nodes is promoted to old generation, triggering nepotism),
this can cause GC issues as now we have a long linked list which must be
marked by young collections.

Reproducer:

```
import io.reactivex.rxjava3.internal.queue.MpscLinkedQueue;

public class MpscLinkedQueueGC {
    public static void main(String[] args) {
        MpscLinkedQueue<Integer> queue = new MpscLinkedQueue<>();
        for (int i = 0; i < 10; i++) System.gc(); // tenure consumer node
        while (true) {
            queue.offer(123);
            queue.poll();
        }
    }
}
```

```
Before fix:

$ java -Xlog:gc -Xmx1G -cp build/classes/java/main MpscLinkedQueueGC.java
...
[1.261s] GC(20) Pause Young (Normal) (G1 Preventive Collection) 115M->115M(204M) 209.335ms
[1.385s] GC(23) Pause Young (Normal) (G1 Evacuation Pause) 148M->149M(204M) 31.491ms
[1.417s] GC(24) Pause Young (Normal) (G1 Evacuation Pause) 157M->158M(204M) 19.333ms
[1.453s] GC(25) Pause Young (Normal) (G1 Evacuation Pause) 166M->167M(599M) 22.678ms
[1.966s] GC(26) Pause Young (Normal) (G1 Evacuation Pause) 249M->249M(497M) 305.238ms
...

After fix:
$ java -Xlog:gc -Xmx1G -cp build/classes/java/main MpscLinkedQueueGC.java
...
[1.169s] GC(14) Pause Young (Normal) (G1 Evacuation Pause) 304M->2M(506M) 0.755ms
[1.558s] GC(15) Pause Young (Normal) (G1 Evacuation Pause) 304M->2M(506M) 0.689ms
[1.948s] GC(16) Pause Young (Normal) (G1 Evacuation Pause) 304M->2M(506M) 0.800ms
[2.337s] GC(17) Pause Young (Normal) (G1 Evacuation Pause) 304M->2M(506M) 0.714ms
...
```
  • Loading branch information
olivergillespie authored Nov 22, 2024
1 parent 338f2a1 commit ede5cfc
Showing 1 changed file with 4 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,8 @@ public T poll() {
// we have to null out the value because we are going to hang on to the node
final T nextValue = nextNode.getAndNullValue();
spConsumerNode(nextNode);
// unlink previous consumer to help gc
currConsumerNode.soNext(null);
return nextValue;
}
else if (currConsumerNode != lvProducerNode()) {
Expand All @@ -101,6 +103,8 @@ else if (currConsumerNode != lvProducerNode()) {
// we have to null out the value because we are going to hang on to the node
final T nextValue = nextNode.getAndNullValue();
spConsumerNode(nextNode);
// unlink previous consumer to help gc
currConsumerNode.soNext(null);
return nextValue;
}
return null;
Expand Down

0 comments on commit ede5cfc

Please sign in to comment.