You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current _bulk indexing API places a high configuration burden on users today to avoid RejectedExecutionException due to TOO_MANY_REQUESTS. This forces the user to "experiment" with bulk block sizes, multi-threading, refresh intervals, etc.
The use HTTP streaming for _bulk indexing would:
improve API usability: streams for request and response
improve resource utilization: the coordinators may funnel the streams from multiple clients
improve overall stability: the coordinators may use backpressure to slow down the clients and apply the optimal batching strategy taking into account resource availability (heap / CPU / ...)
improve durability: the coordinators may start processing as soon the the first bulk item is received (using translog / other means to deal with crashes / restarts / disconnects)
With all the options available, the _bulk should continue to use HTTP protocol, however there are few options to consider.
Chunked Transfer Encoding
More details here #3000 (comment). This is the more or less the only option available in case of HTTP/1.1. The benefit of this implementation is that it would work for 2.x and 3.x releases.
HTTP/2
HTTP/2 offers an optimized transport for HTTP semantics, including superior streaming capabilities, see please Streams and Multiplexing for more details.
HTTP/2 uses DATA frames to carry message payloads. The "chunked"
transfer encoding defined in Section 4.1 of [RFC7230] MUST NOT be
used in HTTP/2.
The HTTP/2 is only supported by 3.x release line (both for clients and servers).
Websockets
The Websockets would offer bidirectional stream, similarly to HTTP/2, but from implementation perspective it would be easier to integrate (in theory): this is new protocol that will not touch the existing OpenSearch HTTP layer.
Implementation Notes
The OpenSearch supports both HTTP/1.1 and HTTP/2 (including H2C). However, the OpenSearch HTTP server model does not support chunked transfer encoding nor exposes HTTP/2 streams (especially data frames):
Problem
Current
_bulk
indexing API places a high configuration burden on users today to avoidRejectedExecutionException
due to TOO_MANY_REQUESTS. This forces the user to "experiment" with bulk block sizes, multi-threading, refresh intervals, etc.The use HTTP streaming for
_bulk
indexing would:See please [RFC] Streaming Index API
Implementation Options
With all the options available, the
_bulk
should continue to use HTTP protocol, however there are few options to consider.Chunked Transfer Encoding
More details here #3000 (comment). This is the more or less the only option available in case of HTTP/1.1. The benefit of this implementation is that it would work for 2.x and 3.x releases.
HTTP/2
HTTP/2 offers an optimized transport for HTTP semantics, including superior streaming capabilities, see please Streams and Multiplexing for more details.
The HTTP/2 is only supported by 3.x release line (both for clients and servers).
Websockets
The Websockets would offer bidirectional stream, similarly to HTTP/2, but from implementation perspective it would be easier to integrate (in theory): this is new protocol that will not touch the existing OpenSearch HTTP layer.
Implementation Notes
The OpenSearch supports both HTTP/1.1 and HTTP/2 (including H2C). However, the OpenSearch HTTP server model does not support chunked transfer encoding nor exposes HTTP/2 streams (especially data frames):
Http2StreamFrameToHttpObjectCodec
, ...) to convert to HTTP/1.1 abstractionsThe suggested direction to proceed towards POC:
At this moment, the POC focuses only on first step: understand the scope of changes to support HTTP streaming on OpenSearch server and client sides.
The text was updated successfully, but these errors were encountered: