Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC] Streaming Indexing API #7273

Closed
wants to merge 1 commit into from
Closed

Conversation

reta
Copy link
Collaborator

@reta reta commented Apr 21, 2023

Description

Streaming Indexing API (work in progress)

Issues Resolved

Closes #5001

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Copy link
Member

@peternied peternied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Look forward to seeing more progress in these space, seems like there could be some great performance boosts.

}
}

private class StreamingRequestConsumer<T extends HttpContent> implements Consumer<T>, Publisher<HttpContent> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The security plugin works by intercepting requests through the transport layer, bulk requests are skipped since there individual index requests are fanned out [1]. I suspect this will work correctly for streaming requests assuming the requests are still fanned out via transport actions - if they aren't a new hook will be needed. If a new hook is needed, it could be added by using the IdentityService [2] to get the subject and then perform the permissions check.

After this is merged into main, but before its backported, I would highly recommend adding a new test within the security plugin to exercise the Streaming API an ensure permissions conventions are not side-stepped.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the early feedback, @peternied , highly appreciated!

@github-actions
Copy link
Contributor

github-actions bot commented May 3, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented May 4, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented May 5, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented May 5, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented May 5, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented May 8, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented May 8, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented May 8, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @reta for starting this initiative.
Do we plan to support CompressionStream for compressing/decompressing streams of data using the gzip/deflate formats?
There might be other follow up changes needed in the engine either to forward the stream directly to a single shard in the data node(optimisation) and refresh the engine on stream close. Further we can evaluate how indexing performance can further be improved by getting rid of translogs altogether.

@reta
Copy link
Collaborator Author

reta commented May 23, 2023

Thanks @Bukhtawar

Thanks @reta for starting this initiative.

This is just POC to understand the scope of changes needed to support the streaming part for request and response, the POC won't go beyond that (but the issue it references has it all described)

Do we plan to support CompressionStream for compressing/decompressing streams of data using the gzip/deflate formats?

The Transfer-Encoding header [1] supports multiple encoding schemes (fe Transfer-Encoding: gzip, chunked), this should not be an issue to add if there is a need.

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@reta reta force-pushed the issue-5001 branch 2 times, most recently from 53d3a77 to e803333 Compare May 24, 2023 15:32
@dblock
Copy link
Member

dblock commented Jun 20, 2023

@dblock @nknize @andrross @Bukhtawar @kotwanikunal @owaiskazi19 opening this one up for discussion, the POC is not targeted for a merge (a few test cases left broken on purpose) but to collect the feedback on technical decisions to be made (and if there are any objections to those). The POC successfully implements end to end streaming flow for bulk indexing (with a new /_sbulk endpoint) whereas the existing HTTP APIs work as expected.

Nice. We should talk about whether _sbulk is the right name or whether there's a content-type or other header that can determine whether data is being streamed or not and keep things under _bulk. A related problem is that today _bulk takes a line-delimited JSON, and it would be nice to have other options (e.g. #8169 and #8170).

  1. Move off from our custom Netty 4 transport to Reactor Netty.
    The Netty 4 transport we have not does not support request / response streaming (it has some limited support for response chunking though). We could implement that but it would require significant amount of effort, additionally we would still run into the issue of dealing with streams in the core server (from/to the HTTP stack). Reactor Netty (Project Reactor + Netty, https://github.com/reactor/reactor-netty) already has all the features we need, plus it works over HTTP/2 streams and HTTP/1.1 chunking. Some plugins (like repository-azure) already use Reactor Netty.

I think this is a good project-wide upgrade. I would open an issue to track.

The new engine could be backported to 2.x as well, since it would work over HTTP/1.1 if HTTP/2 is not supported. It could also be hidden behind feature flag if needed. The security plugin will be impacted by this change (since it configures HTTPS/SSL) and would need to be modified.

As always I'd start with feature flag/experimental, but that's more of a timeline problem.

  1. Bring Project Reactor to core server
    We would need to bring Project Reactor to deal with the streams all the way down. The risk here - it may cause JAR hell issues in some of the community plugins (the once we manage could be easily fixed).

We break downstream plugins all the time, so while this is painful, I think it's something we can manage.

  1. The RestActions we have now should work the same way with the new engine as they were working before.
    The new property will be introduced to hint that RestAction supports request / response streaming (Streaming Indexing endpoint will be the first one)

👍

  1. RestClient and OpenSearch Java Client would have a new streaming API support, rough example of how it is implemented in the POC is below:

👍

  1. For steaming request / response communication, the compression would be applied on chunk level
    See please https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding, since the server would need to process each chunk (fe bulk request) separately, we could handle that only when compression is applied on chunk level. That is going to be less efficient than compressing the complete requests and response.

This may be a dumb question as I don't know much about streaming APIs, but would supporting other streaming implementations/protocols like grpc mitigate this kind of problems?

If no objections are going to be raised with the decisions outlined above, the POC will be scratched and rebranded as meta issue with small incremental steps. Please feel free to comment, raise concerns and share the feedback.

Love it. ❤️

@andrross
Copy link
Member

  1. Move off from our custom Netty 4 transport to Reactor Netty.
  2. Bring Project Reactor to core server
  3. The RestActions we have now should work the same way with the new engine as they were working before.

@reta I think these points all fall under the broader issue of the project wide upgrade to Reactor Netty. I agree with @dblock that these are things that we can manage, and provided the REST API is functionally identical we should be able to work through dependency issues and get this released in 2.x. A couple questions though:

  • Are there alternatives, or is Project Reactor the obvious choice?
  • If there is a critical CVE fix that needs an updated Netty, do we have to wait on a Reactor Netty upgrade?

Overall this looks great!

@reta
Copy link
Collaborator Author

reta commented Jun 20, 2023

Thanks a lot @dblock and @andrross !

This may be a dumb question as I don't know much about streaming APIs, but would supporting other streaming implementations/protocols like grpc mitigate this kind of problems?

@dblock with grpc, we need HTTP/2 and basically move away from REST actions, requiring support of a whole new client ecosystem (to deal with grpc streams).

Are there alternatives, or is Project Reactor the obvious choice?

@andrross there used to be https://github.com/ReactiveX/RxNetty that sadly is dead, and https://github.com/playframework/netty-reactive-streams, which is alive. The good thing about Project Reactor is that it has very reach streaming capabilities (based on Project Reactor) and we also already bundled it in repository-azure.

If there is a critical CVE fix that needs an updated Netty, do we have to wait on a Reactor Netty upgrade?

No, Netty could be updated separately any time (the repository-azure is a good prove to that).

@andrross
Copy link
Member

Thanks @reta! I think I'm onboard with the upgrade to Reactor Netty.

A couple questions about the API:

  • What does it look like on the client when the server cannot keep up with the stream of requests I'm sending? In other words, how does backpressure work?
  • What about failures during the stream? Will we go with the approach suggested by @mikemccand here that nothing is guaranteed to be durable until the stream is successfully closed by the client?

@reta
Copy link
Collaborator Author

reta commented Jun 21, 2023

Thanks @andrross !

* What does it look like on the client when the server cannot keep up with the stream of requests I'm sending? In other words, how does backpressure work?

I believe it relies on standard TCP flow control, the Netty / Reactor Netty buffers data but if the consumer (server in this case) is not able to consume it fast enough, at some point Netty / Reactor Netty stops reading the data from socket, which lead not no ACKs to the client.

* What about failures during the stream? Will we go with the approach suggested by @mikemccand [here](https://github.com/opensearch-project/OpenSearch/issues/3000#issuecomment-1499000417) that nothing is guaranteed to be durable until the stream is successfully closed by the client?

This is not explored in the scope of the POC as it is more like "implementation detail". At the moment client gets the failures as it consumes the response stream and could stop any time.

@andrross
Copy link
Member

This is not explored in the scope of the POC as it is more like "implementation detail".

With the proposed approach I think we're committing to the reactive streams specification for the API, as that is what Project Reactor is built on. This makes sense to me as reactive streams is meant to be a standard for asynchronous stream processing, though I am not an expert in this space. Is there any more due diligence we need to do to ensure we can get the semantics we want for this API?

@reta
Copy link
Collaborator Author

reta commented Jun 22, 2023

Thanks @andrross

This makes sense to me as reactive streams is meant to be a standard for asynchronous stream processing, though I am not an expert in this spac

👍 , I agree with your statement here

Is there any more due diligence we need to do to ensure we can get the semantics we want for this API?

At this point, I don't see any issues (semantic wise or feature wise) we may run into with this API specifically, the most complicated part (as for this POC) were to explore how could we get it in (streaming, essentially, from network to core) without disrupting everything else.

@andrross
Copy link
Member

Thanks @reta, I'm on board with creating the meta issue and starting the incremental steps!

@nknize @dblock @Bukhtawar Any additional thoughts or concerns?

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@reta reta requested a review from sohami as a code owner July 27, 2023 14:25
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Andriy Redko <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@reta
Copy link
Collaborator Author

reta commented Aug 2, 2023

Closing the POC and moving towards implementation #9065

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[POC] Streaming Indexing API
5 participants