Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous HTTP Client Library #72

Closed
wants to merge 26 commits into from
Closed

Conversation

adamziel
Copy link
Collaborator

@adamziel adamziel commented Mar 13, 2024

Asynchronous HTTP Client Library

An asynchronous HTTP client library designed for WordPress. Main features:

Streaming support

Enqueuing a request returns a PHP resource that can be read by PHP functions like fopen()
and stream_get_contents()

$client = new AsyncHttpClient();
$fp = $client->enqueue(
     new Request( "https://downloads.wordpress.org/plugin/gutenberg.17.7.0.zip" ),
);
// Read some data
$first_4_kilobytes = fread($fp, 4096);
// We've only waited for the first four kilobytes. The download
// is still in progress at this point, and yet we're free to do
// other work.

Delayed execution and concurrent downloads

The actual socket are not open until the first time the stream is read from:

$client = new AsyncHttpClient();
// Enqueuing the requests does not start the data transmission yet.
$batch = $client->enqueue( [
    new Request( "https://downloads.wordpress.org/plugin/gutenberg.17.7.0.zip" ),
    new Request( "https://downloads.wordpress.org/theme/pendant.zip" ),
] );
// Even though stream_get_contents() will return just the response body for
// one request, it also opens the network sockets and starts streaming
// both enqueued requests. The response data for $batch[1] is buffered.
$gutenberg_zip = stream_get_contents( $batch[0] )

// At least a chunk of the pendant.zip have already been downloaded, let's
// wait for the rest of the data:
$pendant_zip = stream_get_contents( $batch[1] )

Concurrency limits

The AsyncHttpClient will only keep up to $concurrency connections open. When one of the
requests finishes, it will automatically start the next one.

For example:

$client = new AsyncHttpClient();
// Process at most 10 concurrent request at a time.
$client->set_concurrency_limit( 10 );

Progress monitoring

A developer-provided callback (AsyncHttpClient->set_progress_callback()) receives progress
information about every HTTP request.

$client = new AsyncHttpClient();
$client->set_progress_callback( function ( Request $request, $downloaded, $total ) {
     // $total is computed based on the Content-Length response header and
     // null if it's missing.
     echo "$request->url – Downloaded: $downloaded / $total\n";
} );

HTTPS support

TLS connections work out of the box.

Non-blocking sockets

The act of opening each socket connection is non-blocking and happens nearly
instantly. The streams themselves are also set to non-blocking mode via stream_set_blocking($fp, 0);

Asynchronous downloads

Start downloading now, do other work in your code, only block once you need the data.

PHP 7.0 support and no dependencies

AsyncHttpClient works on any WordPress installation with vanilla PHP only.
It does not require any PHP extensions, CURL, or any external PHP libraries.

Supports custom request headers and body

Implementation details

  • Non-blocking stream opening:
    • streams_http_open_nonblocking utilizes stream_http_open_nonblocking to open streams for the provided URLs.
    • stream_http_open_nonblocking first validates the URL scheme (only http and https are supported).
    • It then creates a stream context with a tcp:// wrapper to open the connection because the https:// and ssl:// wrappers would block until the SSL handshake is complete.
    • After opening the connection using stream_socket_client, it switches the stream to non-blocking mode using stream_set_blocking.
  • Asynchronous HTTP request sending:
    • streams_http_requests_send iterates over the provided streams and enables encryption (crypto) on each one using stream_socket_enable_crypto.
    • It then uses stream_select to wait for the streams to become writable and sends the HTTP request headers using fwrite.
  • Reading the response:
    • streams_http_response_await_bytes utilizes stream_select to wait for a specified number of bytes to become available on any of the streams.
    • streams_http_response_await_headers retrieves the full HTTP response headers iteratively. It reads bytes from the streams until the end-of-headers marker (\r\n\r\n). The rest of the response stream, which is the response body, is available for the consumer code to read.
    • Reading from each async stream triggers stream_select to buffer any data available on other concurrent connections. This is implemented via a stream_read method of a custom stream wrapper.
  • Progress monitoring:
    • stream_monitor_progress taps into the stream_read operation using a stream wrapper and reports the number of read bytes to a callback function.

Related issues

@adamziel adamziel changed the title Explore non-blocking HTTP downloads Explore non-blocking HTTPS downloads Mar 13, 2024
@adamziel adamziel changed the title Explore non-blocking HTTPS downloads A non-blocking HTTPS downloads Mar 17, 2024
@adamziel adamziel changed the title A non-blocking HTTPS downloads Asynchronous HTTP Client Library Mar 17, 2024
@adamziel adamziel force-pushed the http-api-explorations branch from 3daae45 to e75c4b6 Compare March 18, 2024 17:02
adamziel added a commit that referenced this pull request Mar 18, 2024
An asynchronous HTTP client library designed for WordPress. Main features:

 ### Streaming support

 Enqueuing a request returns a PHP resource that can be read by PHP functions like `fopen()`
 and `stream_get_contents()`

 ```php
 $client = new AsyncHttpClient();
 $fp = $client->enqueue(
      new Request( "https://downloads.wordpress.org/plugin/gutenberg.17.7.0.zip" ),
 );
 // Read some data
 $first_4_kilobytes = fread($fp, 4096);
 // We've only waited for the first four kilobytes. The download
 // is still in progress at this point, and yet we're free to do
 // other work.
 ```

 ### Delayed execution and concurrent downloads

 The actual socket are not open until the first time the stream is read from:

 ```php
 $client = new AsyncHttpClient();
 // Enqueuing the requests does not start the data transmission yet.
 $batch = $client->enqueue( [
     new Request( "https://downloads.wordpress.org/plugin/gutenberg.17.7.0.zip" ),
     new Request( "https://downloads.wordpress.org/theme/pendant.zip" ),
 ] );
 // Even though stream_get_contents() will return just the response body for
 // one request, it also opens the network sockets and starts streaming
 // both enqueued requests. The response data for $batch[1] is buffered.
 $gutenberg_zip = stream_get_contents( $batch[0] )

 // At least a chunk of the pendant.zip have already been downloaded, let's
 // wait for the rest of the data:
 $pendant_zip = stream_get_contents( $batch[1] )
 ```

 ### Concurrency limits

 The `AsyncHttpClient` will only keep up to `$concurrency` connections open. When one of the
 requests finishes, it will automatically start the next one.

 For example:
 ```php
 $client = new AsyncHttpClient();
 // Process at most 10 concurrent request at a time.
 $client->set_concurrency_limit( 10 );
 ```

 ### Progress monitoring

 A developer-provided callback (`AsyncHttpClient->set_progress_callback()`) receives progress
 information about every HTTP request.

 ```php
 $client = new AsyncHttpClient();
 $client->set_progress_callback( function ( Request $request, $downloaded, $total ) {
      // $total is computed based on the Content-Length response header and
      // null if it's missing.
      echo "$request->url – Downloaded: $downloaded / $total\n";
 } );
 ```

 ### HTTPS support

 TLS connections work out of the box.

 ### Non-blocking sockets

 The act of opening each socket connection is non-blocking and happens nearly
 instantly. The streams themselves are also set to non-blocking mode via `stream_set_blocking($fp, 0);`

 ### Asynchronous downloads

Start downloading now, do other work in your code, only block once you need the data.

 ### PHP 7.0 support and no dependencies

 `AsyncHttpClient` works on any WordPress installation with vanilla PHP only.
 It does not require any PHP extensions, CURL, or any external PHP libraries.

 ### Supports custom request headers and body

 ## Implementation details

* **Non-blocking stream opening:**
    * `streams_http_open_nonblocking` utilizes `stream_http_open_nonblocking` to open streams for the provided URLs.
    * `stream_http_open_nonblocking` first validates the URL scheme (only http and https are supported).
    * It then creates a stream context with a `tcp://` wrapper to open the connection because the `https://` and `ssl://` wrappers would block until the SSL handshake is complete.
    * After opening the connection using `stream_socket_client`, it switches the stream to non-blocking mode using `stream_set_blocking`.
* **Asynchronous HTTP request sending:**
    * `streams_http_requests_send` iterates over the provided streams and enables encryption (crypto) on each one using `stream_socket_enable_crypto`.
    * It then uses `stream_select` to wait for the streams to become writable and sends the HTTP request headers using `fwrite`.
* **Reading the response:**
    * `streams_http_response_await_bytes` utilizes `stream_select` to wait for a specified number of bytes to become available on any of the streams.
    * `streams_http_response_await_headers` retrieves the full HTTP response headers iteratively. It reads bytes from the streams until the end-of-headers marker (`\r\n\r\n`). The rest of the response stream, which is the response body, is available for the consumer code to read.
    * Reading from each async stream triggers `stream_select` to buffer any data available on other concurrent connections. This is implemented via a `stream_read` method of a custom stream wrapper.
* **Progress monitoring:**
    * `stream_monitor_progress` taps into the `stream_read` operation using a stream wrapper and reports the number of read bytes to a callback function.

  ## Related issues

* #71
@adamziel
Copy link
Collaborator Author

Merged in a9f10c1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant