Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issuing requests in close concurrency can unneccessarily result in multiple HTTP/2 connections. #514

Closed
PrimozGodec opened this issue Nov 7, 2019 · 10 comments
Labels
bug Something isn't working http/2 Issues and PRs related to HTTP/2

Comments

@PrimozGodec
Copy link
Contributor

PrimozGodec commented Nov 7, 2019

When I do several requests with the AsyncClient it seems that it opens multiple connections to the server. My code looks like this:

async def _send_request(self, client, image, url):
    headers = {
        'Content-Type': 'image/jpeg',
        'Content-Length': str(len(image))
    }
    return await client.post(
        url,
        headers=headers,
        data=image
    )

async with httpx.AsyncClient(timeout=timeout, base_url=self.server_url) as client:
    requests = []
    for image in images:
        requests.append(self._send_request(client, image, url))
    responses = await asyncio.gather(*requests)

As far as I understand it should use a connection pooling, store the connection and reuse it. In my case I get the following debug message for every image- post request I send with a loop:

DEBUG:httpx.dispatch.connection:start_connect host='api.garaza.io' port=443 timeout=TimeoutConfig(connect_timeout=None, read_timeout=60, write_timeout=5)

Am I doing anything wrong or is it a bug?

@PrimozGodec
Copy link
Contributor Author

PrimozGodec commented Nov 8, 2019

The issue seems to appear here:

connection = self.pop_connection(origin)
if connection is None:
await self.max_connections.acquire()
connection = HTTPConnection(
origin,
verify=self.verify,
cert=self.cert,
timeout=self.timeout,
http_versions=self.http_versions,
backend=self.backend,
release_func=self.release_connection,
trust_env=self.trust_env,
)
logger.trace(f"new_connection connection={connection!r}")
else:
logger.trace(f"reuse_connection connection={connection!r}")
self.active_connections.add(connection)

First self.pop_connection(origin) is called then there is await call in the line 135 and at the end there is self.active_connections.add(connection). Since there is await in between, first pops for all requests are called (there is no connection in the pool at that time) and the connection is added to the pool at the end when all pop already happens. It would be required to get rid of await in between but I do not know how since I do not understand what exactly await self.max_connections.acquire() does.

@yeraydiazdiaz
Copy link
Contributor

I noticed this happening on a similar situation as well and it also confused me. I fully expected connections to only be created once per origin, but it seems the current behaviour is by design according to this test?

https://github.com/encode/httpx/blob/master/tests/dispatch/test_connection_pools.py#L69-L80

@PrimozGodec, max_connections is a BoundedSemaphore which HTTPX uses to limit the number of connections of the pool. The default is 100.

@PrimozGodec
Copy link
Contributor Author

@yeraydiazdiaz thank you for the response. So it looks OK then. Can establishing more connections instead of one at the same time decrease the speed of getting the responses?

@tomchristie
Copy link
Member

Can establishing more connections instead of one at the same time decrease the speed of getting the responses?

Yes.

The behaviour you want here is...

  • If the connections are HTTP/1.1 then you want to have multiple concurrent connections.
  • If the connections are HTTP/2 then you don't want to have multiple concurrent connections, since you can have multiple concurrent streams on a single connection.

That test case happens to in the context of an HTTP/1.1 case, although it doesn't make that clear.

But, we don't know ahead of time if a connection will be HTTP/1.1 or HTTP/2 until it's been established.

It's also worth noting that you don't actually know ahead of time if a connection will end up being HTTP/1.1 or HTTP/2 until it's been established, so it is possible that making several requests to an HTTP/2 server could end up initially opening multiple connections. There's potentially some finessing to be done there, to make sure that we always only start by issuing a single connection, until we've determined if that connection is HTTP/1.1 or HTTP/2.

@tomchristie
Copy link
Member

@PrimozGodec What is the response.http_version in your case here?

@PrimozGodec
Copy link
Contributor Author

There's potentially some finessing to be done there, to make sure that we always only start by issuing a single connection, until we've determined if that connection is HTTP/1.1 or HTTP/2.

I think it is actually a good solution.

What is the response.http_version in your case here?

It is HTTP/2

@tomchristie
Copy link
Member

It is HTTP/2

Right, then we've got some slightly complicated work to do.

If you'd issued the connections at very slightly different time intervals, you'd (presumably) find that they'd have ended up on the same single connection. As it is, each has started establishing a connection independently, which all turned out to be HTTP/2 connections, that could have just been a single shared connection.

@PrimozGodec
Copy link
Contributor Author

Also, I see another potential issue. pop_connection is called first and in the case that there is no connection it will call await self.max_connections.acquire() which will prevent making more than 100 connections. Then the issue is if I want to make 1000 requests all will pass pop_connection first (when no connections in the pool) and will wait at await self.max_connections.acquire() for connection pool to get released even some of them could reuse the old connection. It means that after 100 requests everything gets blocked since connections are not released and all waiting requests will want to have a new connection. @tomchristie suggestion about making one connection to the same server and then wait would solve the issue.

@ojii
Copy link

ojii commented Feb 18, 2020

But, we don't know ahead of time if a connection will be HTTP/1.1 or HTTP/2 until it's been established.

is there a way around this issue if we know beforehand that it's http2 (as in, when talking to a server that is http2 only, such as APNS)?

@tomchristie
Copy link
Member

Believe this to now be resolved in the 0.13.dev0 pre-release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working http/2 Issues and PRs related to HTTP/2
Projects
None yet
Development

No branches or pull requests

4 participants