Unstable API #13

jrtberlin · 2020-01-30T21:03:12Z

Sometimes departures can not query the departures and responds with:

Could not query departures
Error: json: cannot unmarshal object into Go value of type []main.result

I looked up with curl what the actual error is:

{"error":true,"msg":"HAFAS error: Service Unavailable"}

In my opinion departures should be a bit clearer what the issue is e.g. HAFAS error: Service Unavailable and/or use a more stable API (I'm unsure how feasible it is to switch the API)

The text was updated successfully, but these errors were encountered:

noxer · 2020-01-30T21:15:32Z

What would be your way of dealing with it, should we wait a little longer and retry or give up?

jrtberlin · 2020-01-30T21:21:27Z

I would probably ask the API maintainer if that is a rate limit, a server capacity issue or HAFAS beeing HAFAS.
If the issue is the software quality of HAFAS we should increase the default retries. In my opinion, waiting two seconds longer is a better default config than getting an error.
If it's a rate limit or similar the project could look into hosting a dedicated API server or accessing the source directly.

noxer · 2020-01-30T21:27:41Z

Okay, so this issue is two fold. Detect the HAFAS errors and find a better strategy of dealing with them.

noxer · 2020-01-30T21:28:58Z

Does tagging @derhuerst work?

derhuerst · 2020-01-31T12:13:07Z

Does tagging @derhuerst work?

It does! 🎉

There are three different aspects here:

The *.transport.rest APIs return any error as {"error": true, "msg": "…"}. You would have to handle this in your client.
In the case described above "HAFAS error: Service unavailable", the upstream API, which 2.bvg.transport.rest wraps, returns errors. In most of the cases, this is not caused by an invalid request to it, but simply because it's down; We can't really do anything about it.
That aside, it would be nice to let *.transport.rest expose more properties of the error via JSON, e.g. isHafasError. The current error-handling code is very bare-bones.

If the issue is the software quality of HAFAS we should increase the default retries.

You could do that. On the other hand, I've wanted to add this to hafas-rest-api itself for a while now. Retrying on the server side yields more appropriate retrying schemes (e.g. when many people send requests and all of HAFAS is down) and better latencies.

If it's a rate limit or similar the project could look into hosting a dedicated API server [...].

That is yet another improvement on my todo list: Access HAFAS via a range of IP addresses, to prevent rate-limiting. Let me know if you can help with this.

If it's a rate limit or similar the project could look into [...] accessing the source directly.

Unfortunately that's almost impossible for now:

HAFAS is a huge piece of software that does many things; it's being sold as a tightly integrated all-in-one solution. AFAIK, both legally & technically, it's not meant to be extended or partially replaced by other systems.
The underlying public transportation data is often copyright-protected and exclusively handed over to HaCon (they write & run HAFAS) for the purpose of running the API.

noxer · 2020-02-04T10:16:14Z

It does! 🎉

🎉

The *.transport.rest APIs return any error as {"error": true, "msg": "…"}. You would have to handle this in your client.

We should implement this either way. Even if transport.rest handles retries, it may still encounter a situation where HAFAS just doesn't want to respond.

In the case described above "HAFAS error: Service unavailable", the upstream API, which 2.bvg.transport.rest wraps, returns errors. In most of the cases, this is not caused by an invalid request to it, but simply because it's down; We can't really do anything about it.

Do you log those errors? Maybe we can find a pattern which allows us to better time the retries.

That aside, it would be nice to let *.transport.rest expose more properties of the error via JSON, e.g. isHafasError. The current error-handling code is very bare-bones.

I'd like that :)

You could do that. On the other hand, I've wanted to add this to hafas-rest-api itself for a while now. Retrying on the server side yields more appropriate retrying schemes (e.g. when many people send requests and all of HAFAS is down) and better latencies.

This makes a lot of sense. This could also prevent departures from DDOSing your API when the requests fail. To prevent DDOS you could also add a field "retry" to the error response which indicates how many seconds to wait before a retry should be attempted (or a negative number/zero to block retries entirely).

That is yet another improvement on my todo list: Access HAFAS via a range of IP addresses, to prevent rate-limiting. Let me know if you can help with this.

Sure, we could set up a number of relays for the API, we just need to create a protocol for that.

Unfortunately that's almost impossible for now:

I think what he meant is to circumvent transport.rest and access HAFAS directly. That would be a lot of work but may prevent rate limiting issues. It doesn't solve the underlying issue, though. I would push that faaaaar back on the project plan.

Thank you for your answers.

Tim

derhuerst · 2020-02-04T10:57:22Z

In the case described above "HAFAS error: Service unavailable", the upstream API, which 2.bvg.transport.rest wraps, returns errors. In most of the cases, this is not caused by an invalid request to it, but simply because it's down; We can't really do anything about it.

Do you log those errors? Maybe we can find a pattern which allows us to better time the retries.

I do keep those logs. I'been planning to publish them somewhere for a while now, but haven't managed to so far.

Not sure how well you can time them. The downtimes of HAFA APIs seem pretty much random.

That aside, it would be nice to let *.transport.rest expose more properties of the error via JSON, e.g. isHafasError.

I'd like that :)

PR welcome!

To prevent DDOS you could also add a field "retry" to the error response which indicates how many seconds to wait before a retry should be attempted (or a negative number/zero to block retries entirely).

I could, but this just has an advisory function, and most people won't obey it. I like the idea of the API being able to handle this much more. Let's see.

Sure, we could set up a number of relays for the API, we just need to create a protocol for that.

I already did (read from derhuerst/vbb-rest#29 (comment) onwards), but the setup was overly complex and annoying.

Now it's using Caddy, using a.2.bvg.transport.rest as the main upstream and b.2.bvg.transport.rest as the fallback. But people often forget to renew their SSL certs & update bvg-rest from Git.

There's two more options I want to evaluate though: making use of the IPv6 address pool of the VPS it's currently running on, and SOCKS proxies. Both have the advantage that they don't increase the maintenance effort much, in contrast to the HTTP-based load balancing setup.

Let me know if you want to help, via doing PRs or by running the VPS.

I think what he meant is to circumvent transport.rest and access HAFAS directly. That would be a lot of work but may prevent rate limiting issues. It doesn't solve the underlying issue, though. I would push that faaaaar back on the project plan.

You can always spawn bvg-hafas/hafas-client as a child process and call it via JSON-RPC. But the idea of *.transport.rest is that it should be convenient to access public transport data.

derhuerst · 2020-03-23T17:36:34Z

I have opened a thread in the transport.rest meta-repo to discuss hosting further: public-transport/transport.rest#2

derhuerst · 2020-09-10T12:54:28Z

An update:

2.vbb.transport.rest has been shut off for ~6 months now.
I have just deprecated 3.vbb.transport.rest.
Please use v5.vbb.transport.rest. It has built-in caching, which might help with the underlying HAFAS endpoint being unreliable.

noxer · 2020-10-28T12:27:31Z

This should be solved now. The application uses v5 and I've seen no problems since.

derhuerst mentioned this issue Mar 23, 2020

sustainable server setup: collaborate on hosting public-transport/transport.rest#2

Closed

noxer mentioned this issue Oct 28, 2020

Updated requests to use API v5 #14

Merged

noxer closed this as completed Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unstable API #13

Unstable API #13

jrtberlin commented Jan 30, 2020

noxer commented Jan 30, 2020

jrtberlin commented Jan 30, 2020

noxer commented Jan 30, 2020

noxer commented Jan 30, 2020

derhuerst commented Jan 31, 2020

noxer commented Feb 4, 2020

derhuerst commented Feb 4, 2020

derhuerst commented Mar 23, 2020

derhuerst commented Sep 10, 2020

noxer commented Oct 28, 2020

Unstable API #13

Unstable API #13

Comments

jrtberlin commented Jan 30, 2020

noxer commented Jan 30, 2020

jrtberlin commented Jan 30, 2020

noxer commented Jan 30, 2020

noxer commented Jan 30, 2020

derhuerst commented Jan 31, 2020

noxer commented Feb 4, 2020

derhuerst commented Feb 4, 2020

derhuerst commented Mar 23, 2020

derhuerst commented Sep 10, 2020

noxer commented Oct 28, 2020