-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provider: make Trip 'route' field optional for privacy reasons. #504
Provider: make Trip 'route' field optional for privacy reasons. #504
Conversation
For many regulatory and privacy-oriented reasons, it's possible that cities or providers do not want external actors to access the geolocalized data of the trips throughout the city. The distance, average cost, speed, duration of those trips is still valuable and aggregated information and should thus be accessible, even if the route may not be.
About the city limit change, I think that should be removed from this and put in a different PR. Additionally there are cases where a city needs to know about routes outside of their city limits/jurisdiction. See Issue #491 for some discussion. |
84704bc
to
ad32ddd
Compare
Done with #505 @schnuerle |
We support this change. It's easy to meet perform basic tasks like getting trip counts via the trips endpoints, which is why many cities use it (the provider/trips endpoint was the most commonly used endpoint in the MDS Maturity survey). It would be great if we could meet such needs without requiring access to route data. Another thing that we've noticed is that compared to status events, the trips endpoint tends to be more consistent from operator to operator in terms of the data it returns. Hence, a secondary need for a routeless /trips endpoint is to help catch and diagnose issues with status feeds. |
This change is complimentary to #480, which makes location and telemetry data optional. |
I think it should be a little clearer that optional here means it is up to the consuming city/agency. There are many consumers of Provider that continue to rely on Maybe clarity around optional fields is a broader need that should be addressed in the General Information document? Tagging @jfh01 for visibility. |
It seems the main issue that is trying to be addressed with this PR is that cities may not want trip line data. I’d like some cities to chime in to see if this is just a ‘possible’ scenario, or a current need. Another way that this could be solved if needed is to create a way to make this and other fields optional via the API. An example would be the inclusion of a new parameter, say This would also solve future issues like this where cities don’t want/can’t receive certain data. And has a benefit of getting only the data you need for the task at hand, and reducing the returned data file size and processing on the provider’s side. I propose we move this discussion over to the new #507 issue, so we can talk about solutions separate from the details of a PR. |
@thekaveman I second that ! @schnuerle great idea, I also answered on #507. |
@thekaveman makes a good point about using routes for cap counting and such. I think those needs could be met by just having start/end location but no other route data? Arguably sharing start/end locations would be equivalent to this PR in terms of privacy impact, because start/end data is already exposed via status changes. So, perhaps this issue could be solved while retaining the use cases @thekaveman mentions via either:
|
In my prior experience implementing cap counting in Santa Monica, having just the start/end location would not be enough. The method we used looks at each point of the route within city limits, and uses the earliest/latest timestamps to reconstruct the window of time the vehicle spent inside the boundary. |
@thekaveman That is very much in line with the point I'm raising, I think. Of course, if the routes are available (beacuse the city / agency decided it), any client app may then use those to implement accurate information such as heatmaps, etc. If they are not, advanced features may not be usable, but that would not discard all of the info provided by the Right now, apps couldn't use the endpoint at all if for any reason the routes are not made available. @quicklywilliam start/end may indeed help, but I'm not entirely sure that every city/agency would be willing to provide them, for the same privacy reasons. |
@vperron are these cities using /trips without the /status endpoint, then? |
@quicklywilliam I don't know, that was purely hypothetical ! My point being that if privacy is the reason the routes are not provided within trips, replacing them with starting and ending points maybe does not change much that issue. So my first thought would be that they probably would have to be optional as well, even if I entirely agree, those start/end points could also be found in the /status endpoint, maybe just not as easily. So we see 4 options now:
|
Ah, thank you for clarifying! Apologies, I think I might have misunderstood your original use case. Could you clarify what kinds of external actors you have in mind for this use case? If the use case involves sharing trips data widely beyond the agency, I think even removing route data might not suffice to address potential privacy issues. |
The sensitive data from the point of view of cities&agencies we've contacted is the geolocation, combined with a timestamp, of a particular vehicle, especially if it becomes easy to determine frequent routes from point A to B (commuting, for instance) If this My point is, we should consider making this basic information accessible even if the cities or agencies are not willing to expose |
Got, thank you for clarifying! For this use case I think I would advocate for sharing via a different means. I am concerned that in raw format trip data can be attacked, even without route data. For example, if I know duration along with the exact times that a trip ends and begins then it is likely I can attack a GBFS feed by comparing timestamps and durations with when various vehicles disappear at a given location. From there, it is likely I could then establish an exact trip O/D for a portion of trips. |
Agreed. |
This has been addressed even more explicitly with #646 and the ability for cities to exclude route data from provider endpoints. |
Explain pull request
For many regulatory and privacy-oriented reasons, it's possible that cities or providers do not want external actors to access the
route
geolocalized data of the trips throughout the city.We propose to make this
route
field optional to better reflect that choice.The distance, cost and duration fields, all available in the general Trip payload, still
contain valuable and aggregated information and should thus be accessible, even if the route is not.
Is this a breaking change
No, not breaking. A mandatory field is now optional.
Which spec(s) will this pull request impact?
provider