-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proto: annotate errors regarding invalid UTF-8 with the relevant field #1228
Comments
string field contains invalid UTF-8
errorstring field contains invalid UTF-8
error
string field contains invalid UTF-8
error
Annotating errors with this information seems reasonable. If we do this or marshal, we presumably want to do it for unmarshal as well. At least for unmarshal, we removed context from most errors since it was often misleading as to what the exact problem is. For example, unmarshaling random garbage would often produce an error saying that some specific field was the issue, when really the entire input was bad. This occurs because the protobuf wire format has no magic number. For invalid UTF-8, it seems unlikely to encounter this error due to random data since the data would first have to pass the length-prefix check. Another concern I don't know the answer to is what the performance cost of this is. There's cost in allocating a new error value that contains the field information, and there may be cost passing the field information up/down the call stack. \cc @neild. |
I think it makes sense to include the field name for marshal/unmarshal errors caused by invalid UTF-8. It's been somewhere on my to-do list; just hasn't risen to the top of it yet. I'm not worried much about the cost of allocating a new error in this case, but we do want to be sure that there isn't any additional cost paid by other errors or the non-error path. |
Yeah, seems like the costs of allocating the new error would be limited by the number of messages that have utf8 errors in them, which unless you’re getting hit with a bunch of bad actors, seems unlikely to be as big of a concern. (But it has been known to accidentally turn quadradic before… 😬 ) |
There are almost certainly going to be better attack vectors on making unmarshal operations expensive than triggering an error allocation, so I'm not too worried about that case. (As long as we don't make it ridiculously expensive by accident, of course.) |
+1 |
Any chances we might see some action on this :)? |
+1 |
1 similar comment
+1 |
+1 |
any updates on this? |
Is your feature request related to a problem? Please describe.
string field contains invalid UTF-8
errors while marshaling using the new Go proto librarygoogle.golang.org/protobuf
, but it's too difficult to debug since one doesn't know which field has the problem.Describe the solution you'd like
Add the field name to the error message. Something like:
Describe alternatives you've considered
Looks like the issue is related to directly converting from a slice of bytes to a string.
If that's the case, one could potentially triage that, but the field name would much more helpfull.
Additional context
This is using the new golang
google.golang.org/protobuf
module at v1.25.0 at least.The text was updated successfully, but these errors were encountered: