fix: make Block().* return correct ABI based ipld.ErrNotFound errors #156

Jorropo · 2022-03-27T12:17:35Z

This fixes the tests for ipfs/kubo#8815

There are probably other APIs that needs the same treatment however we do not have failing tests for them yet.

Blocking PR and replace rules to be removed:

guseggert · 2022-03-28T12:47:23Z

abyfy_errors.go

+
+// This file handle parsing and returning the correct ABI based errors from error messages
+//lint:ignore ST1008 this function is not using the error as a mean to return failure but it massages it to return the correct type
+func abyfyIpldNotFound(msg string) (error, bool) {


This seems pretty complicated and brittle. Does every impl of this HTTP client have to do this? Should we consider changing the API to return structured errors, or do this server side, so clients don't have to resort to these contortions?

Some summarizing from the issue as discussed in https://discord.com/channels/806902334369824788/956547968385822730/957037148853395496.

IIUC from the interface tests previously we did have some weak contract on the CoreAPI that functions like Block().Get() and Block().Rm() also needed to return things like blockstore.ErrNotFound (or really that the error message would be something like blockstore: block not found ipfs/interface-go-ipfs-core@01ee941. Now we've broken the weak contract by changing the text and are figuring out the best way to restore it.

Does every impl of this HTTP client have to do this?

Not necessarily, it depends what they're trying to do. In this case to for an HTTP API to be compatible with interface-go-ipfs-core's CoreAPI interfaces it needs to do some sort of error checking to figure out what type of error it received. Maybe we could drop that requirement, but idk that we have to.

Should we consider changing the API to return structured errors

You're right that it might be worth changing the HTTP API to have some contract on the types of errors it returns. I think for me the question boils down to the value tradeoffs here, i.e. the number of server and client implementations of this API that need to care about errors vs how much effort it is to return structured errors. Given the low number of servers and the low number of clients that care strongly about the error types it seems like we can probably not do this right now.

or do this server side, so clients don't have to resort to these contortions?

We can make some of this easier server side, although currently there are two implementers of this API (go-ipfs and js-ipfs) with most other IPFS implementations choosing to not implement this HTTP API at all. Some contortions will still be necessary because go-ipfs and js-ipfs don't have the same error messages at the moment (especially since we just changed them in go-ipfs with resolving ipfs/kubo#7074).

OK thanks. For reference, here are some active community client impls (other than Go and JS):

https://github.com/ipfs-shipyard/py-ipfs-http-client

https://github.com/ferristseng/rust-ipfs-api

https://github.com/vasild/cpp-ipfs-http-client

https://github.com/Peergos/Peergos/blob/00e2a8d121473002fa9bdc1ca3f90ecf7f9d33e8/src/peergos/shared/storage/ContentAddressedStorage.java#L257

I don't know how these are used and whether they care about this or not.

+1 with @aschmahmann

Should we consider changing the API to return structured errors, or do this server side, so clients don't have to resort to these contortions?

I belive what surfaced from our discussion (see adin's discord link) is that we can do that later.

Right now it's fine because there is only one error type that needs that treatment. But I think that if we spread that pattern in the future I think we will use a list of error IDs (or automatically generated ones).

I don't know how these are used and whether they care about this or not.

My thought is that grep.app would've shown more code caring about the previous error text if code built around those clients cared. Since there doesn't seem to be much there it appears to be safe to change the error text and that others aren't relying on knowing the error types.

For reference, here are some active community client impls

I don't think it matters to them, as this is required if the http clients want to be compatible with the CoreAPI objects (extended weak contract, if you only use errors to show your users error messages you don't care about that for example).

Object which doesn't exists in other languages and even if it does (maybe JS ?) it doesn't know about ipld.ErrNotFound.

abyfy_errors.go

abyfy_errors_test.go

aschmahmann · 2022-03-28T13:36:38Z

abyfy_errors.go

+
+// This file handle parsing and returning the correct ABI based errors from error messages
+//lint:ignore ST1008 this function is not using the error as a mean to return failure but it massages it to return the correct type
+func abyfyIpldNotFound(msg string) (error, bool) {


Some summarizing from the issue as discussed in https://discord.com/channels/806902334369824788/956547968385822730/957037148853395496.

IIUC from the interface tests previously we did have some weak contract on the CoreAPI that functions like Block().Get() and Block().Rm() also needed to return things like blockstore.ErrNotFound (or really that the error message would be something like blockstore: block not found ipfs/interface-go-ipfs-core@01ee941. Now we've broken the weak contract by changing the text and are figuring out the best way to restore it.

Does every impl of this HTTP client have to do this?

Not necessarily, it depends what they're trying to do. In this case to for an HTTP API to be compatible with interface-go-ipfs-core's CoreAPI interfaces it needs to do some sort of error checking to figure out what type of error it received. Maybe we could drop that requirement, but idk that we have to.

Should we consider changing the API to return structured errors

You're right that it might be worth changing the HTTP API to have some contract on the types of errors it returns. I think for me the question boils down to the value tradeoffs here, i.e. the number of server and client implementations of this API that need to care about errors vs how much effort it is to return structured errors. Given the low number of servers and the low number of clients that care strongly about the error types it seems like we can probably not do this right now.

or do this server side, so clients don't have to resort to these contortions?

We can make some of this easier server side, although currently there are two implementers of this API (go-ipfs and js-ipfs) with most other IPFS implementations choosing to not implement this HTTP API at all. Some contortions will still be necessary because go-ipfs and js-ipfs don't have the same error messages at the moment (especially since we just changed them in go-ipfs with resolving ipfs/kubo#7074).

abyfy_errors.go

schomatis · 2022-04-01T15:35:54Z

errors.go

+		// Assume that CIDs only contain a-zA-Z0-9 characters.
+		// This is true because go-ipld-format use go-cid#Cid.String which use base{3{2,6},58}.
+		postIndex = strings.IndexFunc(msgPostKey, notAsciiLetterOrDigits)
+		if postIndex < 0 {
+			postIndex = len(msgPostKey)
+		}
+
+		var err error


Do we really need the complexity brought by this? Can't we just assume the CID is separated by a space, instead? (if we have to assume anything at all)

Just a space seems incomplete, I would like to include " too.

And at this point instead of searching a list of things I want to break on, I thought it was simpler to do a list of things I do not want to break on.

That is very static code that should never need an update (unless this parsing code is removed) and has a ~100% test coverage. So it's not really adding maintainment costs.

Edit: this code could be made simpler by removing the LUT, however not breaking on a-zA-Z0-9 seems usefull to me.

this code could be made simpler by removing the LUT

Yes, that.

however not breaking on a-zA-Z0-9 seems usefull to me.

I'm not sure how this keeps something from not breaking, will need to take a closer look. In general hard-coding a basic alphanumeric detector along with its tests sounds like unneeded complexity added, if not for maintaining costs (though more code seems to always imply that) at least for my selfish reviewing costs.

I'm not sure how this keeps something from not breaking

I don't meant breaking code, I meant breaking as control flow, breaking out of the IndexFunc loop.

Since you are two with gus raising the same point #156 (comment) I've replaced it by breaking on " \t\n\r\v\f;\"" (whitespaces or semicolon (produced by multierr when concatenating) or " for %q and %#v).

guseggert · 2022-03-31T23:09:13Z

go.mod

+
+replace github.com/ipfs/go-ipld-format => github.com/Jorropo/go-ipld-format v0.3.2-0.20220330014726-942265d1aca7
+
+replace github.com/ipfs/interface-go-ipfs-core => github.com/Jorropo/interface-go-ipfs-core v0.6.2-0.20220331215619-b98f8571cf6b


BTW from https://discord.com/channels/806902334369824788/956547968385822730/959649999988338762

The go-ipfs and go-ipfs-http-cient PRs are circularly dependent, how you want to handle that ?
The tests works on my machine when I use both branches together (unhelpfull as it is).

Do I make temporary replaces and custom checkout that I'll remove before merging so you can see a green CI light ?
An other option is that since go-ipfs's dependency is only in a testing workflow, we can merge go-ipfs with a custom checkout there, that unblocks go-ipfs-http-client merge it and then do a new PR that just cleanup go-ipfs's checkout.

Ah okay that makes sense, thanks

Do we expect CI runs to fully pass at this point, or is the current test failure expected?

Do we expect CI runs to fully pass at this point, or is the current test failure expected?

It is the one expected. And it works on my machine.

Okay SGTM, the go-ipfs PR looks like it's ready to go right? So can we go ahead and remove the replace directives then so we can merge this?

errors_test.go

guseggert · 2022-04-01T18:45:44Z

errors.go

+		// Assume that CIDs only contain a-zA-Z0-9 characters.
+		// This is true because go-ipld-format use go-cid#Cid.String which use base{3{2,6},58}.


Seems like a brittle assumption to make, as there already exist multibase encodings (and additional proposals) outside of this range. Would prefer not to make assumptions about the CID encoding.

I've replaced by breaking on " \t\n\r\v\f;\"" (whitespaces or semicolon (produced by multierr when concatenating) or " for %q and %#v). Hopefully no CID we will see will include thoses.

I see so this is actually just a bug on the server side, because there doesn't seem to be a way to parse all valid CIDs from these error strings, am I understanding that correctly?

We should add some documentation about the fact that this only supports a subset of valid CIDs, and also open an issue about this bug.

Anyway, this seems like the least-awful approach. Thanks for taking the time to fix.

I see so this is actually just a bug on the server side, because there doesn't seem to be a way to parse all valid CIDs from these error strings, am I understanding that correctly?

Currently AFAIK, no multibase that we support use thoses codepoints so it's valid. If you use a CID that use thoses codepoints the client wouldn't know how to parse it anyway (without bumping it's go-multibase dep).

The "proper" fix, is encoding ipld.ErrNotFound.Cid in a separate whatever field (LEB, json field, ...) so we can unambiguously know where they terminate and fetch them correctly, but that fixing an issue that doesn't exists yet since no CID we know how to parse use that (I would like to see a test failing before fixing it).

Extra chars the encodings have outside 0-9A-Za-z:

base32pad: =

base32padupper: =

base32hexpadupper: =

base64pad: =, +, /

ex f00003E00003F00 => MAAA+AAA/AA==

base64urlpad: =, -, _

ex f00003E00003F00 => UAAA-AAA_AA==

base64: +, /

ex f00003E00003F00 => mAAA+AAA/AA

base64url: -, _

ex f00003E00003F00 => uAAA-AAA_AA

If a new encoding is added, ideally it would just work (would be the case if the cid were properly delimited rather than guessing about delimiters, so we could just blindly pass it to go-multibase). Second-best would be that an existing test would fail when we add support for the new encoding (doesn't seem to be the case here?). Least ideal is that this parsing code gets confused and parses the CID incorrectly (how it's currently implemented). I'm not sure what the implications are of that...annoying for users at the least, potentially insecure?

Extra chars the encodings have outside 0-9A-Za-z

I meant " \t\n\r\v\f;\"".

parsing code gets confused and parses the CID incorrectly

It parse incorrectly see that the CID is invalid (either header is truncated or multihash length doesn't match) and giveup and use a text error instead.

potentially insecure

@guseggert

I'll change it to return a blockstoreNotFoundMatchingIPLDErrNotFound instead of a text error when it gives up (the difference is that ipld.IsNotFound(blockstoreNotFoundMatchingIPLDErrNotFound{}) return true). However the CID will still not be recoverable (so if someone expects to reads the CID out of that they can't).

But no code currently do that (because in the past only text errors was there) and I don't plan anyone to write such code.

Why are we decoding CIDs in the first place? Is that part of the spirit of what we're testing here? It seems what we want to identify is the type of error returned (and sadly we need to do that by string manipulation), but we already know what CID we're requesting (it's our test) so we can anticipate the the CID string and just match against it.

schomatis

I'm happy this no longer uses the alphanumeric detection scheme.

This release include the new ipld.ErrNotFound error support in the block API.

github-actions · 2022-04-05T18:20:05Z

Suggested version: v0.3.0
Comparing to: v0.2.0 (diff)

Changes in go.mod file(s):

diff --git a/go.mod b/go.mod
index 3ef46c5..ce3ed95 100644
--- a/go.mod
+++ b/go.mod
@@ -5,11 +5,11 @@ require (
 	github.com/ipfs/go-cid v0.0.7
 	github.com/ipfs/go-ipfs-cmds v0.6.0
 	github.com/ipfs/go-ipfs-files v0.0.8
-	github.com/ipfs/go-ipld-format v0.2.0
-	github.com/ipfs/go-merkledag v0.4.0
+	github.com/ipfs/go-ipld-format v0.4.0
+	github.com/ipfs/go-merkledag v0.6.0
 	github.com/ipfs/go-path v0.1.1
 	github.com/ipfs/go-unixfs v0.2.5
-	github.com/ipfs/interface-go-ipfs-core v0.5.2
+	github.com/ipfs/interface-go-ipfs-core v0.6.2
 	github.com/ipfs/iptb v1.4.0
 	github.com/ipfs/iptb-plugins v0.3.0
 	github.com/libp2p/go-libp2p-core v0.8.6

gorelease says:

# summary
Suggested version: v0.3.0

gocompat says:

(empty)

schomatis

34cc489 looks good.

…rNotFound fix: make Block().* return correct ABI based ipld.ErrNotFound errors This commit was moved from ipfs/go-ipfs-http-client@fdbee7c

Jorropo force-pushed the fix/ipld-ErrNotFound branch 3 times, most recently from e5b3ccf to 507bb8c Compare March 27, 2022 12:24

fix: make Block().* return correct ABI based ipld.ErrNotFound errors

4f5f8e9

Jorropo force-pushed the fix/ipld-ErrNotFound branch from 507bb8c to c15609a Compare March 27, 2022 12:51

BigLep assigned Jorropo Mar 28, 2022

BigLep requested review from aschmahmann and guseggert March 28, 2022 04:23

BigLep added this to the go-ipfs 0.13 milestone Mar 28, 2022

guseggert reviewed Mar 28, 2022

View reviewed changes

aschmahmann suggested changes Mar 28, 2022

View reviewed changes

Jorropo force-pushed the fix/ipld-ErrNotFound branch 3 times, most recently from adc28ba to f3052c4 Compare March 31, 2022 22:24

Jorropo requested a review from guseggert March 31, 2022 22:24

Jorropo added 2 commits April 1, 2022 00:32

chore: rename abyfyIpldErrNotFound to parseIPLDErrNotFound

7aa0029

fix: parseIPLDErrNotFound test

2e09c4b

Jorropo force-pushed the fix/ipld-ErrNotFound branch from f3052c4 to 48b2867 Compare March 31, 2022 22:34

schomatis reviewed Apr 1, 2022

View reviewed changes

guseggert reviewed Apr 1, 2022

View reviewed changes

Jorropo force-pushed the fix/ipld-ErrNotFound branch from 48b2867 to 5635257 Compare April 2, 2022 01:26

schomatis approved these changes Apr 4, 2022

View reviewed changes

Jorropo added 6 commits April 5, 2022 20:14

feat: update the error parsing for go-ipld-format to v0.4.0

296534f

feat: add blockstore: block not found matching too

a3354f0

fix: change CID breaking logic when parsing ipld.ErrNotFound

a3b4935

test: add a false case test to blockstore parsing

75f597a

correctness: only match CIDs matching go-cid.Cid.String output

34cc489

chore: bump version to 0.3.0

d7208ce

This release include the new ipld.ErrNotFound error support in the block API.

Jorropo force-pushed the fix/ipld-ErrNotFound branch from 5635257 to d7208ce Compare April 5, 2022 18:18

Jorropo merged commit fdbee7c into master Apr 5, 2022

Jorropo deleted the fix/ipld-ErrNotFound branch April 5, 2022 18:19

schomatis reviewed Apr 5, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: make Block().* return correct ABI based ipld.ErrNotFound errors #156

fix: make Block().* return correct ABI based ipld.ErrNotFound errors #156

Jorropo commented Mar 27, 2022 •

edited by schomatis

Loading

guseggert Mar 28, 2022

aschmahmann Mar 28, 2022

guseggert Mar 28, 2022

Jorropo Mar 28, 2022

aschmahmann Mar 28, 2022

Jorropo Mar 28, 2022

aschmahmann Mar 28, 2022

schomatis Apr 1, 2022

Jorropo Apr 1, 2022 •

edited

Loading

schomatis Apr 1, 2022 •

edited

Loading

Jorropo Apr 2, 2022

guseggert Mar 31, 2022

Jorropo Apr 3, 2022 •

edited

Loading

guseggert Apr 3, 2022

guseggert Apr 3, 2022 •

edited

Loading

Jorropo Apr 3, 2022 •

edited

Loading

guseggert Apr 3, 2022

guseggert Apr 1, 2022

Jorropo Apr 2, 2022

guseggert Apr 3, 2022

Jorropo Apr 3, 2022

guseggert Apr 5, 2022 •

edited

Loading

Jorropo Apr 5, 2022 •

edited

Loading

Jorropo Apr 5, 2022

schomatis Apr 5, 2022

schomatis left a comment

github-actions bot commented Apr 5, 2022

schomatis left a comment


		replace github.com/ipfs/go-ipld-format => github.com/Jorropo/go-ipld-format v0.3.2-0.20220330014726-942265d1aca7

		replace github.com/ipfs/interface-go-ipfs-core => github.com/Jorropo/interface-go-ipfs-core v0.6.2-0.20220331215619-b98f8571cf6b

		// Assume that CIDs only contain a-zA-Z0-9 characters.
		// This is true because go-ipld-format use go-cid#Cid.String which use base{3{2,6},58}.

fix: make Block().* return correct ABI based ipld.ErrNotFound errors #156

fix: make Block().* return correct ABI based ipld.ErrNotFound errors #156

Conversation

Jorropo commented Mar 27, 2022 • edited by schomatis Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Jorropo Apr 1, 2022 • edited Loading

Choose a reason for hiding this comment

schomatis Apr 1, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Jorropo Apr 3, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guseggert Apr 3, 2022 • edited Loading

Choose a reason for hiding this comment

Jorropo Apr 3, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guseggert Apr 5, 2022 • edited Loading

Choose a reason for hiding this comment

Jorropo Apr 5, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schomatis left a comment

Choose a reason for hiding this comment

github-actions bot commented Apr 5, 2022

schomatis left a comment

Choose a reason for hiding this comment

Jorropo commented Mar 27, 2022 •

edited by schomatis

Loading

Jorropo Apr 1, 2022 •

edited

Loading

schomatis Apr 1, 2022 •

edited

Loading

Jorropo Apr 3, 2022 •

edited

Loading

guseggert Apr 3, 2022 •

edited

Loading

Jorropo Apr 3, 2022 •

edited

Loading

guseggert Apr 5, 2022 •

edited

Loading

Jorropo Apr 5, 2022 •

edited

Loading