Kernel lookup api method #3000

jaspervdm · 2019-08-25T23:25:07Z

This PR adds an API call to look up an on-chain kernel and the height of the block it is included in.
It has an optional minimum search height. If provided (and close to the actual height), the request will be really fast (<10ms). Without this parameter the request can currently take up to 2.5s. Seeing as in the worst case scenario this time will increase linearly with the chain head height (which keeps increasing forever), it is recommended to always set the minimum height parameter.

Requesting review from @antiochp since I am not very experienced with MMRs and as a result some implementation details might be sub-optimal.

Closes #2460

Use cases:

Wallets can check the status of a transaction without having to rely on the change output, which may or may not exist or might already be spent (wrong "unconfirmed" label for transactions that leave 0 change - plea for check tx status by kernel instead of change output grin-wallet#169)
The on-chain kernel signature is required in some scriptless scripts applications

Usage:
v1/chain/kernels/XXX?min_height=YYY&max_height=ZZZ where XXX is the excess commitment. The min_height and max_height parameters are both optional. If not supplied, min_height will be set to 0 and max_height will be set to the head of the chain. The method will start at the block height max_height and traverse the kernel MMR backwards, until either the kernel is found or min_height is reached.

Example:
Request:

/v1/chain/kernels/0939fe3dc6a35350da91c6288138b7a257e0c0322eae30bda3938229d649e2e642?max_height=324300

Response:

{
  "tx_kernel": {
    "features": {
      "Plain": {
        "fee": 12000000
      }
    },
    "excess": "0939fe3dc6a35350da91c6288138b7a257e0c0322eae30bda3938229d649e2e642",
    "excess_sig": "c6f269b98f611c03cde361467daa5c27109dea0e8b28a3189646a4e1e9ad96fe92ea86b65dc76151686c5a4f3922202bd298b665ec80ba2b1e43fb46df34f6a4"
  },
  "height": 324016,
  "mmr_index": 1946151
}

DavidBurkett · 2019-08-26T00:07:12Z

Why not index these in lmdb?

antiochp · 2019-08-26T09:55:36Z

Why not index these in lmdb?

We do this for outputs. There is also a related PR #2903 that proposes storing the height per output in lmdb.

We came to the conclusion in previous conversations that -

we need fast random access for all outputs (node needs to verify spent outputs for every tx/block)
we do not necessarily need fast random access for all kernels
- a node only cares about a limited subset of kernels (based on connected wallets)
- relative kernels and scriptless script use cases are assumed to be relatively rare

So given the tradeoff between storage costs for lots of kernels and slower lookups - we erred on the side of slower lookups.

We are comfortable maintaining an index of all unspent outputs in lmdb as the UTXO set will not grow indefinitely.
We are less comfortable indexing all kernels as this set will grow indefinitely (and vast majority of this index will never be accessed or used).

That said - a given node probably has a reasonable idea of which kernels it is interested in locally. so it may make sense to store/cache these in lmdb based on prior access.
Or maybe it makes more sense to cache these at the wallet level (in the wallet lmdb instance).

antiochp · 2019-08-26T09:58:13Z

Thinking a bit more about this. I'm torn about the "index or not to index" question.

yeastplume · 2019-08-26T10:10:35Z

Couple of things (that ties in with a few thoughts I've had on wallet improvements)

Caching a min height in a wallet along with each transaction should be trivial, but this will obviously only be useful where the tx log and data is intact. (Think something will have to be done here anyhow as I don't think it's currently storing kernel IDs in a handy place). When restoring, checking the chain, etc, the wallet should still be retrieving by output anyhow so shouldn't make much of a difference.
Another approach to the original issue is to simply have the wallet scan outputs as part of normal operation (i.e. perform the wallet_check command more or less continually, like most other wallets do for their chains). Wallets could store a birthdate block, keep track of the last block scanned between operations and start checks from last_block - sync horizon). I think this will be needed at some point anyhow for 100% wallet data robustness, so we will have to figure out how to keep this operation efficient.

jaspervdm · 2019-08-26T10:29:18Z

@antiochp Hmm I hadn't considered the relative kernel locks usecase. But even then increasing the storage requirement of kernels by ~40% might be a lot, especially since we can never prune them.

Do you want me to already accomodate for this usecase? That means moving some of the logic into the Chain, and probably providing a flag that allows to scan backwards in the MMR instead of forwards.

@yeastplume Right, the kernel lookup would only be used for unconfirmed transactions, not for restores as they happen on the output level rather than the tx level.

Another approach to the original issue is to simply have the wallet scan outputs as part of normal operation

Yes, I think incremental scanning of the utxo set is a good idea anyway. But I'm not sure if checking the tx confirmation status by looking at the presence of the inputs in the utxo set is ideal. If the user is using the seed on multiple different wallets at the same time, it could happen that the wrong tx gets marked as confirmed.

yeastplume · 2019-08-26T11:00:04Z

Yep, agreed that kernel lookup is the best way to confirm transactions, we can get this implemented then look at incremental scanning as a near-term improvement.

antiochp · 2019-08-26T12:14:34Z

Do you want me to already accomodate for this usecase? That means moving some of the logic into the Chain, and probably providing a flag that allows to scan backwards in the MMR instead of forwards.

Do we want to always scan backwards through the MMR? Presumably if we are confirming txs they will be younger txs, present in more recent blocks?

antiochp · 2019-08-26T12:15:36Z

But even then increasing the storage requirement of kernels by ~40% might be a lot, especially since we can never prune them.

I'm not sure I follow what you are saying here - you mean in terms of the lmdb index/storage (if we stored the index in the db)?

jaspervdm · 2019-08-26T13:19:55Z

Do we want to always scan backwards through the MMR? Presumably if we are confirming txs they will be younger txs, present in more recent blocks?

Makes sense, will change it.

jaspervdm · 2019-08-26T13:21:52Z

I'm not sure I follow what you are saying here - you mean in terms of the lmdb index/storage (if we stored the index in the db)?

Yes if we store an index we'd need 40% more storage per kernel.

Thinking about it some more, a malicious user could significantly slow down initial sync by creating lots of relative lock height kernels that rely on very old kernels. What if we only create the index during initial sync and delete it afterwards?

antiochp · 2019-08-26T13:33:20Z

Yeah I think there may be reasons why we would want to temporarily create an index during initial sync. But I don't think we want to assume this index is available for general usage after sync.

I would not be overly concerned about supporting relative kernels today - we can optimize initial fast sync later as you suggest.

We don't want to add 40% to the permanent storage costs today in anticipation of something that may be very rare.

garyyu · 2019-08-27T07:28:13Z

Thinking a bit more about this. I'm torn about the "index or not to index" question.

Same here :-) My left hand vote Yes for index but right hand vote for No index 😸

There could be a middle way on this:

Use index for recent kernels (pruning those indexes when blocks pruned, and no pruning if it's an archived mode node).
Use slow searching for far old kernels (if the estimated height is beyond the STATE_SYNC_THRESHOLD and not found in indexed kernels).

The fast indexed kernel querying will cover 99% transaction status querying requirement, since in most cases people will query status immediately (or at least in two days) after a transaction posted.

(As reference, here is my trying on a pure indexing solution: gottstech#3 and gottstech#13. if working as a wallet service backend, the config item pruning_kernel_index can be changed as false.)

antiochp · 2019-08-29T17:19:46Z

@jaspervdm Is this ready for final review?

jaspervdm · 2019-08-29T18:26:36Z

@antiochp I think it is! I just updated the description to reflect the latest changes.

antiochp · 2019-08-30T08:33:45Z

chain/src/txhashset/txhashset.rs

@@ -260,6 +260,29 @@ impl TxHashSet {
 			.elements_from_insertion_index(start_index, max_count)
 	}

+	/// Find a kernel with a given excess. Work backwards from `max_index` to `min_index`


This might make more sense in the rewindable_kernel_view (rather than the txhashset).
Maybe a separate PR to refactor this.

antiochp · 2019-08-30T08:41:55Z

Looks pretty fast to me. This is the coinbase kernel for the genesis block -

time curl "http://127.0.0.1:3413/v1/chain/kernels/096385d86c5cfda718aa0b7295be0adf7e5ac051edfe130593a2a257f09f78a3b1"
{
  "tx_kernel": {
    "features": "Coinbase",
    "excess": "096385d86c5cfda718aa0b7295be0adf7e5ac051edfe130593a2a257f09f78a3b1",
    "excess_sig": "142a6a482a0c64ba71ba216ba5361612696fc76fe8d5c03c79fae01cab29d050ed7567e9a6d4fbbc1b6c0bf4434b8450dafa3f29b31612fda815f6e4b2bcfd43"
  },
  "height": 0,
  "mmr_index": 1
}
curl   0.01s user 0.01s system 1% cpu 1.049 total

antiochp

👍

Kernel lookup api

ae58117

jaspervdm requested a review from antiochp August 25, 2019 23:25

Min and max height parameters, scan backwards

8ea65ed

jaspervdm mentioned this pull request Aug 26, 2019

Release Planning: Grin v2.1.0 mimblewimble/grin-pm#179

Closed

jaspervdm added 2 commits August 26, 2019 23:57

Return null instead of 404 if kernel not found

0fb1c4e

Return TxKernel instead of TxKernelPrintable

21fe076

Update description of KernelHandler

2e6ce43

antiochp reviewed Aug 30, 2019

View reviewed changes

antiochp approved these changes Aug 30, 2019

View reviewed changes

antiochp merged commit 30156cd into mimblewimble:master Aug 30, 2019

antiochp added the enhancement label Aug 30, 2019

lehnberg added this to the 2.1.0 milestone Aug 30, 2019

quentinlesceller mentioned this pull request Sep 25, 2019

Add missing doc and route for Node API #3059

Merged

jaspervdm deleted the api_kernel branch December 29, 2019 17:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel lookup api method #3000

Kernel lookup api method #3000

jaspervdm commented Aug 25, 2019 •

edited

Loading

DavidBurkett commented Aug 26, 2019

antiochp commented Aug 26, 2019

antiochp commented Aug 26, 2019

yeastplume commented Aug 26, 2019 •

edited

Loading

jaspervdm commented Aug 26, 2019

yeastplume commented Aug 26, 2019

antiochp commented Aug 26, 2019

antiochp commented Aug 26, 2019

jaspervdm commented Aug 26, 2019

jaspervdm commented Aug 26, 2019

antiochp commented Aug 26, 2019

garyyu commented Aug 27, 2019

antiochp commented Aug 29, 2019

jaspervdm commented Aug 29, 2019

antiochp Aug 30, 2019

antiochp commented Aug 30, 2019

antiochp left a comment

Kernel lookup api method #3000

Kernel lookup api method #3000

Conversation

jaspervdm commented Aug 25, 2019 • edited Loading

DavidBurkett commented Aug 26, 2019

antiochp commented Aug 26, 2019

antiochp commented Aug 26, 2019

yeastplume commented Aug 26, 2019 • edited Loading

jaspervdm commented Aug 26, 2019

yeastplume commented Aug 26, 2019

antiochp commented Aug 26, 2019

antiochp commented Aug 26, 2019

jaspervdm commented Aug 26, 2019

jaspervdm commented Aug 26, 2019

antiochp commented Aug 26, 2019

garyyu commented Aug 27, 2019

antiochp commented Aug 29, 2019

jaspervdm commented Aug 29, 2019

antiochp Aug 30, 2019

Choose a reason for hiding this comment

antiochp commented Aug 30, 2019

antiochp left a comment

Choose a reason for hiding this comment

jaspervdm commented Aug 25, 2019 •

edited

Loading

yeastplume commented Aug 26, 2019 •

edited

Loading