feat(ICP-archive): migrate icp archive to stable structures #3910

maciejdfinity · 2025-02-11T16:34:46Z

The archive blocks are stored in stable log. Since migration requires only ~24B instructions, the whole migration is performed in post_upgrade. The archive has a new argument that can be used to specify a new archive size limit. The argument can be skipped.

mbjorkqvist

Thanks @maciejdfinity!

mbjorkqvist · 2025-02-18T16:20:20Z

rs/ledger_suite/icp/archive/src/main.rs

+const MAX_MEMORY_SIZE_BYTES_MEMORY_ID: MemoryId = MemoryId::new(0);
+const BLOCK_HEIGHT_OFFSET_MEMORY_ID: MemoryId = MemoryId::new(1);
+const TOTAL_BLOCK_SIZE_MEMORY_ID: MemoryId = MemoryId::new(2);
+const LEDGER_CANISTER_ID_MEMORY_ID: MemoryId = MemoryId::new(3);
+const BLOCK_LOG_INDEX_MEMORY_ID: MemoryId = MemoryId::new(4);
+const BLOCK_LOG_DATA_MEMORY_ID: MemoryId = MemoryId::new(5);


In the ICRC archive, we have a single ArchiveConfig in a StableCell. What's the rationale for having these as separate MemoryIds, and additionally, for the thread_local! _CACHE RefCells? Is it e.g., easier to add new fields if they are all separate cells? What about the cache, is that faster/more efficient in terms of instructions, or otherwise simplifying things?

The rationale behind separate cells was that I didn't want to copy the whole state to read one integer. I am not sure how expensive that would be, since we only have 4 fields (although, the canister id is slightly larger). Maybe there is some better way, where we keep some local copy of the state and return a reference... As for the cache, I am thinking the since TOTAL_BLOCK_SIZE is modified often, the cache might not be that beneficial. However, the other fields are almost never changed, so it should be helpful there. I could run some simple benchmarks.

I see, thanks for the explanation! I suppose we would use a similar approach for the ledger state if we move that to stable structures also (to eventually be able to get rid of the pre- and post-upgrade serialization&deserialization dance? I tried to see what is done in other canisters, and I saw that in the governance canister, there is a State struct that is used to manage the different data structures stored in individual memories in stable memory. Would it make sense to have something like that for the archive also (and later for the ledger)? It's maybe not so critical for the archive, since as you point out, we don't have so many fields, but on the other hand, maybe here it would be easier to come up with a design that we could later reuse for the ledger? On the other hand, from skimming through the governance canister, it seems they do perform non-trivial initialization in the post-upgrade, so if we want to get rid of that, maybe their approach anyway wouldn't work for us?

Thanks for the pointer! I will have a closer look!

Another option is to have ARCHIVE_STATE_CACHE and access it similarly as we do in the ICRC ledger, i.e. Access::with_archive_state(|state| {...}) and Access::with_archive_state_mut. The mut version would also need to update the underlying stable cell that contains the whole state.

Or not use the cache and read the state directly. I guess if the state is large, reading the whole state might be expensive, but I don't have the exact numbers.

mbjorkqvist · 2025-02-18T16:23:47Z

rs/ledger_suite/icp/archive/src/main.rs

+    if max_memory_size_bytes < total_block_size() {
+        ic_cdk::trap(&format!(
+            "Cannot set max_memory_size_bytes to {}, because it is lower than total_block_size {}.",
+            max_memory_size_bytes,
+            total_block_size()
+        ));
+    }


Does it make sense to have some upper limit on the max memory size? It's probably less critical for the ICP ledger archive compared to the ICRC ledger archive in the sense that whoever sets this should know what they're doing. Maybe having an alert if the size grows too big is enough.

I think we will set the limit to something more conservative (<50G ?). We could have an alert when size reaches e.g. 50% of the limit and decide what to do then.

rs/ledger_suite/icp/archive/src/main.rs

mbjorkqvist · 2025-02-18T16:56:11Z

rs/ledger_suite/icp/archive/tests/tests.rs

+    setup.upgrade(Some(2 * encoded_block_size), None);
+    setup.assert_remaining_capacity(0);
+
+    setup.upgrade(Some(u64::MAX), None);


Regarding having a ceiling on the max size - it may be better to have something like 500GB that would allow the archive to return a more reasonable error to the ledger, so that the archive returns an error before the canister limit is reached, which may be more troublesome to handle (and test)?

Co-authored-by: Mathias Björkqvist <[email protected]>

maciejdfinity added 2 commits February 11, 2025 16:34

migrate icp archive to stable structures

390c81e

refactor

4ffe2b4

maciejdfinity changed the title ~~migrate icp archive to stable structures~~ feat(ICP-archive): migrate icp archive to stable structures Feb 11, 2025

github-actions bot added the feat label Feb 11, 2025

maciejdfinity added 14 commits February 11, 2025 20:08

remaining capacity as u64

ffa62d6

restore block size, move state to separate stable cells

1d12915

clippy and error msg

054d9cb

small refactor

0050134

fix get_blocks offset

7c29fa4

add upgrade argument

62362dd

test u64 max capacity

b3f08e5

allow for no upgrade argument

9edf2e7

calculate block size while adding blocks

5a04449

test upgrade to mainnet fails

aca9a1b

Merge branch 'master' into maciej-icp-arch-stable

2d794b1

small refactor

a54a655

increase archive capacity in golden test

cdf881d

Merge branch 'master' into maciej-icp-arch-stable

5ac7c0c

maciejdfinity marked this pull request as ready for review February 17, 2025 10:39

maciejdfinity requested a review from a team as a code owner February 17, 2025 10:39

github-actions bot added the @finint label Feb 17, 2025

mbjorkqvist reviewed Feb 18, 2025

View reviewed changes

maciejdfinity and others added 5 commits February 18, 2025 18:25

Update rs/ledger_suite/icp/archive/src/main.rs

4408f68

Co-authored-by: Mathias Björkqvist <[email protected]>

use match instead of if

734fbe5

use log_block_size for total_block_size

f4729a7

small refactor

4cd5e0a

remove caches for u64s

f0a9204

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ICP-archive): migrate icp archive to stable structures #3910

feat(ICP-archive): migrate icp archive to stable structures #3910

maciejdfinity commented Feb 11, 2025 •

edited

Loading

mbjorkqvist left a comment

mbjorkqvist Feb 18, 2025

maciejdfinity Feb 18, 2025

mbjorkqvist Feb 19, 2025

maciejdfinity Feb 19, 2025

maciejdfinity Feb 19, 2025 •

edited

Loading

maciejdfinity Feb 20, 2025

mbjorkqvist Feb 18, 2025

maciejdfinity Feb 18, 2025

mbjorkqvist Feb 18, 2025

feat(ICP-archive): migrate icp archive to stable structures #3910

Are you sure you want to change the base?

feat(ICP-archive): migrate icp archive to stable structures #3910

Conversation

maciejdfinity commented Feb 11, 2025 • edited Loading

mbjorkqvist left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maciejdfinity Feb 19, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maciejdfinity commented Feb 11, 2025 •

edited

Loading

maciejdfinity Feb 19, 2025 •

edited

Loading