Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split C code into more files #471

Merged
merged 23 commits into from
Aug 6, 2024
Merged

Split C code into more files #471

merged 23 commits into from
Aug 6, 2024

Conversation

jtraglia
Copy link
Member

@jtraglia jtraglia commented Aug 5, 2024

This PR continues to split the core C code into more manageable files.

Four new directories have been created:

  • common -- Contains general code, some of which is shared between eip4844 and eip7594.
    • For example, it contains definitions for BLS points/fields, memory allocations, and various utilities.
  • eip4844 -- As the name implies, contains the implementation for EIP-4844.
    • The core file in this directory (eip4844.c) has been mostly left as-as. Don't poke the audited dragon.
  • eip7594 -- As the name implies, contains the implementation for EIP-7594.
    • There are several files in this directory, such as fk20.c which handles FK20 proof generation, recovery.c which implements cell recovery, and fft.c which provides methods for FFT.
  • setup -- Contains code pertaining to loading the trusted setup.
    • This also contains the definition for KZGSettings, which didn't quite make sense in common. This is a bit of a weird situation. We don't want to include it in eip4844 because it's been updated for eip7594.

We needed to include the C source directory in the Rust bindings generator so the updated includes would work.

There are various other small changes, but no changes to the implementation have been made here.

Added InsertNewlineAtEOF: true to the clang formatter configuration file. Several of my new files did not contain a newline at the bottom, which showed an ugly crossed out circle on GitHub.

@asn-d6
Copy link
Contributor

asn-d6 commented Aug 5, 2024

This looks great!

Some thoughts:

  • Not sure if I prefer having separate files for constants.h and types.h over just stuffing them into the (already quite small) api.h.
  • I'm not sure if I love the eip7594/api.c entry point. I would go for eip7594/eip7594.c but I understand that might not be best either.

Both points above are just personal preference, and I hold no strong opinions.

@asn-d6
Copy link
Contributor

asn-d6 commented Aug 6, 2024

Some thoughts after reviewing this PR.

As a personal preference, I find oversplitting can sometimes be counterproductive. Having files that hold a single function or type can introduce a mental overhead when trying to locate specific functions, and it can also create a practical overhead in terms of handling numerous files.

IMO for c-kzg, file splitting should aim to highlight the responsibilities of each module, and to specify the scope of objects.
On this regard, I think this PR takes good directions by moving generic algebra functionality to common/ and keeping the Ethereum protocol functonality into eip4844/ and eip7594/. We have also agreed that it's OK for eip7594/ to import eip4844/ because that's how the spec does it, but the opposite direction is not OK.

Some notes on the PR:

  • On common/:
    • I would merge g1.c and g2.c into ec.c that contains elliptic curve operations.
    • I would remove ret.c since it's empty (as well as settings.c and any other empty file).
    • I would rename helpers.c to utils.c. (Nit)
  • Abstractions:
    • I think it's debatable whether KZGSettings should be in common/ since it contains protocol-specific information. I'm not sure if it should be in eip4844/ either though, since it contains 7594 information. Keeping it as is might be the best option now, but this abstraction leak is something to note.
    • I would move KZGCommitment and KZGProof to eip4844/ since they are protocol-specific structures and not generic.
  • On eip4844/:
    • I would rename api.c to eip4844.c. IMO it's fine to have a repeated eip4844/eip4844.c structure as it indicates that the file contains the main implementation of the 4844 module. I would consider doing the same for 7594.
  • On eip7594/:
    • I would probably split it into the following modules:
      • eip7594.c which contains all the public methods as well as central EIP7594 types (like Cell)
      • fft.c which contains field FFT, group FFT, and coset FFT
      • fk20.c which contains the FK20 codebase
      • recovery.c which contains the recovery implementation

@jtraglia jtraglia marked this pull request as ready for review August 6, 2024 14:21
Copy link
Contributor

@asn-d6 asn-d6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! Thanks a lot!

@asn-d6 asn-d6 merged commit ff4f3b7 into ethereum:main Aug 6, 2024
38 checks passed
@jtraglia jtraglia deleted the more-split branch August 6, 2024 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants