-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define library extras and refactor S3 readers #71
Comments
Handling of files on the cloud using Pandas storage options: |
Maybe instead of adding an extra on s3, we could write some installation docs that suggest the S3 compatibility and only define somehow the storage options for the engine, with the use of:
In any case, if the users wants to have the S3 compatibilty, they may always install the required extras. It takes an awful lot of time to resolve the dependencies for pandas extras aws and gcp, but once they are defined in the lock it is quite fast again. It also downloads a lot of extra libraries that are only used for s3 and are not part of the rest of the library. |
Some notes:
I am attaching the full list of Ubuntu 24.04 (latest LTS Ubuntu) python libraries so that an effort to make the VTL Engine runnable on plain Ubuntu can be done. |
Overview
Currently the s3 support delivers quite a lot of libraries (15) that are always installed regardless of the use case. Almost all functionalities in the library should be based on the minimal amount of libraries possible.
When attempting to install vtlengine as a dependency, the poetry dependency resolver takes a lot of time.
Task to perform
The text was updated successfully, but these errors were encountered: