Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for chunked download to handle large dataset files efficiently #38

Merged
merged 2 commits into from
Oct 30, 2024

Conversation

ubyndr
Copy link
Collaborator

@ubyndr ubyndr commented Oct 30, 2024

This PR enhances the download_dataset_with_url method to handle large dataset files efficiently by downloading in 8 MB chunks and includes a progress bar to provide visual feedback during the download process. These updates improve memory management and user experience when handling large files (e.g., 50GB).

Changes

  • Chunked Download: Added stream=True with a 8 MB chunk size to manage memory usage during download.
  • Progress Bar: Integrated tqdm to display a real-time progress bar, showing download progress for large files.

Testing

  • Verified with a large dataset to confirm successful download, complete file integrity, and correct progress display.

This enhancement optimizes memory usage, provides better user feedback, and increases reliability for large file downloads.

@ubyndr ubyndr requested a review from hkir-dev October 30, 2024 11:30
@ubyndr ubyndr merged commit 75d920c into main Oct 30, 2024
@ubyndr ubyndr deleted the feature/chunked-download-large-files branch October 30, 2024 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants