This repository has been archived by the owner on Apr 6, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 7
Add docs #13
Closed
Closed
Add docs #13
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
8ab5972
Conflict
mariosasko 8722860
Minor code fixes
mariosasko be5003f
Add docs
mariosasko 1eba75b
Newlines
mariosasko cd49eca
Apply suggestions from code review
mariosasko bc10894
Minor improvements
mariosasko e458d8b
Mention Polars integration
mariosasko 42ca1d4
Revision in example
mariosasko 2fdfc42
Merge branch 'main' of github.com:huggingface/hffs into docs
mariosasko 88afadc
Remove duckdb hack and update duckdb example
mariosasko f7d41c7
Merge branch 'main' into docs
lhoestq 157966c
update docs
lhoestq e541f4e
Update index.mdx
mariosasko File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
name: Build documentation | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
- doc-builder* | ||
- v*-release | ||
|
||
jobs: | ||
build: | ||
uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main | ||
with: | ||
commit_sha: ${{ github.sha }} | ||
package: hffs | ||
secrets: | ||
token: ${{ secrets.HUGGINGFACE_PUSH }} | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
name: Build PR Documentation | ||
|
||
on: | ||
pull_request: | ||
|
||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }} | ||
cancel-in-progress: true | ||
|
||
jobs: | ||
build: | ||
uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main | ||
with: | ||
commit_sha: ${{ github.event.pull_request.head.sha }} | ||
pr_number: ${{ github.event.number }} | ||
package: hffs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
name: Delete dev documentation | ||
|
||
on: | ||
pull_request: | ||
types: [ closed ] | ||
|
||
|
||
jobs: | ||
delete: | ||
uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main | ||
with: | ||
pr_number: ${{ github.event.number }} | ||
package: hffs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
name: Self-assign | ||
on: | ||
issue_comment: | ||
types: created | ||
jobs: | ||
one: | ||
runs-on: ubuntu-latest | ||
if: >- | ||
(github.event.comment.body == '#take' || | ||
github.event.comment.body == '#self-assign') | ||
&& !github.event.issue.assignee | ||
steps: | ||
- run: | | ||
echo "Assigning issue ${{ github.event.issue.number }} to ${{ github.event.comment.user.login }}" | ||
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"assignees": ["${{ github.event.comment.user.login }}"]}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/assignees | ||
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -X "DELETE" https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/labels/help%20wanted |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
<!--- | ||
Copyright 2020 The HuggingFace Team. All rights reserved. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
# Generating the documentation | ||
|
||
To generate the documentation, you need to install our special tool that builds it: | ||
|
||
```bash | ||
pip install git+https://github.com/huggingface/doc-builder | ||
``` | ||
|
||
--- | ||
**NOTE** | ||
|
||
You only need to generate the documentation to inspect it locally (if you're planning changes and want to | ||
check how they look before committing for instance). You don't have to commit the built documentation. | ||
|
||
--- | ||
|
||
## Building the documentation | ||
|
||
Once you have setup the `doc-builder` and additional packages, you can generate the documentation by | ||
typing the following command: | ||
|
||
```bash | ||
doc-builder build hffs docs/source --build_dir ~/tmp/test-build | ||
``` | ||
|
||
You can adapt the `--build_dir` to set any temporary folder that you prefer. This command will create it and generate | ||
the MDX files that will be rendered as the documentation on the main website. You can inspect them in your favorite | ||
Markdown editor. | ||
|
||
## Previewing the documentation | ||
|
||
To preview the docs, first install the `watchdog` module with: | ||
|
||
```bash | ||
pip install watchdog | ||
``` | ||
|
||
Then run the following command: | ||
|
||
```bash | ||
doc-builder preview {package_name} {path_to_docs} | ||
``` | ||
|
||
For example: | ||
|
||
```bash | ||
doc-builder preview hffs docs/source/ | ||
``` | ||
|
||
The docs will be viewable at [http://localhost:3000](http://localhost:3000). You can also preview the docs once you have opened a PR. You will see a bot add a comment to a link where the documentation with your changes lives. | ||
|
||
--- | ||
**NOTE** | ||
|
||
The `preview` command only works with existing doc files. When you add a completely new file, you need to update `_toctree.yml` & restart `preview` command (`ctrl-c` to stop it & call `doc-builder preview ...` again). | ||
|
||
--- | ||
|
||
## Adding a new element to the navigation bar | ||
|
||
Accepted files are Markdown (.md or .mdx). | ||
|
||
Create a file with its extension and put it in the source directory. You can then link it to the toc-tree by putting | ||
the filename without the extension in the [`_toctree.yml`](https://github.com/huggingface/hffs/blob/main/docs/source/_toctree.yml) file. | ||
|
||
## Adding an image | ||
|
||
Due to the rapidly growing repository, it is important to make sure that no files that would significantly weigh down the repository are added. This includes images, videos and other non-text files. We prefer to leverage a hf.co hosted `dataset` like | ||
the ones hosted on [`hf-internal-testing`](https://huggingface.co/hf-internal-testing) in which to place these files and reference | ||
them by URL. We recommend putting them in the following dataset: [huggingface/documentation-images](https://huggingface.co/datasets/huggingface/documentation-images). | ||
If an external contribution, feel free to add the images to your PR and ask a Hugging Face member to migrate your images | ||
to this dataset. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
- title: Get Started | ||
sections: | ||
- local: index | ||
title: 🤗 Filesystem | ||
- local: integration_zoo | ||
title: Integration Zoo |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
# Filesystem | ||
|
||
🤗 Filesystem (`hffs`) is a package that provides a pythonic [fsspec-compatible](https://filesystem-spec.readthedocs.io/en/latest/) file interface to the [Hugging Face Hub](https://huggingface.co/). It builds on top of the [Hugging Face Hub client library](https://huggingface.co/docs/huggingface_hub/index) to read and write files and inspect repositories on the Hub. | ||
|
||
## Installation | ||
|
||
```bash | ||
pip install hffs | ||
``` | ||
|
||
## Usage | ||
|
||
`HfFileSystem` is the library's main class that holds connection information and enables typical filesystem style operations like `cp`, `mv`, `ls`, `du`, `glob`, `get_file`, `put_file` etc. | ||
|
||
```python | ||
>>> from hffs import HfFileSystem | ||
>>> fs = HfFileSystem() | ||
|
||
>>> # List files in a directory | ||
>>> fs.ls("datasets/my-username/my-dataset-repo/data", detail=False) | ||
['datasets/my-username/my-dataset-repo/data/train.csv', 'datasets/my-username/my-dataset-repo/data/test.csv'] | ||
|
||
>>> # List all ".csv" files in a repo | ||
>>> fs.glob("datasets/my-username/my-dataset-repo/**.csv") | ||
['datasets/my-username/my-dataset-repo/data/train.csv', 'datasets/my-username/my-dataset-repo/data/test.csv'] | ||
|
||
>>> # Read the contents of a remote file | ||
>>> with fs.open("datasets/my-username/my-dataset-repo/data/train.csv", "r") as f: | ||
... train_data = f.readlines() | ||
|
||
>>> # Read all the contents of a remote file at once as a string | ||
>>> train_data = fs.read_text("datasets/my-username/my-dataset-repo/data/train.csv") | ||
|
||
>>> # Write a remote file | ||
>>> with fs.open("datasets/my-username/my-dataset-repo/data/validation.csv", "w") as f: | ||
... f.write("text,label") | ||
... f.write("Fantastic movie!,good") | ||
``` | ||
|
||
The prefix for datasets is "datasets/", the prefix for spaces is "spaces/" and models don't need a prefix in the URL. | ||
|
||
The optional `revision` argument can be passed to open a filesystem from a specific commit (any revision such as a branch or a tag name or a commit hash). | ||
mariosasko marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Unlike Python's built-in `open`, `fsspec`'s `open` defaults to binary mode, `"rb"`. This means you must explicitly set encoding as `"r"` for reading and `"w"` for writing in text mode. | ||
|
||
## Integration | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it'd make more sense to move the Integration section before the Usage section? It might be good for the user to check if they can use a URL with an integration before they start using the filesystem operations. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This order comes from the |
||
|
||
🤗 Filesystem can be used with any library that integrates `fsspec`, and the URL has the following structure: | ||
|
||
``` | ||
hf://[<repo_type_prefix>]<repo_id>/<path/in/repo> | ||
``` | ||
|
||
The `revision` parameter is optional. Most integrations also allow you to pass optional parameters to the filesystem's initializer as `storage_options`, a dictionary mapping parameter names to their values: | ||
|
||
```python | ||
>>> storage_options = {"revision": "main"} | ||
``` | ||
|
||
## Authentication | ||
|
||
In many cases, you must be logged in with a Hugging Face account to interact with the Hub: | ||
|
||
```bash | ||
huggingface-cli login | ||
``` | ||
|
||
Refer to the [Login](https://huggingface.co/docs/huggingface_hub/quick-start#login) section of the Hugging Face Hub client library documentation to learn more about authentication methods on the Hub. | ||
|
||
It is also possible to login programmatically by passing your `token` as an argument to `HfFileSystem`: | ||
|
||
```python | ||
>>> import hffs | ||
>>> fs = hffs.HfFileSystem(token=token) | ||
``` | ||
|
||
If you login this way, be careful not to accidentally leak the token when sharing your source code! | ||
|
||
## API Reference | ||
|
||
As 🤗 Filesystem is based on [fsspec](https://filesystem-spec.readthedocs.io/en/latest/), it is compatible with most of the APIs that it offers. For more details, check out the fsspec's [API Reference](https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.spec.AbstractFileSystem). | ||
|
||
|
||
Read the [Integration Zoo](integration_zoo) guide to learn more about libraries that integrate with `fsspec`, allowing convenient access to the Hub through 🤗 Filesystem. | ||
|
||
If you have questions about 🤗 Filesystem, feel free to join and ask the community on our [forum](https://discuss.huggingface.co/). | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LysandreJik Can you help me set up this secret?