Skip to content

Commit

Permalink
docs: implement docs2.0 layout (#2689)
Browse files Browse the repository at this point in the history
* fix: add white version of oso emblem to dark view

* feat(docs): add new section for adding projects to ossd

* fix(docs): rework getting started pages

* refactor(docs): rework contributing data section

* refactor(docs): clean up get data section

* refactor(docs): revisions to guides

* refactor(docs): changes to order and copy for references

* feat(docs): create new guide ToC

* chore: cleanup old files

* fix(docs): copy-pasta error

* fix: broken links

* fix(docs): resolve broken links

* fix(docs): broken anchors
  • Loading branch information
ccerv1 authored Jan 4, 2025
1 parent 5373b37 commit 3c31ca2
Show file tree
Hide file tree
Showing 85 changed files with 1,356 additions and 1,142 deletions.
4 changes: 0 additions & 4 deletions apps/docs/docs/contribute-data/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@ title: Crawl an API
sidebar_position: 4
---

import NextSteps from "./dagster-config.mdx"

We expect one of the most common forms of data connection would be to connect
some public API to OSO. We have created tooling to make this as easy as possible.

Expand Down Expand Up @@ -202,5 +200,3 @@ There are a few critical changes we've made in this example:
3. The dlt resource is yielded as usual but it is instead passed the
`RESTClient` instance that has been configured with authentication
credentials.

<NextSteps components={props.components}/>
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Connect via BigQuery
title: Connect a BigQuery Public Dataset
sidebar_position: 1
---

Expand All @@ -11,7 +11,7 @@ the US multi-region.
If you want OSO to host a copy of
the dataset in the US multi-region,
see our guide on
[BigQuery Data Transfer Service](./replication.md).
[BigQuery Data Transfer Service](../guides/bq-data-transfer.md).

## Make the data available in the US region

Expand All @@ -29,7 +29,7 @@ you can do this directly from the

OSO will also copy certain valuable datasets into the
`opensource-observer` project via the BigQuery Data Transfer Service
See the guide on [BigQuery Data Transfer Service](./replication.md)
See the guide on [BigQuery Data Transfer Service](../guides/bq-data-transfer.md)
add dataset replication as a Dagster asset to OSO.

## Make the data accessible to our Google service account
Expand Down
8 changes: 2 additions & 6 deletions apps/docs/docs/contribute-data/dagster.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
---
title: Writing Custom Dagster Assets
title: Write a Custom Dagster Asset
sidebar_position: 6
---

import NextSteps from "./dagster-config.mdx"

Before writing a fully custom Dagster asset,
we recommend you first see if the previous guides on
[BigQuery datasets](./bigquery/index.md),
[BigQuery datasets](./bigquery.md),
[database replication](./database.md),
[API crawling](./api.md)
may be a better fit.
Expand Down Expand Up @@ -151,5 +149,3 @@ gitcoin_passport_scores = interval_gcs_import_asset(
),
)
```

<NextSteps components={props.components}/>
6 changes: 1 addition & 5 deletions apps/docs/docs/contribute-data/database.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
---
title: Replicate a Database
title: Provide Access to Your Database
sidebar_position: 3
---

import NextSteps from "./dagster-config.mdx"

OSO's dagster infrastructure has support for database replication into our data
warehouse by using Dagster's "embedded-elt" that integrates with the library
[dlt](https://dlthub.com/).
Expand Down Expand Up @@ -105,5 +103,3 @@ integrated, you will want to contact the OSO team on our
credentials (we will work out a secure method of transmission) and also ensure
that you have access to update any firewall settings that may be required for us
to access your database server.

<NextSteps components={props.components}/>
40 changes: 2 additions & 38 deletions apps/docs/docs/contribute-data/funding-data.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,15 @@
---
title: Add Funding Data
sidebar_position: 10
title: Upload Funding Data
sidebar_position: 7
---

:::info
We are coordinating with several efforts to collect, clean, and visualize OSS funding data, including [RegenData.xyz](https://regendata.xyz/), [Gitcoin Grants Data Portal](https://davidgasquez.github.io/gitcoin-grants-data-portal/), and [Crypto Data Bytes](https://dune.com/cryptodatabytes/crypto-grants-analysis). We maintain a [master CSV file](https://github.com/opensource-observer/oss-funding) that maps OSO project names to funding sources. It includes grants, direct donations, and other forms of financial support. We are looking for data from a variety of sources, including both crypto and non-crypto funding platforms.
:::

## Uploading Funding Data

---

Add or update funding data by making a pull request to [oss-funding](https://github.com/opensource-observer/oss-funding).

1. Fork [oss-funding](https://github.com/opensource-observer/oss-funding/fork).
2. Add static data in CSV (or JSON) format to `./uploads/`.
3. Ensure the data contains links to one or more project artifacts such as GitHub repos or wallet addresses. This is necessary in order for one of the repo maintainers to link funding events to OSS projects.
4. Submit a pull request from your fork back to [oss-funding](https://github.com/opensource-observer/oss-funding).

## Contributing Clean Data

---

Data collective members may also transform the data to meet our schema and add a CSV version to the `./clean/` directory. You can do this by following the same process shown above, except destined to the `./clean/` directory.

Submissions will be validated to ensure they conform to the schema and don't contain any funding events that are already in the registry.

Additions to the `./clean/` directory should include as many of the following columns as possible:

- `oso_slug`: The OSO project name (leave blank or null if the project doesn't exist yet).
- `project_name`: The name of the project (according to the funder's data).
- `project_id`: The unique identifier for the project (according to the funder's data).
- `project_url`: The URL of the project's grant application or profile.
- `project_address`: The address the project used to receive the grant.
- `funder_name`: The name of the funding source.
- `funder_round_name`: The name of the funding round or grants program.
- `funder_round_type`: The type of funding this round is (eg, retrospective, builder grant, etc).
- `funder_address`: The address of the funder.
- `funding_amount`: The amount of funding.
- `funding_currency`: The currency of the funding amount.
- `funding_network`: The network the funding was provided on (eg, Mainnet, Optimism, Arbitrum, fiat, etc).
- `funding_date`: The date of the funding event.

## Exploring Funding Data

---

You can read or copy the latest version of the funding data directly from the [oss-funding](https://github.com/opensource-observer/oss-funding) repo.

If you do something cool with the data (eg, a visualization or analysis), please share it with us!
8 changes: 2 additions & 6 deletions apps/docs/docs/contribute-data/gcs.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
---
title: Connect via Google Cloud Storage (GCS)
title: Import from Google Cloud Storage (GCS)
sidebar_position: 5
---

import NextSteps from "./dagster-config.mdx"

We strongly prefer data partners that can provide
updated live datasets, over a static snapshot.
Datasets that use this method will require OSO sponsorship
Expand All @@ -16,7 +14,7 @@ by OSO, please reach out to us on
[Discord](https://www.opensource.observer/discord).

If you prefer to handle the data storage yourself, check out the
[Connect via BigQuery guide](./bigquery/index.md).
[Connect via BigQuery guide](../guides/bq-data-transfer.md).

## Schedule periodic dumps to GCS

Expand Down Expand Up @@ -84,5 +82,3 @@ you will find a few examples of using the GCS asset factory:
- [Superchain data](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/__init__.py)
- [Gitcoin Passport scores](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/gitcoin.py)
- [OpenRank reputations on Farcaster](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/karma3.py)

<NextSteps components={props.components}/>
8 changes: 4 additions & 4 deletions apps/docs/docs/contribute-data/index.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
---
title: Connect Your Data
title: Contribute Data
sidebar_position: 0
---

:::info
We're always looking for new data sources to integrate with OSO and deepen our community's understanding of open source impact. If you're a developer or data engineer, please reach out to us on [Discord](https://www.opensource.observer/discord). We'd love to partner with you to connect your database (or other external data sources) to the OSO data warehouse.
We're always looking for new data sources to integrate with OSO and deepen our community's understanding of open source impact. If you're a developer or data engineer, please reach out to us on [Discord](https://www.opensource.observer/discord). We'd love to partner with you to connect your database (or other external data sources) to the OSO data lake.
:::

There are currently the following patterns for integrating new data sources into OSO,
in order of preference:

1. [**BigQuery public datasets**](./bigquery/index.md): If you can maintain a BigQuery public dataset, this is the preferred and easiest route.
2. [**Database replication**](./database.md): Replicate your database into an OSO dataset (e.g. from Postgres).
1. [**BigQuery public datasets**](./bigquery.md): If you can maintain a BigQuery public dataset, this is the preferred and easiest route.
2. [**Database replication**](./database.md): Provide access to your database and we can replicate it as an OSO dataset (e.g. from Postgres).
3. [**API crawling**](./api.md): Crawl an API by writing a plugin.
4. [**Files into Google Cloud Storage (GCS)**](./gcs.md): You can drop Parquet/CSV files in our GCS bucket for loading into BigQuery.
5. [**Custom Dagster assets**](./dagster.md): Write a custom Dagster asset for other unique data sources.
Expand Down
Loading

0 comments on commit 3c31ca2

Please sign in to comment.