docs: implement docs2.0 layout (#2689)

* fix: add white version of oso emblem to dark view * feat(docs): add new section for adding projects to ossd * fix(docs): rework getting started pages * refactor(docs): rework contributing data section * refactor(docs): clean up get data section * refactor(docs): revisions to guides * refactor(docs): changes to order and copy for references * feat(docs): create new guide ToC * chore: cleanup old files * fix(docs): copy-pasta error * fix: broken links * fix(docs): resolve broken links * fix(docs): broken anchors
opensource-observer · Jan 4, 2025 · 3c31ca2 · 3c31ca2
1 parent 5373b37
commit 3c31ca2
Show file tree

Hide file tree

Showing 85 changed files with 1,356 additions and 1,142 deletions.
diff --git a/apps/docs/docs/contribute-data/api.md b/apps/docs/docs/contribute-data/api.md
@@ -3,8 +3,6 @@ title: Crawl an API
 sidebar_position: 4
 ---
 
-import NextSteps from "./dagster-config.mdx"
-
 We expect one of the most common forms of data connection would be to connect
 some public API to OSO. We have created tooling to make this as easy as possible.
 
@@ -202,5 +200,3 @@ There are a few critical changes we've made in this example:
 3. The dlt resource is yielded as usual but it is instead passed the
    `RESTClient` instance that has been configured with authentication
    credentials.
-
-<NextSteps components={props.components}/>
diff --git a/...ute-data/bigquery/bigquery-open-perms.png → ...s/contribute-data/bigquery-open-perms.png b/...ute-data/bigquery/bigquery-open-perms.png → ...s/contribute-data/bigquery-open-perms.png
diff --git a/...bute-data/bigquery/bigquery-set-perms.png → ...cs/contribute-data/bigquery-set-perms.png b/...bute-data/bigquery/bigquery-set-perms.png → ...cs/contribute-data/bigquery-set-perms.png
diff --git a/...cs/docs/contribute-data/bigquery/index.md → apps/docs/docs/contribute-data/bigquery.md b/...cs/docs/contribute-data/bigquery/index.md → apps/docs/docs/contribute-data/bigquery.md
@@ -1,5 +1,5 @@
 ---
-title: Connect via BigQuery
+title: Connect a BigQuery Public Dataset
 sidebar_position: 1
 ---
 
@@ -11,7 +11,7 @@ the US multi-region.
 If you want OSO to host a copy of
 the dataset in the US multi-region,
 see our guide on
-[BigQuery Data Transfer Service](./replication.md).
+[BigQuery Data Transfer Service](../guides/bq-data-transfer.md).
 
 ## Make the data available in the US region
 
@@ -29,7 +29,7 @@ you can do this directly from the
 
 OSO will also copy certain valuable datasets into the
 `opensource-observer` project via the BigQuery Data Transfer Service
-See the guide on [BigQuery Data Transfer Service](./replication.md)
+See the guide on [BigQuery Data Transfer Service](../guides/bq-data-transfer.md)
 add dataset replication as a Dagster asset to OSO.
 
 ## Make the data accessible to our Google service account

diff --git a/apps/docs/docs/contribute-data/dagster.md b/apps/docs/docs/contribute-data/dagster.md
@@ -1,13 +1,11 @@
 ---
-title: Writing Custom Dagster Assets
+title: Write a Custom Dagster Asset
 sidebar_position: 6
 ---
 
-import NextSteps from "./dagster-config.mdx"
-
 Before writing a fully custom Dagster asset,
 we recommend you first see if the previous guides on
-[BigQuery datasets](./bigquery/index.md),
+[BigQuery datasets](./bigquery.md),
 [database replication](./database.md),
 [API crawling](./api.md)
 may be a better fit.
@@ -151,5 +149,3 @@ gitcoin_passport_scores = interval_gcs_import_asset(
     ),
 )
 ```
-
-<NextSteps components={props.components}/>
diff --git a/apps/docs/docs/contribute-data/database.md b/apps/docs/docs/contribute-data/database.md
@@ -1,10 +1,8 @@
 ---
-title: Replicate a Database
+title: Provide Access to Your Database
 sidebar_position: 3
 ---
 
-import NextSteps from "./dagster-config.mdx"
-
 OSO's dagster infrastructure has support for database replication into our data
 warehouse by using Dagster's "embedded-elt" that integrates with the library
 [dlt](https://dlthub.com/).
@@ -105,5 +103,3 @@ integrated, you will want to contact the OSO team on our
 credentials (we will work out a secure method of transmission) and also ensure
 that you have access to update any firewall settings that may be required for us
 to access your database server.
-
-<NextSteps components={props.components}/>
diff --git a/apps/docs/docs/contribute-data/funding-data.md b/apps/docs/docs/contribute-data/funding-data.md
@@ -1,51 +1,15 @@
 ---
-title: Add Funding Data
-sidebar_position: 10
+title: Upload Funding Data
+sidebar_position: 7
 ---
 
 :::info
 We are coordinating with several efforts to collect, clean, and visualize OSS funding data, including [RegenData.xyz](https://regendata.xyz/), [Gitcoin Grants Data Portal](https://davidgasquez.github.io/gitcoin-grants-data-portal/), and [Crypto Data Bytes](https://dune.com/cryptodatabytes/crypto-grants-analysis). We maintain a [master CSV file](https://github.com/opensource-observer/oss-funding) that maps OSO project names to funding sources. It includes grants, direct donations, and other forms of financial support. We are looking for data from a variety of sources, including both crypto and non-crypto funding platforms.
 :::
 
-## Uploading Funding Data
-
----
-
 Add or update funding data by making a pull request to [oss-funding](https://github.com/opensource-observer/oss-funding).
 
 1. Fork [oss-funding](https://github.com/opensource-observer/oss-funding/fork).
 2. Add static data in CSV (or JSON) format to `./uploads/`.
 3. Ensure the data contains links to one or more project artifacts such as GitHub repos or wallet addresses. This is necessary in order for one of the repo maintainers to link funding events to OSS projects.
 4. Submit a pull request from your fork back to [oss-funding](https://github.com/opensource-observer/oss-funding).
-
-## Contributing Clean Data
-
----
-
-Data collective members may also transform the data to meet our schema and add a CSV version to the `./clean/` directory. You can do this by following the same process shown above, except destined to the `./clean/` directory.
-
-Submissions will be validated to ensure they conform to the schema and don't contain any funding events that are already in the registry.
-
-Additions to the `./clean/` directory should include as many of the following columns as possible:
-
-- `oso_slug`: The OSO project name (leave blank or null if the project doesn't exist yet).
-- `project_name`: The name of the project (according to the funder's data).
-- `project_id`: The unique identifier for the project (according to the funder's data).
-- `project_url`: The URL of the project's grant application or profile.
-- `project_address`: The address the project used to receive the grant.
-- `funder_name`: The name of the funding source.
-- `funder_round_name`: The name of the funding round or grants program.
-- `funder_round_type`: The type of funding this round is (eg, retrospective, builder grant, etc).
-- `funder_address`: The address of the funder.
-- `funding_amount`: The amount of funding.
-- `funding_currency`: The currency of the funding amount.
-- `funding_network`: The network the funding was provided on (eg, Mainnet, Optimism, Arbitrum, fiat, etc).
-- `funding_date`: The date of the funding event.
-
-## Exploring Funding Data
-
----
-
-You can read or copy the latest version of the funding data directly from the [oss-funding](https://github.com/opensource-observer/oss-funding) repo.
-
-If you do something cool with the data (eg, a visualization or analysis), please share it with us!
diff --git a/apps/docs/docs/contribute-data/gcs.md b/apps/docs/docs/contribute-data/gcs.md
@@ -1,10 +1,8 @@
 ---
-title: Connect via Google Cloud Storage (GCS)
+title: Import from Google Cloud Storage (GCS)
 sidebar_position: 5
 ---
 
-import NextSteps from "./dagster-config.mdx"
-
 We strongly prefer data partners that can provide
 updated live datasets, over a static snapshot.
 Datasets that use this method will require OSO sponsorship
@@ -16,7 +14,7 @@ by OSO, please reach out to us on
 [Discord](https://www.opensource.observer/discord).
 
 If you prefer to handle the data storage yourself, check out the
-[Connect via BigQuery guide](./bigquery/index.md).
+[Connect via BigQuery guide](../guides/bq-data-transfer.md).
 
 ## Schedule periodic dumps to GCS
 
@@ -84,5 +82,3 @@ you will find a few examples of using the GCS asset factory:
 - [Superchain data](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/__init__.py)
 - [Gitcoin Passport scores](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/gitcoin.py)
 - [OpenRank reputations on Farcaster](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/karma3.py)
-
-<NextSteps components={props.components}/>
diff --git a/apps/docs/docs/contribute-data/index.md b/apps/docs/docs/contribute-data/index.md
@@ -1,17 +1,17 @@
 ---
-title: Connect Your Data
+title: Contribute Data
 sidebar_position: 0
 ---
 
 :::info
-We're always looking for new data sources to integrate with OSO and deepen our community's understanding of open source impact. If you're a developer or data engineer, please reach out to us on [Discord](https://www.opensource.observer/discord). We'd love to partner with you to connect your database (or other external data sources) to the OSO data warehouse.
+We're always looking for new data sources to integrate with OSO and deepen our community's understanding of open source impact. If you're a developer or data engineer, please reach out to us on [Discord](https://www.opensource.observer/discord). We'd love to partner with you to connect your database (or other external data sources) to the OSO data lake.
 :::
 
 There are currently the following patterns for integrating new data sources into OSO,
 in order of preference:
 
-1. [**BigQuery public datasets**](./bigquery/index.md): If you can maintain a BigQuery public dataset, this is the preferred and easiest route.
-2. [**Database replication**](./database.md): Replicate your database into an OSO dataset (e.g. from Postgres).
+1. [**BigQuery public datasets**](./bigquery.md): If you can maintain a BigQuery public dataset, this is the preferred and easiest route.
+2. [**Database replication**](./database.md): Provide access to your database and we can replicate it as an OSO dataset (e.g. from Postgres).
 3. [**API crawling**](./api.md): Crawl an API by writing a plugin.
 4. [**Files into Google Cloud Storage (GCS)**](./gcs.md): You can drop Parquet/CSV files in our GCS bucket for loading into BigQuery.
 5. [**Custom Dagster assets**](./dagster.md): Write a custom Dagster asset for other unique data sources.