Skip to content

Commit

Permalink
fix(docs): polishing (#2727)
Browse files Browse the repository at this point in the history
* fix: update footer

* fix: cmd-k on search

* add emojis to docs index pages

* feat(docs): updated index pages for each major section
  • Loading branch information
ccerv1 authored Jan 9, 2025
1 parent 03741e6 commit eec241e
Show file tree
Hide file tree
Showing 7 changed files with 76 additions and 198 deletions.
32 changes: 11 additions & 21 deletions apps/docs/docs/contribute-data/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,27 +3,17 @@ title: Contribute Data
sidebar_position: 0
---

:::info
We're always looking for new data sources to integrate with OSO and deepen our community's understanding of open source impact. If you're a developer or data engineer, please reach out to us on [Discord](https://www.opensource.observer/discord). We'd love to partner with you to connect your database (or other external data sources) to the OSO data lake.
:::
# Contribute Data

There are currently the following patterns for integrating new data sources into OSO,
in order of preference:
We're always looking for new data sources to integrate with OSO. Here are the current patterns for integrating new data sources:

1. [**BigQuery public datasets**](./bigquery.md): If you can maintain a BigQuery public dataset, this is the preferred and easiest route.
2. [**Database replication**](./database.md): Provide access to your database and we can replicate it as an OSO dataset (e.g. from Postgres).
3. [**API crawling**](./api.md): Crawl an API by writing a plugin.
4. [**Files into Google Cloud Storage (GCS)**](./gcs.md): You can drop Parquet/CSV files in our GCS bucket for loading into BigQuery.
5. [**Custom Dagster assets**](./dagster.md): Write a custom Dagster asset for other unique data sources.
6. **Static files**: If the data is high quality and can only be imported via static files, please reach out to us on [Discord](https://www.opensource.observer/discord) to coordinate hand-off. This path is predominantly used for [grant funding data](./funding-data.md).
7. (deprecated) [Airbyte](./airbyte.md): a modern ELT tool
- 🗂️ [BigQuery Public Datasets](./bigquery.md) - Preferred and easiest route for maintaining a dataset
- 🗄️ [Database Replication](./database.md) - Provide access to your database for replication as an OSO dataset
- 🌐 [API Crawling](./api.md) - Crawl an API by writing a plugin
- 📁 [Files into Google Cloud Storage (GCS)](./gcs.md) - Drop Parquet/CSV files in our GCS bucket for loading into BigQuery
- ⚙️ [Custom Dagster Assets](./dagster.md) - Write a custom Dagster asset for unique data sources
- 📜 Static Files - Coordinate hand-off for high-quality data via static files. This path is
predominantly used for [grant funding data](./funding-data.md).
- 🚫 (deprecated) [Airbyte](./airbyte.md) - A modern ELT tool

We generally prefer to work with data partners that can help us regularly
index live data that can feed our daily data pipeline.
All data sources should be defined as
[software-defined assets](https://docs.dagster.io/concepts/assets/software-defined-assets) in our Dagster configuration.

ETL is the messiest, most high-touch part of the OSO data pipeline.
Please reach out to us for help on
[Discord](https://www.opensource.observer/discord).
We will happily work with you to get it working.
Reach out to us on [Discord](https://www.opensource.observer/discord) for help.
66 changes: 5 additions & 61 deletions apps/docs/docs/contribute-models/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,65 +3,9 @@ title: Explore Ways of Contributing
sidebar_position: 0
---

:::info
There are a variety of ways you can contribute to OSO. This doc features some of the most common pathways, which you can explore further via the links on the sidebar.
:::
There are a variety of ways you can contribute to OSO. Here are some common pathways:

<table>
<thead>
<tr>
<th style={{ textAlign: "left" }}>Contribution Type</th>
<th style={{ textAlign: "left" }}>GitHub Repo</th>
<th style={{ textAlign: "left" }}>Description</th>
<th style={{ textAlign: "left" }}>Type of Contributor</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<a href="./data-models">Propose an Impact Data Model</a>
</td>
<td>
<a href="https://github.com/opensource-observer/oso">oso</a>
</td>
<td>Submit a dbt data model for tracking open source impact metrics.</td>
<td>Data Scientists, Analysts</td>
</tr>
<tr>
<td>
<a href="./retrofunding">Create Metrics for Retro Funding</a>
</td>
<td>
<a href="https://github.com/opensource-observer/oso">oso</a>
</td>
<td>
Help develop impact metrics for allocating retroactive funding to open source projects.
</td>
<td>Data Scientists, Analysts</td>
</tr>
<tr>
<td>
<a href="./share-insights">Share Insights</a>
</td>
<td>
<a href="https://github.com/opensource-observer/insights">insights</a>
</td>
<td>
Contribute to our library of data visualizations and Jupyter notebooks.
</td>
<td>Data Scientists, Analysts</td>
</tr>
<tr>
<td>
<a href="./challenges">Join a Data Challenge</a>
</td>
<td>
<a href="https://github.com/opensource-observer/insights">insights</a>
</td>
<td>
Work on a specific data challenge and get paid for your contributions.
</td>
<td>Data Scientists, Analysts</td>
</tr>
</tbody>
</table>
- 🛠️ [Propose an Impact Data Model](./data-models) - Submit a dbt data model for tracking open source impact metrics
-[Create Metrics for Retro Funding](./retrofunding) - Develop impact metrics for allocating retroactive funding
- 💡 [Share Insights](./share-insights) - Contribute to our library of data visualizations and Jupyter notebooks
- 🏆 [Join a Data Challenge](./challenges) - Work on a specific data challenge and get paid for your contributions
64 changes: 10 additions & 54 deletions apps/docs/docs/guides/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,61 +3,17 @@ title: Guides
sidebar_position: 0
---

:::info
Our guides are designed to help engineers and data scientists perform common tasks on OSO.
:::

<table>
<thead>
<tr>
<th>Guide</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="./notebooks">Connect OSO to Notebooks</a></td>
<td>Use Python notebooks to explore OSO data and share insights</td>
</tr>
<tr>
<td><a href="./visualization-tools">Connect Visualization Tools</a></td>
<td>Connect OSO data to third-party BI and visualization tools</td>
</tr>
<tr>
<td><a href="./oss-directory">OSS Directory</a></td>
<td>Make updates to the directory that serves as the source of truth for OSO projects</td>
</tr>
<tr>
<td><a href="./oss-funding">Upload Funding Data</a></td>
<td>Make updates to OSS funding data</td>
</tr>
<tr>
<td><a href="./dbt">dbt Setup</a></td>
<td>Set up dbt locally and test your models</td>
</tr>
<tr>
<td><a href="./bq-data-transfer">BigQuery Data Transfer</a></td>
<td>Copy and schedule BigQuery dataset transfers into OSO</td>
</tr>
<tr>
<td><a href="./dagster">Dagster Local Development</a></td>
<td>Set up and run Dagster locally for data pipeline development</td>
</tr>
<tr>
<td><a href="./sqlmesh">SQLMesh</a></td>
<td>Use SQLMesh for data transformation and modeling</td>
</tr>
<tr>
<td><a href="./ops">Operations</a></td>
<td>Perform other devops tasks on OSO</td>
</tr>
<tr>
<td><a href="./fork-pipeline">Fork the Data Pipeline</a></td>
<td>Create your own instance of the OSO data pipeline</td>
</tr>

</tbody>
</table>

- 📓 [Connect OSO to Notebooks](./notebooks) - Use Python notebooks to explore OSO data and share insights
- 📊 [Connect Visualization Tools](./visualization-tools) - Connect OSO data to third-party BI and visualization tools
- 📚 [OSS Directory](./oss-directory) - Make updates to the directory that serves as the source of truth for OSO projects
- 💰 [Upload Funding Data](./oss-funding) - Make updates to OSS funding data
- 🔧 [dbt Setup](./dbt) - Set up dbt locally and test your models
- 🔄 [BigQuery Data Transfer](./bq-data-transfer) - Copy and schedule BigQuery dataset transfers into OSO
- 🚀 [Dagster Local Development](./dagster) - Set up and run Dagster locally for data pipeline development
- 🔍 [SQLMesh](./sqlmesh) - Use SQLMesh for data transformation and modeling
- ⚙️ [Operations](./ops) - Perform other devops tasks on OSO
- 🔀 [Fork the Data Pipeline](./fork-pipeline) - Create your own instance of the OSO data pipeline

This section is a work in progress. If you have any questions or need help, please reach out to us on [Discord](https://www.opensource.observer/discord).
28 changes: 12 additions & 16 deletions apps/docs/docs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,21 +19,17 @@ export const paths = {
troubleshoot: './projects/troubleshoot'
};

:::tip
Open search with `/` , then `Tab` to search docs
:::

<div className="cards-container">
<a href={paths.getStarted} className="card">
<h3>Get started</h3>
<h3>🌱 Get started</h3>
<p>Make your first queries to the OSO data lake or API</p>
</a>
<a href={paths.datasets} className="card">
<h3>View datasets</h3>
<h3>View datasets</h3>
<p>Explore more than 100TB of curated public datasets</p>
</a>
<a href={paths.references} className="card">
<h3>Learn how OSO works</h3>
<h3>🔍 Learn how OSO works</h3>
<p>See how all the pieces in our open data pipeline fit together</p>
</a>
</div>
Expand All @@ -43,13 +39,13 @@ Open search with `/` , then `Tab` to search docs
<h3>For data scientists</h3>
<div className="use-case-cards">
<a href={paths.notebooks} className="mini-card">
<h3>Connect OSO to notebooks</h3>
<h3>📓 Connect OSO to notebooks</h3>
</a>
<a href={paths.tutorials} className="mini-card">
<h3>Find a tutorial</h3>
<h3>📚 Find a tutorial</h3>
</a>
<a href={paths.contributeModels} className="mini-card">
<h3>Contribute models</h3>
<h3>🤖 Contribute models</h3>
</a>
</div>
</div>
Expand All @@ -58,13 +54,13 @@ Open search with `/` , then `Tab` to search docs
<h3>For developers</h3>
<div className="use-case-cards">
<a href={paths.graphqlApi} className="mini-card">
<h3>Use the GraphQL API</h3>
<h3>Use the GraphQL API</h3>
</a>
<a href={paths.pipeline} className="mini-card">
<h3>Check pipeline status</h3>
<h3>📈 Check pipeline status</h3>
</a>
<a href={paths.connectData} className="mini-card">
<h3>Connect your data</h3>
<h3>🔌 Connect your data</h3>
</a>
</div>
</div>
Expand All @@ -73,13 +69,13 @@ Open search with `/` , then `Tab` to search docs
<h3>For OSS Projects</h3>
<div className="use-case-cards">
<a href={paths.addProject} className="mini-card">
<h3>Add your project</h3>
<h3>Add your project</h3>
</a>
<a href={paths.viewArtifacts} className="mini-card">
<h3>View your project's artifacts</h3>
<h3>📦 View your project's artifacts</h3>
</a>
<a href={paths.troubleshoot} className="mini-card">
<h3>Troubleshoot data issues</h3>
<h3>🔧 Troubleshoot data issues</h3>
</a>
</div>
</div>
Expand Down
13 changes: 6 additions & 7 deletions apps/docs/docs/integrate/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,11 @@ title: Get OSO Data
sidebar_position: 0
---

Open Source Observer is a fully open data pipeline for measuring the impact of open source efforts.
That means all source code, data, and infrastructure is publicly available for use.
Open Source Observer is a fully open data pipeline for measuring the impact of open source efforts. Explore the following resources:

- [**Subscribe to Public Datasets**](./datasets/index.mdx): for an overview of all data available
- [**Explore the OSO Data Lake**](./query-data.mdx): all OSO data is available in BigQuery for you to explore and connect to your own tools
- [**Use the GraphQL API**](./api.md): integrate OSO registries and metrics into a live production application
- [**Import or clone oss-directory**](./oss-directory.md): leverage [oss-directory](https://github.com/opensource-observer/oss-directory) data separate from OSO
- 📊 [Subscribe to Public Datasets](./datasets/index.mdx) - Get free access to any of the public datasets that OSO maintains or builds on top of
- 🔍 [Explore the OSO Data Lake](./query-data.mdx) - Query the OSO data lake using BigQuery
- [Use the GraphQL API](./api.md) - Integrate OSO registries and metrics into a live production application
- 📂 [Import or Clone OSS-Directory](./oss-directory.md) - Leverage oss-directory data separate from OSO

See the [Tutorials](../tutorials/index.md) for more examples and the [Guides](../guides/index.mdx) for more detailed guides for integrating with specific tools (eg, Jupyter, Hex, etc).
See the [Tutorials](../tutorials/index.md) for more examples and the [Guides](../guides/index.mdx) for more detailed guides for integrating with specific tools.
28 changes: 4 additions & 24 deletions apps/docs/docusaurus.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -150,35 +150,15 @@ const config: Config = {
title: "Docs",
items: [
{
label: "Get Started",
label: "Get started",
to: "/docs/get-started/",
},
{
label: "Add your project",
to: "/docs/projects/",
label: "View datasets",
to: "/docs/integrate/datasets/",
},
{
label: "Get Data",
to: "/docs/integrate/",
},
{
label: "Tutorials",
to: "/docs/tutorials/",
},
{
label: "Contribute data",
to: "/docs/contribute-data/",
},
{
label: "Contribute models",
to: "/docs/contribute-models/",
},
{
label: "Guides",
to: "/docs/guides/",
},
{
label: "References",
label: "Learn how OSO works",
to: "/docs/references/",
},
],
Expand Down
Loading

0 comments on commit eec241e

Please sign in to comment.