Add example for American Community Survey data #364

jaanli · 2024-02-14T00:45:11Z

I think it would be interesting to add an example to let folks build on the American Community Survey data.

There are a lot of variables and use cases enabled by the new data filter extension!

Full list of variables:

https://github.com/jaanli/exploring_american_community_survey_data/blob/main/american_community_survey/models/public_use_microdata_sample/generated/enum_types_mapped_renamed/housing_units_united_states_first_tranche_enum_mapped_renamed.sql

Will start this PR in case others are able to help.

Example GIF of the filter extension: https://s13.gifyu.com/images/SCGH2.gif

jaanli · 2024-02-14T00:45:24Z

Running into a bug, filed an issue: #363

…f map and filter!

jaanli · 2024-02-14T12:59:34Z

@kylebarron just added the full American Community Survey data; initial income filter example:

Any ideas for other filters / lonboard features to try if this example may be worth including? Happy to make it more readable / user friendly as I think this data deserves more use.

I really like your demo of housing shapes and was hoping to do something similar. ChatGPT has tons of ideas but hard to parse through.

Added an example ChatGPT prompt that describes all the variables available here: https://github.com/jaanli/exploring_american_community_survey_data/blob/main/prompts/exploring-new-york-city.md (it suggests a scavenger hunt instead of an analysis / geospatial exploration :p)

…it of public use microdata sample

…ry public use microdata area

kylebarron

Thanks for the contribution!

I'm not sure where is the best place for this example notebook to live.

This is a great example because it shows the end-to-end process of accessing, reshaping and visualizing this specific dataset... but on the other hand it's not something that I know how to maintain. The majority of the code is in preparing the data, (and steps very specific to Census data at that). Ideally the example notebooks in this repo have the maximum ratio of lonboard-specific visualization to data preparation so that it's as approachable as possible for as many users as possible. (I can imagine someone unfamiliar with Census data easily getting overwhelmed).

Perhaps the best course of action is to have a page in the docs website linking to external examples. Then there's a good distinction that the external examples are good showcases but aren't "officially maintained". What would you think about, say, linking to a notebook in one of your repos?

kylebarron · 2024-02-14T20:27:04Z

.gitignore

+.venv/
+*.venv
+.venv


I think these are redundant; line 2 is probably sufficient for all cases.

kylebarron · 2024-02-14T20:49:15Z

examples/data-filter-extension.ipynb

@@ -41,7 +41,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,


Can you revert the changes to this notebook? It looks like you ran the notebook again and saved it but didn't make changes.

jaanli · 2024-02-15T16:29:38Z

I agree! I like that solution. The hardest part is indeed the transformation of the data and orchestration of dbt + duckdb, and it is overwhelming even with GPT-4 :(

I'm down to try linking to it as an external notebook. The downside of linking externally is that it is hard to guarantee links will remain valid; perhaps a GitHub action / continuous integration test could help lint documentation for broken links?

But I'm not sure how to get the example to display, in the GitHub's web view of Jupyter notebooks or with nbviewer.org (link).

Let me know if there's a place to add an external link and I can revise the PR accordingly! 🙏

kylebarron · 2024-02-16T21:48:52Z

Absolutely for most visualizations, 80% of the work is just in preparing the data.

I'm thinking of a page in the docs like the deck.gl "showcase page". Next week we have an onsite and I'll be really busy, but a good task for the following week is for me to create that page.

There's virtually no online notebook hosting platform where the map will also render; the map needs a running Python session. None of our notebook examples in our docs site render the map either. In the future we want to test saving the notebooks as html with the map data embedded, but that doesn't always work.

jaanli · 2024-02-17T03:17:43Z

Cool! I think there might be a solution to explore longer term, especially as Mosaic adds spatial support and Observable Framework’s support for parquet: observablehq/framework#834 (comment)

Enjoy the offsite and stoked to add this to the showcase down the line!

kylebarron · 2024-02-27T16:16:43Z

Mosaic adds spatial support

Do you have a link for this?

jaanli · 2024-02-27T16:26:38Z

Yup! Here it is @kylebarron : https://uwdata.github.io/mosaic/examples/nyc-taxi-rides.html

I also plan on trying Observable Framework's geospatial support with Protomaps to get something similar: https://bdon.github.io/observable-framework-maps/example-map

Tested observable framework over the weekend for this ACS historical data; I think lonboard filter extension would be great for this too to see patterns: https://jaanli.github.io/american-community-survey/income

@jaanli

Just a start for #364 cc @jaanli This is copied from geoarrow-rust for now. The idea is to have a grid of some sort, with a title, some text, and an image per example. <img width="1379" alt="image" src="https://github.com/developmentseed/lonboard/assets/15164633/e810d4ee-cc04-4d6d-a4aa-3d7b65f86f61">

kylebarron · 2024-03-21T19:51:48Z

In #401 I created a new top-level examples page with an image or gif per notebook. On that page I added the gif you posted here, as well as linked to your profile and notebook permalink in this PR. I think this is a better long term solution than adding those notebooks into this repo directly.

Feel free to make a PR to edit that page if you want to make a change!

kylebarron · 2024-03-30T01:37:34Z

examples/american-community-survey.ipynb

+    "\n",
+    "# If you had a direct way to map each exploded point to its PUMA, you'd fill puma_to_point_index here\n",
+    "# For demonstration, let's assume each point in points_for_people is already associated with a PUMA:\n",
+    "for i, puma in enumerate(puma_indices):  # This assumes puma_indices is aligned with points_for_people\n",


By the way I tried to run this notebook and got

NameError: name 'puma_indices' is not defined

jaanli added 5 commits February 13, 2024 19:29

wip: add american community survey example

cd7c718

wip: add american community survey example

3fb8947

wip: add american community survey example

ed5285c

wip: fix paths

64040c1

wip: fix url

d5d10ae

jaanli added 2 commits February 14, 2024 06:46

fix: correct filter_size and restart kernel to fix notebook display o…

2ce342b

…f map and filter!

fix: add full american community survey income data

771cb79

jaanli added 2 commits February 14, 2024 08:03

fix: add full data from first and second tranche of census bureau spl…

636d7e7

…it of public use microdata sample

wip: add sampling random anonymized locations for every person in eve…

1af815e

…ry public use microdata area

kylebarron reviewed Feb 14, 2024

View reviewed changes

kylebarron mentioned this pull request Mar 1, 2024

Docs showcase page #401

Merged

kylebarron closed this Mar 21, 2024

kylebarron reviewed Mar 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example for American Community Survey data #364

Add example for American Community Survey data #364

jaanli commented Feb 14, 2024 •

edited

Loading

jaanli commented Feb 14, 2024

jaanli commented Feb 14, 2024 •

edited

Loading

kylebarron left a comment

kylebarron Feb 14, 2024

kylebarron Feb 14, 2024

jaanli commented Feb 15, 2024

kylebarron commented Feb 16, 2024 •

edited

Loading

jaanli commented Feb 17, 2024

kylebarron commented Feb 27, 2024

jaanli commented Feb 27, 2024 •

edited

Loading

kylebarron commented Mar 21, 2024 •

edited

Loading

kylebarron Mar 30, 2024

Add example for American Community Survey data #364

Add example for American Community Survey data #364

Conversation

jaanli commented Feb 14, 2024 • edited Loading

jaanli commented Feb 14, 2024

jaanli commented Feb 14, 2024 • edited Loading

kylebarron left a comment

Choose a reason for hiding this comment

kylebarron Feb 14, 2024

Choose a reason for hiding this comment

kylebarron Feb 14, 2024

Choose a reason for hiding this comment

jaanli commented Feb 15, 2024

kylebarron commented Feb 16, 2024 • edited Loading

jaanli commented Feb 17, 2024

kylebarron commented Feb 27, 2024

jaanli commented Feb 27, 2024 • edited Loading

kylebarron commented Mar 21, 2024 • edited Loading

kylebarron Mar 30, 2024

Choose a reason for hiding this comment

jaanli commented Feb 14, 2024 •

edited

Loading

jaanli commented Feb 14, 2024 •

edited

Loading

kylebarron commented Feb 16, 2024 •

edited

Loading

jaanli commented Feb 27, 2024 •

edited

Loading

kylebarron commented Mar 21, 2024 •

edited

Loading