DataSourcces.txt

Data source (healthcare expenditures):

https://www.cms.gov/data-research/statistics-trends-and-reports/national-health-expenditure-data/historical

In retrospect, this is a small and quite messy dataset.  It has freestyle header and footer text and the bulk of the tabular data is not of interest.  

Downloaded:
"NHE Summary, including share of GDP, CY 1960-2022 (ZIP)"

1) "Extract al..." applied to the ZIP folder;
2) Used Google Sheets to open ("Import") "NHE_Summary.xlsx";
3) Deleted all rows except the original Row 2 and Row 31 (becoming Rows 1 & 2);
4) Deleted Column A;
6) Saved ("Download") data in CSV format as "NHE_cleaned.csv";
7) Read that into an R data.frame and reformatted in R:

nhe_wide <- read_csv("data/NHE22_cleaned.csv")
nhe_long <- pivot_longer(nhe_wide, cols=everything())
colnames(nhe_long) <- c("Year", "PCTGDP")

Data source (life expectancy [all races, both sexes]):

1) From https://catalog.data.gov/dataset/nchs-death-rates-and-life-expectancy-at-birth downloaded the CSV format;
2) Imported into GSheets;
3) Kept Rows 1 - 120 (all races, both sexes);
4) Kept only Column A & D (became A & B);
5) renamed Column B to "LifeExpectancy";
5) Saved/"Download" as "LifeExpectancy_cleaned.csv";
6) Column names: nchs_long <- read_csv("data/NCHS_cleaned.csv")
colnames(nchs_long) <- c("Year", "LifeExp")


That prepared the data for analysis and visualization.