Skip to content

Commit

Permalink
Clean up docs for loading sample data and remove references to privat…
Browse files Browse the repository at this point in the history
…e data
  • Loading branch information
jeancochrane authored and ddohler committed Jun 13, 2018
1 parent 7d2d3e0 commit bc4e417
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 15 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,12 +143,14 @@ Note that the import process will take roughly two hours for the full data set;
number of records with `head` on the individual CSVs.

To load mock black spots, run `python scripts/load_black_spots.py --authz 'Token YOUR_AUTH_TOKEN' /path/to/black_spots.json`.
Mock black spot data is available in `scripts/sample_data/black_spots.json`.

To load mock interventions, run `python scripts/load_interventions.py --authz 'Token YOUR_AUTH_TOKEN' /path/to/interventions_sample_pts.geojson`.
Mock intervention data is available in `scripts/sample_data/interventions_sample_pts.json`.

To generate black spot and load forecast training inputs, run `python scripts/generate_training_input.py /path/to/roads.shp /path/to/records.csv`.

More documentation for loading data can be found in the [`scripts`
More information on the requirements for loading data can be found in the [`scripts`
directory](./scripts/README/md).

### Costs
Expand Down
29 changes: 15 additions & 14 deletions scripts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

## OS-level requirements

- Python 2.x.
- Python 2.7
- Python bindings for GDAL (available on the [apt package index](https://packages.ubuntu.com/artful/python-gdal))
as well as on [PyPi](https://pypi.org/project/GDAL/))

Expand All @@ -18,19 +18,20 @@ image ([code](https://github.com/azavea/docker-gdal),
- [pytz](https://pypi.org/project/pytz/)
- [requests](https://pypi.org/project/requests/)

## Script and schema dependencies
## Loading sample data

Certain ETL scripts and schemas in this repo depend on specific data
sources to run properly. Retrieve these dependencies from the fileshare and
make sure they exist in this repo before you load the data.
The `sample_data` directory provides two sample data files for testing purposes:

These dependencies include:
- `black_spots.json`: sample black spots
- `interventions_sample_pts.json`: sample interventions

| script/schema | depends on |
| ------------- | --------------- |
| `pnp_incident_schema.json` | `incidents_and_sites.csv` |
| `pnp_incident_schema_v2.json` | 'public.csv' |
| `incident_schema_v3.json` | `data_for_v3` directory |
| `load_black_spots.py` | `black_spots.json` |
| `load_interventions.py | `interventions_sample_pts.geojson` |
| `generate_training_input.py` | `blackspot_training/roads_utm.*` and `blackspot_training/all_crashes_2008-2012.csv` |
Retrieve your auth token by inspecting network traffic, and then load these
files using the scripts in this directory:

```
python load_black_spots.py --authz 'Token YOUR_AUTH_TOKEN' sample_data/black_spots.json
```

```
python load_interventions.py --authz 'Token YOUR_AUTH_TOKEN' sample_data/interventions_sample_pts.geojson
```

0 comments on commit bc4e417

Please sign in to comment.