Skip to content
Hannah Kerner edited this page May 10, 2021 · 4 revisions

Data sets

Dark Energy Survey (DES)

Description: catalog of astrophysical objects.

Data set # samples Format Input data dimension Location Notes
DES (full) 109,922,293 Feature vectors [1, 6] /proj/des Filtered down from 339M
DES (10%) 11,000,000 Feature vectors [1, 6] /proj/des After filtering

Feature vectors: ['color_g_minus_r', 'lup_r', 'color_i_minus_r', 'color_z_minus_r', 'T', 'snr']

Evaluation data:

  • [how many?] objects that were discarded in the Dec 2019 catalog update (

More details on data and prior results here.

Mars rover targeting

Description: candidate targets for follow-up observation in rover images.

Data set # samples Format Input data dimension Location Notes
Navcam 6005 Images [64, 64, 1] /proj/cif-novelty/data/navcam-rockster-targets-onboard-64x64/ Resized bounding boxes found by Rockster
Navcam TBD Images [64, 64, 1] /proj/cif-novelty/ref/rockster_navcam_handpicked_executions.json Need to create image dataset from JSON file
Mastcam RGB 22837 Images [64, 64, 3] /proj/cif-novelty/ref/rockster_mcam_executions_sol122-2259.json Need to create image dataset from JSON file
Mastcam Multispectral 6151 Images [64, 64, 6] /proj/cif-novelty/ref/multi_color_122-2259.json Need to create image dataset from JSON file

Evaluation data:

  • 104 novel targets in Navcam images (manually identified) spanning 42 sols (1343-1703) (/proj/cif-novelty/ref/AEGIS targets survey for novel features.csv)
  • 61 novel targets in Mastcam RGB/Multispectral images (manually identified) spanning 12 sols (346-2138) (/proj/cif-novelty/ref/mastcam_targets_survey_novel_features.csv)

More details on data and prior results here.

Ground-truth field observations

Description: time series of satellite observations at locations from ground-truth data collection surveys.

Data set # samples Format Input data dimension Location Notes
Sentinel-1 VV/VH 34,610 Time series [1, 12] Will be much smaller after filtering duplicates
Sentinel-2 NDVI 2000 Time series [1, 12]
Landsat-8 NDVI 2000 Time series [1, 12] Sample data also in dora/sample_data/field_samples/landsat8_ndvi_2020_sample-2k.csv

Evaluation data:

  • 113 (manually labeled) points in Kenya (76 inliers/true crop fields, 37 outliers/other land cover)
  • could also use other crop/non-crop datasets from other countries/tasks since 'non-crop' should be considered outliers

More details on data and prior results here

Volcanic thermal features

Description: ASTER thermal images containing known volcanic thermal anomalies.

Data set # samples Format Input data dimension Location Notes
ASTER (AST_08) 5 Images [830, 700, 1] Hannah's computer # samples and dimension could change depending on input data formatting decisions

Evaluation data:

  • 5 shapefiles with (manually labeled) polygons of known volcanic thermal anomalies (Hannah's computer)

More details on data and prior results here