Data Analysis and Map Generation

Goal

Uniform the analyses required to extract features form samples, generate rescaled maps (considering all the samples), and compute metrics of interests (misbehaviors, density, etc.)

Pipeline

We assume to have access to the raw data about each experiment, which consists in a timestamped folder that contains all the individuals (e.g., npy images) generated by the test generators and possibly the outcome of the test (misclassified digit for MNIST).

The steps to generate the maps are:

Process the output of the tools and generate an info_<ID>.json file for each sample. This file contains the features, and various metadata about the sample (timestamp, sample-id, file location, tool name, run id, etc.)

Go to the root of the project (./DeepHyperion-BNG) and run the following command to process a dataset folder (folders will be recursively checked):
```
python report_generator/app.py generate-samples --force-attribute run 1 --force-attribute tool DeepHyperionBeamNG ./logs/run_XXX/simulations
```
NOTE: the set of features to be computed is predefined for each tool.

Process all the json files corresponding to the samples of all the runs for all the tools, to extract the maps extrema for each feature. Go to the root of the project (./DeepHyperion-BeamNG) and run the following command:

python report_generator/app.py extract-stats --parsable --feature <NAME> --feature <NAME> ./logs/run_XXX/simulations

For example:

python report_generator/app.py extract-stats --parsable --feature segment_count --feature mean_lateral_position ./logs/run_XXX/simulations

NOTE: you should select features based on your config in __(run_xxx/config.json)

You should get an output similar to:

2020-12-22 22:41:02,764 INFO     Process Started
name=segment_count,min=1,max=5,missing=0
ame=mean_lateral_position,min=166,max=178,missing=0

This output reports for each feature specified in input its name, its min/max values, and the count of samples found for which that feature was not present (useful for debug)

Removing the --parsable option yields a structured (JSON-like) report:

2020-12-22 22:42:54,168 INFO     Process Started
{
 "total": 13,
 "features": {
     "segment_count": {
         "min": 1,
         "max": 5,
         "missing": 0
     },
     "mean_lateral_position": {
         "min": 166,
         "max": 178,
         "missing": 0
     }
 }
}

Build the map and visualize it. Maps can contain two or more features, but their visualization is limited to two features at the time. To generate a map and generate a report run the following command (add --visualize if you want to visualize the map):
```
python report_generator/app.py generate-map --feature <NAME> <MIN> <MAX> <NUM_CELL> --feature <NAME> <MIN> <MAX> <NUM_CELL> ./logs/run_XXX/simulations
```
For example:
```
python report_generator/app.py generate-map --feature segment_count 1 5 4 --feature mean_lateral_position 166 178 25 ./logs/run_XXX/simulations
```
NOTE: You should set the values for each feature based on previous command's output, otherwise, you might loose some individuals which are out of your defined bind.

NOTE: You can add other features (assuming they are present in all the samples) by adding to the command entries like --feature <NAME> <MIN> <MAX> <NUM_CELL> (e.g., --feature min_radius 0 30 25).

You should get something like this:
```
2020-12-22 22:47:25,853 INFO     Process Started
```
This command produces report files in the logs/run_XXX folder:
- auc-coverage-DeepHyperionBeamNG-<RUN_ID>-segment_count-mean_lateral_position.npy
- auc-misbehaviour-DeepHyperionBeamNG-<RUN_ID>-segment_count-mean_lateral_position.npy
- coverage-DeepHyperionBeamNG-<RUN_ID>-segment_count-mean_lateral_position.npy
- misbehaviour-DeepHyperionBeamNG-<RUN_ID>-segment_count-mean_lateral_position.npy
- probability-DeepHyperionBeamNG-<RUN_ID>-segment_count-mean_lateral_position.npy
- DeepHyperionBeamNG-<RUN_ID>-stats.json
- probability-DeepHyperionBeamNG-<RUN_ID>-segment_count-mean_lateral_position.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Data Analysis and Map Generation

Goal

Pipeline

Files

README.md

Latest commit

History

README.md

File metadata and controls

Data Analysis and Map Generation

Goal

Pipeline