counting animals in deployment

UMassCDS · tnmygrwl · Aug 11, 2023 · Aug 11, 2023 · Aug 11, 2023 · Sep 3, 2023
commit 3ad14576fda97f93f1b6468d5e5c29a1b5410a2d
diff --git a/README.md b/README.md
@@ -18,15 +18,19 @@ Roost detection is based on [Detectron2](https://github.com/darkecology/detectro
     - **tracking**
     - **utils** contains various utils, scripts to postprocess roost tracks, and scripts to generate visualization
 - **tools** is for system deployment
-    - **demo\*.py** downloads radar scans, renders arrays and some channels as images for visualization, 
-      detects and tracks roosts in them, and postprocesses the results 
-    - **launch_demo\*.py** is a modifiable template that makes multiple calls to **sbatch demo.sbatch**, 
-      each calling **demo.py**, to submit slurm jobs.
-    - If we want each slurm job to include multiple calls to **demo.py** (e.g., process several time periods at 
-      a station within one slurm job), use **gen_deploy_station_days_scripts.py** to create a **launch\*.py** file 
-      and corresponding **\*.sbatch** files. 
-    - **demo.ipynb** is for interactively running the system
-    - **publish_images.sh** sends images generated during system deployment to a server to be archived
+    - **demo.py** downloads radar scans, renders arrays to be processed by models and some channels as images for 
+      visualization, detects and tracks roosts in them, and postprocesses the results.
+      - **demo.sbatch** defines a slurm job which calls **demo.py**.
+      - **launch_demo.py** makes multiple calls to **sbatch demo.sbatch** to submit slurm jobs, and 
+        is by default for detecting swallows.
+      - **launch_demo_bats.py** is for bats.
+    - **gen_deploy_station_days_scripts.py** can create a **launch\*.py** file and corresponding **\*.sbatch** files, 
+      when we want each slurm job to include multiple calls to **demo.py** (e.g., process several time periods at 
+      a station within one slurm job). 
+    - **publish_images.sh** sends images generated during system deployment to a server where we archive data
+    - (outdated) **demo.ipynb** is for interactively running the system and not actively maintained
+    - (customization) **demo_tiff.py**, **demo_tiff.sbatch**, **launch_demo_tiff.py** are customized given 
+      rendered arrays as tiff files.
     - (depreciated) **add_local_time_to_output_files.py** takes in scans*.txt and tracks*.txt files produced by 
       system deployment and append local time to each line. Now the system should handle it automatically.
     - (depreciated) **post_hoc_counting** takes in tracks* files and compute estimated numbers of animals in 
@@ -68,15 +72,21 @@ To run detection with GPU, check the cuda version at, for example, `/usr/local/c
 - Monitor from local: `ssh -N -f -L localhost:9990:localhost:9991 username@server`
 - Enter `localhost:9990` from a local browser tab
 
-#### Developing a detection model
+#### Develop a detection model
 - **development** contains all training and evaluation scripts.
 - To prepare a training dataset (i.e. rendering arrays from radar scans and 
 generating json files to define datasets with annotations), refer to 
 **Installation** and **Dataset Preparation** in the README of 
 [wsrdata](https://github.com/darkecology/wsrdata.git).
 - Before training, optionally run **try_load_arrays.py** to make sure there's no broken npz files.
 
-#### Run Inference
+Latest model checkpoints are available
+[here](https://drive.google.com/drive/folders/1ApVX-PFYVzRn4lgTZPJNFDHnUbhfcz6E?usp=sharing).
+- v1: Beginning of Summer 2021 Zezhou model.
+- v2: End of Summer 2021 Wenlong model with 48 AP. Better backbone, anchors, and other config.
+- v3: End of Winter 2021 Gustavo model with 55 AP. Adapter layer and temporal features.
+
+#### Deploy the system
 A Colab notebook for running small-scale inference is 
 [here](https://colab.research.google.com/drive/1UD6qtDSAzFRUDttqsUGRhwNwS0O4jGaY?usp=sharing).
 Large-scale deployment can be run on CPU servers as follows.
@@ -117,16 +127,34 @@ For example, DET_CFG can be changed to adopt a new detector.
    EXPERIMENT_NAME output directory. Thereby when we copy newly processed data to the server 
    that hosts the web UI, previous data won't need to be copied again.
 
-#### Deployment Log
-Model checkpoints are available [here](https://drive.google.com/drive/folders/1ApVX-PFYVzRn4lgTZPJNFDHnUbhfcz6E?usp=sharing).
-- v1: Beginning of Summer 2021 Zezhou model.
-- v2: End of Summer 2021 Wenlong model with 48 AP. Good backbone, anchors, etc.
-- v3: End of Winter 2021 Gustavo model with 55 AP. Adapter layer and temporal features.
+#### Notes about array, image, and annotation directions
+- geometric direction: large y is North (row 0 is South), large x is East
+- image direction: large y is South (row 0 is North), large x is East
+1. Rendering
+   1. [Render arrays](https://github.com/darkecology/roost-system/blob/b27ffd17e773dfeaedac2a79d453395614e8b679/src/roosts/data/renderer.py#L13) 
+   for the model to process in the **geographic** direction
+   2. [Render png images](https://github.com/darkecology/roost-system/blob/b27ffd17e773dfeaedac2a79d453395614e8b679/src/roosts/data/renderer.py#L161) 
+   for visualization in the **image** direction
+   3. Generate the list of scans with successfully rendered arrays
+2. Detector in the **geographic** direction
+   1. During training and evaluation, doesn’t use our defined
+   [Detector class](https://github.com/darkecology/roost-system/blob/b27ffd17e773dfeaedac2a79d453395614e8b679/src/roosts/system.py#L27) 
+      1. [dataloader](https://github.com/darkecology/roost-system/blob/b27ffd17e773dfeaedac2a79d453395614e8b679/development/experiments_v2/train_roost_detector.py#L220): 
+      XYXY
+   2. During deployment, use our defined 
+   [Detector class](https://github.com/darkecology/roost-system/blob/b27ffd17e773dfeaedac2a79d453395614e8b679/src/roosts/system.py#L27) 
+   which wraps a Predictor. The run function of this Detector [flips the y axis](https://github.com/darkecology/roost-system/blob/b27ffd17e773dfeaedac2a79d453395614e8b679/src/roosts/detection/detector.py#L115) of predicted boxes to get the **image** direction and outputs [predicted boxes](https://github.com/darkecology/roost-system/blob/b27ffd17e773dfeaedac2a79d453395614e8b679/src/roosts/detection/detector.py#L118) in xyr where xy are center coordinates
+4. For rain removal post-processing using dualpol arrays, 
+[flip the y axis](https://github.com/darkecology/roost-system/blob/b27ffd17e773dfeaedac2a79d453395614e8b679/src/roosts/utils/postprocess.py#L188) 
+to operate in the **image** direction
+5. Generate the list of predicted tracks to accompany png images in the **image** direction
+
 
-#### Website Visualization
-In the generated csv files, the following information could be used to further filter the tracks: 
+#### User Interface Visualization
+In the generated csv files that can be imported to a user interface for visualization, 
+the following information could be used to further filter the tracks: 
 - track length
-- detection scores (-1 represents the bbox is not from detector, instead, our tracking algorithm)
+- detection scores (-1 represents that the bbox is not from detector, instead, our tracking algorithm)
 - bbox sizes
 - the minutes from sunrise/sunset of the first bbox in a track
 

diff --git a/src/roosts/data/renderer.py b/src/roosts/data/renderer.py
@@ -95,6 +95,7 @@ def render(self, keys, logger, force_rendering=False):
         npz_files = [] # the list of arrays for the detector to load and process
         scan_names = [] # the list of all scans for the tracker to know
         img_files = [] # the list of dz05 images for visualization
+        success_keys = []
 
         for key in tqdm(keys, desc="Rendering"):
             key_splits = key.split("/")
@@ -119,6 +120,7 @@ def render(self, keys, logger, force_rendering=False):
                 npz_files.append(npz_path)
                 scan_names.append(scan)
                 img_files.append(dz05_path)
+                success_keys.append(key)
                 continue
 
             arrays = {}
@@ -149,8 +151,9 @@ def render(self, keys, logger, force_rendering=False):
                 npz_files.append(npz_path)
                 scan_names.append(scan)
                 img_files.append(dz05_path)
+                success_keys.append(key)
 
-        return npz_files, scan_names, img_files
+        return npz_files, scan_names, img_files, success_keys
 
     def render_img(self, array, utc_date_station_prefix, scan):
         attributes = self.array_render_config['fields']

diff --git a/src/roosts/detection/detector.py b/src/roosts/detection/detector.py
@@ -77,11 +77,10 @@ def _preprocess_npz_file(self, npz_paths):
 
         return np.concatenate(image_list, axis=2)
 
-    def run(self, array_files, file_type = "npz"):
-
+    def run(self, array_files, keys, file_type = "npz"):
         outputs = []
         count = 0
-        for idx, file in enumerate(tqdm(array_files, desc="Detecting")):
+        for idx, (file, key) in enumerate(tqdm(zip(array_files, keys), desc="Detecting")):
             # extract scanname 
             name = os.path.splitext(os.path.basename(file))[0]
             # preprocess data
@@ -122,7 +121,8 @@ def run(self, array_files, file_type = "npz"):
                     "scanname" : name, 
                     "det_ID"   : count,
                     "det_score": scores[kk],
-                    "im_bbox"  : bbox_xyr[kk]
+                    "im_bbox"  : bbox_xyr[kk],
+                    "key"      : key
                 }
                 count += 1
                 outputs.append(det)

diff --git a/src/roosts/system.py b/src/roosts/system.py
@@ -10,11 +10,12 @@
 from roosts.utils.postprocess import Postprocess
 from roosts.utils.file_util import delete_files
 from roosts.utils.time_util import scan_key_to_local_time
+from roosts.utils.counting_util import calc_n_animals, image2xy
 
 
 class RoostSystem:
 
-    def __init__(self, args, det_cfg, pp_cfg, dirs):
+    def __init__(self, args, det_cfg, pp_cfg, count_cfg, dirs):
         self.args = args
         self.dirs = dirs
         self.downloader = Downloader(
@@ -27,6 +28,7 @@ def __init__(self, args, det_cfg, pp_cfg, dirs):
             self.detector = Detector(**det_cfg)
             self.tracker = Tracker()
             self.postprocess = Postprocess(**pp_cfg)
+            self.count_cfg = count_cfg
             self.visualizer = Visualizer(sun_activity=self.args.sun_activity)
 
     def run_day_station(
@@ -61,11 +63,11 @@ def run_day_station(
 
         ######################### (2) Render data #########################
         (
-            npz_files,  # the list of arrays for the detector to load and process
-            scan_names, # the list of all scans for the tracker to know
-            img_files,  # the list of dz05 images for visualization
+            npz_files,      # the list of arrays for the detector to load and process
+            scan_names,     # the list of all scans for the tracker to know
+            img_files,      # the list of dz05 images for visualization
+            success_keys,   # the list of keys
         ) = self.renderer.render(keys, logger)
-        delete_files([os.path.join(self.dirs["scan_dir"], key) for key in keys])
 
         if len(npz_files) == 0:
             process_end_time = time.time()
@@ -79,6 +81,10 @@ def run_day_station(
             )
             return
 
+        if self.args.just_render:
+            return
+
+        # initialize output paths
         os.makedirs(self.dirs["scan_and_track_dir"], exist_ok=True)
         scans_path = os.path.join(
             self.dirs["scan_and_track_dir"],
@@ -93,15 +99,17 @@ def run_day_station(
                 f.write("filename,local_time\n")
         if not os.path.exists(tracks_path):
             with open(tracks_path, 'w') as f:
-                f.write(f'track_id,filename,from_{self.args.sun_activity},det_score,x,y,r,lon,lat,radius,local_time\n')
+                f.write(
+                    f'track_id,filename,from_{self.args.sun_activity},det_score,x,y,r,lon,lat,radius,local_time,'
+                    f'count_scaling,n_animals,overthresh_percent\n'
+                )
+                # we may want to scale a box to be 1.2x large for counting, since
+                # the box annotations used to train models may trace instead of bound roosts
         with open(scans_path, "a+") as f:
             f.writelines([f"{scan_name},{scan_key_to_local_time(scan_name)}\n" for scan_name in scan_names])
 
-        if self.args.just_render:
-            return
-
         ######################### (3) Run detection models on the data #########################
-        detections = self.detector.run(npz_files)
+        detections = self.detector.run(npz_files, success_keys)
         logger.info(f'[Detection Done] {len(detections)} detections')
 
         ######################### (4) Run tracking on the detections #########################
@@ -123,7 +131,20 @@ def run_day_station(
         )
         logger.info(f'[Postprocessing Done] {len(cleaned_detections)} cleaned detections')
 
-        ######################### (6) Visualize the detection and tracking results #########################
+        ######################### (6) Count animals  #########################
+        for detection in cleaned_detections:
+            detection["count_scaling"] = self.count_cfg["count_scaling"]
+            detection["n_animals"], _, detection["overthresh_percent"], _ = calc_n_animals(
+                pyart.io.read_nexrad_archive(os.path.join(self.dirs["scan_dir"], detection["key"])),
+                self.count_cfg["sweep_number"],
+                image2xy(det["im_bbox"][0], det["im_bbox"][1], det["im_bbox"][2], k=self.count_cfg["count_scaling"]),
+                self.count_cfg["rcs"],
+                self.count_cfg["threshold"],
+                method="polar",
+            )
+        delete_files([os.path.join(self.dirs["scan_dir"], key) for key in keys])
+
+        ######################### (7) Visualize the detection and tracking results #########################
         # generate gif visualization
         if self.args.gif_vis:
             """ visualize detections under multiple thresholds of detection score"""

diff --git a/src/roosts/utils/counting_util.py b/src/roosts/utils/counting_util.py
@@ -12,11 +12,13 @@
 from wsrlib import *
 
 
-def image2xy(x, y, r=0, dim=600, rmax=150000, k=1.0):
+def image2xy(x, y, r, dim=600, rmax=150000, k=1.0):
     '''
     Convert from image coordinates to geometric offset from radar
     '''
 
+    x, y, r = float(x), float(y), float(r)
+
     x0 = y0 = dim / 2.0  # origin
     x =  (x - x0) * 2 * rmax / dim
     y = -(y - y0) * 2 * rmax / dim

diff --git a/src/roosts/utils/geo_util.py b/src/roosts/utils/geo_util.py
@@ -16,7 +16,7 @@ def geo_dist_km(coor1, coor2):
     return distance.distance(coor1, coor2).km
 
 def cart2pol(x, y):
-    dis = np.sqrt(x**2 + y**2)
+    dis = np.sqrt(x ** 2 + y ** 2)
     angle = np.arctan2(y, x)
     return angle, dis
 
@@ -28,7 +28,7 @@ def pol2cmp(angle):
     bearing = np.mod(bearing, 360)
     return bearing 
 
-def get_roost_coor(roost_xy, station_xy, station_name, distance_per_pixel):
+def get_roost_coor(roost_xy, station_xy, station_name, distance_per_pixel, y_direction="image"):
     """ 
         Convert from image coordinates to geographic coordinates
 
@@ -37,12 +37,15 @@ def get_roost_coor(roost_xy, station_xy, station_name, distance_per_pixel):
             station_xy: image coordinates of station 
             station_name: name of station, e.g., KDOX
             distance_per_pixel: geographic distance per pixel, unit: meter
+            y_direction: image (big y means South, row 0 is North) or geographic (big y means North, row 0 is South)
 
         Return:
             longitude, latitide of roost center
     """
     station_lat, station_lon = NEXRAD_LOCATIONS[station_name]["lat"], NEXRAD_LOCATIONS[station_name]["lon"]
-    angle, dis = cart2pol(roost_xy[0]-station_xy[0], -roost_xy[1]+station_xy[1])
+    x_offset = roost_xy[0] - station_xy[0]
+    y_offset = -(roost_xy[1] - station_xy[1]) if y_direction == "image" else roost_xy[1] - station_xy[1]
+    angle, dis = cart2pol(x_offset, y_offset)
     bearing = pol2cmp(angle)
     origin = geopy.Point(station_lat, station_lon)
     des = distance.distance(kilometers=dis * distance_per_pixel/ 1000.).destination(origin, bearing) 

diff --git a/src/roosts/utils/postprocess.py b/src/roosts/utils/postprocess.py
@@ -130,7 +130,9 @@ def geo_converter(self, detections):
             station_xy = (self.imsize / 2., self.imsize / 2.)  # image coordinate of radar station
             station_name = det["scanname"][:4]
             distance_per_pixel = self.geosize / self.imsize
-            roost_lon, roost_lat = get_roost_coor(roost_xy, station_xy, station_name, distance_per_pixel)
+            roost_lon, roost_lat = get_roost_coor(
+                roost_xy, station_xy, station_name, distance_per_pixel, y_direction="image"
+            )
             geo_radius = det["im_bbox"][2] * distance_per_pixel
             det["geo_bbox"] = [roost_lon, roost_lat, geo_radius]
         return detections

diff --git a/src/roosts/utils/visualizer.py b/src/roosts/utils/visualizer.py
@@ -294,11 +294,12 @@ def save_predicted_tracks(self, detections, tracks, outpath):
                     if idx > last_pred_idx:
                         break
                     det = det_dict[det_ID]
-                    f.write('{:d},{:s},{:.3f},{:.3f},{:.3f},{:.3f},{:.3f},{:.3f},{:.3f},{:.3f},{:s}\n'.format(
+                    f.write('{:d},{:s},{:.3f},{:.3f},{:.3f},{:.3f},{:.3f},{:.3f},{:.3f},{:.3f},{:s},{:f},{:.5f},{:.5f}\n'.format(
                         det["track_ID"], det["scanname"], det[f"from_{self.sun_activity}"], det["det_score"],
                         det["im_bbox"][0], det["im_bbox"][1], det["im_bbox"][2],
                         det["geo_bbox"][0], det["geo_bbox"][1], det["geo_bbox"][2],
-                        scan_key_to_local_time(det["scanname"])
+                        scan_key_to_local_time(det["scanname"]),
+                        det["count_scaling"], det["n_animals"], det["overthresh_percent"]
                     ))
                     saved_track = True
                 if saved_track:

diff --git a/tools/demo.py b/tools/demo.py
@@ -6,10 +6,12 @@
 from roosts.system import RoostSystem
 from roosts.utils.time_util import get_days_list, get_sun_activity_time
 from roosts.utils.s3_util import get_station_day_scan_keys
+from roosts.utils.counting_util import get_bird_rcs
 
 here = os.path.dirname(os.path.realpath(__file__))
 
 parser = argparse.ArgumentParser()
+parser.add_argument('--species', type=str, required=True, help="swallow or bat")
 parser.add_argument('--station', type=str, required=True, help="a single station name, eg. KDOX")
 parser.add_argument('--start', type=str, required=True, help="the first local date to process, eg. 20101001")
 parser.add_argument('--end', type=str, required=True, help="the last local date to process, eg. 20101001")
@@ -21,7 +23,7 @@
 parser.add_argument('--data_root', type=str, help="directory for all outputs",
                     default=f"{here}/../roosts_data")
 parser.add_argument('--just_render', action='store_true', help="just download and render, no detection and tracking")
-parser.add_argument('--model_version', type=str, default="v2")
+parser.add_argument('--model_version', type=str, default="v3")
 parser.add_argument('--gif_vis', action='store_true', help="generate gif visualization")
 parser.add_argument('--aws_access_key_id', type=str, default=None)
 parser.add_argument('--aws_secret_access_key', type=str, default=None)
@@ -60,6 +62,16 @@
     "clean_rain":       True,
 }
 
+# counting config
+assert args.species in ["swallow", "bat"]
+CNT_CFG = {
+    "rcs":              get_bird_rcs(54) if args.species == "swallow" else 4.519,
+    "sweep_number":     0,  # index of the sweep where we extract counts
+    "threshold":        68402,  # threshold above which we consider reflectivity to be too high in the linear scale;
+                                # sometimes helpful to have no threshold, sometimes to cut at 30dbZ
+    "count_scaling":    1.2,  # the detector model predicts boxes that "trace roosts", enlarge to get a bounding box
+}
+
 # directories
 DIRS = {
     "scan_dir":                   os.path.join(args.data_root, 'scans'),  # raw scans downloaded from AWS
@@ -72,7 +84,7 @@
 }
 
 ######################### Run #########################
-roost_system = RoostSystem(args, DET_CFG, PP_CFG, DIRS)
+roost_system = RoostSystem(args, DET_CFG, PP_CFG, CNT_CFG, DIRS)
 
 days = get_days_list(args.start, args.end) # timestamps that indicate the beginning of dates, no time zone info
 print("Total number of days: %d" % len(days), flush=True)