Skip to content

Python code to assist in familiarizing meteorologists with machine learning

License

Notifications You must be signed in to change notification settings

alburke/WAF_ML_Tutorial_Part1

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WAF Tutorial Part 1: Traditional ML

Introduction

This repository is the code associated with the WAF manuscript titled: "A Machine Learning Tutorial for Operational Meteorology, Part I: Traditional Machine Learning" written by Chase, R. J., Harrison, D. R., Burke, A., Lackmann, G. and McGovern, A. under review. While the paper undergoes review, feel free to read its preprint and provide any comments to via email to the corresponding author. If you have any issues with the code (bugs or other questions) please leave an issue associated with this repo.

This first paper and repo (of two) covers the traditional supervised machine learning methods (e.g., the sklearn models; if you don't know what that phrase even means thats OK! Check out Section 2 in the paper). We decided to start off with the orginal machine learning methods, before jumping into the more advanced techniques. Part two of this paper series willl dig into neural networks and deep learning. When that paper is submitted the repo will be linked here (right now its a dead link, taking you back here).

Motivation

Meteorological journal articles mentioning or using machine learning is growing rapidly (see figure above or Figure 1 in the paper; Data are derived from Clarivate Web of Science). Since there is such rapid growth and formal instruction of machine learning topics catered for meteorologsts are scarce, this manuscript and code repository were created. The goal is to familiarize meteorologists with the tools of machine learning and accelerate the use of machine learning in meteorological workflows. In order to accomplish these goals, it is imperative that code and a sandbox for readers to play around with exisit.

Background on the example dataset

Beyond just discussing the machine learning topics in an abstract way, we decided to show an end-to-end example of the machine learning pipeline using the The Storm EVent ImagRy (SEVIR) dataset

SEVIR Sample

SEVIR consists of over 10,000 matched storm events measured by satellite (i.e., GOES-16) and radar (i.e., NEXRAD) images. The specific variables are: red channel visible reflectance, mid-tropospheric water vapor channel brightness temperatures, clean infrared channel brightness temperatures, retrieved vertically integrated liquid and GOES Lightning Mapper (GLM) measured lightning flashes. The SEVIR dataset github repo can be found here and a helpful notebook tutorial can be found here. We thank the authors (Mark S. Veillette, Siddharth Samsi and Christopher J. Mattioli) of SEVIR for their efforts and creating a high-quality, open source meteorological dataset primed for machine learning. This dataset will be the centerpiece for both this paper and the next paper in the series.

Getting Started

There are two main ways to interact with the code here.

Use Google Colab

This is the reconmended and the quickest way to get started and only requires a (free) google account. Google Colab is a cloud instance of python that is run from your favorite web browser (although works best in Chrome). If you wish to use these notebooks, see the directory named colab_notebooks. There will be a button at the top of the notebook that says open in colab. This will take you to a new browser and open the notebook.

Install python on your local machine and run notebooks there

This is a bit more intense, especially for people who have never installed python on their machine. This method does allow you to always have the right packages installed and would enable you to actually download all of the SEVIR dataset if you want it (although it is very big... 924G total).

  1. Setup a Python installation on the machine you are using. I recommend installing Miniconda since it requires less storage than the full Anaconda Python distribution. Follow the instructions on the miniconda page to download and install Miniconda for your operating system. It is best to do these steps in a terminal (Mac/Linux) or powershell (Windows)

    Once you get it setup, it would be good to have python and jupyter in this base environment.

    $ conda install -c conda-forge python jupyterlab

  2. Now that conda is installed, clone this repository to your local machine with the command:

    $ git clone https://github.com/ai2es/WAF_ML_Tutorial_Part1.git

    If you dont have git, you can install git (Install Git) or choose the "Download Zip" option, unzip it and then continue with these steps.

  3. Change into the newly downloaded directory

    $ cd WAF_ML_Tutroial_Part1.git

  4. It is good practice to always make a new env for each project you work on. So here we will make a new environment

    $ conda env create -f environment.yml

  5. Activate the new environment

    $ conda activate waf_tutorial_part1

  6. Add this new environement to a kernel in jupyter

    $ python -m ipykernel install --user --name waf_tutorial_part1 --display-name "waf_tutorial_part1"

  7. Go back to the base environment

    $ conda deactivate

  8. Start jupyter

    $ jupyter lab

  9. You should be able to open the notebooks with this repository and you should be able to add the kernel we just installed with the name waf_tutorial_part1. To change from the default kernel, click on the kernels tab and select Change Kernel... and select the waf_tutorial_part1 kernel.

About

Python code to assist in familiarizing meteorologists with machine learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.5%
  • Python 0.5%