--- title: "Data processing assignment: COVID-19" output: html_notebook --- ## The data You are provided with a CSV file, `covid-19.csv` which was downloaded from the EU's open data portal, https://data.europa.eu/euodp/en/home. This file contains data for COVID-19 cases and deaths for each country since the end of 2019. The dates given in the first column are in the format `dd/mm/yyyy`, e.g. `16/05/2020`. These can be converted to R's `date` datatype using the `dmy()` function from the tidyverse `lubridate` package: ```{r} library(tidyverse) library(lubridate) dmy("16/05/2020") ``` ## The tasks 0. Use the tidyverse functions to load this file into a dataframe and convert the dates from strings to the `date` datatype. ```{r} ``` --- Now create figures to show the following: 1. Daily cases for a single country. ```{r} ``` 2. Weekly cases for a single country. *Hint*: use `week()` to convert a date to a number representing the week of the year. ```{r} ``` 3. Cumulative daily cases for a single country. *Hint*: use `arrange()` and `cumsum()` ```{r} ``` 4. Daily cases for each continent. *Hint*: use `group_by()` and `summarise()` ```{r} ``` 5. Apparent overall mortality rate (total deaths / total cases) for each country, grouped by continent. ```{r} ``` --- 6. The UK (Referred to as "United_Kingdom" in this data set) has a strong weekly periodicity in reported COVID-19 deaths. (a) Make a plot to illustrate this effect as clearly as possible. *Hint*: `lubridate` provides a function `wday()` which returns the day of the week as an integer, starting from 1 = Sunday. ```{r} ``` (b) Show that the number of deaths reported on Mondays and Tuesdays is significantly lower than on other days. ```{r} ``` (c) Do any other countries show a similar periodicity? ```{r} ``` (d) How could you explain what is happening, and what are the implications for how we use the data? ---