---
title: "Data processing assignment: COVID-19"
output: html_notebook
---


## The data

You are provided with a CSV file, `covid-19.csv` which was downloaded from the EU's open data portal, https://data.europa.eu/euodp/en/home. This file contains data for COVID-19 cases and deaths for each country since the end of 2019.

The dates given in the first column are in the format `dd/mm/yyyy`, e.g. `16/05/2020`. 
These can be converted to R's `date` datatype using the `dmy()` function from the tidyverse `lubridate` package:

```{r}
library(tidyverse)
library(lubridate)

dmy("16/05/2020")
```


## The tasks

0. Use the tidyverse functions to load this file into a dataframe and convert the dates from strings to the `date` datatype. 

```{r}

```

---

Now create figures to show the following:

1. Daily cases for a single country.

```{r}

```


2. Weekly cases for a single country. *Hint*: use `week()` to convert a date to a number representing the week of the year.

```{r}

  
```


3. Cumulative daily cases for a single country. *Hint*: use `arrange()` and `cumsum()`


```{r}


```

4. Daily cases for each continent. *Hint*: use `group_by()` and `summarise()`

```{r}


```


5. Apparent overall mortality rate (total deaths / total cases) for each country, grouped by continent.

```{r}


```

---

6. The UK (Referred to as "United_Kingdom" in this data set) has a strong weekly periodicity in reported COVID-19 deaths.

(a) Make a plot to illustrate this effect as clearly as possible. *Hint*: `lubridate` provides a function `wday()` which returns the day of the week as an integer, starting from 1 = Sunday.

```{r}

```

(b) Show that the number of deaths reported on Mondays and Tuesdays is significantly lower than on other days.

```{r}

```

(c) Do any other countries show a similar periodicity?

```{r}

```

(d) How could you explain what is happening, and what are the implications for how we use the data?



---