forked from RandomWalker300/edav_proj
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathedav-final-project.Rmd
1085 lines (764 loc) · 51.6 KB
/
edav-final-project.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Hurricane Analysis and Visualization Using R"
author: "Romane Goldmuntz, Vy Tran, and Jianqiong Zhan"
date: "`r Sys.Date()`"
site: bookdown::bookdown_site
output: bookdown::gitbook
documentclass: book
bibliography: [proj.bib, packages.bib]
biblio-style: apalike
link-citations: yes
github-repo: rstudio/bookdown-demo
description: "This is an edav class final project"
---
# Introduction {#intro}
Hurricanes have caused tremendous losses on the country’s economy (@Winkle2018): they currently cost the government over \$28 billion each year. Besides the government, several industries are heavily impacted by hurricanes, including the insurance industry. For example, according to Bloomberg, hurricane Dorian caused the insurance industry losses of up \$25 billion, making it the most expensive natural disaster for the industry since 2017’s Hurricane Maria (@DSouza2019). Therefore, the United States’ Sustainable Development has identified economic losses from hurricane as a key indicator for the economic growth (@SDG2018). Disentangling the relative roles of variability and changes in Hurricane could contribute information useful to sustainable development goals.
```{r echo=FALSE}
knitr::opts_chunk$set(warning = FALSE, message = FALSE)
```
Key frequently asked questions are: how climate affects hurricane, and what of the future hurricanes are more likely to be. To gain insight on these questions, we first have to look into historical records to discern the variability and changes of hurricanes, the analysis of which will help us to build predictive understanding of the influencing of climate on hurricane. Our working hypothesis is that: if greenhouse warming has caused a substantial increase in Atlantic hurricane activity, then the century scale increase in Atlantic hurricane since 1851 should have produced a long-term rise in measures of Atlantic hurricanes activity, similar to that seen for global temperature, for example. Therefore, we aim to dig into the original storm track file with more than 150 years of records of Atlantic tropical storm activities in order to examine variability and changes of Atlantic hurricane under global warming conditions. Specifically, three important questions we would like to address in this analysis are:
_(1) What influences the change in category of hurricanes?_
_(2) Has hurricane activity shifted north-ward?_
_(3) Have humans already caused a detectable increase in Atlantic hurricane activity?_
<!--chapter:end:index.Rmd-->
# Data Source {#datasource}
The storm track data presented in the analysis is downloaded from [National Hurricane Center and Central Pacific Hurricane Center](https://www.nhc.noaa.gov/data/#hurdat), which is also known as the Atlantic hurricane database [HURDAT2: 1851-2018] (https://www.nhc.noaa.gov/data/hurdat/hurdat2-1851-2018-120319.txt) and the [HURDAT](https://www.aoml.noaa.gov/hrd/hurdat/comparison_table.html) data is downloaded from the [NOAA’s Atlantic Oceanographic and Meteorological Laboratory] (https://www.aoml.noaa.gov/).
We choose to work with these data because they provide us with updated, complete and accurate information about hurricanes over the past 150 years (1851 - 2018). Since the Atlantic hurricane activity has shown very strong year-to-year and decade-to-decade variability, longer records or hurricanes are much needed.
One important thing to note about the names of hurricanes is that there are six names lists that have been used in rotation and re-cycled every six years. This means that there can be more than one hurricanes having the same name but happened in different years. We have tackled this problem by giving each hurricane a unique `storm-id`.
<!--chapter:end:01-DataSource.Rmd-->
# Data Transformation {#datatrans}
<!--test-->
The ([HURDAT2](https://www.nhc.noaa.gov/data/hurdat/hurdat2-format-atlantic.pdf)) data has a comma-delimited, text format with six-hourly information on the location, maximum winds, central pressure, and (beginning in 2004) size of all known tropical cyclones and subtropical cyclones. The dataset is a combination of serveral subsets. Each subset is used for a storm track record which includes header information and values.
please refer to [this file](https://www.nhc.noaa.gov/data/hurdat/hurdat2-format-atlantic.pdf) for detail information.
```{r include=FALSE}
# keep this chunk in your .Rmd file
knitr::opts_chunk$set(warning = FALSE, message = FALSE)
```
```{r echo=FALSE}
library(tidyverse)
library(stringr)
library(GGally)
# Read in data set
#dfile <- read_lines("https://www.nhc.noaa.gov/data/hurdat/hurdat2-1851-2018-120319.txt")
dfile<- "../data/raw/hurdat2-1851-2018-120319.txt"
hurdat2_in <- read_lines(dfile)
```
Firstly, we extract storm `id`, `name`, and `subtext length` from each subtext header, then read in data according to each `subtext length`, and merge data subset by indexing it with storm `id`, `name`. In the original text, you will find that the `name` is non-unique. Currently, there are six lists that are used in rotation and re-cycled every six years, i.e., the 2013 list is used again in 2019. For more information, please see tropical cyclone names. To avoid the future confusion, we create `storm-id` variable by combining `name` and `year` together (e.g., _Sandy-2012_). In the original file, there are storms labeled with NAMEs but others labelled with `UNNAME`. Here, we use `name_id` variable to indicate whether a storm has a name or not.
```{r echo=FALSE}
# read hearder information
library(stringr)
header_locations <- (1:length(hurdat2_in))[stringr::str_count(hurdat2_in, "\\,") == 3]
header_df <- readr::read_csv(hurdat2_in[header_locations],
col_names = FALSE) %>%
dplyr::select(-c("X4"))
names(header_df) <- c("id","name","n_entries")
header_df <- header_df %>% mutate(header_loc = as.numeric(header_locations))
```
```{r echo=FALSE}
#tail(header_df)
```
```{r echo=FALSE}
# read data value
hurdat2_df <- vector("list", nrow(header_df))
names(hurdat2_df) <- header_df$id
df_names <- c(
"date", "time", "record_identifier", "status", "latitude", "longitude", "max_wind", "min_pressure",
"extent_34_NE", "extent_34_SE", "extent_34_SW", "extent_34_NW",
"extent_50_NE", "extent_50_SE", "extent_50_SW", "extent_50_NW",
"extent_64_NE", "extent_64_SE", "extent_64_SW", "extent_64_NW", "nas"
)
```
```{r echo=FALSE}
#
for (i in seq_along(header_df$id)) {
hurdat2_df[[i]] <- read_csv(dfile,
skip = header_df$header_loc[i],
n_max = header_df$n_entries[i],
col_names = df_names,
na = c("", "-99", "-999"),
col_types = list(
time = col_character(),
record_identifier = col_character(),
min_pressure = col_integer(),
extent_34_NE = col_integer(),
extent_34_SE = col_integer(),
extent_34_SW = col_integer(),
extent_34_NW = col_integer(),
extent_50_NE = col_integer(),
extent_50_SE = col_integer(),
extent_50_SW = col_integer(),
extent_50_NW = col_integer(),
extent_64_NE = col_integer(),
extent_64_SE = col_integer(),
extent_64_SW = col_integer(),
extent_64_NW = col_integer()
)
)
}
```
```{r echo=FALSE}
#hurdat2_df[[1]]
```
Secondly, we estimate `category` variable from wind speed based on [Saffir-Simpson storm category](https://www.nhc.noaa.gov/aboutsshws.php), calculate the diameter of the area experiencing hurricane strength winds (64 knots or above), `_ts_diameter_` from extent of 34 kt wind radii maximum extent in northeastern quadrant (in nautical miles, `extent_34_NE `), 34 kt wind radii maximum extent in southeastern quadrant (in nautical miles, `extent_34_SW`), 34 kt wind radii maximum extent in northeastern quadrant (in nautical miles, ` extent_34_NW `), and 34 kt wind radii maximum extent in southeastern quadrant (in nautical miles, ` extent_34_SE`), `_hu_diameter_` from extent of 64 kt wind radii maximum extent in northeastern quadrant - `extent_64_NE `, southeastern quadrant - `extent_34_SW`, northeastern quadrant, ` extent_64_NW `), and southeastern quadrant - ` extent_64_SE`.
```{r echo=FALSE}
# Combine and clean the data sets
library(lubridate)
hurdat2 <-
hurdat2_df %>%
dplyr::bind_rows(.id = "id") %>%
dplyr::mutate(
date = lubridate::ymd(date),
year = lubridate::year(date),
month = lubridate::month(date),
day = lubridate::day(date),
hour = as.numeric(stringr::str_sub(time, 1, 2)),
#datetime = as.Date(ISOdate(year, month, day, hour, min = 0, sec = 0, tz = "GMT")),
datetime = lubridate::ymd_h(paste(year, month, day, hour, sep="-")),
#lat_hemisphere = stringr::str_sub(latitude, -1),
latitude = dplyr::if_else(stringr::str_sub(latitude, -1) == "N",
as.numeric(stringr::str_sub(latitude, 1, -2))*1,
as.numeric(stringr::str_sub(latitude, 1, -2))*(-1)),
longitude = dplyr::if_else(stringr::str_sub(longitude, -1) == "E",
as.numeric(stringr::str_sub(longitude, 1, -2))*1,
as.numeric(stringr::str_sub(longitude, 1, -2))*(-1)),
category = cut(max_wind, # Saffir-Simpson Hurricane Wind Scale
breaks = c(0, 34, 64, 83, 96, 113, 137, 500),
labels = c(-1, 0, 1, 2, 3, 4, 5),
include.lowest = TRUE, ordered = TRUE
),
# wind = wind * 1.15078, # transforms knots to mph,
TSradius1 = extent_34_NE + extent_34_SW,
TSradius2 = extent_34_NW + extent_34_SE,
ts_diameter = pmax(TSradius1, TSradius2) * 1.15078, # to convert from nautical miles to miles # pmax: returns the parallel maxima and minima of the input values
HUradius1 = extent_64_NE + extent_64_SW,
HUradius2 = extent_64_NW + extent_64_SE,
hu_diameter = pmax(HUradius1, HUradius2) * 1.15078, # to convert from nautical miles to miles # pmax: returns the parallel maxima and minima of the input values
status = recode(status,
"TD" = "tropical depression", # maximum sustained winds below 39 mph
"TS" = "tropical storm", # 39-73 mph
"HU" = "tropical hurricane", #74-95, 96-110, 111-130, 131-155, >156
"EX" = "Extratropical cyclone", ##
"SD" = "subtropical depression", #<18m/s or <35
"SS" = "subtropical storm",
"LO" = "a low",
"WV" = "tropical wave",
"DB" = "disturbance")
)
```
<!--test: category has been calculated based on [Saffir-Simpson Hurricane Wind Scale](https://www.nhc.noaa.gov/aboutsshws.php) to indicate "Types of Damage Due to Hurricane Winds". -->
```{r echo=FALSE}
# merge header information to data values
header_df_selected <- header_df %>% dplyr::select(c("id","name"))
# headers_df_selected
hurdat2_add_name <- left_join(header_df_selected, hurdat2, by=c("id")) %>%
dplyr::select(id, name, datetime, year, month, day, hour, record_identifier, latitude, longitude, status, category,
max_wind, min_pressure, ts_diameter, hu_diameter)
```
```{r echo=FALSE}
hurdat2_out <- hurdat2_add_name %>%
mutate(name= dplyr::if_else(grepl("UNNAMED", name), name,
stringr::str_to_title(name)))
hurdat2_out$status <- factor(hurdat2_out$status)
hurdat2_out$category <- factor(hurdat2_out$category)
```
```{r echo=FALSE}
# absorb header information to data values
header_df_selected <- header_df %>% dplyr::select(c("id","name"))
# headers_df_selected
hurdat2_add_name <- left_join(header_df_selected, hurdat2, by=c("id")) %>%
dplyr::select(id, name, datetime, year, month, day, hour, record_identifier, latitude, longitude, status, category,
max_wind, min_pressure, ts_diameter, hu_diameter)
```
```{r echo=FALSE}
hurdat2_out <- hurdat2_add_name %>%
mutate(name= dplyr::if_else(grepl("UNNAMED", name), name,
stringr::str_to_title(name)))
hurdat2_out$status <- factor(hurdat2_out$status)
hurdat2_out$category <- factor(hurdat2_out$category)
```
```{r echo=FALSE}
#levels(hurdat2_out$status)
#hurdat2_out %>% dplyr::filter(status == "ET")
```
<!--test:*Note: there is an "ET" in _Status of system_, which does not included in the description [HURDAT2](https://www.nhc.noaa.gov/data/hurdat/hurdat2-format-atlantic.pdf). This is a typo in the dataset, `recode` it into 'EX".*-->
```{r echo=FALSE}
hurdat2_out$status <- dplyr::recode(hurdat2_out$status, ET = "Extratropical cyclone")
```
```{r echo=FALSE}
#hurdat2_out$status %>% unique()
```
```{r echo=FALSE}
hurdat2_out$category <- factor(hurdat2_out$category, levels=c("0","1","2","3","4","5"))
```
```{r echo=FALSE}
hurdat2_out$storm_id <- paste(hurdat2_out$name, hurdat2_out$year, sep="-")
#df$date <- lubridate::ymd_h(paste(df$year, df$month, df$day, df$hour, sep="-"))
```
```{r echo=FALSE}
#names(df)
```
<!--Save transformed data** for further use.-->
Thirdly, we estimate the storm duration `tc_dur_track` to those with maximum sustained surface winds of at least 35 knot and defined storms and define `tc_dur_type` for type of the duration. Here, `S` indicates storms with duration of 2.0 days or less and will be mentioned in the following text as *short-lived* storms, and ` L` represnts storms with duration of more than 2.0 days and will be referred as “medium-to-long lived” storms.
```{r echo=FALSE}
hurdat2_out <- hurdat2_out %>%
#dplyr::filter(status == c("tropical hurricane", "tropical storm")) %>%
group_by(id) %>% mutate(dur_track=as.numeric(
max(datetime) - min(datetime), units = "days")) %>% ungroup() %>%
mutate(dur_type = dplyr::if_else(dur_track<=2, "S","L"))%>%
#if_else((dur_track<=4 & dur_track>2), "M", "S")
#)) %>%
mutate(dur_type=factor(dur_type, levels = c("S", "L")))
```
```{r echo=FALSE}
#dim(hurdat2_out)
```
```{r echo=FALSE}
tcs <- c("tropical storm", "tropical hurricane","subtropical storm")
df <- hurdat2_out %>% dplyr::filter(status %in% tcs)
#df$status <- factor(df$status, levels=tcs)
df$status <- factor(df$status, levels=tcs)
```
```{r echo=FALSE}
#dim(df)
```
```{r echo=FALSE}
df <- df %>%
dplyr::filter(status %in% tcs) %>%
dplyr::group_by(id) %>% mutate(tc_dur_track=as.numeric(
max(datetime) - min(datetime), units = "days"))%>%
ungroup() %>%
mutate(tc_dur_type = dplyr::if_else(tc_dur_track<=2, "S",
"L"
# if_else((tc_dur_track<=4 & tc_dur_track>2), "M", "L")
)) %>%
mutate(tc_dur_type=factor(tc_dur_type, levels = c("S", "M", "L")))
```
```{r echo=FALSE}
df$category <- (as.numeric(df$category)-1)
#df$category %>% unique()
```
```{r echo=FALSE}
df <- df %>%
dplyr::filter(status %in% tcs) %>%
dplyr::group_by(id) %>%
mutate(unname_label = if_else(name=="UNNAMED", "no", "yes"),
max_category = max(category),
max_status_label = (category == max(category)),
max_max_wind = max(max_wind),
min_min_pressure = min(min_pressure),
max_ts_diameter = max(ts_diameter),
max_hu_diamter = max(hu_diameter)) %>%
ungroup()
#names(df)
#df$year %>% unique()
```
```{r echo=FALSE}
dir <- '../data/clean/hurricanes.csv'
readr::write_csv(df, dir)
```
Finally, the ocean surface temperature is `.nc` formate, we use `ncdf4` to read in data. Note that there is there is an "ET" typo in _Status_ of system in the [HURDAT2]( https://www.nhc.noaa.gov/data/hurdat/hurdat2-1851-2018-120319.txt), which has been corrected to `EX` in the output `data\clean\hurricanes.csv` file.
<!--
_read in atmospheric CO2_
-->
```{r echo=FALSE}
d <- read.table('ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt',
na.strings = -99.99)
names(d) <- c('year', 'month', 'decimalYear', 'average', 'interpolated', 'trend', 'ndays')
d$time <- with(d, as.Date(paste(year, sprintf('%02d', month), '01', sep='-'), format='%Y-%m-%d', tz='UTC'))
time <- d$time
co2 <- d$interpolated
df_aco2 <- data.frame(time, co2)
#names(d)
```
```{r echo=FALSE}
#d$year %>% unique()
```
<!--
_read in HUDART (1970-2018)_
-->
```{r echo=FALSE}
hurdat_names <- c("year","Original_Named_Storms","Revised_Named_Storms",
"Original_Hurricanes", "Revised_Hurricanes",
"Original_Major_Hurricanes","Revised_Major_Hurricanes","Original_ACE","Revised_ACE",
"Original_US_Hurricanes","Original_US_Hurricanes_Category",
"Revised_US_Hurricanes", "Revised_US_Hurricanes_Category")
col_types = list(year = col_double(),
Original_Named_Storms = col_double(),
Revised_Named_Storms = col_double(),
Original_Hurricanes = col_double(),
Revised_Hurricanes = col_double(),
Original_Major_Hurricanes = col_double(),
Revised_Major_Hurricanes = col_double(),
Original_ACE = col_double(),
Revised_ACE = col_double(),
Original_US_Hurricanes = col_double(),
Original_US_Hurricanes_Category = col_character(),
Revised_US_Hurricanes = col_double(),
Revised_US_Hurricanes_Category = col_character())
dfhurdatin<- "../data/raw/HURDAT.csv"
df_hurdat <- read_csv(dfhurdatin, skip =7, n_max = 168, col_names = hurdat_names, na = c(""," "), col_types = col_types)
#names(df_hurdat)
#df_hurdat$year
```
<!--
_read in ocean temperature (1870-2010)_
-->
```{r echo=FALSE}
library(ncdf4)
ncfname <- "../data/raw/AMO_HADLEY.1870-2010.CLM_1901-1970.nc"
ncIn <- nc_open(ncfname)
#print(ncIn)
year <- ncvar_get(ncIn, "year")
AMO_WARM <- ncvar_get(ncIn, "AMO_WARM")
AMO_WARM_REMOVED <- ncvar_get(ncIn, "AMO_WARM_REMOVED")
AMO_WARM_REMOVED_SMTH <- ncvar_get(ncIn, "AMO_WARM_REMOVED_SMTH")
AMO_WARM_SMTH <- ncvar_get(ncIn, "AMO_WARM_SMTH")
SST_GLOBAL_MEAN <- ncvar_get(ncIn, "SST_GLOBAL_MEAN")
SST_GLOBAL_MEAN_SMTH <- ncvar_get(ncIn, "SST_GLOBAL_MEAN_SMTH")
df_amo <- data.frame(year, AMO_WARM, AMO_WARM_REMOVED, AMO_WARM_REMOVED_SMTH,AMO_WARM_SMTH,
SST_GLOBAL_MEAN, SST_GLOBAL_MEAN_SMTH)
#names(df_amo)
#df_amo$year
```
<!--_Save data for furture use_*-->
```{r echo=FALSE}
dir <- '../data/clean/'
write_csv(hurdat2_out, file.path(dir, "hurdat2_out.csv"))
```
```{r echo=FALSE}
#names(hurdat2_out)
```
```{r echo=FALSE}
#names(df)
```
```{r echo=FALSE}
#df$max_max_wind %>% unique()
```
**Meaning for each variables**
`_id_`
Storm id, which is unique. An id is a combination of 8 characters,
for example, 'AL092011',
* AL (Spaces 1 and 2) – Basin – Atlantic
* 09 (Spaces 3 and 4) – ATCF cyclone number for that year
* 2011 (Spaces 5-8, before first comma) – Year
for detail information, please see [dataformat](https://www.nhc.noaa.gov/data/hurdat/hurdat2-format-atlantic.pdf)
`_name_`
Storm Name, which is non-unique. There are six lists that are used in rotation and re-cycled every six years, i.e., the 2013 list is used again in 2019. For more information, please see [tropical cyclone names](https://www.nhc.noaa.gov/aboutnames.shtml).
`_storm_id_`
Storm name and id combined, i.e., Sandy-2012
`_unname_label_`
Storms have name or not (“yes”, “no”)
`_datetime, year, month, day, hour_`
Date of report (in Universal Time Coordinate)
`_record_identifier_`
C – Closest approach to a coast, not followed by a landfall
G – Genesis
I – An intensity peak in terms of both pressure and wind
L – Landfall (center of system crossing a coastline)
P – Minimum in central pressure
R – Provides additional detail on the intensity of the cyclone when rapid changes are underway
S – Change of status of the system
T – Provides additional detail on the track (position) of the cyclone
W – Maximum sustained wind speed
`_latitude,longitude_`
Location of storm center
`_status_`
Storm classification (Tropical Depression, Tropical Storm, or Hurricane)
TD – Tropical cyclone of tropical depression intensity (< 34 knots)
TS – Tropical cyclone of tropical storm intensity (34-63 knots)
HU – Tropical cyclone of hurricane intensity (> 64 knots)
EX – Extratropical cyclone (of any intensity)
SD – Subtropical cyclone of subtropical depression intensity (< 34 knots)
SS – Subtropical cyclone of subtropical storm intensity (> 34 knots)
LO – A low that is neither a tropical cyclone, a subtropical cyclone, nor an extratropical cyclone (of any intensity)
WV – Tropical Wave (of any intensity)
DB – Disturbance (of any intensity)
`_category_`
[Saffir-Simpson storm category](https://www.nhc.noaa.gov/aboutsshws.php) (estimated from wind speed. -1 = Tropical Depression, 0 = Tropical Storm)
`_max_wind_`
storm's maximum sustained wind speed (in knots)
`_min_pressure_`
Air pressure at the storm's center (in millibars)
`_ts_diameter_`
Diameter of the area experiencing tropical storm strength winds (34 knots or above)
`_hu_diameter_`
Diameter of the area experiencing hurricane strength winds (64 knots or above)
`_max_category_`
Maximum category of each storm track
`_max_status_label_`
Label (“TRUE”, “FALSE”) to indicate whether the measurement is for the maximum status of each track
`_max_max_wind_`
The maximum value of the ` max_wind ` for each track
`_min_min_pressure_`
The minimum value of the ` min_pressure ` for each track
`_max_ts_diameter_`
The maximum value of the ` ts_diameter ` for each track
`_max_hu_diamter_`
The maximum value of the ` hu_diameter ` for each track
!--test-->
```{r echo=FALSE}
d <- read.table('ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt',
na.strings = -99.99)
names(d) <- c('year', 'month', 'decimalYear', 'average', 'interpolated', 'trend', 'ndays')
d$time <- with(d, as.Date(paste(year, sprintf('%02d', month), '01', sep='-'), format='%Y-%m-%d', tz='UTC'))
time <- d$time
co2 <- d$interpolated
df_aco2 <- data.frame(time, co2)
#names(d)
```
```{r echo=FALSE}
#d$year %>% unique()
```
```{r echo=FALSE}
hurdat_names <- c("year","Original_Named_Storms","Revised_Named_Storms",
"Original_Hurricanes", "Revised_Hurricanes",
"Original_Major_Hurricanes","Revised_Major_Hurricanes","Original_ACE","Revised_ACE",
"Original_US_Hurricanes","Original_US_Hurricanes_Category",
"Revised_US_Hurricanes", "Revised_US_Hurricanes_Category")
col_types = list(year = col_double(),
Original_Named_Storms = col_double(),
Revised_Named_Storms = col_double(),
Original_Hurricanes = col_double(),
Revised_Hurricanes = col_double(),
Original_Major_Hurricanes = col_double(),
Revised_Major_Hurricanes = col_double(),
Original_ACE = col_double(),
Revised_ACE = col_double(),
Original_US_Hurricanes = col_double(),
Original_US_Hurricanes_Category = col_character(),
Revised_US_Hurricanes = col_double(),
Revised_US_Hurricanes_Category = col_character())
dfhurdatin<- "../data/raw/HURDAT.csv"
df_hurdat <- read_csv(dfhurdatin, skip =7, n_max = 168, col_names = hurdat_names, na = c(""," "), col_types = col_types)
#names(df_hurdat)
#df_hurdat$year
```
```{r echo=FALSE}
library(ncdf4)
ncfname <- "../data/raw/AMO_HADLEY.1870-2010.CLM_1901-1970.nc"
ncIn <- nc_open(ncfname)
#print(ncIn)
year <- ncvar_get(ncIn, "year")
AMO_WARM <- ncvar_get(ncIn, "AMO_WARM")
AMO_WARM_REMOVED <- ncvar_get(ncIn, "AMO_WARM_REMOVED")
AMO_WARM_REMOVED_SMTH <- ncvar_get(ncIn, "AMO_WARM_REMOVED_SMTH")
AMO_WARM_SMTH <- ncvar_get(ncIn, "AMO_WARM_SMTH")
SST_GLOBAL_MEAN <- ncvar_get(ncIn, "SST_GLOBAL_MEAN")
SST_GLOBAL_MEAN_SMTH <- ncvar_get(ncIn, "SST_GLOBAL_MEAN_SMTH")
df_amo <- data.frame(year, AMO_WARM, AMO_WARM_REMOVED, AMO_WARM_REMOVED_SMTH,AMO_WARM_SMTH,
SST_GLOBAL_MEAN, SST_GLOBAL_MEAN_SMTH)
#names(df_amo)
#df_amo$year
```
<!--_Save data for furture use_*-->
```{r echo=FALSE}
dir <- '../data/clean/'
write_csv(hurdat2_out, file.path(dir, "hurdat2_out.csv"))
```
```{r echo=FALSE}
#names(hurdat2_out)
```
<!--chapter:end:02-DataTransformation.Rmd-->
# Missing Values{#missingval}
Here, we use `visna::extract` to look at the missing value patterns. _The bars_ beneath the columns in Figure \@ref(fig:missingvalue-extracat-fig) show the proportions of missingness by variable, suggesting hurricane diameter _(`hu_diameter`)_ and storm diameter _(`ts_diameter`)_ both have the highest number of missing value and the columns suggest that they follow the same missing pattern, meaning when hurricane diameter is missing, storm diameter is also missing.
The third most missing variable is Pressure _(`min_pressure`)_ and _the columns_ show that Pressure is missing only when hurricane diameter and storm diameter are missing. The _bars_ on the right show the relative frequencies of the missing patterns, which suggest the most frequent missing patterns are in the combination of hurricane diameter, storm diameter and pressure, followed by the combination of hurricane diameter and storm diameter. Non-missing data are in the third meaning most of rows are completeness.
Finally, hurricane diameter and storm diameter are the most missing values because they are calculated from _Wind Raddii_ but _Wind Raddii_ values were not used before 2004.
```{r echo=FALSE}
knitr::opts_chunk$set(warning = FALSE, message = FALSE)
```
```{r echo=FALSE}
library(tidyverse)
library(stringr)
library(naniar)
library(mi)
#devtools::install_github("cran/extracat", force = TRUE)
#devtools::install_github("coatless/ucidata", force = TRUE)
library(extracat)
library(visdat)
library(ggthemes)
```
```{r echo=FALSE}
tcs <- c("tropical storm", "tropical hurricane","subtropical storm")
df <- df %>% dplyr::filter(status %in% tcs)
#names(df)
```
```{r missingvalue-extracat-fig, fig.cap='Missing Values Patterns', out.width='80%', fig.asp=.75, fig.align='center', echo=FALSE}
# review missingness patterns
df_new_name <- df %>% dplyr::select(min_pressure, max_wind, longitude, latitude, ts_diameter, hu_diameter) %>%
dplyr::rename(Pressure = min_pressure, Wind = max_wind, Longitude = longitude, Latitude = latitude, TS_Diam= ts_diameter, HU_Diam = hu_diameter)
extracat::visna(df_new_name, sort = "b")
```
It is worth noting that almost all Pressure data are missing from 1850s to 1940s. The number of missing data is then decreasing from the 1940s, and there are no missing Pressure starting the 2000s (see Figure \@ref(fig:missingvalue-pressure-year-fig)). The reasons for this are are the following: (1) in the early years, information about pressure was recorded by ships; those were few in numbers and thus a lot of pressure data could not be recorded (2) in more recent years, it became a common habit to replace missing pressure values by an analytical product such as sattelite data; (3) improvements in modern tools and technologies have provided us with a more powerful observation network than that in the early years.
```{r missingvalue-pressure-year-fig, fig.cap="Proportion of Missing Air Pressure at the Storm\'s Center By Year", out.width='80%', fig.asp=.75, fig.align='center', echo=FALSE}
df_completeish <-
df %>% group_by(year) %>% summarize(num_completeish = n(), num_na = sum(is.na(min_pressure))) %>% mutate(percent_na = num_na/num_completeish) %>% arrange(-percent_na)
ggplot(df_completeish, aes(x=year, y =percent_na))+
geom_line()+
xlab('Year')+
ylab("Proportion of Missing Air Pressure (in millibars)") + theme_grey(13)+
theme(plot.title = element_text(hjust = 0.5))
```
Another noticeable pattern in missing value relates to the name of tropical cyclones (see Figure \@ref(fig:atlatnic-storms-unnamed-year-fig)).
Before the 1950s, it was not common practice to name tropical cyclones. This explains why we cannot find any cyclone with a name before that time. A couple were still not named after that, and the reason can be linked to the category: those cyclones are of category 0 and 1 (more information about categories can be found in part 5), which means that they could have been overlooked and therefore didn't receive any name.
```{r atlatnic-storms-unnamed-year-fig, fig.cap='Cyclones received names starting 1950s', out.width='80%', fig.asp=.75, fig.align='center', echo=FALSE}
df$category = factor(df$category, levels=c("0","1","2","3","4","5"))
df %>%
dplyr::filter(category==c("0","1","2","3","4","5")) %>%
dplyr::select(c("id","name","year","category")) %>% unique() %>%
mutate(name_id = if_else(name=="UNNAMED", "no", "yes")) %>%
#group_by(name_id, year, category) %>% summarise(name_id_num = n()) %>% ungroup() %>%
ggplot(aes(x=year, y=name_id, color=name_id))+
geom_point()+
facet_wrap(~category)+
#geom_smooth()+
labs(x = "Year", y = "",color = "Is the cyclone named?")+
theme_grey(13)+
theme(plot.title = element_text(hjust = 0.5))+
scale_color_colorblind()
```
<!--chapter:end:03-MissingValues.Rmd-->
# Results {#result}
```{r include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(warning = FALSE, message = FALSE)
```
## What influences the change in category of a hurricane?
Hurricanes are classified in different categories based on the [Saffir-Simpson](https://www.nhc.noaa.gov/aboutsshws.php) Hurricane Wind Scale, which classifies hurricanes with regards to their wind speeds and estimates potential property damages. The hurricanes categories come under five items:
(1) Category 1:
Very dangerous winds from 64 to 82 knot (kt) that will lead to some damage such as roof damage or snapped tree branches. A famous hurricane of category 1 is [Danny](https://en.wikipedia.org/wiki/Hurricane_Danny_(1985)), which hit Cuba and the Gulf of Mexico in 1985.
(2) Category 2:
Extremely dangerous winds from 83 to 95 kt that will produce extensive damage such as roof damage and siding, many toppled shallowly rooted trees and near-total power outage. A famous hurricane of category 2 is [Erin](https://en.wikipedia.org/wiki/Hurricane_Erin_(1995)) which hit Florida in 1995 and caused 5 people to die and cost $700 million (1995 USD).
(3) Category 3:
Extremely dangerous winds from 96 to 112 kt will cause devastating damage such as removal of roof decking, power outage and power outage for days to weeks after the storm passes. [Katrina](https://en.wikipedia.org/wiki/Hurricane_Katrina) is a famous hurricane of category 3 that lead to 1,836 deaths and $81 billion (2005 USD) in property damage in 2005. Major states impacted were Louisiana and Mississippi.
(4) Category 4:
Extremely dangerous winds from 113 to 136 kt that will cause catastrophic damage such as loss of most of the roof structure and exterior walls, fallen trees and power outage for weeks to months. A famous hurricane of category 4 is [Harvey](https://en.wikipedia.org/wiki/Hurricane_Harvey) which hit Texas and Louisiana in 2017 with damage estimated to $125 billion (2017 USD) and 107 deaths in the United States.
(5) Category 5:
Extremely dangerous winds of 137 kt or higher that will cause catastrophic damage such as total roof and wall collapse, fallen trees, power outage for weeks to months and uninhabitable regions for weeks or months. A famous hurricane of category 5 is [Andrew](https://en.wikipedia.org/wiki/Hurricane_Andrew), which hit Louisiana and Florida in 1992 and whose damage accounts for $41.5 billion (2011 USD) and 54 deaths (approximately 1 million people were evacuated before the storm).
Besides, it is also worth distinguishing hurricanes from tropical storms, whose maximum sustained surface winds range from 34 to 63 kt. They are labeled category 0 in our dataset.
The strength of a hurricane can also be assessed by its barometric pressure, that is the pressure at its center: the lower the barometric pressure, the stronger the hurricane.
Finally, the size of a hurricane is defined by its diameter.
Let's now examine those features on a parallel coordinate plot and see how they relate to each other.
```{r pcp-category-wind-pressure-fig, fig.cap="Windspeed and Pressure are indicators of cyclone category", out.width='80%', fig.asp=.75, fig.align='center', echo=FALSE}
#PCP plot showing features of high category hurricane (high wind, low pressure)
data2<- df %>% dplyr::select(max_wind, min_pressure, hu_diameter,category)
colnames(data2) <- c("Windspeed", "Pressure", "Diameter", "category")
data2$category <- as.factor(data2$category)
ggparcoord(data2,columns=1:3,alphaLines=1,
groupColumn=4)+
geom_vline(xintercept=1:3,color="grey")+
labs(x = "Features", y = "Standardized scale", color = "Category")+
theme_grey(13)+
scale_color_viridis_d()+
scale_color_colorblind()
```
First of all, as showing in Figure \@ref(fig:pcp-category-wind-pressure-fig), categories are clearly distinct in terms of windspeed and increase proportionally to this feature. This is consistent with the definition of the [Saffir-Simpson Scale](https://www.nhc.noaa.gov/aboutsshws.php). The same conclusion can be drawn for the [barometric pressure measured at the center of the hurricane]( https://sciencing.com/barometric-pressure-hurricanes-22734.html): for low pressures, the hurricane will be overall of a higher category. The distinction is however not as clear as for windspeed and some overlap can be noticed.
Furthermore, no relationship can be established between the diameter of the area experiencing hurricane strength winds of 64 knots or above hurricane diameter _(`hu_diameter`)_, which would mean that the category, wind speed and min pressure will not determine the potential areas that will be affected by hurricane.
Note that all tropical storm have a diameter equal to 0: this is because the variable used for the graph is hurricane diameter, which accounts for the diameter of hurricanes. The diameter of the area experiencing tropical storm strength winds (34 knots or above) is reported in a separated variable called storm diameter _(`ts_diameter`)_. As this project focuses on hurricanes, and these two parameters are highly correlated to each other, the decision was made to not include it in the graph.
### A first look into the evolution of the category of hurricanes
Let's now check whether a hurricane sticks to one category in its lifetime or if it changes overtime. To do so, we select three hurricanes of the same year: Irma (Florida), Maria (North Carolina) and Gert (North of Atlantic Ocean), which all occurred in 2017.
```{r echo=FALSE}
data3 <- df %>% dplyr::filter(name =="Irma",year == 2017)
data4 <- df %>% dplyr::filter(name == "Maria",year == 2017)
data5 <- df %>% dplyr::filter(name == "Gert",year == 2017)
```
```{r each-track-develop-fig, fig.cap='Three Example Hurricanes in 2017', out.width='100%', fig.asp=.75,fig.align='center',echo=FALSE}
#Plots that shows the evolution of category of hurricane over time (x: longitude, y: latitude, each point represent the data and category)
data6 <- rbind(data3,data4,data5)
data6$datetime <- as.Date(data6$datetime)
data6$category <- (as.numeric(data6$category)-1)
ggplot(data6) +
geom_sf(data=spData::world, fill = "grey85", col = 1) +
coord_sf(xlim = c(-100, -20), ylim = c(0,60), expand = TRUE)+
geom_point(aes(y=latitude, x=longitude, color= hu_diameter, size=category))+
facet_wrap(~name)+
labs(x = "Longitude", y = "Latitude", color = "Diameter (miles)", size = "Category")+
theme_grey(13)+
theme(axis.text.x = element_text(size=8, angle=45))
```
From Figure \@ref(fig:each-track-develop-fig), it can be noted that both Irma and Maria grew in intensity to reach category 5, while Gert **only** reached category 2 as a maximum. A difference in lifetime can also be highlighted (almost 2 weeks vs a couple days). As both Irma and Maria hit Southern states and Gert occured at a Northern location, the following hypothesis is stated: is the maximum category reached by a hurricane dependent on its location? That is, *is there a difference between the maximum category reached by Southern and Northern hurricanes?*
```{r each-track-develop-time-fig,out.width='80%', fig.asp=.75, fig.cap='Time Series of Three Example Hurricanes in 2017', fig.align='center',echo = FALSE}
#Time series of 3 hurricanes (In 2017, Irma hits Florida (category 5), Maria hits North Carolina (category 5?), Gert hits East Coast (category 2))
#Irma <- df %>% dplyr::select(category, datetime, storm_id) %>% dplyr::filter(storm_id=="Irma-2017")
#Maria <- df %>% dplyr::select(category, datetime, storm_id) %>% dplyr::filter(storm_id=="Maria-2017")
#Gert <- df %>% dplyr::select(category, datetime, storm_id) %>% dplyr::filter(storm_id=="Gert-2017")
#MaIrGer <- rbind(Maria, Irma, Gert)
#MaIrGer$datetime <- as.Date(MaIrGer$datetime)
ggplot(data6, aes(datetime, category, color=name)) +
geom_line()+
labs(x = "Time", y = "Category", color = "Hurricanes") +
theme_grey(13)+
scale_color_viridis_d()+
scale_color_colorblind()
```
Before testing that hypothesis, it is worthy to note that the vertical jumping lines in the time series of hurricanes (Figure \@ref(fig:each-track-develop-time-fig)). These suggest that hurricanes are highly dynamic and can lose/gain energy in only a couple hours. This concept is illustrated by the following time series (Figure \@ref(fig:tc-dramatic-fig)), showing a close up of hurricane Irma evolution on September 9th 2017; in a couple hours, the hurricane when from category 5 to category 2.
```{r tc-dramatic-fig, fig.cap = 'Close Up of Time Series of Hurricane Irma in 2017',out.width='80%', fig.asp=.75,fig.align='center', echo=FALSE}
IrSep9 <- df %>% dplyr::select(category, datetime, storm_id, day, hour) %>% dplyr::filter(storm_id=="Irma-2017", day == 9)
scaleFUN <- function(x) sprintf("%.2f", x)
Ir_hour <- ggplot(IrSep9, aes(hour, category)) +
geom_line()+
labs(x = "Hours", y = "Category") +
theme_grey(13)
Ir_hour + scale_x_continuous(labels=scaleFUN)
```
### Is there a difference between the maximum category reached by Southern and Northern hurricanes?
Let's now verify whether the hypothesis holds. The Figure \@ref(fig:each-track-develop-fig) depicts the path of each hurricane across time and also colors the diameter the area experiencing hurricane strength winds of 64 knots or above and sized according their categories.
Each hurricane formed in the low latitude east with a low category, then intensified (large circle and bluer) as it extracted energy from the warm/moist air over the oceans. As a result, its category evolved with time until it reached a certain point then dissipated. The hurricanes then broke apart because they move through cold water (e.g, Maria) or move over land (e.g., Irma) which made them losing touch with the hot water that powered them.
When hurricanes start at different location point, they absorb different amount of energy throughout their progression towards the west or northwest. This will lead to different energy formation that will determine the highest wind level measured, and in turn decide what that category of that hurricane will be.
The comparison of the hurricane track and intensity indicates that the longer the hurricane stay over the southern warm water, the stronger it gets to be (see Irma), while the northern formed storms, like Gert, could not reach high category and they die out a few hours after they start because they do not have enough energy to become a hurricane.
This seems to confirm the hypothesis stated above: there seems to be an interesting relationship between location and category of a hurricane. In addition, hurricanes seem to reach higher categories when located in the South of the Atlantic Ocean than in the North.
```{r co2-year-fig, fig.cap ='Frequency of Hurricanes across the United States',out.width='80%', fig.asp=.75,fig.align='center', echo=FALSE}
ggplot(data=df)+
geom_sf(data=spData::world, fill = "grey85", col = 1) +
coord_sf(xlim = c(-100, -20), ylim = c(0,60))+
geom_hex(aes(x=longitude, y=latitude), binwidth = c(1,1))+
scale_fill_gradientn(colours=c("orange","black"),name = "Frequency",na.value=NA)+
facet_wrap(~category)+
theme_bw(12)+xlab("Longitude")+ylab("Latitude")+
theme_grey(13)+
theme(axis.text.x = element_text(size=8, angle=45))
```
```{r echo=FALSE}
df_maxcate_only <- df %>% dplyr::filter(max_status_label == "TRUE")
```
Let's now check whether the hypothesis can be extended in terms of frequency. The frequency of hurricane by location for each category. Figure \@ref(fig:co2-year-fig) shows that storms and hurricanes disappear when they reach the land or the northern water.
The low category storms (e.g., category 0 and 1) with the highest density along the coastline can reach further North, while the high category hurricanes (4 and 5) only appear in the South. The highest density along the coastline is related to the progress of the small tropical depressions. When these small depressions travel over the ocean to the US eastern coast, they gain energy and update to the category 0 storm. It is worth to note that as category increases, their north boundary seems to shift toward the south, suggesting the feature of storms and hurricanes are location-related.
We build a Shiny Web Application for the Interactive Component in order to assist with visualizing this result.
## Does hurricane activity shift north-ward?
#```{r echo=FALSE}
#df <- read.csv("../data/clean/hurricanes.csv")
#```
```{r echo=FALSE}
library(tidyverse)
library(dplyr)
library(tidyverse)
library(parcoords)
library(ggplot2)
library(GGally)
library(vcd)
library(sf)
library(maps)
#View(data)
```
Next, we want to assess whether hurricane activity is shifting North. A recent study showed that the locations of tropical cyclones is shifting poleward (i.e. north-ward) globally ([Thompson](https://www.climatecentral.org/news/warming-may-shift-hurricane-impacts-17437)). If true, this implies heavy consequences for the Northern part of the United States. Southern states such as Florida have already taken steps to improve its approach to hurricanes, such as property structure improvements, rescue boats and vehicles investments, and emergency workers trainings; if the frequency of tropical cyclones is indeed shifting North, Northern states should consider taking similar actions.
Before exploring the above hypothesis, it is worth redefining latitude and longitude:
**Latitude** is the measurement of a place on Earth with regards to its distance north or south-wise from the Earth's equator. Latitude is expressed in degrees followed by the letter N or S whether the location measured is North or South from the equator. Interesting examples for this project are Miami's latitude (~26°N) and New York City's latitude (~41°N).
**Longitude** is the measurement of a place on Earth with regards to its distance west or east-wise from the meridian at Greenwich, England. It is also measured in degrees and followed by the letter W or E depending on whether the location of interest is West or East from the meridian.
We will focus on latitude to verify whether hurricane activity seems to be changing directions.
```{r echo=FALSE}
df_maxcate_only <- df %>% dplyr::filter(max_status_label == "TRUE")
```
```{r hurricane-north-fig, fig.cap='Frequency of Hurricanes by Latitude over Years', out.width='80%', fig.asp=.75, fig.align='center', echo=FALSE}
df_maxcate_only %>%
#dplyr::filter(year>1950) %>%
#dplyr::filter(year<2000) %>%
ggplot()+
geom_hex(aes(x=year, y=latitude))+#, binwidth = c(1,1))+
scale_fill_gradientn(colours=c("orange","black"),name = "Frequency",na.value=NA)+
#facet_wrap(~category)+
xlab("Year")+
ylab("Latitude")+
theme_bw(12)+
theme_grey(13)
```
Figure \@ref(fig:hurricane-north-fig) shows a heatmap of hurricane activity based on time and location.
While tropical cyclones frequency seems to be consistently higher at lower latitudes, one can observe an increase in cyclones activity at higher latitudes in the 1970s. However, this shift does not seem to be permanent as the frequency of tropical cyclones seems to decrease again in the following decades.
Besides, if we look at the tropical cyclones activity faceted on category (Figure \@ref(fig:hurricane-north-category-fig)), the shift in location seems to be focused on tropical storms (category 0) and hurricanes of category 1. Hurricanes of category 4 are also more frequent on higher latitudes in the 1970s but the frequency remains at a low level and the activity returns to lower latitudes in the following decades. These observed trends would therefore not justify investments in hurricane-proofed facilities for Northern States.
```{r hurricane-north-category-fig, fig.cap='Frequency of each Hurricane Category over Years', out.width='80%', fig.asp=.75, fig.align='center', echo=FALSE}
df_maxcate_only %>%
#dplyr::filter(year>1950) %>%
#dplyr::filter(year<2000) %>%
ggplot()+
geom_hex(aes(x=year, y=latitude))+#, binwidth = c(1,1))+
scale_fill_gradientn(colours=c("orange","black"),name = "Frequency",na.value=NA)+
facet_wrap(~category)+
theme_bw(12)+
xlab("Year")+
ylab("Latitude")+
theme_bw(12)+
theme_grey(13)
```
## Have humans already caused a detectable long-term increase in Atlantic hurricane activity?
As we can see from Figure \@ref(fig:ts-co2-year-fig), the global CO2 concentration has increased, and the tropical Atlantic sea surface temperature show significant warming trends (Figure \@ref(fig:ts-sst-year-fig)).
```{r echo=FALSE}
#HM filter
library(signal)
n <- 10
hm <- fir1(n, 0.1, type = c("low"), window = hamming(n + 1), scale = TRUE)
```
<!--CO2-->
```{r ts-co2-year-fig, fig.cap='Time Series of Atmospheric CO2 (ppm)', out.width='80%', fig.asp=.75, fig.align='center', echo=FALSE}
df_aco2 %>% ggplot(aes(x=time, y=co2))+
geom_line()+
geom_smooth()+
xlab("Year")+ylab("CO2 concentration (ppm)")+theme_grey(16)#ggtitle("Time Series of Atmospheric CO2 (ppm)")+
#+theme(plot.title = element_text(hjust = 0.5))
```
<!--SST-->
```{r ts-sst-year-fig, fig.cap='Time Series of Ocean Surface Temperature', out.width='80%', fig.asp=.75, fig.align='center', echo=FALSE}
df_amo$SST_GLOBAL_MEAN_lp <- signal::filtfilt(hm, df_amo$SST_GLOBAL_MEAN)
df_amo %>%
#dplyr::filter(year >1940) %>%
ggplot(aes(x=year, y=SST_GLOBAL_MEAN))+
geom_point()+
#geom_line(aes(x=year, y=SST_GLOBAL_MEAN_lp))
geom_smooth(method = "lm")+xlab("Year")+ylab("Surface Ocean Temperature (in Celsius)")+
#ggtitle("Time Series of Ocean Surface Temperature")+
theme(plot.title = element_text(hjust = 0.5))+
theme_gray(16)
```