Go to the ae-06-[GITHUB USERNAME]
repo, clone it, and start a new project in RStudio. See the Lab 01 for more detailed instructions about cloning a repo and starting a new project.
Run the following code to configure Git. Fill in your GitHub username and the email address associated with your GitHub account.
library(usethis)
use_git_config(user.name= "your github username", user.email="your email")
library(tidyverse)
library(scales)
fisheries <- read_csv("data/fisheries.csv")
continents <- read_csv("data/continents.csv")
The code below fills in the gaps from joining the data sets to creating the updated visualizations.
fisheries <- fisheries %>%
filter(total > 100000) %>%
left_join(continents) %>%
mutate(
continent = case_when(
country == "Democratic Republic of the Congo" ~ "Africa",
country == "Hong Kong" ~ "Asia",
country == "Myanmar" ~ "Asia",
TRUE ~ continent
),
aquaculture_perc = aquaculture / total
)
Note: In each of these exercises you will need to set eval=TRUE
in the code chunk header when you’re ready to run the code for that exercise.
Calculate the mean aquaculture percentage (we’ll call it mean_ap
for short) for continents in the fisheries data using the summarise()
function in dplyr. Note that the function for calculating the mean is mean()
in R.
fisheries %>% # start with the fisheries data frame
___ %>% # group by continent
___(mean_ap = ___) # calculate mean aquaculture
Now expand your calculations to also calculate the minimum and maximum aquaculture percentage for continents in the fisheries data. Note that the functions for calculating minimum and maximum in R are min()
and max()
respectively.
fisheries %>% # start with the fisheries data frame
# and the rest of the code goes here
Create a new data frame called fisheries_summary
that calculates minimum, mean, and maximum aquaculture percentage for each continent in the fisheries data.
fisheries_summary <- fisheries %>%
# you can reuse code from Exercise 2 here
Take the fisheries_summary
data frame and order the results in descending order of mean aquaculture percentage.
fisheries_summary %>% # start with the fisheries_summary data frame
___ # order in descending order of mean_ap
The code below creates the graph you originally saw in the lecture slides. Change the theme to change the look of the graph. Choose one of the complete themes found in the ggplot2 reference page.
ggplot(fisheries_summary,
aes(y = fct_reorder(continent, mean_ap), x = mean_ap)) +
geom_col() +
scale_x_continuous(labels = label_percent(accuracy = 1)) +
labs(
x = "",
y = "",
title = "Average share of aquaculture by continent",
subtitle = "out of total fisheries harvest, 2016",
caption = "Source: bit.ly/2VrawTt"
) +
theme_minimal() #change the theme!
This exercise was modified from “Fisheries” in Data Science in Box.