Loading Libraries and Getting Data
library(httr)
library(jsonlite)
library(purrr)
##
## Attaching package: 'purrr'
## The following object is masked from 'package:jsonlite':
##
## flatten
library(tidyverse)
## -- Attaching packages ---------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.0 v dplyr 0.8.5
## v tibble 3.0.0 v stringr 1.4.0
## v tidyr 1.0.2 v forcats 0.5.0
## v readr 1.3.1
## -- Conflicts ------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x purrr::flatten() masks jsonlite::flatten()
## x dplyr::lag() masks stats::lag()
res = GET("https://api.teleport.org/api/urban_areas/")
data = fromJSON(rawToChar((res$content)))
urban_areas = data.frame(data$`_links`$`ua:item`)
Cleaning data
# Function to get the individual data sets from each city's url
get_all_scores = function(url, name){
initial = fromJSON(rawToChar(GET(as.character(urban_areas$href[which(urban_areas$name == name)]))$content))
city_api = fromJSON(rawToChar(GET(as.character(initial$`_links`$`ua:scores`))$content))
city_data = data.frame(city_api$categories)
# Data frame that calculates the Quality of life and includes the cost of living and internet access
# 1 = Housing, 8 = Safety, 10 = Education, 11 = Environmental Quality, 14 = Internet Access,
# 15 = Leisure & Culture, 2 = Cost of Living
data.frame("Name" = name,
"Quality of Life" = mean(city_data$score_out_of_10[1], city_data$score_out_of_10[8],
city_data$score_out_of_10[10], city_data$score_out_of_10[11],
city_data$score_out_of_10[15]),
"Cost of Living" = city_data$score_out_of_10[2], stringsAsFactors = FALSE,
"Internet Access" = city_data$score_out_of_10[14])
}
# Uses the function to get all the scores
all_scores = bind_rows(list(url = urban_areas$href, name = urban_areas$name) %>%
pmap(.f = get_all_scores))
Creating Graph
all_scores %>%
ggplot(aes(x = Quality.of.Life, y = Cost.of.Living, color = Internet.Access)) +
geom_point(size = 2) +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(size = 0.5, linetype = "solid",
colour = "black")) +
labs(x = "Positivities in Life Score", y = "Cost of Living Score",
title = "Living in Urban Areas")
The main purpose of this plot is to show that there is a positive correlation between the positivites in life and cost of living in urban areas. The positivities is a mean score of the Housing, Safety, Education, Enviornmental Quality, and Leisure & Culture Scores. Although the positivities may be biased because I chose which scores are considered “Positivities,” I believe the majority of people would agree with me. A side factor I wanted to show is that higher positivity and cost of living areas tend to have lower internet access. (Personal Opinion: Therefore, do not get rich and move to a cooler house, because you will sadly have less internet! :C )