R Programming: Functions

RStudio really makes the language shine – giving a user GUI access to tables and variables. The language itself is a bit odd by modern standards… and while I enjoy using R, programming in R can seem a bit daunting at first.

While running some scatter plot comparisons (pairs) within R on a dataset, I decided to construct a method so I could pass in a variable to plot.

Function #1: Data by Country

In Python and many modern languages, writing methods or functions is pretty straight forward. In R, you define a variable identifier as a function:

master <- read.csv('/Users/bwarner/Downloads/master.csv') total_suicides_by_country <- function(country_name){ subset_country <- master[master$country == country_name,] pairs(~ subset_country$suicides_no+subset_country$year+subset_country$population+subset_country$gdp_per_capita...., main=paste("Suicide Data for", country_name), labels = c("Suicides","Year","Population","GDP per Capita")) }

In the code above, I have a CSV file on my local system. This is a dataset of suicide statistics across the world.

Rather than repeatedly pass the same commands into the console for any country I’m curious about, now I just do this:

total_suicides_by_country("Armenia") total_suicides_by_country("Japan") total_suicides_by_country("United States")

Subsetting the Data

The first line of the function makes a subset of the dataset by country name.

Pairs

Using the R standard library pairs method, I plot pairs for the country in question. Note the “$gdp_per_capita….” syntax. that column header had special characters, by doing this, it works like a splat or a %LIKE% in SQL.

Dynamic Title (Country Name)

Using the main=paste() method I was able to take the country passed into the function and append it to the title of the plot.

Output of pairs()

Function 2: Data by Country & Year

suicides_by_year_country <- function(country_name, country_year){ by_year <- master[master$year == country_year,] by_country_by_year <- by_year[by_year$country == country_name,] pairs(~ by_country_by_year$suicides_no+by_country_by_year$population+by_country_by_year$gdp_per_capita...., main=paste("Suicide Data for", country_name),labels = c("Suicides","Year","Population","GDP per Capita")) }

There might be a more elegant way… I’m not sure. This works, but it is a tad confusing. I’m subsetting the dataset twice. The first time I’m subsetting by country name (as before), then I’m subsetting that data frame by year.

In this way I can visually hone in on a countries data by specific year.

Function #3: Boxplotting Comparisons of all Countries

comparison_by_year <- function(country_year){ by_year <- master[master$year == country_year,] boxplot(col="red",by_year$suicides_no ~ by_year$country, main=paste("Global Suicides in", country_year), mtext(text = "Suicides Counts", side = 2)) }

This third function I wrote for the dataset takes a single parameter of year. Subsetting the dataset by year, all the global suicides for that year are described by country. The output is in a Boxplot diagram.

Using another dynamic title, I’m passing in the year of the output as the title:

Leave a Reply

Your email address will not be published. Required fields are marked *