## Descriptive Statistics in R

The following notes cover the use of R to create measurements of central tendency: mean(), median() and mode(), as well as the spread of data through range, IQR (inter-quantile-range) and standard deviation. Finishing the notes is some useful visualizations for this work, including standard R graphic functions, as well as ggplot graphs. Code Show All … Read moreDescriptive Statistics in R

## Tidyverse (dplyr) Notes

Code Show All Code Hide All Code Download Rmd Tidyverse & dplyr Tidyverse is a collectin of data science tools in R. This library needs to be installed and loaded, as it’s not part of the standard library. install.packages(“tidyverse”) The same is true with dplyr (“deep ply R”). To use these libraries: library(tidyverse) library(dplyr) For … Read moreTidyverse (dplyr) Notes

## Using Quantiles to Find Values

Let’s say we have a dataset that is the height of people in inches. Someone asks us “what’s the cut off where 2.2% of the tallest men would be at?” In other words, we have a population that ranges from 59.85″ to 78.53″, where does the top 2.2% come in? By running the quantile method … Read moreUsing Quantiles to Find Values

## Testing for Normal Distribution

A lot of online statistics courses make use of generated data that follows a normal distribution. This is great for a basic understanding of the normal distribution, but it doesn’t help with real world data. Even if the histogram for some data appears to be normally distributed, we really don’t have evidence it’s normally distributed. … Read moreTesting for Normal Distribution

## R Programming: Functions

RStudio really makes the language shine – giving a user GUI access to tables and variables. The language itself is a bit odd by modern standards… and while I enjoy using R, programming in R can seem a bit daunting at first. While running some scatter plot comparisons (pairs) within R on a dataset, I … Read moreR Programming: Functions

## Boxplots

Boxplots are very useful graphs. They tell us a story of not only the range of the data for a variable, but also depict the mean and interquartile ranges. Using a small dataset of mall customer information, I created the above boxplot. On the Y axis we have the variable for Age and along the … Read moreBoxplots

## R: Multi-Scatter Plots

When analyzing results, it can help to compare graphs side by side. R has a few options to achieve this within the base R package. Parameters Using parameters (par method) we can create a grid of graphs like so: The command above sets the environment to render the next graphs in a grid. In this … Read moreR: Multi-Scatter Plots