## Using Quantiles to Find Values

Let’s say we have a dataset that is the height of people in inches. Someone asks us “what’s the cut off where 2.2% of the tallest men would be at?” In other words, we have a population that ranges from 59.85″ to 78.53″, where does the top 2.2% come in? By running the quantile method … Read moreUsing Quantiles to Find Values

## Testing for Normal Distribution

A lot of online statistics courses make use of generated data that follows a normal distribution. This is great for a basic understanding of the normal distribution, but it doesn’t help with real world data. Even if the histogram for some data appears to be normally distributed, we really don’t have evidence it’s normally distributed. … Read moreTesting for Normal Distribution

## R Programming: Functions

RStudio really makes the language shine – giving a user GUI access to tables and variables. The language itself is a bit odd by modern standards… and while I enjoy using R, programming in R can seem a bit daunting at first. While running some scatter plot comparisons (pairs) within R on a dataset, I … Read moreR Programming: Functions

## Boxplots

Boxplots are very useful graphs. They tell us a story of not only the range of the data for a variable, but also depict the mean and interquartile ranges. Using a small dataset of mall customer information, I created the above boxplot. On the Y axis we have the variable for Age and along the … Read moreBoxplots

## R: Multi-Scatter Plots

When analyzing results, it can help to compare graphs side by side. R has a few options to achieve this within the base R package. Parameters Using parameters (par method) we can create a grid of graphs like so: The command above sets the environment to render the next graphs in a grid. In this … Read moreR: Multi-Scatter Plots

## Python Vs. R for Data Science

So many opinions are out there about which is better for Data Science. Python has been a strong contributor to Data Science, especially in regards to Machine Learning and AI. However, when looking up the differences between the two languages (R and Python), there’s obvious bias presented. Often times people prefer Python and tend to … Read morePython Vs. R for Data Science

## System Stats Dashboard: Tableau

Utilizing some manual intervention, I constructed the dashboard below. The dashboard is showing an outage in my local testing environment. Here we can see that at a specific time (around 5:30PM GMT) my local system had an outage. Trying to correlate data to the problem, I pulled data from Incoming and Outgoing streams, as well … Read moreSystem Stats Dashboard: Tableau

## Jupyter Mathematical Markdown

When you install Jupyter you may or may not have the Latex library installed with it. If Latex is installed, you will be able to generate Mathematical expressions. Below are a few examples: Installing Latex If Latex isn’t installed (or you’re not sure) and you have Anaconda installed, you simply run this command from the … Read moreJupyter Mathematical Markdown

## Jupyter Notebook

I love Notebooks. The first time I came across one was in the use of Mathematica, a few years ago. Mathematica is special in that you can easily write (using keyboard shortcuts) Mathematica notation (rather than code that runs the mathematical functions.) Shortly after getting into Mathematica, I discovered data science and from there I … Read moreJupyter Notebook