Posts

Showing posts from February, 2024

Module 7: Mtcars Visualizations

Image
Hi everyone! This week we learned how to utilize R to capture new visualizations such as box plots and scatterplots. We explored the three dimensions of distributions including shape, location, and spread.  I have some experience utilizing ggplot2, so I utilized this package for my visualizations.   I chose to explore the spread of a car's fuel efficiency (miles per gallon) for manual transmission cars versus automatic cars. It is clear from these box plots that while there is a wider range of fuel efficiency for automatic cars the average mpg is much higher than all manual cars. I decided to explore location through a scatterplot of fuel efficiency versus the horsepower of cars with different numbers of cylinders. It is clear from the scatterplot that cars with the least number of cylinders have the lowest horsepower but the highest fuel efficiency. There seems to be a larger range of efficiency for 4 cylinders unlike 8 cylinders that have a wider variety of horsepower with s...

Module 6: Basic Visualizations in R

Image
Hi everyone! This week we learned how to create basic visualizations in R and the techniques to decide which visualizations fit one type of data versus another. Yau explains the most important aspect of a visualization is its ability to help a viewer spot differences in the data.  I chose to explore the in-built dataset in R called "USArrests" which is categorizes arrests by murder, rape, and assault  per 100,000 residents in each of the 50 US states in 1973. I have never created a pie chart in R before, so I used this dataset to capture a basic visualization of the country's total arrests by crime type in a pie chart. I chose to calculate these by percentages of the whole in order to help the viewer understand assaults are significantly higher percentage of crimes in the US.  Within these assaults, I thought it would be interesting to understand the distribution of number of assaults by each state to know which states have the highest and lowest assau...

Module 5: Plot.ly Visualization

Image
Hi everyone! This week we learned about part versus whole graphs and how to utilize the software plot.ly. The given dataset is average position versus time.  Generally, time data is best visualized in a line graph. After showcasing time versus average position, the change between each point looked interesting. To capture this, I added a bar graph of the percent change (rate of change) in position over the time intervals to gain a better understanding of when the object was moving the fastest versus the slowest. I also included units to explain the data better. Chart in Plot.ly  In this visualization, it is clear the object is moving linearly a large distance between 1 to 2.5 seconds. The bar graph displays the percent difference in position by each second indicating the object is rapidly decreasing in speed after 2.5 seconds. This is seen as plateauing in the line graph, but we gain a better understanding of this rapid decrease in rate of change through the bar graph. The...

Module 4: Visualizing Time Series Data

Image
Hi everyone! This week we explored time series data and the basics of statistics. The given data set is a Monthly Modal Time Series of public transit data for many U.S cities including information such as vehicle revenue hours from 2014 to 2019. I chose to keep the city , year , UZA sq miles (urban area in miles), ridership (total population that rode public transit), vehicle revenue hours (travel hours), and vehicle revenue miles (travel miles) columns in this dataset for my analysis. There are too many cities in this dataset, so I ordered the cities from largest to smallest Primary UZA Sq Miles to focus my visualization on only the top 5 urban areas which are Atlanta, Boston, Chicago, New York, and Philadelphia . I chose the top 5 because they have the most potential to use public transit making an interesting visualization. My line graph showcases each of these major cities' percent difference in average ridership every year from 2014 to 2019 relative to the ridership i...