Note to Self: Using the filter and select functions from the dplyr package

This is the first post in a series where I write to myself regarding the various data science spells I’m learning. Today’s spell: dplyr’s filter function. For some reason, upon learning how to filter data with the dplyr package, I thought that function was designed to only remove or discard data, specifically columns. That is not the case and I’m writing this blog post to try and correct this automatic thinking in my brain.

Continue »

Plotting multiple lines on the y axis of a ggplot graph

I wanted to plot the yearly sales of three different types of hybrid and electric vehicles on the same graph. The dataset was originally wide with years as columns and the types of cars as rows. After cleaning the data (making it skinner by switching cars to columns and years to rows) and saving it to the name “ev_csv_3”, it was time to plot. In order to have multiple y-axis lines, simply skip entering a y argument in the aes function in the first line of the ggplot call.

Continue »