Workshop information

Welcome! This workshop will provide an introduction to the basics of data analysis in R (including loading and writing excel and other common file formats, basic analysis, and tidyverse methods), and will also explore data visualization using ggplot, and task automation through the cronR (mac) and taskscheduleR (windows) packages. If time permits or in a follow-up workshop, we can also explore selections from topics such as mapping and geocoding in R, connecting to databases in R, using API’s to pull data automatically in R (e.g. from data.cdc.gov), and working with census data in R.

Instructors: Marisa Eisenberg, Michael Hayashi, and Julie (Jules) Gilbert (University of Michigan, Ann Arbor)

Overview of topics

Below is an overview of the topics we’ll cover (and when!) so you can join for the parts of the workshop that make sense for you. Feel free to listen in even if you’re already familiar with a topic (or you are worried you may not have the skills for one) —in either case we hope it will help give you some context and hopefully you’ll find some useful info!

We’ve given estimated times for each topic below, but note they may be somewhat approximate! We may adjust as needed based on how comfortable everyone is working with R.


0. Pre-workshop set up

You should have received an email already with this info! But if for some reason you haven’t already installed the R, RStudio, and the packages we’ll be using today, here are the instructions and the installation check code that you can run to make sure you’re all set!

If you need help at any point during the workshop, you can join this zoom room:

And Jules will be available to help you get sorted out!


0.1 Datasets we will use

Before we start, please download this zip file of all the datasets we will use for this workshop. Put it in whatever folder/directory you plan to put all your workshop code in and then unzip it. You should have a folder called “datasets” with a bunch of csv and excel data inside!

1. Basic use of R, loading packages

1-1:10pm (10 minutes)

Covers: basic use of R, RStudio (the environment, console, etc.), loading packages (basically making sure everyone is set up and ready to go!)

2. Dataframes, loading data, and basic data operations

1:10-1:30pm (20 minutes)

Covers: dataframes, loading csv and excel data, basic operations with dataframes

3. A bit of coding: loops, conditionals, functions

1:30-1:50pm (20 minutes)

4. Tidyverse and dplyr: the pipe operator!

1:50-2:10pm (20 minutes)

5. Visualization with ggplot2

2:10-2:30pm (20 minutes)

6. Automation with cronR and taskscheduleR

2:30-2:50pm (20 minutes)

7. Quick intro to API’s and the RSocrata package

2:50-3:00pm (10 minutes)

8. Additional topics (if time!), topics for next time

  • Basics of building maps in R
  • Geocoding data
  • More fun with ggplot
  • More advanced coding
  • Rmarkdown (what we used to make this website!)
  • Do we want to do a follow up workshop? If so what to cover? Potential topics besides the above:
    • Shiny apps, building dashboards (e.g. with shiny + flexdashboard), basics of building websites
    • Working with census data (requires to have an account, so a bit more set up that we can do this time)
    • Accessing NHANES data with the nhanes package
    • Other API’s and data scraping
    • Parallel computing
    • Doing some analyses? Correlations, cross correlations, linear models, that sort of thing?

R Resources

More to be added soon!