A workshop series on how to efficiently manage and analyze sequencing data. Bioinformatics Workshops

Introduction to R: Basics, Plots, and RNA-seq Differential Expression Analysis

At a glance
Opportunity for
  • Participants to understand the basics of R and RStudio and their application to differential gene expression analysis on RNA-seq count data, including data visualization
  • MD, DNP, PhD or equivalent, DDS/DMD
  • Receipt of endorsement from an applicant's supervisor stating the applicant will be able to attend all days of the workshop
  • Preference for no or minimal R programming experience or familiarity with RNA-seq analysis methods
Time commitment
  • Four days (broken up between two weeks)
Funding level
  • Tuition-free
Session dates
  • April 5-6 and April 12-13, 2018 (this is one iteration spanning four days)
Application due
  • 5:00pm on March 1, 2018
  • Endorsements due 5:00pm March 5, 2018
  • All applicants will be notified of their status via email no later than March 14, 2018.
The application process is closed.
Login required.
Need Help?

This four-day hands-on workshop taught by the teaching team at the Harvard Chan Bioinformatics Core will introduce participants to the basics of R and RStudio and their application to differential gene expression analysis on RNA-seq count data, including data visualization.

R is a simple programming environment that enables the effective handling of data, while providing excellent graphical support. RStudio is a tool that provides a user-friendly environment for working with R. Together, R and RStudio allow participants to wrangle data, plot using ggplot2, and use DESeq2 to obtain lists of differentially expressed genes from RNA-seq count data.

This workshop is intended to provide both basic R programming knowledge AND its application. Participants should be interested in:

  • using R for increasing their efficiency for data analysis
  • visualizing data using R (ggplot2)
  • using R to perform statistical analysis on RNA-seq count data to obtain differentially expressed gene lists

Workshop segments will address the following:

  • R syntax: Understanding the different 'parts of speech' in R; introducing variables and functions, demonstrating how functions work, and modifying arguments for specific use cases.
  • Data structures in R: Getting a handle on the classes of data structures and the types of data used by R.
  • Data inspection and wrangling: Reading in data from files. Using indices and various functions to subset, merge, and create datasets.
  • Visualizing data: Visualizing data using plotting functions in base R as well as from external packages such as ggplot2.
  • Exporting data and graphics: Generating new data tables and plots for use outside of the R environment.
  • Differential expression analysis for RNA-seq data:
    • QC on count data
    • Using DESeq2 to obtain a list of significantly different genes
    • Visualizing expression patterns of differentially expressed genes
    • Performing functional analysis on gene lists with R-based tools

Please note that this workshop does not cover single cell analysis and does not review personal data.

Harvard Catalyst Postgraduate Education Program's policy requires full attendance and the completion of all activity surveys to be eligible for CME credit; no partial credit is allowed.

The Harvard Catalyst Education Program is accredited by the Massachusetts Medical Society to provide continuing medical education for physicians.