Introduction to gene expression microarray analysis in R and Bioconductor

Microarrays were widely used in the ’00s to interrogate the gene expression of cells in a transcriptome-wide manner. Although the decreasing costs of sequencing has led RNA-seq to become the method of choice for genome-wide transcriptomics, microarrays are still used due to the relative simplicity of analysis. In addition there are many existing data sets using microarrays that are still valuable for analysis. This lesson will introduce you to using analysing gene expression experiments on microarrays using linear models of differential expression.

Prerequisites

The first portion of the lesson assumes a basic knowledge of R, and theoretical knowledge of how gene expression microarray experiments are peformed. The second portion assumes theoretical familiarity with how differentially expressed genes are identified using microarrays.

Schedule

Time Topic Learning Objectives
Before start Setup
14:00 Working with Bioconductor
  • Become familiar with the basic Bioconductor setup.
  • Be able to install the appropriate Bioconductor packages for microarray analysis
  • Be able to use the help system and vignettes for Bioconductor packages
14:20 Importing processed microarray data into R from GEO
  • Be able to obtain data from GEO, including processed and raw data.
  • Be able to explain and use the differences between GEO data types.
  • Understand the concept of the ExpressionSet class of objects.
14:40 Importing raw (unprocessed) Affymetrix microarray data
  • Be able to obtain supplemental data
  • Be able to explain and use the differences between GEO data types.
  • Understand the concept of the ExpressionSet class of objects.
14:55 Working with experimental metadata
  • Be able to use metadata from GEO objects to construct useful R data objects
  • Be able to use read.celfiles in combination with your own pData object to ensure data integrity
15:10 Microarray Data processing with RMA
  • Understand and explain Background correct, normalise and summarisation steps for microarray data
15:50 coffee break Break
16:00 Identifying differentially expressed genes using linear models (part 1)
  • Be able to use limma to identify differentially expressed genes.
  • Understand the formula class of objects in R, and use it to specify the appropriate model for linear modeling.
16:40 Identifying differentially expressed genes using linear models (part 2, factorial designs)
  • Be able to use limma to identify differentially expressed genes.
  • Understand the formula class of objects in R, and use it to specify the appropriate model for linear modeling.
17:20 From features to annotated gene lists
  • Be able to use AnnotationDb methods to association annotations with platform data.
17:50 Basic downstream analysis of microarray data
  • Be able to plot volcano plots and heatmaps in R.
  • Be able to interpret the above plots generated.
  • Be aware of some downstream analysis that are commonly done to interpret the results of differential expression analysis.
18:10 Finish