Chapman & Hall/CRC – 2008, 256 pages
ISBN: 1420067273
Providing an interface between dry-bench bioinformaticians and wet-lab biologists, DNA Methylation Microarrays: Experimental Design and Statistical Analysis presents the statistical methods and tools to analyze high-throughput epigenomic data, in particular, DNA methylation microarray data. Since these microarrays share the same underlying principles as gene expression microarrays, many of the analyses in the text also apply to microarray-based gene expression and histone modification (ChIP-on-chip) studies.
After introducing basic statistics, the book describes wet-bench technologies that produce the data for analysis and explains how to preprocess the data to remove systematic artifacts resulting from measurement imperfections. It then explores differential methylation and genomic tiling arrays. Focusing on exploratory data analysis, the next several chapters show how cluster and network analyses can link the functions and roles of unannotated DNA elements with known ones. The book concludes by surveying the open source software (R and Bioconductor), public databases, and other online resources available for microarray research.
Requiring only limited knowledge of statistics and programming, this book helps readers gain a solid understanding of the methodological foundations of DNA microarray analysis.
Applied Statistics
Descriptive statistics
Inferential statistics
DNA Methylation Microarrays and Quality ControlDNA methylation microarrays
Workflow of methylome experiment
Image analysis
Visualization of raw data
Reproducibility
Experimental DesignGoals of experiment
Reference design
Balanced block design
Loop design
Factorial design
Time course experimental design
How many samples/arrays are needed?
Appendix
Data NormalizationMeasure of methylation
The need for normalization
Strategy for normalization
Two-color CpG island microarray normalization
Oligonucleotide arrays normalization
Normalization using control sequences
Appendix
Significant Differential MethylationFold change
Linear model for log-ratios or log-intensities
t test for contrasts
F test for joint contrasts
P-value adjustment for multiple testing
Modified t and F tests
Significant variation within and between groups
Significant correlation with a covariate
Permutation test for bisulfite sequence data
Missing data values
Appendix
High-Density Genomic Tiling ArraysNormalization
Wilcoxon test in a sliding window
Boundaries of methylation regions
Multiscale analysis by wavelets
Unsupervised segmentation by hidden Markov model
Principal component analysis and biplot
Cluster AnalysisMeasure of dissimilarity
Dimensionality reduction
Hierarchical clustering
K-means clustering
Model-based clustering
Quality of clustering
Statistical significance of clusters
Reproducibility of clusters
Repeated measurements
Statistical ClassificationFeature selection
Discriminant function
K-nearest neighbor
Performance assessment
Interdependency Network of DNA MethylationGraphs and networks
Partial correlation
Dependence networks from DNA methylation microarrays
Network analysis
Time Series ExperimentRegulatory networks from microarray data
Dynamic model of regulation
A penalized likelihood score for parsimonious model
Optimization by genetic algorithms
Online AnnotationsGene centric resources
PubMeth: A cancer methylation database
Gene Ontology
Kyoto Encyclopedia of Genes and Genomes
UniProt/Swiss-Prot protein knowledgebase
The International HapMap Project
UCSC human genome browser
Public Microarray Data RepositoriesEpigenetics Society
Microarray Gene Expression Data Society
Minimum Information About a Microarray Experiment
Public repositories for high-throughput arrays
Open Source Software for Microarray Data AnalysisR: A language and environment for statistical computing and graphics
Bioconductor