• Foundations of Applied Statistics
  • Preface
  • I Introduction
  • 1 Statistics
    • 1.1 History
    • 1.2 Definition
    • 1.3 Relationship to Machine Learning
    • 1.4 Relationship to Data Science
    • 1.5 Some History On Data Science
      • John Tukey
      • Jeff Wu
      • William Cleveland
      • Industry
  • 2 Components of Applied Statistics
    • 2.1 Study Design
    • 2.2 Data Wrangling
    • 2.3 Data Analysis
      • 2.3.1 Exploratory Data Analysis
      • 2.3.2 Modeling
      • 2.3.3 Inference
      • 2.3.4 Prediciton
    • 2.4 Communication
  • 3 Data Sets Used in this Book
  • II Expoloratory Data Analysis
  • 4 Exploratory Data Analysis
    • 4.1 What is EDA?
    • 4.2 Descriptive Statistics Examples
    • 4.3 Components of EDA
    • 4.4 Data Sets
      • 4.4.1 Data mtcars
      • 4.4.2 Data mpg
      • 4.4.3 Data diamonds
      • 4.4.4 Data gapminder
  • 5 Numerical Summaries of Data
    • 5.1 Useful Summaries
    • 5.2 Measures of Center
    • 5.3 Mean, Median, and Mode in R
    • 5.4 Quantiles and Percentiles
    • 5.5 Five Number Summary
    • 5.6 Measures of Spread
    • 5.7 Variance, SD, and IQR in R
    • 5.8 Identifying Outliers
    • 5.9 Application to mtcars Data
    • 5.10 Measuring Symmetry
    • 5.11 skewness() Function
    • 5.12 Measuring Tails
    • 5.13 Excess Kurtosis
    • 5.14 kurtosis() Function
    • 5.15 Visualizing Skewness and Kurtosis
    • 5.16 Covariance and Correlation
      • 5.16.1 Covariance
      • 5.16.2 Pearson Correlation
      • 5.16.3 Spearman Correlation
  • 6 Data Visualization Basics
    • 6.1 Plots
    • 6.2 R Base Graphics
    • 6.3 Read the Documentation
    • 6.4 Barplot
    • 6.5 Boxplot
    • 6.6 Constructing Boxplots
    • 6.7 Boxplot with Outliers
    • 6.8 Histogram
    • 6.9 Histogram with More Breaks
    • 6.10 Density Plot
    • 6.11 Boxplot (Side-By-Side)
    • 6.12 Stacked Barplot
    • 6.13 Scatterplot
    • 6.14 Quantile-Quantile Plots
    • 6.15 A Grammar of Graphics
  • 7 EDA of High-Dimensional Data
    • 7.1 Definition
    • 7.2 Examples
    • 7.3 Big Data vs HD Data
    • 7.4 Definition of HD Data
    • 7.5 Rationale
  • 8 Cluster Analysis
    • 8.1 Definition
    • 8.2 Types of Clustering
    • 8.3 Top-Down vs Bottom-Up
    • 8.4 Challenges
    • 8.5 Illustrative Data Sets
      • 8.5.1 Simulated data1
      • 8.5.2 “True” Clusters data1
      • 8.5.3 Simulated data2
      • 8.5.4 “True” Clusters data2
    • 8.6 Distance Measures
      • 8.6.1 Objects
      • 8.6.2 Euclidean
      • 8.6.3 Manhattan
      • 8.6.4 Euclidean vs Manhattan
      • 8.6.5 dist()
      • 8.6.6 Distance Matrix data1
    • 8.7 Hierarchical Clustering
      • 8.7.1 Strategy
      • 8.7.2 Example: Cancer Subtypes
      • 8.7.3 Algorithm
      • 8.7.4 Linkage Criteria
      • 8.7.5 hclust()
      • 8.7.6 Hierarchical Clustering of data1
      • 8.7.7 Standard hclust() Usage
      • 8.7.8 as.dendrogram()
      • 8.7.9 Modify the Labels
      • 8.7.10 Color the Branches
      • 8.7.11 Cluster Assignments (\(K = 3\))
      • 8.7.12 Cluster Assignments (\(K = 3\))
      • 8.7.13 Cluster Assignments (\(K = 2\))
      • 8.7.14 Cluster Assignments (\(K = 4\))
      • 8.7.15 Cluster Assignments (\(K = 6\))
      • 8.7.16 Linkage: Complete (Default)
      • 8.7.17 Linkage: Average
      • 8.7.18 Linkage: Single
      • 8.7.19 Linkage: Ward
      • 8.7.20 Hierarchical Clustering of data2
      • 8.7.21 as.dendrogram()
      • 8.7.22 Modify the Labels
      • 8.7.23 Color the Branches
      • 8.7.24 Cluster Assignments (\(K = 2\))
      • 8.7.25 Cluster Assignments (\(K = 3\))
      • 8.7.26 Cluster Assignments (\(K = 4\))
      • 8.7.27 Cluster Assignments (\(K = 5\))
    • 8.8 K-Means Clustering
      • 8.8.1 Strategy
      • 8.8.2 Centroid
      • 8.8.3 Algorithm
      • 8.8.4 Notes
      • 8.8.5 kmeans()
      • 8.8.6 fitted()
      • 8.8.7 K-Means Clustering of data1
      • 8.8.8 Centroids of data1
      • 8.8.9 Cluster Assignments (\(K = 3\))
      • 8.8.10 Cluster Assignments (\(K = 2\))
      • 8.8.11 Cluster Assignments (\(K = 6\))
      • 8.8.12 K-Means Clustering of data2
      • 8.8.13 Cluster Assignments (\(K = 2\))
      • 8.8.14 Cluster Assignments (\(K = 3\))
      • 8.8.15 Cluster Assignments (\(K = 5\))
  • 9 Principal Component Analysis
    • 9.1 Dimensionality Reduction
    • 9.2 Goal of PCA
    • 9.3 Defining the First PC
    • 9.4 Calculating All PCs
    • 9.5 Singular Value Decomposition
    • 9.6 A Simple PCA Function
    • 9.7 The Ubiquitous PCA Example
    • 9.8 PC Biplots
    • 9.9 PCA Examples
      • 9.9.1 Weather Data
      • 9.9.2 Yeast Gene Expression
      • 9.9.3 HapMap Genotypes
  • III Random Variables
  • 10 Probability and Statistics
    • 10.1 Central Dogma of Inference
    • 10.2 Data Analysis Without Probability
  • 11 Probability Theory
    • 11.1 Sample Space
    • 11.2 Measure Theoretic Probabilty
    • 11.3 Mathematical Probability
    • 11.4 Union of Two Events
    • 11.5 Conditional Probability
    • 11.6 Independence
    • 11.7 Bayes Theorem
    • 11.8 Law of Total Probability
  • 12 Random Variables
    • 12.1 Definition
    • 12.2 Distributon of RV
    • 12.3 Discrete Random Variables
    • 12.4 Example: Discrete PMF
    • 12.5 Example: Discrete CDF
    • 12.6 Probabilities of Events Via Discrete CDF
    • 12.7 Continuous Random Variables
    • 12.8 Example: Continuous PDF
    • 12.9 Example: Continuous CDF
    • 12.10 Probabilities of Events Via Continuous CDF
    • 12.11 Example: Continuous RV Event
    • 12.12 Note on PMFs and PDFs
    • 12.13 Note on CDFs
    • 12.14 Sample Vs Population Statistics
    • 12.15 Expected Value
    • 12.16 Variance
    • 12.17 Moment Generating Functions
    • 12.18 Random Variables in R
  • 13 Discrete RVs
    • 13.1 Uniform (Discrete)
    • 13.2 Uniform (Discrete) PMF
    • 13.3 Uniform (Discrete) in R
    • 13.4 Bernoulli
    • 13.5 Binomial
    • 13.6 Binomial PMF
    • 13.7 Binomial in R
    • 13.8 Poisson
    • 13.9 Poisson PMF
    • 13.10 Poisson in R
  • 14 Continuous RVs
    • 14.1 Uniform (Continuous)
    • 14.2 Uniform (Continuous) PDF
    • 14.3 Uniform (Continuous) in R
    • 14.4 Exponential
    • 14.5 Exponential PDF
    • 14.6 Exponential in R
    • 14.7 Beta
    • 14.8 Beta PDF
    • 14.9 Beta in R
    • 14.10 Normal
    • 14.11 Normal PDF
    • 14.12 Normal in R
  • 15 Joint Distributions
    • 15.1 Bivariate Random Variables
    • 15.2 Events for Bivariate RVs
    • 15.3 Marginal Distributions
    • 15.4 Independent Random Variables
    • 15.5 Conditional Distributions
    • 15.6 Conditional Moments
    • 15.7 Law of Total Variance
    • 15.8 Covariance and Correlation
    • 15.9 Multivariate Distributions
    • 15.10 MV Expected Value
    • 15.11 MV Variance-Covariance Matrix
  • 16 Multivariate RVs
    • 16.1 Multinomial
    • 16.2 Multivariate Normal
    • 16.3 Dirichlet
    • 16.4 In R
  • 17 Sums of Random Variables
    • 17.1 Linear Transformation of a RV
    • 17.2 Sums of Independent RVs
    • 17.3 Sums of Dependent RVs
    • 17.4 Means of Random Variables
  • 18 Convergence of Random Variables
    • 18.1 Sequence of RVs
    • 18.2 Convergence in Distribution
    • 18.3 Convergence in Probability
    • 18.4 Almost Sure Convergence
    • 18.5 Strong Law of Large Numbers
    • 18.6 Central Limit Theorem
    • 18.7 Example: Calculations
    • 18.8 Example: Plot
  • 19 Population Principal Components Analysis
  • 20 From Probability to Likelihood
    • 20.1 Likelihood Function
    • 20.2 Log-Likelihood Function
    • 20.3 Sufficient Statistics
    • 20.4 Factorization Theorem
    • 20.5 Example: Normal
    • 20.6 Likelihood Principle
    • 20.7 Maximum Likelihood
    • 20.8 Going Further
  • 21 Exponential Family Distributions
    • 21.1 Rationale
    • 21.2 Definition
    • 21.3 Example: Bernoulli
    • 21.4 Example: Normal
    • 21.5 Natural Single Parameter EFD
    • 21.6 Calculating Moments
    • 21.7 Example: Normal
    • 21.8 Maximum Likelihood
    • 21.9 Table of Common EFDs
  • IV Frequentist Inference
  • 22 Statistical Inference
    • 22.1 Data Collection as a Probability
    • 22.2 Example: Simple Random Sample
    • 22.3 Example: Randomized Controlled Trial
    • 22.4 Parameters and Statistics
    • 22.5 Sampling Distribution
    • 22.6 Central Dogma of Inference
    • 22.7 Example: Fair Coin?
  • 23 Inference Goals and Strategies
    • 23.1 Basic Idea
    • 23.2 Normal Example
    • 23.3 Point Estimate of \(\mu\)
    • 23.4 Sampling Distribution of \(\hat{\mu}\)
    • 23.5 Pivotal Statistic
  • 24 Confidence Intervals
    • 24.1 Goal
    • 24.2 Formulation
    • 24.3 Interpretation
    • 24.4 A Normal CI
    • 24.5 A Simulation
    • 24.6 Normal\((0,1)\) Percentiles
    • 24.7 Commonly Used Percentiles
    • 24.8 \((1-\alpha)\)-Level CIs
    • 24.9 One-Sided CIs
  • 25 Hypothesis Tests
    • 25.1 Example: HT on Fairness of a Coin
    • 25.2 A Caveat
    • 25.3 Definition
    • 25.4 Return to Normal Example
    • 25.5 HTs on Parameter Values
    • 25.6 Two-Sided vs. One-Sided HT
    • 25.7 Test Statistic
    • 25.8 Null Distribution (Two-Sided)
    • 25.9 Null Distribution (One-Sided)
    • 25.10 P-values
    • 25.11 Calling a Test “Significant”
    • 25.12 Types of Errors
    • 25.13 Error Rates
  • 26 Maximum Likelihood Estimation
    • 26.1 The Normal Example
    • 26.2 MLE \(\rightarrow\) Normal Pivotal Statistics
    • 26.3 Likelihood Function
    • 26.4 Log-Likelihood Function
    • 26.5 Calculating MLEs
    • 26.6 Properties
    • 26.7 Assumptions and Notation
    • 26.8 Consistency
    • 26.9 Equivariance
    • 26.10 Fisher Information
    • 26.11 Standard Error
    • 26.12 Asymptotic Normal
    • 26.13 Asymptotic Pivotal Statistic
    • 26.14 Wald Test
    • 26.15 Confidence Intervals
    • 26.16 Optimality
    • 26.17 Delta Method
    • 26.18 Delta Method Example
    • 26.19 Multiparameter Fisher Info Matrix
    • 26.20 Multiparameter Asymptotic MVN
  • 27 MLE Examples: One Sample
    • 27.1 Exponential Family Distributions
    • 27.2 Summary of MLE Statistics
    • 27.3 Notes
    • 27.4 Binomial
    • 27.5 Normal
    • 27.6 Poisson
    • 27.7 One-Sided CIs and HTs
  • 28 MLE Examples: Two-Samples
    • 28.1 Comparing Two Populations
    • 28.2 Two RVs
    • 28.3 Two Sample Means
    • 28.4 Two MLEs
    • 28.5 Poisson
    • 28.6 Normal (Unequal Variances)
    • 28.7 Normal (Equal Variances)
    • 28.8 Binomial
    • 28.9 Example: Binomial CI
    • 28.10 Example: Binomial HT
  • 29 The t Distribution
    • 29.1 Normal, Unknown Variance
    • 29.2 Aside: Chi-Square Distribution
    • 29.3 Theoretical Basis of the t
    • 29.4 When Is t Utilized?
    • 29.5 t vs Normal
    • 29.6 t Percentiles
    • 29.7 Confidence Intervals
    • 29.8 Hypothesis Tests
    • 29.9 Two-Sample t-Distribution
    • 29.10 Two-Sample t-Distribution
    • 29.11 Two-Sample t-Distributions
  • 30 Inference in R
    • 30.1 BSDA Package
    • 30.2 Example: Poisson
    • 30.3 Direct Calculations
    • 30.4 Commonly Used Functions
    • 30.5 About These Functions
    • 30.6 Normal Data: “Davis” Data Set
    • 30.7 Height vs Weight
    • 30.8 An Error?
    • 30.9 Updated Height vs Weight
    • 30.10 Density Plots of Height
    • 30.11 Density Plots of Weight
    • 30.12 t.test() Function
    • 30.13 Two-Sided Test of Male Height
    • 30.14 Output of t.test()
    • 30.15 Tidying the Output
    • 30.16 Two-Sided Test of Female Height
    • 30.17 Difference of Two Means
    • 30.18 Test with Equal Variances
    • 30.19 Paired Sample Test (v. 1)
    • 30.20 Paired Sample Test (v. 2)
    • 30.21 The Coin Flip Example
    • 30.22 binom.test()
    • 30.23 alternative = "greater"
    • 30.24 alternative = "less"
    • 30.25 prop.test()
    • 30.26 An Observation
    • 30.27 Wording of Surveys
    • 30.28 The Data
    • 30.29 Inference on the Difference
    • 30.30 90% Confidence Interval
    • 30.31 Poisson Data: poisson.test()
    • 30.32 Example: RNA-Seq
    • 30.33 \(H_1: \lambda_1 \not= \lambda_2\)
    • 30.34 \(H_1: \lambda_1 < \lambda_2\)
    • 30.35 \(H_1: \lambda_1 > \lambda_2\)
    • 30.36 Question
  • 31 Likelihood Ratio Tests
    • 31.1 General Set-up
    • 31.2 Significance Regions
    • 31.3 P-values
    • 31.4 Example: Wald Test
    • 31.5 Neyman-Pearson Lemma
    • 31.6 Simple vs. Composite Hypotheses
    • 31.7 General Hypothesis Tests
    • 31.8 Composite \(H_0\)
    • 31.9 Generalized LRT
    • 31.10 Null Distribution of Gen. LRT
    • 31.11 Example: Poisson
    • 31.12 Example: Normal
  • V Bayesian Inference
  • 32 Likelihood Function
    • 32.1 Same MLE, Different \(L(\theta | \boldsymbol{x})\)
    • 32.2 Weighted Likelihood Estimate
    • 32.3 Conditional Expected Value
    • 32.4 Standard Errror
  • 33 Bayesian Inference
    • 33.1 Frequentist Probability
    • 33.2 Bayesian Probability
    • 33.3 The Framework
    • 33.4 An Example
    • 33.5 Calculations
    • 33.6 In Practice
    • 33.7 Goal
    • 33.8 Advantages
    • 33.9 Computation
  • 34 Estimation
    • 34.1 Assumptions
    • 34.2 Posterior Distribution
    • 34.3 Posterior Expectation
    • 34.4 Posterior Interval
    • 34.5 Maximum A Posteriori Probability
    • 34.6 Loss Functions
    • 34.7 Bayes Risk
    • 34.8 Bayes Estimators
  • 35 Classification
    • 35.1 Assumptions
    • 35.2 Prior Probability on H
    • 35.3 Posterior Probability
    • 35.4 Loss Function
    • 35.5 Bayes Risk
    • 35.6 Bayes Rule
  • 36 Priors
    • 36.1 Conjugate Priors
    • 36.2 Example: Beta-Bernoulli
    • 36.3 Example: Normal-Normal
    • 36.4 Example: Dirichlet-Multinomial
    • 36.5 Example: Gamma-Poisson
    • 36.6 Jeffreys Prior
    • 36.7 Examples: Jeffreys Priors
    • 36.8 Improper Prior
  • 37 Theory
    • 37.1 de Finetti’s Theorem
    • 37.2 Admissibility
  • 38 Empirical Bayes
    • 38.1 Rationale
    • 38.2 Approach
    • 38.3 Example: Normal
  • VI Numerical Methods for Likelihood Functions
  • 39 Why Numerical Methods for Likelihood
    • 39.1 Challenges
    • 39.2 Approaches
  • 40 Latent Variable Models
    • 40.1 Definition
    • 40.2 Empirical Bayes Revisited
    • 40.3 Normal Mixture Model
    • 40.4 Bernoulli Mixture Model
  • 41 EM Algorithm
    • 41.1 Rationale
    • 41.2 Requirement
    • 41.3 The Algorithm
    • 41.4 \(Q({\boldsymbol{\theta}}, {\boldsymbol{\theta}}^{(t)})\)
    • 41.5 EM for MAP
  • 42 EM Examples
    • 42.1 Normal Mixture Model
    • 42.2 E-Step
    • 42.3 M-Step
    • 42.4 Caveat
    • 42.5 Yeast Gene Expression
    • 42.6 Initialize Values
    • 42.7 Run EM Algorithm
    • 42.8 Fitted Mixture Distribution
    • 42.9 Bernoulli Mixture Model
    • 42.10 Other Applications of EM
  • 43 Theory of EM
    • 43.1 Decomposition
    • 43.2 Kullback-Leibler Divergence
    • 43.3 Lower Bound
    • 43.4 EM Increases Likelihood
  • 44 Variational Inference
    • 44.1 Rationale
    • 44.2 Optimization Goal
    • 44.3 Mean Field Approximation
    • 44.4 Optimal \(q_k({\boldsymbol{z}}_k)\)
    • 44.5 Remarks
  • 45 Markov Chain Monte Carlo
    • 45.1 Motivation
    • 45.2 Note
    • 45.3 Big Picture
    • 45.4 Metropolis-Hastings Algorithm
    • 45.5 Metropolis Algorithm
    • 45.6 Utilizing MCMC Output
    • 45.7 Remarks
    • 45.8 Full Conditionals
    • 45.9 Gibbs Sampling
    • 45.10 Gibbs and MH
    • 45.11 Latent Variables
    • 45.12 Theory
    • 45.13 Software
  • 46 MCMC Example
    • 46.1 Single Nucleotide Polymorphisms
    • 46.2 PSD Admixture Model
    • 46.3 Gibbs Sampling Approach
    • 46.4 The Data
    • 46.5 Model Components
    • 46.6 The Model
    • 46.7 Conditional Independence
    • 46.8 The Posterior
    • 46.9 Full Conditional for \(\boldsymbol{Q}\)
    • 46.10 Full Conditional for \(\boldsymbol{P}\)
    • 46.11 Full Conditional \(\boldsymbol{Z}_A\) & \(\boldsymbol{Z}_B\)
    • 46.12 Gibbs Sampling Updates
    • 46.13 Implementation
    • 46.14 Matrix-wise rdirichlet Function
    • 46.15 Inspect Data
    • 46.16 Model Parameters
    • 46.17 Update \(\boldsymbol{P}\)
    • 46.18 Update \(\boldsymbol{Q}\)
    • 46.19 Update (Each) \(\boldsymbol{Z}\)
    • 46.20 Model Log-likelihood Function
    • 46.21 MCMC Configuration
    • 46.22 Run Sampler
    • 46.23 Posterior Mean of \(\boldsymbol{Q}\)
    • 46.24 Plot Log-likelihood Steps
    • 46.25 What Happens for K=4?
    • 46.26 Run Sampler Again
    • 46.27 Posterior Mean of \(\boldsymbol{Q}\)
  • 47 Further Reading
  • VII Nonparametric Statistical Inference
  • 48 Nonparametric Statistics
    • 48.1 Parametric Inference
    • 48.2 Nonparametric Inference
    • 48.3 Nonparametric Descriptive Statistics
    • 48.4 Semiparametric Inference
  • 49 Empirical Distribution Functions
    • 49.1 Definition
    • 49.2 Example: Normal
    • 49.3 Pointwise Convergence
    • 49.4 Glivenko-Cantelli Theorem
    • 49.5 Dvoretzky-Kiefer-Wolfowitz (DKW) Inequality
    • 49.6 Statistical Functionals
    • 49.7 Plug-In Estimator
    • 49.8 EDF Standard Error
    • 49.9 EDF CLT
  • 50 Bootstrap
    • 50.1 Rationale
    • 50.2 Big Picture
    • 50.3 Bootstrap Variance
    • 50.4 Caveat
    • 50.5 Bootstrap Sample
    • 50.6 Bootstrap CIs
    • 50.7 Invoking the CLT
    • 50.8 Percentile Interval
    • 50.9 Pivotal Interval
    • 50.10 Studentized Pivotal Interval
    • 50.11 Bootstrap Hypothesis Testing
    • 50.12 Example: t-test
    • 50.13 Parametric Bootstrap
    • 50.14 Example: Exponential Data
  • 51 Permutation Methods
    • 51.1 Rationale
    • 51.2 Permutation Test
    • 51.3 Wilcoxon Rank Sum Test
    • 51.4 Wilcoxon Signed Rank-Sum Test
    • 51.5 Examples
    • 51.6 Permutation t-test
  • 52 Goodness of Fit
    • 52.1 Rationale
    • 52.2 Chi-Square GoF Test
    • 52.3 Example: Hardy-Weinberg
    • 52.4 Kolmogorov–Smirnov Test
    • 52.5 One Sample KS Test
    • 52.6 Two Sample KS Test
    • 52.7 Example: Exponential vs Normal
  • 53 Method of Moments
    • 53.1 Rationale
    • 53.2 Definition
    • 53.3 Example: Normal
    • 53.4 Exploring Goodness of Fit
  • VIII Statistical Models
  • 54 Types of Models
    • 54.1 Probabilistic Models
    • 54.2 Multivariate Models
    • 54.3 Variables
    • 54.4 Statistical Model
    • 54.5 Parametric vs Nonparametric
    • 54.6 Simple Linear Regression
    • 54.7 Ordinary Least Squares
    • 54.8 Generalized Least Squares
    • 54.9 Matrix Form of Linear Models
    • 54.10 Least Squares Regression
    • 54.11 Generalized Linear Models
    • 54.12 Generalized Additive Models
    • 54.13 Some Trade-offs
    • 54.14 Bias and Variance
  • 55 Motivating Examples
    • 55.1 Sample Correlation
    • 55.2 Example: Hand Size Vs. Height
    • 55.3 Cor. of Hand Size and Height
    • 55.4 L/R Hand Sizes
    • 55.5 Correlation of Hand Sizes
    • 55.6 Davis Data
    • 55.7 Height and Weight
    • 55.8 Correlation of Height and Weight
    • 55.9 Correlation Among Females
    • 55.10 Correlation Among Males
  • 56 Simple Linear Regression
    • 56.1 Definition
    • 56.2 Rationale
    • 56.3 Setup
    • 56.4 Line Minimizing Squared Error
    • 56.5 Least Squares Solution
    • 56.6 Visualizing Least Squares Line
    • 56.7 Example: Height and Weight
    • 56.8 Calculate the Line Directly
    • 56.9 Plot the Line
    • 56.10 Observed Data, Fits, and Residuals
    • 56.11 Proportion of Variation Explained
  • 57 lm() Function in R
    • 57.1 Calculate the Line in R
    • 57.2 An lm Object is a List
    • 57.3 From the R Help
    • 57.4 Some of the List Items
    • 57.5 summary()
    • 57.6 summary() List Elements
    • 57.7 Using tidy()
    • 57.8 Proportion of Variation Explained
    • 57.9 Assumptions to Verify
    • 57.10 Residual Distribution
    • 57.11 Normal Residuals Check
    • 57.12 Fitted Values Vs. Obs. Residuals
  • 58 Ordinary Least Squares
    • 58.1 OLS Solution
    • 58.2 Sample Variance
    • 58.3 Sample Covariance
    • 58.4 Expected Values
    • 58.5 Standard Error
    • 58.6 Proportion of Variance Explained
    • 58.7 Normal Errors
    • 58.8 Sampling Distribution
    • 58.9 CLT
    • 58.10 Gauss-Markov Theorem
  • 59 Generalized Least Squares
    • 59.1 GLS Solution
    • 59.2 Other Results
  • 60 OLS in R
    • 60.1 Weight Regressed on Height + Sex
    • 60.2 One Variable, Two Scales
    • 60.3 Interactions
    • 60.4 More on Interactions
    • 60.5 Visualizing Three Different Models
  • 61 Categorical Explanatory Variables
    • 61.1 Example: Chicken Weights
    • 61.2 Factor Variables in lm()
    • 61.3 Plot the Fit
    • 61.4 ANOVA (Version 1)
    • 61.5 anova()
    • 61.6 How It Works
    • 61.7 Top of Design Matrix
    • 61.8 Bottom of Design Matrix
    • 61.9 Model Fits
  • 62 Variable Transformations
    • 62.1 Rationale
    • 62.2 Power and Log Transformations
    • 62.3 Diamonds Data
    • 62.4 Nonlinear Relationship
    • 62.5 Regression with Nonlinear Relationship
    • 62.6 Residual Distribution
    • 62.7 Normal Residuals Check
    • 62.8 Log-Transformation
    • 62.9 OLS on Log-Transformed Data
    • 62.10 Residual Distribution
    • 62.11 Normal Residuals Check
    • 62.12 Tree Pollen Study
    • 62.13 Tree Pollen Count by Week
    • 62.14 A Clever Transformation
    • 62.15 week Transformed
  • 63 OLS Goodness of Fit
    • 63.1 Pythagorean Theorem
    • 63.2 OLS Normal Model
    • 63.3 Projection Matrices
    • 63.4 Decomposition
    • 63.5 Distribution of Projection
    • 63.6 Distribution of Residuals
    • 63.7 Degrees of Freedom
    • 63.8 Submodels
    • 63.9 Hypothesis Testing
    • 63.10 Generalized LRT
    • 63.11 Nested Projections
    • 63.12 F Statistic
    • 63.13 F Distribution
    • 63.14 F Test
    • 63.15 Example: Davis Data
    • 63.16 Comparing Linear Models in R
    • 63.17 ANOVA (Version 2)
    • 63.18 Comparing Two Models
      with anova()
    • 63.19 When There’s a Single Variable Difference
    • 63.20 Calculating the F-statistic
    • 63.21 Calculating the Generalized LRT
    • 63.22 ANOVA on More Distant Models
    • 63.23 Compare Multiple Models at Once
  • 64 Logistic Regression
    • 64.1 Goal
    • 64.2 Bernoulli as EFD
    • 64.3 Model
    • 64.4 Maximum Likelihood Estimation
    • 64.5 Iteratively Reweighted Least Squares
    • 64.6 GLMs
  • 65 glm() Function in R
    • 65.1 Example: Grad School Admissions
    • 65.2 Explore the Data
    • 65.3 Logistic Regression in R
    • 65.4 Summary of Fit
    • 65.5 ANOVA of Fit
    • 65.6 Example: Contraceptive Use
    • 65.7 A Different Format
    • 65.8 Fitting the Model
    • 65.9 Summary of Fit
    • 65.10 ANOVA of Fit
    • 65.11 More on this Data Set
  • 66 Generalized Linear Models
    • 66.1 Definition
    • 66.2 Exponential Family Distributions
    • 66.3 Natural Single Parameter EFD
    • 66.4 Dispersion EFDs
    • 66.5 Example: Normal
    • 66.6 EFD for GLMs
    • 66.7 Components of a GLM
    • 66.8 Link Functions
    • 66.9 Calculating MLEs
    • 66.10 Newton-Raphson
    • 66.11 Fisher’s scoring
    • 66.12 Iteratively Reweighted Least Squares
    • 66.13 Estimating Dispersion
    • 66.14 CLT Applied to the MLE
    • 66.15 Approximately Pivotal Statistics
    • 66.16 Deviance
    • 66.17 Generalized LRT
    • 66.18 Example: Grad School Admissions
    • 66.19 glm() Function
  • 67 Nonparametric Regression
    • 67.1 Simple Linear Regression
    • 67.2 Simple Nonparametric Regression
    • 67.3 Smooth Functions
    • 67.4 Smoothness Parameter \(\lambda\)
    • 67.5 The Solution
    • 67.6 Natural Cubic Splines
    • 67.7 Basis Functions
    • 67.8 Calculating the Solution
    • 67.9 Linear Operator
    • 67.10 Degrees of Freedom
    • 67.11 Bayesian Intepretation
    • 67.12 Bias and Variance Trade-off
    • 67.13 Choosing \(\lambda\)
    • 67.14 Smoothers and Spline Models
    • 67.15 Smoothers in R
    • 67.16 Example
  • 68 Generalized Additive Models
    • 68.1 Ordinary Least Squares
    • 68.2 Additive Models
    • 68.3 Backfitting
    • 68.4 GAM Definition
    • 68.5 Overview of Fitting GAMs
    • 68.6 GAMs in R
    • 68.7 Example
  • 69 Bootstrap for Statistical Models
    • 69.1 Homoskedastic Models
    • 69.2 Residuals
    • 69.3 Studentized Residuals
    • 69.4 Confidence Intervals
    • 69.5 Hypothesis Testing
    • 69.6 Parametric Bootstrap
  • IX High-Dimensional Inference
  • 70 High-Dimensional Data and Inference
    • 70.1 Definition
    • 70.2 Examples
    • 70.3 HD Gene Expression Data
    • 70.4 Many Responses Model
    • 70.5 HD SNP Data
    • 70.6 Many Regressors Model
    • 70.7 Goals
    • 70.8 Challenges
  • 71 Many Responses Model
  • 72 Shrinkage and Empirical Bayes
    • 72.1 Estimating Several Means
    • 72.2 Usual MLE
    • 72.3 Loss Function
    • 72.4 Stein’s Paradox
    • 72.5 Inverse Regression Approach
    • 72.6 Empirical Bayes Estimate
    • 72.7 EB for a Many Responses Model
  • 73 Multiple Testing
    • 73.1 Motivating Example
    • 73.2 Challenges
    • 73.3 Outcomes
    • 73.4 Error Rates
    • 73.5 Bonferroni Correction
    • 73.6 False Discovery Rate
    • 73.7 Point Estimate
    • 73.8 Adaptive Threshold
    • 73.9 Conservative Properties
    • 73.10 Q-Values
    • 73.11 Bayesian Mixture Model
    • 73.12 Bayesian-Frequentist Connection
    • 73.13 Local FDR
  • 74 Many Regressors Model
  • 75 Ridge Regression
    • 75.1 Motivation
    • 75.2 Optimization Goal
    • 75.3 Solution
    • 75.4 Preprocessing
    • 75.5 Shrinkage
    • 75.6 Example
    • 75.7 Existence of Solution
    • 75.8 Effective Degrees of Freedom
    • 75.9 Bias and Covariance
    • 75.10 Ridge vs OLS
    • 75.11 Bayesian Interpretation
    • 75.12 Example: Diabetes Data
    • 75.13 GLMs
  • 76 Lasso Regression
    • 76.1 Motivation
    • 76.2 Optimization Goal
    • 76.3 Solution
    • 76.4 Preprocessing
    • 76.5 Bayesian Interpretation
    • 76.6 Inference
    • 76.7 GLMs
  • X Latent Variable Models
  • 77 HD Latent Variable Models
    • 77.1 Definition
    • 77.2 Model
    • 77.3 Estimation
  • 78 Jackstraw
    • 78.1 Procedure
    • 78.2 Example: Yeast Cell Cycle
  • 79 Surrogate Variable Analysis
    • 79.1 Procedure
    • 79.2 Example: Kidney Expr by Age
  • Appendix
  • References

Foundations of Applied Statistics

Foundations of Applied Statistics

Data Analysis, Inference, and Modeling

John D. Storey

Created 2017-02-01; Last modified 2020-02-19