Advanced Environmental Data Analysis
EAS 6490  SPRING 2022

cover_rock.jpg
 
DiLorenzo.jpg

Prof. Emanuele Di Lorenzo 
phone 404-894-3994,
web  
office ES&T 3252 
email
edl@gatech.edu 

INSTRUCTOR

IMG_0900.JPG

TEACHING ASSISTANT

Giangiacomo Navarra 
phone 404-894-3994,
web  
office ES&T 3252 
email
navarra@gatech.edu 

CLASS

Monday & Wednesday, 2:00 PM - 3:15 PM

L1175 ES&T

NOTICE: class is VIRTUAL until  further notice 2022. Link to zoom is available in CANVAS. 

TEXT

Discrete Inverse and State Estimation Problems
Carl Wunsch, Cambridge Press.

Class notes on Objective Analysis 
Dennis Hartmann, Web Notes

 

- SYLLABUS -

COVID ANNOUNCEMENT SPRING 2022

If you do attend in person, Georgia Tech is currently recommending the following:
(1) Get a booster if you haven't already. (2) Participate in surveillance testing at least once a week. (3) Wear well-fitting face coverings with good filtration

 

What is a "well-fitting face covering with good filtration"?  A snugly fitting (no-gaps) respirator-style mask such as a KN-95, KF-94 or N-95 provides the best protection against airborne transmission.  A second-best option is a paper surgical mask with a nose wire that is fully extended over nose and chin and tightened as necessary to minimize gaps.  Cloth masks do not provide nearly as good filtration. For more information on masks: https://dearpandemic.org/masks-and-omicron/

COURSE PHILOSOPHY AND GOALS: This course is an advanced introduction to environmental data analysis. It is intended for first year graduate students and senior undergraduates. The goal of this class is to provide a deeper understanding of the theory underlying the statistical analysis of environmental data, both in the space, time and spectral domain, and to provide the students with a hands on experience. Ideally at the end of this class you will have developed a series of computer programming tool boxes and theoretical skills that should immediately be available for analyzing and modeling data in your own research. 
Although some previous knowledge of probability and statistics is required, a background review will be provided. Concepts and notation will be reintroduced as needed. In this class you will learn (A) how to combine models, which quantify statistical or dynamical relationships, with observations, (B) time series analysis, (C) forecasting and extrapolation and (D) signal decomposition. A more detail description of these topics is appended in the LECTURE TOPICS below.

 

HOMEWORK:  There will be a homework assignment approximately every two weeks. You will be required to learn some computer programming skills with either MATLAB or the R –software (http://www.r-project.org/) or IDL or anything you wish. If you do not have access to a computer with these software you will be provided with an account at the beginning of the semester. The type of data to be analyzed in the homeworks will vary depending on the interest of the attending students.

 

EXAMS: There is going to be a short MIDTERM and a FINAL PRESENTATION of the class project. 

CLASS PROJECT:  To help you put into practice your data analysis skills you will be asked to choose a data analysis project, possibly involving your own research data, that you will present at the end of the semester. The project is to be chosen based on a set of questions that you would like to answer rather than the type of data analysis technique you would like to apply, and it may require the use of one or more data analysis techniques.

GRADING: 50% Homework, 25% Midterm, 25% Class Project.

LECTURE TOPICS:

* Background Review: Matrix and Vector Algebra, Fundamental Statistical Measures, Multivariable Probability Densities, Sample Estimates, Correlation and Covariance, Function and Sums of Random Variables, Central Limit Theorem. 
 

* Combining models and observations: Interpolation and Function Fitting, Least Square modeling and Singular Vector Expansion, Uncertainties in Estimates, Inverse Methods, Statistical vs. Dynamical Constraints.

* Time Series Analysis: Time and Frequency Domain Models, Stationarity, Auto-Regression Models, Spectral Analysis and Coherence, Trend Analysis and Significance, Estimating errors in time series reconstruction.

* Forecasting and Extrapolation: Statistically Optimal Linear Estimators, Regression models, space and time models, objective mapping (multivariate regression), covariance modeling.

* Decomposing signals: Multivariate eigenfunction analysis, EOFs, PCA, CCA, and Wavelet analysis
 

 

- LECTURES -

ADDITIONAL RESOURCES

MATLAB Tutorials: MathWorks Tutorial 
MATH Reviews on Calculus, Linear Algebra and ODEs by 
Paul Dawkins
Additional Copyright Material - [ web ]

* BACKGROUND REVIEW

Matrix and Vector Algebra, Fundamental Statistical 
Measures, Multivariable Probability Densities, Sample Estimates, Correlation and 
Covariance, Function and Sums of Random Variables, Central Limit Theorem.

Topic 1:
Why is statistical analysis useful. 
References: Davis-lect-1.pdf , 01-notes.pdf , 01-figues.pdfLorenz Attractor

 

Topic 2:
An overview of the statistical methods. How does it all fit together ?  
References:  02-notes.pdf02-figues.pdf

Topic 3:

Fundamental Statistical Measures. Univariate Statistics and PDFs.
References: Davis-lect-2.pdf ,  03-notes.pdf, Wunsch Chap. 2 (pp. 27-41), Hartman webnotes Chap. 1  notes1.pdf

Topic 4:
Fundamental Statstical Measures. Multivariate Statistics and JPDFs.
References: Davis-lect-2.pdf , Wunsch Chap. 2 (pp. 27-41),  04-notes.pdf 
CentralLimitTheorem.pdf (S. Gille), Hartman webnotes Chap. 1  notes1.pdf

Topic 5:
Statistically Optimal Linear Estimators: relationship between least squares and conditional joint PDF estimates. 
References:   Davis-lect-3.pdf05-notes.pdf

* COMBINING MODELS AND OBSERVATIONS

Interpolation and Function Fitting, Least  Square modeling and Singular Vector Expansion,

Uncertainties in Estimates, Inverse  Methods, Statistical vs. Dynamical Constraints. 

Topic 6:
Testing a model against observations: Introduction to Least Squares (LSQ)  06-lsq-review.pdf 
Linear Algebra Review: 06-linalg-notes.pdf  from Wuncsh Chap. 2 (pp. 1-27) .
References: Wunsch Chap. 1 , Wunsch Chap. 2 ( 41-57) ,

 

Topic 7:

Interpolation and function fitting with LSQ: The CO2 curve and SST spatial maps 

References: CO2.pdf ,  LSQ_SST.pdf  Wunsch Chap. 2 ( 41-57) , 

Topic 8:
LSQ and Inverse Modeling: Reconstructing the source of a pollutant with an advection diffusion model 
References: Wunsch Chap. 1 , Wunsch Chap. 2 ( 41-57) , LSQ_dispersion.pdf

Topic 9:
Lagrange Multiplyers and Adjoints  

References: Wunsch Chap. 2 ( 58-68) , 09-adjoint.pdf

 

* FORECASTING AND EXTRAPOLATION

Mulitvariate Statistically Optimal Linear Estimators, Regression models, space and time models,

objective mapping (multivariate regression), covariance modeling. 

Topic 10:
Covariance Modeling, Basic Theory 
References: 
Hartmann from Chapter 3 and 5.  CovModel_Theory.pdf

Examples in the time and Yule-Walker Equations : CovModel_TimeEX.pdf
Example in the space domain and the multivariate optimal interpolation: CovModel_SpaceEX.pdf and CovModel_SpaceEX_fig.pdf


 

* SIGNAL DECOMPOSITION

Multivariate eigenfunction analysis, EOFs, PCA, CCA, and  Wavelet analysis 

Topic 11:
Empirical Orthogonal Functions (EOFs) / Principal Component Analysis (PCA),

Maximum Covariance Analysis (MCA), Combined EOFs (SVD) and Canonical Correlation Analysis (CCA)
References: 
EOF_notes.pdfEOF_local_vs_global.pdfEOF_Figs.pdfHartmann

Topic 12:
Space/Time filters (e.g. high-pass, low-pass, band-pass) and Wavelet analysis
References: 
HartmannWaveletClass.pptWavelet_Torrence_compo1998.pdf  
 (MATLAB programs)

* TIME SERIES ANALYSIS 

Time and Frequency Domain Models, Stationarity, Auto- Regression Models, Spectral Analysis and Coherence,

Trend Analysis and   Significance, Estimating errors in time series reconstruction.

Material is taken from the following references and personal notes: 

Hartmann  Web notes  Chapter 6,

Time Series pdfbook  Chapter 1-4

Topic 13:
Understanding Time Processes in the Time Domain, TimeProcesses.pdf

White Noise, Red Noise, Auto-correlation Function, Auto-Regressive Models, Fourier Series

References: Hartmann Chapter 6,   Time Series pdfbook Chapter 4,

Topic 14 - 15:

Frequency domain, Spectrum and Autocovariance function 

References: Hartmann Chapter 6,   Time Series pdfbook Chapter 1,  10-timeseries-intro.pdf
Review Convolution and Cross-correlation, Aliasing, DFT and Tapering 
References: C. Hoyos Powerpoint Slide [
ppt1 | ppt2 ], TimeSeriesCodes.zip 

Topic 16:
Analysis of two or more signals, Cross-Spectra and Coherence 
References: Hartmann 
Chapter 6,   Time Series pdfbook Chapter 4,
class notes 
Coherece.pdf

 

- HOMEWORKS -

Homework 1: 

Homework 1: 01-hw.pdfutl_contourfill.m

Homework 2: 

Homework 2: 02-hw.pdf , hw2_generateY.m , bs_rand.m , utl_H.m

Homework 3: 

Homework 3: 03-hw.pdf , 03-hw-data
Addendum: 03-hw_addendum.pdfrun_adj_sens.m

Homework 4: 

Homework 4a: EOF, 04-hw.pdf, directory

Homework 5: 

Homework: 05-hw.pdfSST_NP.mat

Homework TBA: 

Homework: 04-hw.pdf , 04-ENSO.txt04-timeseries1.txt
Due date:  TBA

Homework TBA: 

Homework: 05-hw.pdf , 05-timeseries1.txt05-timeseries2.txt
Due date: TBA

Homework TBA: 

Homework: 07-hw.pdf07-data.mat07-movies.mov
Due date: TBA

FINAL EXAM 2022: 

Homework: Final_Exam_2022.pdfFinal_Exam.mat,Final_Exam_Video.mov
Due date: TBA

Prof. Emanuele Di Lorenzo

Program in Ocean Science & Engineering

Georgia Insitute of Technology

Ford Environmental Science & Technology Building (ES&T), Office 3252

311 Ferst Drive NE, Atlanta, GA, 30332

United States of America

+1 (404) 894-3994

© Emanuele Di Lorenzo, Georgia Institute of Technology

GTExtended_White.png
georgia-tech-campus-map.png