Introduction to Machine Learning

School of Computing, University of Nebraska-Lincoln Fall 2021: CSCE 478/878

Synopsis: This course offers a rigorous mathematical exploration of machine learning (ML) models, encompassing both supervised and unsupervised learning approaches. It adopts a probabilistic perspective, with a strong emphasis on the Bayesian view of statistics, to present these models. Students will implement ML algorithms from scratch using vanilla Python and its scientific (non-ML) libraries. Prerequisites include robust Python programming skills and a solid foundation in probability & statistics, linear algebra, calculus, and algorithm complexity analysis. Expect programming-intensive and time-consuming assignments.

Instructor
Dr. M. R. Hasan
Office Hours
See the course Canvas page
Lecture Time
Tuesday and Thursday: 11:00 AM - 12:15 PM in Avery Hall 119
Assignments
See the course Canvas page
Recitations
See the course Canvas page
Syllabus
See the course Canvas page
Class Discussion
See the Piazza link on the course Canvas page
Teaching Assistant
See the course Canvas page

Explore my tutorials on Machine Learning and Deep Learning on GitHub for additional resources.

Schedule

Topic, PDF Slides, & Misc. Resources Video Links

Note: A background in Probability Theory (discrete & continuous) and Linear Algebra is assumed. Only the Information Theory section will be covered in lectures; other slides are for review.

[ML Background] Probabilistic Reasoning
  • Probabilistic Reasoning-1
    • Uncertainty & Probability
    • Probabilistic Reasoning (Frequentist & Bayesian)
    • Sample Space and Random Variable
  • Probabilistic Reasoning-2
    • Discrete Probability Theory
    • Sum & Product Rule
    • Chain Rule of Probability
    • Bayes’ Rule
    • Joint and Conditional Distribution
    • Reducing the Complexity of Joint Distribution
    • Unconditional and Conditional Independence
  • Probabilistic Reasoning-3
    • Continuous Probability Theory
    • Probability Density Function
    • Expectation
    • Variance
    • Covariance & Correlation

Readings: Bishop: 1.21, 1.22, 1.23, 1.6; Murphy: 2.2, 2.8

[ML Background] Gaussian Distribution
[ML Background] Linear Algebra for Machine Learning
  • Linear Algebra for Machine Learning-1
    • What is Linear Algebra?
    • How is Linear Algebra useful in Machine Learning?
  • Linear Algebra for Machine Learning-2
    • Mathematical Objects (Scalars, Vectors, Matrices, Tensors)
    • Measuring the Size of Vectors and Matrices (various norms)
    • Some Special Matrices (Symmetric, Identity, Diagonal, Orthogonal)
    • Inverse of a Matrix
    • Orthogonal Matrix
    • Matrix & Vector Multiplication (dot, inner & Hadamard Product)
    • Orthogonal Transformation
  • Linear Algebra for Machine Learning-3
    • Motivation for solving a system of linear equation (linear systems)
    • Method of Gauss elimination & back substitution
    • Square Matrix: Gauss-Jordan Elimination Method
    • Conditions for a unique solution of a linear system
    • Determinant
    • Singular Matrix
    • Span of columns of a matrix
    • Linear Independence of columns of a matrix
    • Basis of the columns of a matrix
    • Rank of a matrix
    • Change of bases
    • Computation of Rank: Row-echelon form
  • Linear Algebra for Machine Learning-4
    • Intuition of the Eigenvalue equation
    • Matrix eigenvalue problem
    • Computing eigenvalues & eigenvectors
    • Characteristic equation of a matrix
    • Eigenbasis
    • Matrix diagonalization
    • Eigendecomposition
  • Linear Algebra for Machine Learning-5
    • Quadratic form of a vector
    • Positive Definite & positive semi-definite matrix
    • Summary of the discussion on Linear Algebra for Machine Learning

Readings: Array Programming with NumPy; Ch 7 & 8: Advanced Engineering Mathematics (10th edition) by Erwin Kreyszig

Information Theory
  • Information Theory-1
    • Information Theory (Message vs. Information)
    • Entropy & Cross-Entropy
    • Relative Entropy or Kullback-Leibler (KL) Divergence
  • Information Theory-2
    • KL Divergence for Maximum Likelihood Estimation
  • Information Theory-3
    • Independence of Random Variables & Information Gain
    • KL Divergence & Information Gain

Readings: Bishop: 1.21, 1.22, 1.23, 1.6; Murphy: 2.2, 2.8

Course Introduction

Jupyter Notebook Demo:

Readings: Russell & Norvig: 1; Geron: 1

Fuel Your Imagination:

Analogy-based Learning: K-Nearest Neighbors

Jupyter Notebooks:

Readings: Bishop: 2.5; Murphy: 1.4.1, 1.4.2, 1.4.3; Alpaydin: 8.1, 8.2, 8.3, 8.4; [Classification performance metrics] Geron: 3

Fuel Your Imagination:

Text Resources
  • Lecture slides and Jupyter notebooks provide a detailed account of the topics.
  • Primary References:
    • Machine Learning: A Probabilistic Perspective by Kevin P. Murphy
    • Pattern Recognition and Machine Learning by Christopher M. Bishop
    • Introduction to Machine Learning (3rd ed.) by Ethem Alpaydin
  • Practical Implementation:
    • Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow (2nd Edition, 2019) by Aurélien Géron (O'Reilly)
  • Introductory Texts:
    • Machine Learning by Tom Mitchell
    • Data Science from Scratch by Joel Grus (O’Reilly)
    • Python for Data Analysis (2nd Edition) by Wes McKinney (O'Reilly)
    • Python Machine Learning by Sebastian Raschka (Packt Publishing)
    • The Hundred-Page Machine Learning Book by Andriy Burkov
    • Artificial Intelligence: A Modern Approach by Stuart Russell and Peter Norvig
  • Optional Texts:
    • The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani, Jerome Friedman
    • Pattern Classification by Peter E. Hart, David G. Stork, Richard O. Duda
    • Bayesian Reasoning and Machine Learning by David Barber
    • Information Theory, Inference, and Learning Algorithms by David MacKay
    • An Introduction to Support Vector Machines and Other Kernel-based Learning Methods by Nello Cristianini, John Shawe-Taylor
    • Boosting: Foundations and Algorithms by Schapire, Robert E., and Freund, Yoav
  • Advanced Texts:
    • Dive into Deep Learning by Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola
    • Deep Learning by Ian Goodfellow, Yoshua Bengio, Aaron Courville
    • Deep Learning with Python by Francois Chollet
    • Reinforcement Learning: An Introduction by Richard S. Sutton, Andrew G. Barto
  • Statistics, Linear Algebra & Calculus Texts:
    • Advanced Engineering Mathematics (10th Ed.) by Erwin Kreyszig
    • All of Statistics: A Concise Course in Statistical Inference by Larry Wasserman
    • Convex Optimization by Boyd and Vandenberghe
  • Interesting & Enlightening Texts:
    • The Master Algorithm by Pedro Domingos
    • Artificial Intelligence: A Guide for Thinking Humans by Melanie Mitchell
    • The Deep Learning Revolution by Terrence J. Sejnowski
    • Prediction Machines by Ajay Agrawal, Joshua Gans, Avi Goldfarb
    • Thinking, Fast and Slow by Daniel Kahneman
    • The Drunkard's Walk by Leonard Mlodinow
    • The Signal and the Noise by Nate Silver
    • Calculated Risks by Gerd Gigerenzer
    • The Black Swan by Nassim Nicholas Taleb
    • Surfaces and Essences by Douglas Hofstadter, Emmanuel Sander
    • The Book of Why by Judea Pearl, Dana Mackenzie
    • Rebooting AI by Gary Marcus, Ernest Davis
    • Interpretable Machine Learning by Christoph Molnar
Machine Learning & Related Courses/Talks
Collaboration Tool
Google Colab Tutorials
Python
Open Data Repositories
ML Podcasts
Journals
Conference Proceedings