Linear algebra is a field of applied mathematics that is a prerequisite to reading and understanding the formal description of deep learning methods, such as in papers and textbooks.

Generally, an understanding of linear algebra (or parts thereof) is presented as a prerequisite for machine learning. Although important, this area of mathematics is seldom covered by computer science or software engineering degree programs.

In their seminal textbook on deep learning, Ian Goodfellow and others present chapters covering the prerequisite mathematical concepts for deep learning, including a chapter on linear algebra.

In this post, you will discover the crash course in linear algebra for deep learning presented in the de facto textbook on deep learning.

After reading this post, you will know:

- The topics suggested as prerequisites for deep learning by experts in the field.
- The progression through these topics and their culmination.
- Suggestions for how to get the most out of the chapter as a crash course in linear algebra.

Let’s get started.

## Deep Learning Prerequisites

The book “Deep Learning” by Ian Goodfellow, Yoshua Bengio, and Aaron Courville is the de facto textbook for deep learning.

In the book, the authors provide a part titled “*Applied Math and Machine Learning Basics*” intended to provide the background in applied mathematics and machine learning required to understand the deep learning material presented in the rest of the book.

This part of the book includes four chapters; they are:

- Linear Algebra
- Probability and Information Theory
- Numerical Computation
- Machine Learning Basics

Given the expertise of the authors of the book, it is fair to say that the chapter on linear algebra provides a well reasoned set of prerequisites for deep learning, and perhaps more generally much of machine learning.

This part of the book introduces the basic mathematical concepts needed to understand deep learning.

— Page 30, Deep Learning, 2016.

Therefore, we can use the topics covered in the chapter on linear algebra as a guide to the topics you may be expected to be familiar with as a deep learning and machine learning practitioner.

Linear algebra is less likely to be covered in computer science courses than other types of math, such as discrete mathematics. This is specifically called out by the authors.

Linear algebra is a branch of mathematics that is widely used throughout science and engineering. However, because linear algebra is a form of continuous rather than discrete mathematics, many computer scientists have little experience with it.

— Page 31, Deep Learning, 2016.

We can take that the topics in this chapter are also laid out in a way tailored for computer science graduates with little to no prior exposure.

### Need help with Linear Algebra for Machine Learning?

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Download Your FREE Mini-Course

## Linear Algebra Topics

The chapter on linear algebra is divided into 12 sections.

As a first step, it is useful to use this as a high-level road map. The complete list of sections from the chapter are listed below.

- Scalars, Vectors, Matrices and Tensors
- Multiplying Matrices and Vectors
- Identity and Inverse Matrices
- Linear Dependence and Span
- Norms
- Special Kinds of Matrices and Vectors
- Eigendecomposition
- Singular Value Decomposition
- The Moore-Penrose Pseudoinverse
- The Trace Operator
- The Determinant
- Example: Principal Components Analysis

There’s not much value in enumerating the specifics covered in each section as the topics are mostly self explanatory, if familiar.

## Progress Through Concepts

A reading of the chapter shows a progression in concepts and methods from the most primitive (vectors and matrices) to the derivation of the principal components analysis (known as PCA), a method used in machine learning.

It is a clean progression and well designed. Topics are presented with textual descriptions and consistent notation, allowing the reader to see exactly how elements come together through matrix factorization, the pseudoinverse, and ultimately PCA.

The focus is on the application of the linear algebra operations rather than theory. Although, no worked examples are given of any of the operations.

Finally, the derivation of PCA is perhaps a bit much. A beginner may want to skip this full derivation, or perhaps reduce it to the application of some of the elements learned throughout the chapter (e.g. eigendecomposition).

One area I would like to have seen covered is linear least squares and the use of various matrix algebra methods used to solve it, such as directly, LU, QR decomposition, and SVD. This might be more of a general machine learning perspective and less a deep learning perspective, and I can see why it was excluded.

## Linear Algebra References

The authors also suggest two other texts to consult if further depth in linear algebra is required.

They are:

- The Matrix Cookbook, Petersen and Pedersen, 2006.
- Linear Algebra, Shilov, 1977.

The Matrix Cookbook is a free PDF filled with the notations and equations of practically any matrix operation you can conceive.

These pages are a collection of facts (identities, approximations, inequalities, relations, …) about matrices and matters relating to them. It is collected in this form for the convenience of anyone who wants a quick desktop reference.

— page 2, The Matrix Cookbook, 2006.

Linear Algebra by Georgi Shilov is a classic and well regarded textbook on the topics designed for undergraduate students.

This book is intended as a text for undergraduate students majoring in mathematics and physics.

— Page v, Linear Algebra, 1977.

## Use as a Linear Algebra Crash Course

If you are a machine learning practitioner looking to use this chapter as a linear algebra crash course, then I would make a few recommendations to make the topics more concrete:

- Implement each operation in Python using NumPy functions on small contrived data.
- Implement each operation manually in Python without NumPy functions.
- Apply key operations, such as the factorization methods (eigendecomposition and SVD) and PCA to real but small datasets loaded from CSV.
- Create a cheat sheet of notation that you can use as a quick reference going forward.
- Research and list examples of each operation/topic used in machine learning papers or texts.

Did you take on any of these suggestions?

List your results in the comments below.

## Further Reading

This section provides more resources on the topic if you are looking to go deeper.

- Deep Learning, 2016.
- The Matrix Cookbook, Petersen and Pedersen, 2006.
- Linear Algebra, Shilov, 1977.

## Summary

In this post, you discovered the crash course in linear algebra for deep learning presented in the de facto textbook on deep learning.

Specifically, you learned:

- The topics suggested as prerequisites for deep learning by experts in the field.
- The progression through these topics and their culmination.
- Suggestions for how to get the most out of the chapter as a crash course in linear algebra.

Did you read this chapter of the Deep Learning book? What did you think of it?

Let me know in the comments below.

The post Linear Algebra for Deep Learning appeared first on Machine Learning Mastery.