— course overview —�18.06: Linear Algebra
Prof. Steven Johnson,
MIT Applied Math, Physics
Fall 2022
all learning materials: https://github.com/stevengj/1806
pset submission/grades/announcements only: MIT Canvas/Gradescope
Textbook: Strang, Introduction to Linear Algebra, 5th edition
+ supplementary notes
The textbook
Gil Strang, Introduction to Linear Algebra, 5th edition
Administrative Details
Lectures MWF11, 34-101 — video recording & notes links posted to github
Tuesday recitations, various office hours — use Canvas to switch sections
13 Weekly psets, due Fridays 11am (except when Friday is an exam or holiday, in which case due date is Wed or Mon). Submission via Gradescope
— lowest pset score will be dropped, anything else requires S3 + Prof. J.
— pset 1 posted today, due Friday 9/16
Grading: homework 15%, 3 in-class/in-person exams 45% (10/7, 11/14, & 12/9)
& final exam 40% — makeups require S3 + Prof. J (no class conflicts, sports ok)
Collaboration policy: talk to anyone you want, read anything you want, but:
• Make an effort on a problem before collaborating.
• Write up your solutions independently (from “blank sheet of paper”).
• List your collaborators and external sources (not course materials).
18.06 piazza.com discussion forum: https://piazza.com/class/l7g5ixuivav3rm
Problem-set partnering: psetpartners.mit.edu
Tutoring: Math Learning Center, also TSR2
Syllabus and Calendar
Before each exam, a definitive list of topics will be announced, along with a review session.
What is 18.06 about?
High school:
3 “linear” equations
(only ± and × constants)
in 3 unknowns
Method: eliminate unknowns one at a time.
Equivalent matrix problem
Ax = b
Ax is a “linear operation:”
A(x+y) = Ax + Ay
A(3x) = 3Ax
A
x
b
take “dot products” of rows × columns
matrix of
coefficients
vector of
unknowns
vector of
right-hand sides
What is 18.06 about?
Linear system of equations,
in matrix form
Ax = b
A
x
b
Will we learn faster methods to solve this? No. (Except if A is special.) The standard “Gaussian elimination” (and “LU factorization”) matrix methods are just a slightly more organized version of the high-school algebra elimination technique.
Will we get better at doing these calculations by hand? Maybe, but who cares? Nowadays, all important matrix calculations are done by computers.
Will we learn more about the computer algorithms? A little. But mostly the techniques for “serious” numerical linear algebra are topics for another course (e.g. 18.335).
How do we think about linear systems?�(imagine someone gives you a 106×106 matrix)
Don’t expect a lot of “turn the crank” problems
on psets or exams of the form
“solve this system of equations.”
… we will turn it upside-down, give you the answer and ask the question, ask about properties of the solution from partial information, give you a factorization and ask how to use it … the general goal is to require you to understand the crank rather than just turn it.
Fundamentally more abstract than 18.01–18.03!
Where do big matrices�come from?��Lots of examples in many fields,�but here are a couple that are�relatively easy to understand…
Engineering & Scientific Modeling�[ 18.303, 18.330, 6.7300, 6.7330, … ]
http://gmsh.info/
Unknown functions
(fluid flow, mechanical stress,
electromagnetic fields, …) approximated by values on a discrete mesh/grid
e.g. 100x100x100 grid
= 106 unknowns!
http://www.electronics-eetimes.com/
comsol.com
Data analysis�and Machine Learning
Image processing:
images are just matrices of numbers
(red/green/blue intensity)
[ Mathworks ]
[ wikipedia ]
Regression:
(curve fitting)
[ computationalculture.net ]
Google “page rank” problem
(also for gene networks etc.)
Determine the “most important”
web pages just from how they link.
matrix = (# web pages) × (# web pages)
(entry = 1 if they link, 0 otherwise)
Machine Learning
(6.3900 requires 18.06!)
Not just matrices of numbers
“A”
“x”
“b”
linear
operator
∇2
unknown
function
u(x,y,z)
right-hand side
f(x,y,z)
Poisson’s equation
(e.g. 18.303, 18.152)
18.06 vs. 18.C06
≈ 60% overlap
more linear algebra vs. more optimization
(more depth in linalg., (more depth in optim. algorithms,
e.g. in factorizations, problem types, convexity, examples;
vector spaces, eigenstuff; less depth in matrix/vector concepts)
…a little optimization)
more physics vs. more data science/ML
(linear ODEs, PDEs, (discrete data classification,
functions as “vectors”) data bias, robustness)
some computers vs. computer mini-projects
(depends on (some programming
instructor!) experience assumed!)
Linear Algebra and Optimization
Linear Algebra
(both fulfill same course
requirements, prereqs!
…course 6-3 suggests
18.C06 for sufficiently
advanced students)
(faster paced, fall only)
18.06 vs. 18.700
“applied” vs. “pure” math
few proofs vs. formal proofs expected
(deduce patterns (definitions to
from examples, lemmas to theorems
informal arguments) … training in proof writing)
more applications vs. more theorems
more concrete vs. more abstract
(but still more
abstract than
other 18.0x!)
some computers vs. only pencil-and-paper
(depends on
instructor!)
(see also 18.701 for students with much more proof experience)
Computer software
[ image: Viral Shah ]
Lots of choices:
julialang.org
This semester: a new-ish (2012) language
that scales better to real problems.
No programming required for 18.06,
just a “glorified calculator” to turn the crank.
Only on psets, not exams
Install it on your computer (preferred) or use it online — see tutorial link on github
Optional tutorial: Mon 5pm via Zoom
18.06 vs. 18.C06/18.700
furry friend vs. no furry friends?
“Cookie”
6-year old mini labradoodle
18.06 warm-up material
We don’t want to think of matrices as just “bags of numbers” that we push around with some algorithms. We want to think algebraically and geometrically.
Many of the algebraic rules (for scalar numbers) you have used for years no longer work (for vectors and matrices)! Blindly pushing symbols around will mostly lead to nonsense.
Pro tip: Always think of the shapes of matrices and vectors and keep track of the order of operations
— you can only add/multiply things if their shapes are compatible
— AB ≠ BA, and re-ordering is likely nonsensical for non-square objects.
Column vectors:
“in”
n-component column vector of real ( ℝ) numbers
or x = [x1, x2, …, xn] — note commas
= column vector
basic operations on vectors:
• add x + y, subtract x – y,
• multiply by scalars αx
(lower-case Greek = scalar)
later in 18.06:
other kinds of “vectors” too!
… matrices and functions
can be “vectors”!
Dot (“inner”) products:
transpose:
columns become rows
row vector (spaces,
not commas!)
Linear operations on vectors
Linear operations Ax (≠ xA!):
take a vector x in and give vector Ax out, satisfying linearity:
i.e. “distributive law”
A(αx+βy) = α(Ax) + β(Ay)
examples:
identity operation:
scaling by α:
“non-square” operations:
3 inputs
2 outputs
{
“linear combination” of x and y
Linear or nonlinear? Examples
linear
nonlinear
nonlinear
linear
linear
x1
x1/x2
2x1+1
nonlinear!
nonlinear!
= linear + constant
“affine”
Matrices: Representing linear ops
“rows times columns” rule is just a way of writing down a linear operation
… not always the most natural way!
identity matrix I:
blanks = zeros
scaling by α:
“non-square” operations = rectangular matrices:
3 inputs
2 outputs
2 x 3 matrix
Matrices: Representing linear ops
aij = row i, column j = output i, input j
( …or sometimes I’ll write Aij )
output i (yi)
= “dot product” of
row i of A
with column x
Matrix multiplication = Composition
A(Bx) = take input x, do operation B, then do operation A
= (AB)x where AB = operation that does B, then A
≠ (BA)x or B(Ax) … associative but not commutative
Matrix-multiplication “rows times columns” rule for AB is defined so that associativity works, but always remember that “AB” means “do B, then A” when acting on a vector
We will explore several ways to think about matrix multiplication AB besides mechanical “rows times columns”…
Our goal is not just to mindlessly “turn the crank.”
Matrix-multiplication example
2x3 matrix B
“1x2 matrix” A
(1x2)*(2x3)
matrix that
does x3 – x1
… whereas BA doesn’t even make sense! A has 1 output, but B has 3 inputs!
= sum of 2 rows of B
(really, a row vector)
Adding & scaling matrices
What does “A + B” mean for two m × n matrices?
— it means adding the results: (A+B)x = Ax + Bx
— but adding entries of A and B gives the same thing:
What does “αA” mean for a matrix and a scalar?
— it means scaling the results: (αA)x = α(Ax)
— but scaling entries of A gives the same thing:
… so you can add/subtract matrices
& multiply by scalars … sort of like vectors…