Notes for Analysis for Applications I
Vector Spaces
Vector spaces developed gradually over a very long period of time,
with roots in both pure mathematics and physics. Its axiomatic
formulation is attributed to the mathematician Peano, in 1886. The
word vector comes directly from Latin; it means "carrier." In
astronomy in the 1600's, the line joining a planet to the sun was
thought to be a a kind of string that "carried" the planet around in
its orbit around the sun. Here is the familiar, modern definition:
Definition. A vector space is a set V together with
two operations, + and × . If u, v are in V, then u + v is in V;
if c is a scalar (a real or complex number for us), then c×v is
in V. The operations satisfy the following rules.
Addition |
|
Multiplication by a scalar |
u + (v + w) = (u + v) + w |
|
a(b×u) = (ab)×u |
Identity: u + 0 = 0 + u = u |
|
(a + b)×u = a× u + b×u |
Inverse: u + (-u) = (-u) + u = 0 |
|
a×(u + v) = a×u + a×v |
u + v = v + u |
|
1×u = u |
The mathematician Peter Lax calls the idea of a vector space "a
bare-bones concept," and, when we look at the definition, it's easy to
see why. All that's required is a set, scalars, two operations, and a
few axioms involving them. It is striking that there are many objects
that satisfy these axioms, and that the "bare-bones" vector-space
notion can yield a great deal of information concerning these objects.
The vector spaces that we will deal with here include the familiar
finite dimensional spaces $\mathbb R^n$, $\mathbb C^n$, the space of
$k$ times continuously differentiable functions $C^k$, polynomials of
degree $n$ or less $P_n$, various spaces of Lebesgue integrable functions
$L^p$, sequence spaces $\ell^p$, and other spaces that we will introduce
later. At this point, it's a good idea to review some standard definitions
and theorems that come up in linear algebra.
Subspaces
- Definition. A subset U of V is
a subspace if, under + and × from V, U is a vector
space in its own right.
- Theorem. U is a subspace of V if and only if these hold:
- 0 is in U.
- U is closed under + .
- U is closed under × .
Span
- Definition. Let S={v1 ...
vn} be a subset of a vector space V. The span of S
is the set of linear combinations of vectors in S. That is,
$\text{span}(S) := \{c_1v_1 + \cdots c_n v_n \}$, where the $c$'s are
arbitrary scalars and the $v$'s are vectors from $S$.
- Proposition. The set span(S) is a subspace of V.
Bases and dimension
- Definition. We say that a set of vectors
S = {v1, v2, ... ,
vn}
is linearly independent if and only the equation
c1v1 + c2v2 + ... +
cnvn = 0
has only c1 = c2 = ... = cn = 0 as a
solution. If it has solutions different from this one, then the set S
is said to be linearly dependent.
- Definition. A subset B = {v1 ...
vn} of V is a basis for V if B spans V and is linearly
independent. Equivalently, B is a basis if it is maximally
linearly independent; that is, B is not a proper subset of some
other linearly independent set. Unless we specifically state
otherwise, we will assume that B is ordered.
- Theorem. Every basis for V has the same number of vectors
as every other basis. This common number is defined to be the
dimension of V, dim(V).
Two remarks. If a vector space has arbitrarily large sets of linearly
independent vectors, it is infinite dimensional. The subspace
containing only $0$ has no linearly independent vectors and is
assigned $0$ as its dimension.
Linear transformations and isomorphisms
-
Definition. A mapping $L$ from a vector space $V$ to a vector
space $W$ that preserves addition and scalar multiplication is called
a linear transformation. That is $L:V\to W$ is linear if and
only if it satisfies $L(c_1v_1+\cdots +c_nv_n) =c_1L(v_1)+\cdots
+c_nL(v_n)$.
-
Definition A linear
transformation that is one-to-one and onto (bijective) is said to be
an isomorphism.
-
Proposition. Under appropriate conditions, linear combinations,
compositions, and inverses of linear transformations are linear.
The word "appropriate" means that, for example, the composition
$K\circ L$ has to be defined. Thus the image of $L$ must be contained
in the domain of $K$. Similar conditions need to be placed on the
other operations mentioned in the proposition.
Coordinate vectors
- Coordinates are used to associate a point in some space with a
set of real or complex numbers think of polar or cartesian
coordinates in 2D. The association is at least locally one-to-one and
onto. For finite dimensional vector spaces, we not only want the
association to be one-to-one and onto, but we also want it to preserve
vector addition and multiplication by scalars. In other words, we want
an isomorphism between a vector space $V$ and $\mathbb R^n$ or
$\mathbb C^n$. The properties defining a basis are exactly the ones
needed to define "good" coordinates for a finite dimensional vector
space. We start by recalling the following important theorem:
-
Theorem. Given an ordered basis B = {v1 ...
vn} and a vector v, we can uniquely write the vector as v =
x1v1 +...+ xnvn, and thus
represent it by the column vector [v]B = [x1,
..., xn]T.
- As a consequence of this theorem, we can define a bijective map
$\Phi$ from an $n$ dimensional vector $V$ to $n\times 1$ columns of
scalars. All we need is an ordered basis for $V$. Then, using the
previous result, we define the map
\[
\Phi[v] := [v]_B = \left(\begin{array}{c} x_1 \\ \vdots \\
x_n\end{array}\right).
\]
The inverse is given by $\Phi^{-1}[x_1 \ \cdots
\ x_n]^T = x_1v_1+\cdots +x_nv_n$. The map $\Phi$ is easily shown to
be linear, so $\Phi$ is an isomorphism between $V$ and $n\times 1$
columns of scalars. We will call $\Phi$ a coordinate map. and
the column vector $[v]_B$ a coordinate vector.
Change of coordinates
-
Often we want to change coordinates. For instance, we do this when we diagonalize a matrix. To see how to do this, we start with two ordered bases B = {v1 ...
vn} and D = {w1 ... wn} for an n-dimensional vector space V. In addition, let $\Phi$ and $\Psi$ be coordinate maps for B and D, respectively. Finally, suppose that the coordinates of $v\in V$ relative to B and D are
\[
\Phi(v) = [v]_B = \mathbf x \quad \text{and}\quad \Psi(v) = [v]_D = \mathbf y.
\]
From this, we see that $\Phi^{-1}(\mathbf x)=v$. Since $\Psi(v)=\mathbf y$, $\Psi(\Phi^{-1}(\mathbf x)) = \Psi(v) =\mathbf y$. Thus the map $\Psi\circ \Phi^{-1}$ changes $\mathbf x$-coordinates to $\mathbf y$ coordinates.
Of course, this doesn't give us an explicit formula for changing from
one system to the other. To do that, we first let $\mathbf e_j= (0
\cdots 1 \cdots 0)^T$ be the column vector with 1 in the
jth position and zeros elsewhere. Next, write the column
vector as a linear combination of $\mathbf e_j$'s, $\mathbf x = \sum_j
x_j \mathbf e_j$, and then apply $\Psi\circ \Phi^{-1}$ to get $\mathbf
y = \sum_j x_j\Psi\circ \Phi^{-1}(\mathbf e_j)$. By the definition of
$\Phi$, we see that $\Phi^{-1}(\mathbf e_j)=0\cdot v_1+\cdots +1\cdot
v_j+\cdots 0\cdot v_n=v_j.\,$ Hence, $\mathbf y =\sum_j x_j\Psi(v_j)=
\sum_j x_j[v_j]_D=S\mathbf x$. Here $S$ is called the transition
matrix and its jth column is $[v_j]_D$, the
D-coordinate vector of $v_j$; explicitly,
\[
S= \big[ [v_1]_D\ [v_2]_D \cdots [v_n]_D\big] =
\big[\text{D-coordinates of the B-basis}\big].
\]
What's left to do is to find the entries in $S$. With a little matrix
manipulation, we have that $S\mathbf e_j=[v_j]_D=\sum_k S_{kj}\mathbf
e_k$. Applying $\Psi^{-1}$ to this equation yields
\[
\nu_j = \Psi^{-1}[v_j]_D = \Psi^{-1}\big(\sum_k S_{kj}\mathbf e_k\big)=
\sum_k S_{kj}\underbrace{\Psi^{-1}(\mathbf e_k)}_{w_k} = \sum_k S_{kj}w_k.
\]
One final remark. If we take components in $\mathbf y = S\mathbf x$,
we get $ y_j=\sum_k S_{jk}x_k$. In the formula for $v_j$ above, the
row and column indices of $S$ are reversed. This means that the matrix
relating the two bases is the transpose of the transition
matrix, which relates the two sets of coordinates.
Matrix representations of linear transformations
-
Let $L:V\to W$ be linear, and suppose that the dimension of $V$ is $n$
and that of $W$ is $m$. Furthermore, we will take B = {v1
... vn} and D = {w1 ... wm} to be
bases for $V$ and $W$, respectively, and we will let $\Phi$ and $\Psi$
be the associated coordinate maps. (Keep in mind that here the
coordinate maps are for different vector spaces.) It follows that the
linear map $\Psi\circ L \circ \Phi^{-1}$ takes $n\times 1$ columns
into $m\times 1$ columns. If $w=L(v)$, $\mathbf x=\Phi(v)$, $\mathbf y
= \Psi(w)$, then $\mathbf y = \Psi\circ L \circ \Phi^{-1} (\mathbf
x)$. Next, note that $\mathbf x= \sum_k x_k\mathbf e_k$, so
\[
\mathbf y = \Psi\circ L \circ \Phi^{-1} (\mathbf x) =
\sum_k x_k \Psi\circ L \circ \Phi^{-1} (\mathbf e_k) = \sum_k x_k
\Psi(L(v_k))= \sum_k [L(v_k)]_D x_k = A_L \mathbf x,
\]
where $A_L =\big[[L(v_1)]_D \ [L(v_2)]_D \cdots
[L(v_n)]_D\big]=\big[\text{ D-coordinates of }
L(\text{B-basis})\big]$. To summarize, we have constructed a unique
matrix $A_L$ such that $w=L(v)$ if and only if $\mathbf y=A_L\mathbf
x$. Thus $A_L$ is the matrix representation of $L$ relative to the
bases involved.
Dual space
- Definition. A linear functional is a linear
transformation $\varphi:V\to $ scalars.
- Definition. The set $V^\ast$ of all linear functionals is
called the (algebraic) dual of V.
- Proposition. $V^\ast$ is a vector space under the
operations of addition of functions and multiplication of a function
by a scalar.
Next: Inner product spaces
Updated 9/1/14 (fjn).