Linear Algebra - Uwni'Space

We have learned about linear algebra in a concrete way, such as solving linear equations, matrix operations, etc. In this chapter, we will learn about the abstract concept of linear spaces, which is a generalization of vectors and matrices. First of all, you may ask, why do we need to learn such abstract concepts? check the following example. We have learned about vectors at least in high school, like displacement, velocity, force, etc. actually, when we say vectors, our first impression is that they must have two or three components, which represent a point in a plane or in space. but after we learned about the linear algebra, we learned that something like $(1, 2, 3, 4, 5)$ is also a vector, even though we cannot imagine what its physical picture in the real world.

So, basically, the change of the concept of vector to a arbitrary number of components is a generalization or a abstraction to the original concept. so that we can handle more with the new definition. but at the cost of losing some physical meaning. In this lecture, we will repeat this process, to further abstract the concept of vector, so that we can figure out some shared, common, or general properties of these.

let us recall the process of the abstraction from $ℝ^{3}$ to $ℝ^{𝑛}$ , here, I will give a more rigorous definition, as the first step to get you familiar with the algebraic structure.

Recall that if given a structure like $(𝑎, 𝑏, 𝑐, 𝑑)$ , is this a vector? No, definitely not. actually, it is just a tuple, which is an ordered finite sequence or list of elements. btw you can construct it with ordered pairs, anyway. Although we usually by default treat it as a vector, that because we can define a addition and scalar multiplication on it, in a very natural way. but with the structure of a tuple only, we cannot do any operation on it, unless you define some in advance. for example, if you try to run

python

(1, 2, 3) + (1, 2, 3)

(1, 2, 3) + (1, 2, 3)

in Python, it will return (1, 2, 3, 1, 2, 3), and if you try to write $(1, 2, 3) + (4, 5, 6)$ on your paper, the reader will think it is $(1 + 4, 2 + 5, 3 + 6)$ by default. that because for the programming languages, the + operator is overloaded to concatenate two tuples. but for the math, especially in the Euclid space, the + operator is defined to add their elements one by one. so you will understand that it is important to declare or at least to know what a symbol exactly means before using it, otherwise you will get confused. so, the tuple is not a vector, but we can define a vector structure on it, and then it becomes a vector.

Application of Vectors

Then, let's return back to the topic of vectors. for example, we know that force can be decomposed into two orthogonal components, and we use this to analyze mechanics problems. Why can we do that? because the force is a vector. but similarly, a sine signal

𝐴 \cos (𝜔 𝑡 + 𝜑) = (𝐴 \cos 𝜑) \cos (𝜔 𝑡) - (𝐴 \sin 𝜑) \sin (𝜔 𝑡)

(1)

can be decomposed into two components $\sin (𝜔 𝑡)$ and $\cos (𝜔 𝑡)$ , we usually use a constellation to represent this decomposition. Have you ever noticed the similarity between the two cases? therefore, our goal today is to come up with a structure to deal with all the similar cases.

From these two examples above, looking like a list is neither a sufficient condition for being a vector (because addition on lists isn't necessarily vector addition), nor is it a necessary condition (because even if something doesn't look like a list, like a function or a polynomial, it can still behave exactly like a vector).So, Finally, what is a vector? how should we define what a vector is?

Here we will first introduce the concept of field, which are the basic algebraic structures that we will use to define linear spaces.

Introduction to Field

Before we start talking about the vector itself, let's begin from a more basic concept. Our story will starts with a set, the fundamental building block of mathematics. Suppose there is a set $𝑆$ . it is too trivial(boring) just like the tuple we mentioned above. with a set only we can seldom do anything.

So, we want to define a binary operation on the set $𝑆$ , which is a function that maps two elements of the set to the set itself.

Definition 1.

(Binary Operation)

A binary operation $𝐴$ on a set $𝑆$ is a function

𝐴 : 𝑆 \times 𝑆 \to 𝑆

(2)

Here is some examples of binary operations:

Example 1.

(Some Binary Operation Examples)

$+, -, \times$ defined on $ℝ$ , (is $/$ a binary operation?)
$\cup, \cap$ defined on sets are also binary operation. (are $\subseteq, \subset$ binary operations?)
$\land, \lor, \oplus$ defined on ${⊤, ⊥}$ , (is $\neg$ a binary operation?)

the we can define¹ a field is a set $𝑆$ with two binary operations, addition and multiplication, that satisfy certain properties. Note that the name "addition" and "multiplication" are just names, they do not necessarily mean the same as the addition and multiplication of real numbers. The important thing is that these operations satisfy certain properties.then the "addition" and "multiplication" satisfies' the following properties.

Definition 2.

(Field)

A field is a set $𝑆$ with two binary operations, $+$ and $\cdot$ , such that: $𝑎, 𝑏, 𝑐 \in 𝑆$

Associativity of addition: $(𝑎 + 𝑏) + 𝑐 = 𝑎 + (𝑏 + 𝑐)$
Commutativity of addition: $𝑎 + 𝑏 = 𝑏 + 𝑎$
Additive identity element²: $𝑎 + 0 = 𝑎$
Inverse element for addition: $\forall 𝑎 \exists 𝑏 (𝑎 + 𝑏 = 0)$
Associativity of multiplication: $(𝑎 \cdot 𝑏) \cdot 𝑐 = 𝑎 \cdot (𝑏 \cdot 𝑐)$
Commutativity of multiplication: $𝑎 \cdot 𝑏 = 𝑏 \cdot 𝑎$
Multiplicative identity element: $𝑎 \cdot 1 = 𝑎$
Inverse element for multiplication: $𝑎 \neq 0 \Rightarrow \exists 𝑏 (𝑎 \cdot 𝑏 = 1)$
Distributivity of multiplication over addition: $𝑎 \cdot (𝑏 + 𝑐) = 𝑎 \cdot 𝑏 + 𝑎 \cdot 𝑐$

Proof.(

0 + 𝑎 = 𝑎

)

trivial. by axiom 2. ∎

Actually, we call the structure $(𝑆, +)$ that satisfied the first 4 properties above an abelian group. So, if we define the abelian group, then a more simple definition can be given as,

Definition 3.

A set with two abelian groups,

(𝑆, +)

for addition and

(𝑆 ∖ {0}, \cdot)

, and the multiplication operation is distributive over addition. then we call it a field, and denoted as

(𝑆, + \cdot)

the most common example of a field is the rational numbers, you can easily verify that the rational numbers satisfy all the properties of a field. The real numbers are also a field, and so are the complex numbers.

Example 2.

(Examples of Fields)

$ℚ$ (rational numbers)
$ℝ$ (real numbers)
$ℂ$ (complex numbers)

Meanwhile, the integers are not a field, because they do not have multiplicative inverses for all non-zero elements (for example $1 / 2 \notin ℤ$ ), the natural numbers are not even a group [Task: search for the definition of groups, and tell why.].

Vector Space

Now We're ready to define what a vector space is.

Definition 4.

(Vector Space)

A vector space $𝑉$ over $𝔽$ , consists of a set $𝑉$ (whose elements are called vectors) and a field $𝔽$ (whose elements are called scalars), together with two operations:

Vector addition: $+ : 𝑉 \times 𝑉 \to 𝑉$
Scalar multiplication³: $\cdot : 𝔽 \times 𝑉 \to 𝑉$

These operations must satisfy the following axioms. For vectors $𝒖, 𝒗, 𝒘 \in 𝑉$ and scalars $𝑎, 𝑏 \in 𝔽$ :

Associativity: $(𝒖 + 𝒗) + 𝒘 = 𝒖 + (𝒗 + 𝒘)$
Commutativity: $𝒖 + 𝒗 = 𝒗 + 𝒖$
Identity element: There exists $𝟎 \in 𝑉$ such that $𝒗 + 𝟎 = 𝒗$ for all $𝒗 \in 𝑉$
Inverse element: For every $𝒗 \in 𝑉$ , there exists $𝒘 \in 𝑉$ such that $𝒗 + 𝒘 = 𝟎$
Associativity: $𝑎 (𝑏 𝒗) = (𝑎 𝑏) 𝒗$
Identity: $1 𝒗 = 𝒗$ ⁴
Distributivity over vector addition: $𝑎 (𝒖 + 𝒗) = 𝑎 𝒖 + 𝑎 𝒗$
Distributivity over scalar addition: $(𝑎 + 𝑏) 𝒗 = 𝑎 𝒗 + 𝑏 𝒗$

⁵ Note that these eight axioms completely characterize what we mean by a vector space. If a set $𝑉$ with operations satisfies these axioms over some field $𝔽$ , then we call it a vector space over $𝔽$ .

For now, we can eventually answer the question, what is a vector? the answer is simple but abstract: A vector is an element of a vector space. And the vector space is defined by the eight axioms above.

Now let's look at some concrete examples to see how this abstract definition applies to familiar and not-so-familiar cases.

Example 3.

(

ℝ^{𝑛}

)

The most familiar example is

ℝ^{𝑛} = {(𝑥_{1}, 𝑥_{2}, \dots, 𝑥_{𝑛}) | 𝑥_{𝑖} \in ℝ}

(3)

over the field $ℝ$ . Here:

Vector addition: $(𝑥_{1}, \dots, 𝑥_{𝑛}) + (𝑦_{1}, \dots, 𝑦_{𝑛}) = (𝑥_{1} + 𝑦_{1}, \dots, 𝑥_{𝑛} + 𝑦_{𝑛})$
Scalar multiplication: $𝑎 \cdot (𝑥_{1}, \dots, 𝑥_{𝑛}) = (𝑎 𝑥_{1}, \dots, 𝑎 𝑥_{𝑛})$
Additive identity: $(0, 0, \dots, 0)$
Additive inverse: $(- 𝑥_{1}, \dots, - 𝑥_{𝑛})$

You can verify that all eight axioms are satisfied.

Example 4.

(Polynomials)

Let $𝒫_{𝑛} (ℝ)$ be the set of all polynomials of degree at most $𝑛$ with real coefficients:

𝒫_{𝑛} (ℝ) = {𝑎_{0} + 𝑎_{1} 𝑥 + 𝑎_{2} 𝑥^{2} + \dots + 𝑎_{𝑛} 𝑥^{𝑛} | 𝑎_{𝑖} \in ℝ}

(4)

Here:

Vector addition: $(𝑎_{0} + 𝑎_{1} 𝑥 + \dots) + (𝑏_{0} + 𝑏_{1} 𝑥 + \dots) = (𝑎_{0} + 𝑏_{0}) + (𝑎_{1} + 𝑏_{1}) 𝑥 + \dots$
Scalar multiplication: $𝑐 \cdot (𝑎_{0} + 𝑎_{1} 𝑥 + \dots) = (𝑐 𝑎_{0}) + (𝑐 𝑎_{1}) 𝑥 + \dots$
Additive identity: $0 = 0 + 0 𝑥 + 0 𝑥^{2} + \dots$
Additive inverse: $- (𝑎_{0} + 𝑎_{1} 𝑥 + \dots) = (- 𝑎_{0}) + (- 𝑎_{1}) 𝑥 + \dots$

Notice how polynomials, which don't "look like" vectors in the geometric sense, still satisfy all the vector space axioms!

Example 5.

(Functions)

Let $𝔽^{𝑆}$ be the set of all functions mapping a nonempty set $𝑆$ to $𝔽$ . We define $\forall 𝑓, 𝑔 \in 𝔽^{𝑆}$

Addition: $(𝑓 + 𝑔) (𝑥) = 𝑓 (𝑥) + 𝑔 (𝑥)$
Scalar multiplication: $(𝑐 \cdot 𝑓) (𝑥) = 𝑐 \cdot 𝑓 (𝑥)$
additive identity: const function $0 (𝑥) = 0$ for all $𝑥 \in 𝑆$
additive inverse: $(- 𝑓) (𝑥) = - 𝑓 (𝑥)$

then $𝔽^{𝑆}$ is a vector space over $𝔽$ .

This also forms a vector space over $𝔽$ . Functions can be thought of as "vectors" in a very abstract sense, where the "components" are the function values at each point in the domain. and we can restric this function vector space to a subset of functions, while still satisfying the vector space axioms.

Example 6.

(Continuous Functions)

Let $𝐶 (𝐼)$ be the set of all continuous functions defined on an interval $𝐼 \to 𝔽$ . The structure is defined basically the same as in , but with the additional property that the functions are continuous over the interval $𝐼$ .

𝐶 (𝐼) = {𝑓 \in 𝔽^{𝐼} | (\forall 𝑥_{0} \in 𝐼) \lim_{𝑥 \to 𝑥_{0}} 𝑓 (𝑥) = 𝑓 (𝑥_{0})}

(5)

by the continuous functions' properties, we know that the sum of two continuous functions is also continuous, and the scalar multiplication of a continuous function is also continuous. So this set also forms a vector space over $𝔽$ .

Furthermore, let $𝐶^{1} (𝐼)$ be the set of all continuously differentiable functions⁶ $𝐼 \to 𝔽$ . The structure is defined similarly, but with the additional property that the functions have continuous first derivatives. similarly, we can define $𝐶^{𝑛} (𝐼)$ as the set of all functions with continuous derivatives up to order $𝑛$ .

They are all vector spaces over $𝔽$ . the proof is left to the reader as exercises.

This is the example we started with!

Example 7.

(Solutions to Linear Differential Equations)

The set of all solutions to the differential equation

{𝑦 \in 𝔽^{𝐼} | 𝑦^{(𝑛)} + 𝑎_{𝑛 - 1} 𝑦^{(𝑛 - 1)} + \dots + 𝑎_{1} 𝑦^{'} + 𝑎_{0} 𝑦 = 0}

(6)

is a vector space over $𝔽$ . where $𝑎_{𝑖} \in 𝔽$ are known coefficients. The addition and scalar multiplication are defined the same as in .

Some useful properties of this vector space can be derived from the definition(axioms) directly. and these properties should be proved before we use them. they are obviously true, but quite tricky to prove.

Proposition 1.

(Unique Additive Identity)

A vector space has a unique additive identity.

Proof.

Suppose there are two additive identities

𝟎_{1}

and

𝟎_{2}

in a vector space

𝑉

. Then:

𝟎_{1} + 𝟎_{2} = 𝟎_{1}

(by the definition of additive identity) But also,

𝟎_{2} + 𝟎_{1} = 𝟎_{2}

(by the same definition) Therefore,

𝟎_{1} = 𝟎_{2}

(by the commutativity of addition). ∎

Proposition 2.

(Unique Additive Inverse)

Every element in a vector space has a unique additive inverse.

Proof.

Suppose $𝑉$ is a vector space. Let $𝒗 \in 𝑉$ . Suppose $𝒘$ and $𝒘^{'}$ are additive inverses of $𝒗$ . Then

𝒘 = 𝒘 + 𝟎 = 𝒘 + (𝒗 + 𝒘^{'}) = (𝒘 + 𝒗) + 𝒘^{'} = 𝟎 + 𝒘^{'} = 𝒘^{'}

(7)

∎

By the uniqueness of the additive identity, the notation $- 𝒗$ is well-defined as the unique additive inverse of $𝒗$ . and we can define the substraction operation as $𝒗 - 𝒘 = 𝒗 + (- 𝒘)$ .

Proposition 3.

0 𝒗 = 𝟎

for every

𝒗 \in 𝑉

Proof.

Let $𝒗 \in 𝑉$ . By the definition of scalar multiplication, we have: $0 𝒗 = (0 + 0) 𝒗 = 0 𝒗 + 0 𝒗$ Suppose $- 0 𝒗$ is the additive inverse of $0 𝒗$ , so that $0 𝒗 + (- 0 𝒗) = 𝟎$ . Then we have:

𝟎 = 0 𝒗 + (- 0 𝒗) = 0 𝒗 + 0 𝒗 + (- 0 𝒗) = 0 𝒗

(8)

∎

Proposition 4.

𝑎 𝟎 = 𝟎

for every

𝑎 \in 𝔽

Proof.

Let $𝑎 \in 𝔽$ and $𝟎 \in 𝑉$ be the additive identity. Then: $𝑎 𝟎 = 𝑎 (𝟎 + 𝟎) = 𝑎 𝟎 + 𝑎 𝟎$ By the definition of additive identity, we have: $𝑎 𝟎 + (- 𝑎 𝟎) = 𝟎$ Therefore,

𝟎 = 𝑎 𝟎 + (- 𝑎 𝟎) = 𝑎 𝟎 + 𝑎 𝟎 + (- 𝑎 𝟎) = 𝑎 𝟎

(9)

∎

Proposition 5.

(- 1) 𝒗 = - 𝒗

for every

𝒗 \in 𝑉

Proof.

𝒗 + (- 1) 𝒗 = 1 𝒗 + (- 1) 𝒗 = (1 + (- 1)) 𝒗 = 0 𝒗 = 𝟎

(10)

This equation says that $(- 1) 𝒗$ , when added to $𝒗$ , gives $𝟎$ . Thus $(- 1) 𝒗$ is the additive inverse of $𝒗$ , as desired. ∎

Subspaces

Now that we understand vector spaces, let's talk about subspaces. A subspace is essentially a "vector space within a vector space."

Definition 5.

(Subspace)

A subset

𝑈 \subseteq 𝑉

is called a subspace of

𝑉

𝑈

is also a vector space with the same scalar field, additive identity, addition, and scalar multiplication as on

𝑉

Proposition 6.

A subset $𝑈 \subseteq 𝑉$ over the same $𝔽$ is a subspace of $𝑉$ if and only if:

$𝟎 \in 𝑈$
$𝑈$ is closed under vector addition: $\forall 𝒖, 𝒗 \in 𝑈, 𝒖 + 𝒗 \in 𝑈$
$𝑈$ is closed under scalar multiplication: $\forall 𝒗 \in 𝑈, \forall 𝑎 \in 𝔽, 𝑎 𝒗 \in 𝑈$

Proof.

𝑈

is a subspace of

𝑉

, then

𝑈

satisfies the three conditions above by the definition of vector space. Conversely, suppose

𝑈

satisfies the three conditions above. The first condition ensures that the additive identity of

𝑉

is in

𝑈

. The second condition ensures that addition makes sense on

𝑈

. The third condition ensures that scalar multiplication makes sense on

𝑈

. If

𝒖 \in 𝑈

, then

- 𝒖

(which equals

(- 1) 𝒖

by ) is also in

𝑈

by the third condition above. Hence every element of

𝑈

has an additive inverse in

𝑈

. The other parts of the definition of a vector space, such as associativity and commutativity, are automatically satisfied for

𝑈

because they hold on the larger space

𝑉

. Thus

𝑈

is a vector space and hence is a subspace of

𝑉

. ∎

Note that if these three conditions are satisfied, then $𝑊$ automatically inherits all the vector space axioms from $𝑉$ , so $(𝑊, 𝐹)$ is itself a vector space. meanwhile, if $𝑊$ is a subset of $𝑉$ but does not satisfy these conditions, it won't be a subspace.

Example 8.

(Lines through the origin)

In $ℝ^{2}$ , any line passing through the origin forms a subspace. For instance:

𝑈 = {(𝑥, 𝑦) \in ℝ^{2} | 𝑦 = 2 𝑥} = {(𝑡, 2 𝑡) | 𝑡 \in ℝ}

(11)

You can verify:

$(0, 0) \in 𝑈$
If $(𝑡_{1}, 2 𝑡_{1}), (𝑡_{2}, 2 𝑡_{2}) \in 𝑈$ , then $(𝑡_{1}, 2 𝑡_{1}) + (𝑡_{2}, 2 𝑡_{2}) = (𝑡_{1} + 𝑡_{2}, 2 (𝑡_{1} + 𝑡_{2})) \in 𝑈$
If $(𝑡, 2 𝑡) \in 𝑈$ and $𝑎 \in ℝ$ , then $𝑎 \cdot (𝑡, 2 𝑡) = (𝑎 𝑡, 2 𝑎 𝑡) \in 𝑈$

Similarly, any plane or line through the origin in $ℝ^{3}$ is a subspace. but note that those that do not pass through the origin are not subspaces.

Example 9.

(Even/Odd functions)

In the vector space of $𝔽^{𝑆}$ , the set of even/odd functions forms a subspace.

It contains the $0 (𝑥) = 0$ function.
If $𝑓 (𝑥)$ and $𝑔 (𝑥)$ are even/odd, then $(𝑓 + 𝑔) (𝑥) = 𝑓 (𝑥) + 𝑔 (𝑥)$ is also even/odd.
If $𝑓 (𝑥)$ is even/odd and $𝑐 \in ℝ$ , then $(𝑐 𝑓) (𝑥) = 𝑐 𝑓 (𝑥)$ is also even/odd.

Example 10.

(Continuous Functions)

The set $𝐶 (𝐼)$ of all continuous functions on an interval $𝐼$ forms a subspace of the vector space of all functions $𝔽^{𝐼}$ .

constant value function $0 (𝑥) = 0$ is in $𝐶 (𝐼)$
If $𝑓, 𝑔 \in 𝐶 (𝐼)$ , then $(𝑓 + 𝑔) (𝑥) = 𝑓 (𝑥) + 𝑔 (𝑥)$ is also continuous, so $𝑓 + 𝑔 \in 𝐶 (𝐼)$
If $𝑓 \in 𝐶 (𝐼)$ and $𝑐 \in ℝ$ , then $(𝑐 𝑓) (𝑥) = 𝑐 𝑓 (𝑥)$ is also continuous, so $𝑐 𝑓 \in 𝐶 (𝐼)$

Example 11.

(Solutions to Homogeneous Linear Differential Equations)

The set of all solutions to a linear homogeneous differential equation in forms a subspace of the vector space of functions.

The zero function is a solution (the trivial solution).
If $𝑦_{1}$ and $𝑦_{2}$ are solutions, then $(𝑦_{1} + 𝑦_{2}) (𝑥) = 𝑦_{1} (𝑥) + 𝑦_{2} (𝑥)$ is also a solution.
If $𝑦$ is a solution and $𝑐 \in ℝ$ , then $(𝑐 𝑦) (𝑥) = 𝑐 𝑦 (𝑥)$ is also a solution.

Sum of Subspaces

Now let's talk about the sum of two subspaces. Given two subspaces $𝑈$ and $𝑊$ of a vector space $𝑉$ , their sum, denoted $𝑈 + 𝑊$ , is defined as:

Definition 6.

(Sum of Subspaces)

The sum of two subspaces $𝑈$ and $𝑊$ of a vector space $𝑉$ is the set:

𝑈 + 𝑊 = {𝒖 + 𝒘 | 𝒖 \in 𝑈, 𝒘 \in 𝑊}

(12)

but note that $𝑈 \cup 𝑊$ is not the same as $𝑈 + 𝑊$ . The sum $𝑈 + 𝑊$ is a vector space itself, and it contains all possible sums of vectors from $𝑈$ and $𝑊$ whereas $𝑈 \cup 𝑊$ just combines the elements of both subspaces so it will not must be a vector space.

Example 12.

Suppose $𝑈$ is the subspace of all vectors of the form $(𝑥, 0)$ in $ℝ^{2}$ , and $𝑊$ is the subspace of all vectors of the form $(0, 𝑦)$ . Then:

𝑈 + 𝑊 = {(𝑥, 𝑦) | 𝑥 \in ℝ, 𝑦 \in ℝ} = ℝ^{2}

(13)

forms the $ℝ^{2}$ plane while $𝑈 \cup 𝑊 = {(𝑥, 0) | 𝑥 \in ℝ} \cup {(0, 𝑦) | 𝑦 \in ℝ}$ is just the union of $𝑥$ -axis and $𝑦$ -axis, which is not a vector space.

Proposition 7.

(Sum of Subspaces is the smallest containing subspace)

The sum

𝑉_{1} + \dots + 𝑉_{𝑚}

of subspaces

𝑉_{1}, \dots, 𝑉_{𝑚}

𝑉

is the smallest subspace of

𝑉

containing each of

𝑉_{1}, \dots, 𝑉_{𝑚}

Proof.

The reader can verify that $𝑉_{1} + \dots + 𝑉_{𝑚}$ contains the additive identity $𝟎$ and is closed under addition and scalar multiplication. Thus it is a subspace of $𝑉$ .

The subspaces $𝑉_{1}, \dots, 𝑉_{𝑚}$ are all contained in $𝑉_{1} + \dots + 𝑉_{𝑚}$ (to see this, consider sums $𝒗_{1} + \dots + 𝒗_{𝑚}$ where all except one of the $𝒗_{𝑘}$ 's are $𝟎$ ). Conversely, every subspace of $𝑉$ containing $𝑉_{1}, \dots, 𝑉_{𝑚}$ contains $𝑉_{1} + \dots + 𝑉_{𝑚}$ (because subspaces must contain all finite sums of their elements). Thus $𝑉_{1} + \dots + 𝑉_{𝑚}$ is the smallest subspace of $𝑉$ containing $𝑉_{1}, \dots, 𝑉_{𝑚}$ . ∎