Here’s the text in a more sarcastic way, with added hastags and a more playful tone:
The tools of linear algebra are a blast when working in Euclidean space, but what if we want to take it to the next level? Imagine applying these same tools to functions and sequences, but in a whole new world of infinite dimensions!
Hilbert space is the ultimate tool for infinite-dimensional functions, a space where we can apply linear algebra to functions without any worries about the limitations of real-world vectors. With their infinite dimensions, functions become infinite-dimensional vectors, just like the infinite set of numbers we use to define them.
Intuition – infinite-dimensional vectors
Functionals in Euclidean space are like infinite-dimensional vectors, each with a unique index. In the case of functions, a vector in Euclidean space is a function that maps a set of indices (the domain) to a real number (the codomain). However, instead of having a finite set of indices, functions in Euclidean space can take any real number as an index, making them infinite-dimensional.
Functions
The tools of linear algebra are extremely useful when working in
Euclidean space (e.g. \mathbb{R}^3). Wouldn’t it be great if we
could apply these tools to additional mathematical constructs, such as
functions and sequences? Hilbert space allows us to do exactly this –
apply linear algebra to functions.
Intuition – functions as infinite-dimensional vectors
There are several ways to view vectors; a standard interpretation is an
ordered list of numbers. Let’s take a vector in \mathbb{R}^3 as
an example:
\[v = \begin{bmatrix}
1.4 \\
4.2 \\
-2.14
\end{bmatrix}\]
This is a list of three numbers, where each number has an index.
v[1] is 1.4, v[2] is 4.2 and so on. Another way to think
of a vector is a function, in the strict mathematical
sense. A
vector in \mathbb{R}^3 is a function with the domain
{1,2,3} (the indices) and codomain , or:
\[v:\{1,2,3\}\to\mathbb{R}\]
Now imagine that our vector is N-dimensional: \mathbb{R}^N.
Using the function notation we can write
v:\{1,2,\cdots ,N\}\to\mathbb{R}. This works for any N, and in
fact it also works for an infinite N. Our vector then simply becomes a
function from the natural numbers to the reals:
v:\mathbb{N}\to\mathbb{R}.
But we can take it even further; what if we allow any real number as an
index? Our vector is then v:\mathbb{R}\to\mathbb{R}, or we may
just change its name to be more familiar:
f:\mathbb{R}\to\mathbb{R}. This “vector” is just a function from
the reals to the reals.
While we
can’t write all the elements down explicitly (there’s an infinite number
of them, and most of the indices are irrational which don’t even have a
finite representation), we can instead come up with a rule that maps an
index to the element. For example: f(x)=x^2 is such a rule. For
any given index x, it assigns the value x^2. We’re not used to
thinking of functions as vectors, but if we carefully extend some
definitions, it’s entirely possible!
So, functions can be seen as vectors with infinite dimensions. The next
step is to see how we can define a vector space for functions.
Functions form a vector space
Functions, together with the standard addition and scalar multiplication
operations form a vector
space.
For full generality, let be any set and
\mathbb{F}\in\{\mathbb{R},\mathbb{C}\} (either reals or complex
numbers). Let be the set of all functions mapping
X\to\mathbb{F}. For f,g\in V and a number
a \in \mathbb{F}, we define function addition and scalar
multiplication as follows:
\[[f+g](x)=f(x)+g(x)\qquad [a\cdot f](x)=a\cdot f(x)\]
Then along with these operations forms a vector space over
\mathbb{F}. For a proof, see Appendix A.
Square integrable functions
A vector space is useful, but to get to Hilbert space and be able to do
more interesting operations on functions, we need some additional
structure.
From here on, we’ll switch to functions with complex values (functions
with real values are just a special case). A function
f:\mathbb{R}\to\mathbb{C} is said to be square integrable if:
\[\int_{-\infty}^{\infty}\left | f(x) \right |^2 dx < \infty\]
The set of such functions is commonly denoted L^2, and it forms
a subspace of the vector space we discussed in the previous section
(for a proof, see Appendix B).
The integral over the square of the function is equivalent to the
Euclidean norm for vectors; intuitively, it acts as a measure of
length, which is the term used in vectors. For functions, it’s
typically referred to as energy [1].
Inner product and norm
To add more tools from the linear algebra toolbox, let’s define an inner
product on L^2:
\[\langle f,g \rangle=\int_{-\infty}^{\infty}f^{*}(x)g(x) dx\]
Why is it defined in this way? Here’s the definition of inner product
between two N-dimensional vectors with complex values:
\[\langle u,v \rangle=\sum_{i=1}^{N}u_{i}^{*} v_{i}\]
Looks familiar? The function version is just the generalization of this
sum over an infinite range (the entire x-axis, if you will), using an
integral.
As the next step, we want to show that L^2 is an inner product
space, when taken with the inner product operation as defined above.
For this to be true, first and foremost we have to show that the inner
product is finite for every pair of functions in L^2 (if the
integral doesn’t converge, it’s not something we can work with). This
can be done using the integral form of the Cauchy-Schwarz
inequality:
\[\int_{-\infty}^{\infty}f^{*}(x)g(x) dx \leq
\sqrt{\int_{-\infty}^{\infty}|f(x)|^2} dx
\sqrt{\int_{-\infty}^{\infty}|g(x)|^2} dx\]
Since f,g\in L^2, the right hand side is finite, and therefore
the inner product is finite as well. This is where the square
integrability of functions in L^2 comes into play – without
being square integrable, the inner product would be impossible to
define.
The other properties of inner products can also be demonstrated readily,
and there are plenty of resources online that show how [2].
Therefore L^2, coupled with the inner product operation shown
here forms an inner product
space.
This inner product can be used to define a norm for our space:
\[\|f\| = \sqrt{\langle f,f\rangle}
= \sqrt{\int_{-\infty}^{\infty} |f(x)|^2 \, dx}\]
Once again, because our functions in L^2 are square integrable,
the norm exists and it’s easy to show it satisfies all the usual
requirements for a norm.
Are we Hilbert yet?
We’ve seen that the set of square integrable functions L^2 forms
a proper vector space and also an inner product space when coupled with
an inner product operation; it also has a norm. So does it have all
that’s needed for linear algebra?
Almost. This space should also be complete. The term “complete” is
severely overloaded in math, so it’s important to say what it means in
this context: put simply, it means the set has no “holes” – no sequence
of elements in the set converges to an element outside the set [3]. To
put it less simply, a space is complete if all Cauchy
sequences of elements
of this space converge.
This gets us deep into the large and advanced topic of real analysis.
The Riesz-Fischer
theorem
shows that L^2 is complete.
Once we add completeness to the set of properties of L^2, it
becomes a Hilbert space.
You may also hear the term Banach
space mentioned in this
context. Banach spaces are more general than Hilbert spaces: a complete
space with a norm is a Banach space (this norm doesn’t have to come from
an inner product). A complete inner-product space is a Hilbert space –
the norm of a Hilbert space is defined using its inner product, as we’ve
seen above.
Application: generalized Fourier series
The Fourier series is one of the most brilliant and consequential ideas
in mathematics. I would really like to dive deeper into this topic, but
that would require a post (or a book) of its own.
In short, Fourier series can be defined for functions in L^2
because these form a Hilbert space. Specifically, the inner product for
functions lets us define orthogonality and the concept of basis
vectors in L^2. These are then used to express any function as
a weighted sum of a series of basis functions that span the space.
Moreover, the completeness of the space guarantees that Fourier series
actually converge to functions within the space.
Interestingly, Fourier put forward his ideas decades before the field of
analysis matured and Hilbert space was defined. This is why many
mathematicians of the day (most notably Lagrange) objected to Fourier
theory as not sufficiently rigorous. The theory worked brilliantly for
many useful scenarios, however, and later developments in functional
analysis helped put it on a more solid theoretical footing.
Another related example which I find very cool: I’ve mentioned how this
theory helps us apply the tools of linear algebra to functions, and
generalized Fourier series provides an excellent illustration.
Most people are familiar with the trigonometric Fourier series, but the
theory is more general and applies to any set of mutually orthogonal
functions that form a basis for the vector space. Is there a polynomial
Fourier series? Yes, and it can be derived using one of the classical
tools of linear algebra – the Gram-Schmidt
process.
The result is Legendre
polynomials.
Again, all of this is fascinating and I hope to be able to write more on
this topic in the future.
Application: Quantum mechanics
In QM, states of particles are described by wavefunctions in a Hilbert
space. The inner product can be interpreted as a probability. QM
operators can be seen as linear maps on that space. This lets us apply
linear algebra in infinite dimensions and unlocks a treasure chest of
useful mathematical tools.
Appendix A: proof of vector space axioms for functions
As a reminder, we’re dealing with the set of functions
f:X\to\mathbb{F}, where is any set and
\mathbb{F} can be either or
\mathbb{C}. This set , along with addition between set
members and scalar multiplication form a vector space. To prove this, we
prove all the vector space axioms:
Associativity of vector addition:
\[\begin{aligned}
[f+[g+h]](x)&=f(x)+[g+h](x)\\
&=f(x)+g(x)+h(x)\\
&=[f+g](x)+h(x)\\
&=[[f+g]+h](x)
\end{aligned}\]
This proceeds very smoothly because addition on either reals or complex
numbers is associative, commutative, etc.
Commutativity of vector addition:
\[\begin{aligned}
[f+g](x)&=f(x)+g(x)\\
&=g(x)+f(x)\\
&=[g+f](x)
\end{aligned}\]
Identity element of vector addition:
The function z(x)=0 serves as an additive identity element:
\[\begin{aligned}
f(x)+z(x)&=f(x)\qquad \forall x \\
[f+x](x)&=f(x)
\end{aligned}\]
Inverse elements of vector addition:
We’ll define (-f)(x)=-f(x) as the additive inverse, and
z(x) as before:
\[\begin{aligned}
f(x)-f(x)&=z(x)\qquad \forall x \\
[f+(-f)](x)&=z(x)
\end{aligned}\]
Associativity of scalar multiplication:
For a scalar a\in\mathbb{F}:
\[a(bf(x))=ab(f(x))=(ab)f(x)\]
Identity element of scalar multiplication:
We’ll use the scalar 1 as the identity element of scalar multiplication.
Since the result of is a real or complex scalar, it’s
trivially true that:
\[1\cdot f(x)=f(x)\]
Distributivity of scalar multiplication over vector addition:
\[\begin{aligned}
a\cdot [f+g](x)&=a\cdot (f(x)+g(x))\\
&=a\cdot f(x)+a\cdot g(x)\\
&=[af](x)+[ag](x)
\end{aligned}\]
Distributivity of scalar multiplication over scalar addition:
For scalars a and b:
\[(a+b)\cdot f(x)=a\cdot f(x) + b\cdot f(x)\]
Appendix B: proof that square integrable functions form a subspace
To show that L^2 is a subspace of the function vector space, we
have to prove the following properties:
Zero element
The zero element z(x)=0 is in L^2:
\[\int_{-\infty}^{\infty}\left | z(x) \right |^2 dx =\int_{-\infty}^{\infty} 0\ dx=0
< \infty\]
Closure under addition
Recall that our functions f,g\in L^2 are complex-valued. For any
two complex numbers (see this
post):
\[|u+v|^2=|u|^2+|v|^2+2 Re(uv^*)\]
It’s very easy to show that 2Re(uv^*)\leq2|u||v|, and also that
2|u||v|\leq|u|^2+|v|^2. Therefore:
\[|u+v|^2\leq2|u|^2+2|v|^2\]
Armed with this, let’s check if the sum of and g(x)
is square integrable:
\[\int_{-\infty}^{\infty}\left | f(x) + g(x)\right |^2 dx\]
Since the values and g(x) are just complex numbers,
we’ll use the inequality shown above to write:
\[\int_{-\infty}^{\infty}\left | f(x) + g(x)\right |^2 dx
\leq
2\int_{-\infty}^{\infty}\left | f(x) \right |^2 dx+
2\int_{-\infty}^{\infty}\left | g(x) \right |^2 dx\]
Both integrals on the right hand side are finite, so the one on the left
is finite as well.
Closure under scalar multiplication
Given f\in L^2 and a scalar a:
\[\int_{-\infty}^{\infty}a \left | f(x) \right |^2 dx =a^2 \int_{-\infty}^{\infty} \left | f(x) \right |^2\ dx=< \infty\]
[1] | We want to work with functions that have finite total energy. Note that this is a pretty strong restriction! In Fourier analysis, we typically modify the square integrability requirement to be on a finite interval – all the tools still work – and talk about periodic functions. |
[2] | I’m not including the proofs here, because some of them are a bit technical and require terminology from real analysis. |
[3] | A classical example of a set not satisfying this condition is \mathbb{Q} – the rational numbers; an infinite sum of rational numbers can end up being irrational: \sum_{n=0}^{\infty}\frac{1}{n!}=e. Infinite sums are tricky! |