We all have some notion of what it means for something to be infinite. Take the biggest number you can think of, and multiply it by 7! Ok, not quite but since we’ve already conjured up the image, let’s work with it. Call this number . Now let’s, raise it to the power of itself. Are we there yet? Not even close since we still have,

no matter how wildly imaginative you were in your choice of fixed in the first place. But can one kind of infinity be smaller/larger than another? Well, a brilliant century German mathematician thought so, and now, most people agree with him.

When Georg Cantor published his thoughts advancing this notion, it made him the laughing stock of his contemporaries. To quote this interesting paper on Cantor’s life,

Leopold Kronecker, one of Cantor’s teachers and among the most prominent members of the German mathematics establishment, even attacked Cantor personally, calling him a “scientific charlatan,” a “renegade” and a “corrupter of youth”.

The ridicule of his peers caused Cantor immense emotional distress, leading him to shy away from the public eye, and he eventually passed away secluded, and in poverty. Nevertheless, Cantor’s singular thought earned him a little piece of immortality as his theorems live on, and are part of standard curriculum in advanced mathematics. His ideas are essential in **Measure Theory**, and its application to **Quant Finance** through the **Stochastic Calculus**. In **Part I**, we provide some basic background in **Set Theory**, then state and prove what is now known as **Cantor’s Theorem**.

## Bijective Maps

Consider a set containing different elements.

These elements can be integers, tractors, hamburgers, hamburgers and tractors, etc. The crucial thing is that the elements are, as Cantor put it,

distinct of our perception (Anschauung) or of our thought.

ln other words, whenever . Suppose is any other set. We might wish to create some kind of relationship between the elements of and the elements of . The apparatus for this task is our good ol’ friend, the function.

A function or *map* is a machine that takes an element from its **domain** , and spits out an element from its **codomain **. There are 3 very general classes of functions, the third just being a combination of the first 2.

A function is an

injection, orone to one, if for every , implies .

As can be seen from the diagram, the injection maps every element to some , such that no two elements of map to the same element of . The visual test for injectivity generally encountered in pre-calculus is the so called **horizontal line test, **which states that a function is injective if and only if one can draw any horizontal line on the graph of such that the line touches at at most one point. Note that in our diagram, , leading us to…

### Definition (surjection):

A function is a

surjection, oronto, if for every , there is at least one such that .

Hence, if the **image** of under is the **entire** codomain , i.e. , then is a surjection. *sur* is the French word for *on*, which conveys the idea that maps **onto the entire codomain **.

### Definition 1 (bijection):

A function is a

bijectionif it is both an injection and a surjection.

**Equivalently**,

### Definition 2 (bijection):

is a

bijectionif and only if there exists a function , such that for all and

Hence, when is both a **left inverse** and a **right inverse**, we say that is the **inverse function** of . When such an described exists, is **invertible. **Hence bijections are always invertible functions and vice versa.** **It’s a nice exercise to demonstrate the equivalence of these 2 definitions of bijections.

Ok, let’s consider 3 functions, just to summarize.

To start, we’ll assume the 3 map from to .

Right off the bat we can see that

A so called **even function**. Thus is not an injection and therefore not a bijection.

We can make it an injection by restricting the domain of to non-negative reals on .

Done. So is

a bijection? Still not, since there’s no real number that when squared, will yield a negative real number. So while we have tweaked into an injection, it’s still not a surjection, and hence not a bijection. So we need to change the codomain too, by restricting it to all non-negative reals.

And finally, by restricting to the upper right quadrant, we have our bijection.

What about First notice that the derivative

So is **continuously differentiable**, and **strictly increasing** in over the real line (except at the origin where it switches from a concave function to a convex function). Hence for any two reals , can’t possibly be , since is strictly increasing on .

Hence** all strictly monotonic functions are injections**. Since is also continuous over all of , (else would not be differentiable over ), all reals are hit as , and we conclude is a bijection (actually a **homeomorphism**). In general, all **strictly monotonic functions **map bijectively from their domains to their images (not necessarily codomains!). Notice that exists and is also a bijection.

You can use the exact same argument to show

are bijections.

Cardinals

Suppose we wanted to get a sense of the size of a set . In other words, we’d like to **measure** . To keep things simple, we’ll just *count* the number of elements in , and call this the size of . This is what Cantor called the **cardinality**, or** cardinal measure** of , which we’ll denote by . Why choose cardinality to measure ? For one thing, its universally understandable – since everybody knows how to count. More importantly though, it’s *unitless*.

To see why, let be a set of different pigeons each capable of delivering 1 mailpiece. If is a set of different letters, then you are guaranteed that each piece of mail can be assigned to a unique pigeon. In other words, . This is known as the **Pigeonhole Principle**.

### Theorem

if and only if there is a bijection mapping

onto. We then say and areequinumerous.

In other words, for each letter in the mailbag , there is a unique pigeon in the coop such that . The result is that all pigeons and all letters will be put in a 1 to 1 correspondence.

These **finite** **cardinal measures, **or **finite** **cardinals **are represented by elements of . All the naturals represent here, are magnitudes of sets. No more, no less.

## Ordinals

Notice that each of the isn’t quite a number, and it’s not obvious that or .

Why should a particular pigeon come before any other in the set? Because it’s prettier? Because it’s faster? Actually, it’s simply because we’ve chosen to **order** it that way. The ordering information is encoded in the subscripts, called **ordinals. **Once again, the characters we use for these **finite** **ordinal numbers** are the natural numbers.

Unlike **cardinals**, which represent magnitudes of sets, **ordinals **are used to induce an **order relation** on sets. Keep in mind that notwithstanding cardinals’ and ordinals’ use of the same characters (arabic numerals), they represent different things. In fact, the pronunciation is different. “One”, “Two”, “Three”, …, for cardinals, “First”, “Second”, “Third”,…, for ordinals. Cardinals are absolute magnitudes, which have meaning on their lonesome. Ordinals, on the other hand, are relative magnitudes, which can only be understood in the context of other ordinals.

Countable sets

We mentioned the fact that the cardinal measure is useful since it relies on the universal ability to count. Cantor wanted to consider the implications of making this concept more precise.

### Definition:

A setisif and only ifcountablethere exists a bijection such that .

Here is the set of all **Natural Numbers**, (or counting numbers). i.e.,

Hence, will be countable, provided we can bijectively map it onto a set of naturals, possibly all of .

We have already seen that when a set has finitely many elements, it can be bijectively mapped to any other set with the same finite cardinality. Hence, **every finite set is countable**.

To see why, let for some finite number . Then we can just take the set as our subset of . Since both sets have cardinality , we can simply let our bijection be

What about sets with infinitely many elements? Well, countability is defined in terms of . Yet,

Does that mean we can’t count the countable numbers? Referring back to the definition of a countable set, all we need to do is demonstrate the existence of a bijective map from to , and we are done.

This is easy enough, let’s define

**(A)bort , (R)etry: (F)ail: R**

Let’s verify that so defined forms a bijection. Consider any two natural numbers . Then,

Hence *if and only if * showing that is injective. It is also a surjection, since for every natural in the codomain , there is an in the domain such that . In particular, . Hence produces unique, exhaustive pairings between and , and is therefore a bijection.

We have shown that is indeed countable, as is any infinitely numerous set for which there is a bijection onto .

Cantor called all such sets **countably infinite, at most countable, **or simply **countable**.

## The Powerset

A very important set that can be derived from any set is its **powerset** denoted by . is the set of all possible unique subsets of .

To see what this means, suppose an entrepeneur wished to capitalize on the USPS’s termination of Saturday delivery, by opening a pigeon mail delivery service. To keep pigeon morale high, s/he decides that every pigeon will have the opportunity to make a delivery with a group of its peers, where the group is responsible for delivering a single mailpiece. In order to ensure the pigeons don’t have too much fun, each pigeon will still have to make a delivery on its own from time to time. Let’s say s/he anticipates a mailbag of 3 letters every Saturday, and therefore invests in a set of 3 pigeons for now. How many different kinds of groups can be formed from these pigeons? Well,

Hence there are 7 possible groups, plus one empty set denoted by , giving 8 total.

Powersets exist for every set by the last of the **Zermelo–Fraenkel (ZF) **axioms – **Axiom of Powerset (Axiom VIII).**

## Cardinality of powersets of finite sets

Notice that

In general,

This isn’t obvious though. Just because it held in the case of pigeons, doesn’t necessarily mean it should hold in the case of say, .

There are several ways to convince yourself with minimal guilt that it **will** hold for every single finite cardinal.

The following is **proof by induction** that it will.

Please see oldrin.bataku’s intuitive explanation, that requires no combinatorial messiness.

Let’s try with pigeons. How many elements are in ? Let’s phrase this differently – how many unique groups of pigeons in are there?

Recall that there are

(read: choose ) ways of making a unique collection of items from a set of elements. Where the factorial operator, is defined by,

Thus, the total possible number of unique groups of pigeons formed from the is,

Therefore,

Ok, we’re getting pretty confident that our hunch is right. We’ve already shown shown that groups of , and pigeons have powersets with cardinality , respectively.

Now, **assume** that every set of elements has powerset cardinality . If we can show this assumption **implies every** set of elements has powerset cardinality , we will have shown this holds for all . Note that had we started by demonstrating the (trivial) case of (the **empty set** (where clearly we can only make choice – to choose **nothing**), we would then be showing this holds for all . So now that we have already shown it, we might as well just say we’re showing it for all such non-negative . This is called **proof by induction**.

Towards this end, let have elements.

Which is what we wanted to show.

Note that in equality , we used **Pascal’s identity,**

Actually, as oldrin.bataku notes, the above proof is just a special case of a more general result – the Binomial theorem, which requires just a tiny bit more work to show that,

Letting gives us .

Yet, what happens in the case where ? Well, the above formula yields,

But this doesn’t give us our answer, since and as shown above it’s still countable. Furthermore using , as our respective pigeon coop and mailbag, intuition would suggest that though there will be an infinite number of pigeon groups, if ever there is a group without a letter, we should still be able to just reach into the infinite mailbag and grab a letter, without taking one away from any other group, right? Let’s find out.

## Cantor’s Theorem: transfinite cardinals

What follows is Cantor’s brilliant argument that the determination of whether an infinitely large set is countable is dependent upon just how “infinite” it is. It starts by considering the most basic powerset of a countably infinite set – .

Let’s assume for the moment that we’ve found a bijection mapping onto its powerset.

Let’s construct a special kind of set defined by,

In words, is the set of all natural numbers , such that is not in the set to which maps it. That is, all naturals , with the property that . To illustrate, let’s say our bijection of the first 5 naturals looks like this,

After these 5 maps, which elements will we have added to ? Since isn’t in , must be in . Since **is** in , . Similarly, whereas . Hence, so far, . Continuing in this fashion, numbers will be added to , so long as they aren’t in their map under .

Let’s stop for a moment and consider what we have. is just a set of natural numbers, and therefore being a subset of , must be an element of the powerset . Since , and we have assumed that , being bijective, matches every natural number with a unique element of the powerset, there **must** be some natural number such that .

Let’s see what else we can say. One thing that we can be certain of is is either in or not in .

So is ? Well, if it were, that would mean that gets mapped to a set containing it, and by the way we defined , this is impossible. Hence is not in .

But if , then is mapped by to a set which doesn’t contain it, meaning that must be in . Clearly being both **in** and **not in** is absurd, and we can only conclude our assumption of being a bijection is clearly wrong. Hence there is no bijective function mapping into its powerset. We can only conclude that even though is itself countably infinite, its powerset just has too many elements to be countable.

Ok, so far we’ve shown that

Similarly,

for any countable set .

What about uncountable sets? Then the same argument still applies. Let be an arbitrary set, and be a purported bijection. Then we once again construct our special set

and proceed to show, as we did in the countable case, that this generates the contradiction,

Hence for arbitrary set ,

## The ב ,א, and ω numbers

Under revision: Thx Fourmisain

Cantor realized that even though both countable sets and their powersets had infinite cardinality, these cardinalities represented different magnitudes of infinity, the latter being indescribably larger than the former. For this reason, he created two new classes of numbers called the **transfinite numbers.**

**Transfinite ordinals **the class of which is denoted by , are used to encode order information upon infinitely numerous sets. The first and smallest such number is denoted by . There are a variety of characterizations of such numbers, (see Von Neumann’s construction), but we can loosely consider to be the first “spot” at the end of a countably infinite “list”.

Hence, the next few transfinite ordinals will be

until we can no longer **count** these infinite numbers, and we “*arrive”* at the first **uncountable **transfinite ordinal, . Keep in mind that transfinite numbers do not usually behave the same way finite numbers do under familiar operations of addition and multiplication. For example, for any finite ordinal number ,

See here if interested why. To give a visual, from wiki,

The **transfinite cardinals**, also known as the **aleph numbers**, are denoted by,

For every ordinal . Cantor chose from the Hebrew words **אין סוף** (Ayn Sof), which translates as “without end”, and is one of the names used to describe God in the Zohar.

Cantor discerned between these infinities by replacing infinity with for the cardinality of countably infinite sets, and for uncountably infinite sets. Hence, Cantor’s Theorem can be succinctly stated as,

In the next post, we’ll explore the nature of two extremely important infinite sets: the set of all rational numbers , and the real numbers . We’ll also take a look at another of Cantor’s brilliant constructs – the **Cantor set.**

Hey, Dovi, nice presentation. Check the section titled ‘Countable sets’. you say that you will demonstrate the existence of a bijective map from N to N. It seems to me that you postulate the existence of said bijective map in lieu and place of demonstrating it. Maybe I am wrong.

Thank you Dr. Orphee! You’re absolutely right, some handwavery going on there. Gonna fix it.

Nice post, though you made a mistake in giving the power set of F. It should read

P(F) = {{}, {p_1}, {p_2}, {p_3}, {p_1, p_2}, {p_1, p_3}, {p_2, p_3}, {p_1, p_2, p_3}}

rather than

P(F) = {{}, p_1, p_2, p_3, {p_1, p_2}, {p_1, p_3}, {p_2, p_3}, {p_1, p_2, p_3}}

by the way, there is a simpler ‘proof’ of the cardinality of the power set by considering that for each subset there are two possibilities for each element — membership or lack thereof. It follows that for a set of cardinality n we have n such possibilities per element and so 2^n possible subsets corresponding to the different ways we can include or ‘leave out’ elements per subset. I usually find this more intuitive to explain than the inductive argument. You might also wish to appeal to the binomial theorem in noting nC0 + nC1 + nC2 + … + nCn = (1 + 1)^n

Thank you! I made the changes. I was actually considering proving the binomial theorem, rather than the special case we were dealing with. Will do that when I get home later.

This is a good introductory to Cantor’s work, although I felt, that the the end of your blogpost isn’t that fleshed out and looked into.

First of all, I was wondering, why you are restricting Cantor’s Theorem only to the set of natural numbers, when it applies to every set, especially every infinite one: |X| < |P(X)|, for every set X. That alone tells you, that there has to be an infinite amount of infinite cardinals (the cardinals omega, |P(omega)|, |P(P(omega))| are sometimes called beth numbers). You don't even need to change your proof, as you didn't use any specific property of the set of natural numbers.

Then there is an error in your last paragraph.

You are stating, that Aleph_0 < 2^{Aleph_0} = Aleph_1.

Of course Aleph_0 < Aleph_1, by definition; but the statement 2^{Aleph_0} = Aleph_1 (or Beth_1 = Aleph_1) is called the continuum hypothesis and can neither be proven nor disproven (in ZFC), which in itself is one of the most interesting things about mathematics or mathematical logic, tying tightly together with Gödel's incompleteness theorems. It probably also was a driving force of Cantor's mental demise, constantly trying to prove the unprovable.

The only thing one can be sure of, is that 2^{Aleph_0} is at least as big as Aleph_1 and that 2^{Aleph_0} is the cardinality of the set of real numbers, although the latter isn't trivial.

Hmm, I actually intended for the Cantor’s Theorem section to generalize the inequality developed for finite sets in the previous section, but somehow stopped short of showing it holds beyond countable sets such as .

You’re absolutely right about unprovability/undisprovability in ZFC, and I’ve made the change. I should probably demonstrate at least why , and , in the next post.

Thank you for pointing out that mistake!