We all have some notion of what it means for something to be infinite. Take the biggest number you can think of, and multiply it by 7! Ok, not quite but since we’ve already conjured up the image, let’s work with it. Call this number . Now let’s, raise it to the power of itself. Are we there yet? Not even close since we still have,
no matter how wildly imaginative you were in your choice of fixed in the first place. But can one kind of infinity be smaller/larger than another? Well, a brilliant century German mathematician thought so, and now, most people agree with him.
When Georg Cantor published his thoughts advancing this notion, it made him the laughing stock of his contemporaries. To quote this interesting paper on Cantor’s life,
Leopold Kronecker, one of Cantor’s teachers and among the most prominent members of the German mathematics establishment, even attacked Cantor personally, calling him a “scientific charlatan,” a “renegade” and a “corrupter of youth”.
The ridicule of his peers caused Cantor immense emotional distress, leading him to shy away from the public eye, and he eventually passed away secluded, and in poverty. Nevertheless, Cantor’s singular thought earned him a little piece of immortality as his theorems live on, and are part of standard curriculum in advanced mathematics. His ideas are essential in Measure Theory, and its application to Quant Finance through the Stochastic Calculus. In Part I, we provide some basic background in Set Theory, then state and prove what is now known as Cantor’s Theorem.
Consider a set containing different elements.
These elements can be integers, tractors, hamburgers, hamburgers and tractors, etc. The crucial thing is that the elements are, as Cantor put it,
distinct of our perception (Anschauung) or of our thought.
ln other words, whenever . Suppose is any other set. We might wish to create some kind of relationship between the elements of and the elements of . The apparatus for this task is our good ol’ friend, the function.
A function or map is a machine that takes an element from its domain , and spits out an element from its codomain . There are 3 very general classes of functions, the third just being a combination of the first 2.
A function is an injection, or one to one, if for every , implies .
As can be seen from the diagram, the injection maps every element to some , such that no two elements of map to the same element of . The visual test for injectivity generally encountered in pre-calculus is the so called horizontal line test, which states that a function is injective if and only if one can draw any horizontal line on the graph of such that the line touches at at most one point. Note that in our diagram, , leading us to…
A function is a surjection, or onto, if for every , there is at least one such that .
Hence, if the image of under is the entire codomain , i.e. , then is a surjection. sur is the French word for on, which conveys the idea that maps onto the entire codomain .
Definition 1 (bijection):
A function is a bijection if it is both an injection and a surjection.
Definition 2 (bijection):
is a bijection if and only if there exists a function , such that for all and
Hence, when is both a left inverse and a right inverse, we say that is the inverse function of . When such an described exists, is invertible. Hence bijections are always invertible functions and vice versa. It’s a nice exercise to demonstrate the equivalence of these 2 definitions of bijections.
Ok, let’s consider 3 functions, just to summarize.
To start, we’ll assume the 3 map from to .
Right off the bat we can see that
A so called even function. Thus is not an injection and therefore not a bijection.
We can make it an injection by restricting the domain of to non-negative reals on .
Done. So is
a bijection? Still not, since there’s no real number that when squared, will yield a negative real number. So while we have tweaked into an injection, it’s still not a surjection, and hence not a bijection. So we need to change the codomain too, by restricting it to all non-negative reals.
And finally, by restricting to the upper right quadrant, we have our bijection.
What about First notice that the derivative
So is continuously differentiable, and strictly increasing in over the real line (except at the origin where it switches from a concave function to a convex function). Hence for any two reals , can’t possibly be , since is strictly increasing on .
Hence all strictly monotonic functions are injections. Since is also continuous over all of , (else would not be differentiable over ), all reals are hit as , and we conclude is a bijection (actually a homeomorphism). In general, all strictly monotonic functions map bijectively from their domains to their images (not necessarily codomains!). Notice that exists and is also a bijection.
You can use the exact same argument to show
Suppose we wanted to get a sense of the size of a set . In other words, we’d like to measure . To keep things simple, we’ll just count the number of elements in , and call this the size of . This is what Cantor called the cardinality, or cardinal measure of , which we’ll denote by . Why choose cardinality to measure ? For one thing, its universally understandable – since everybody knows how to count. More importantly though, it’s unitless.
To see why, let be a set of different pigeons each capable of delivering 1 mailpiece. If is a set of different letters, then you are guaranteed that each piece of mail can be assigned to a unique pigeon. In other words, . This is known as the Pigeonhole Principle.
if and only if there is a bijection mapping onto . We then say and are equinumerous.
In other words, for each letter in the mailbag , there is a unique pigeon in the coop such that . The result is that all pigeons and all letters will be put in a 1 to 1 correspondence.
These finite cardinal measures, or finite cardinals are represented by elements of . All the naturals represent here, are magnitudes of sets. No more, no less.
Notice that each of the isn’t quite a number, and it’s not obvious that or .
Why should a particular pigeon come before any other in the set? Because it’s prettier? Because it’s faster? Actually, it’s simply because we’ve chosen to order it that way. The ordering information is encoded in the subscripts, called ordinals. Once again, the characters we use for these finite ordinal numbers are the natural numbers.
Unlike cardinals, which represent magnitudes of sets, ordinals are used to induce an order relation on sets. Keep in mind that notwithstanding cardinals’ and ordinals’ use of the same characters (arabic numerals), they represent different things. In fact, the pronunciation is different. “One”, “Two”, “Three”, …, for cardinals, “First”, “Second”, “Third”,…, for ordinals. Cardinals are absolute magnitudes, which have meaning on their lonesome. Ordinals, on the other hand, are relative magnitudes, which can only be understood in the context of other ordinals.
We mentioned the fact that the cardinal measure is useful since it relies on the universal ability to count. Cantor wanted to consider the implications of making this concept more precise.
A set is countable if and only if there exists a bijection such that .
Here is the set of all Natural Numbers, (or counting numbers). i.e.,
Hence, will be countable, provided we can bijectively map it onto a set of naturals, possibly all of .
We have already seen that when a set has finitely many elements, it can be bijectively mapped to any other set with the same finite cardinality. Hence, every finite set is countable.
To see why, let for some finite number . Then we can just take the set as our subset of . Since both sets have cardinality , we can simply let our bijection be
What about sets with infinitely many elements? Well, countability is defined in terms of . Yet,
Does that mean we can’t count the countable numbers? Referring back to the definition of a countable set, all we need to do is demonstrate the existence of a bijective map from to , and we are done.
This is easy enough, let’s define
(A)bort , (R)etry: (F)ail: R
Let’s verify that so defined forms a bijection. Consider any two natural numbers . Then,
Hence if and only if showing that is injective. It is also a surjection, since for every natural in the codomain , there is an in the domain such that . In particular, . Hence produces unique, exhaustive pairings between and , and is therefore a bijection.
We have shown that is indeed countable, as is any infinitely numerous set for which there is a bijection onto .
Cantor called all such sets countably infinite, at most countable, or simply countable.
A very important set that can be derived from any set is its powerset denoted by . is the set of all possible unique subsets of .
To see what this means, suppose an entrepeneur wished to capitalize on the USPS’s termination of Saturday delivery, by opening a pigeon mail delivery service. To keep pigeon morale high, s/he decides that every pigeon will have the opportunity to make a delivery with a group of its peers, where the group is responsible for delivering a single mailpiece. In order to ensure the pigeons don’t have too much fun, each pigeon will still have to make a delivery on its own from time to time. Let’s say s/he anticipates a mailbag of 3 letters every Saturday, and therefore invests in a set of 3 pigeons for now. How many different kinds of groups can be formed from these pigeons? Well,
Hence there are 7 possible groups, plus one empty set denoted by , giving 8 total.
Powersets exist for every set by the last of the Zermelo–Fraenkel (ZF) axioms - Axiom of Powerset (Axiom VIII).
Cardinality of powersets of finite sets
This isn’t obvious though. Just because it held in the case of pigeons, doesn’t necessarily mean it should hold in the case of say, .
There are several ways to convince yourself with minimal guilt that it will hold for every single finite cardinal.
The following is proof by induction that it will.
Please see oldrin.bataku’s intuitive explanation, that requires no combinatorial messiness.
Let’s try with pigeons. How many elements are in ? Let’s phrase this differently – how many unique groups of pigeons in are there?
Recall that there are
(read: choose ) ways of making a unique collection of items from a set of elements. Where the factorial operator, is defined by,
Thus, the total possible number of unique groups of pigeons formed from the is,
Ok, we’re getting pretty confident that our hunch is right. We’ve already shown shown that groups of , and pigeons have powersets with cardinality , respectively.
Now, assume that every set of elements has powerset cardinality . If we can show this assumption implies every set of elements has powerset cardinality , we will have shown this holds for all . Note that had we started by demonstrating the (trivial) case of (the empty set (where clearly we can only make choice – to choose nothing), we would then be showing this holds for all . So now that we have already shown it, we might as well just say we’re showing it for all such non-negative . This is called proof by induction.
Towards this end, let have elements.
Which is what we wanted to show.
Note that in equality , we used Pascal’s identity,
Actually, as oldrin.bataku notes, the above proof is just a special case of a more general result – the Binomial theorem, which requires just a tiny bit more work to show that,
Letting gives us .
Yet, what happens in the case where ? Well, the above formula yields,
But this doesn’t give us our answer, since and as shown above it’s still countable. Furthermore using , as our respective pigeon coop and mailbag, intuition would suggest that though there will be an infinite number of pigeon groups, if ever there is a group without a letter, we should still be able to just reach into the infinite mailbag and grab a letter, without taking one away from any other group, right? Let’s find out.
Cantor’s Theorem: transfinite cardinals
What follows is Cantor’s brilliant argument that the determination of whether an infinitely large set is countable is dependent upon just how “infinite” it is. It starts by considering the most basic powerset of a countably infinite set - .
Let’s assume for the moment that we’ve found a bijection mapping onto its powerset.
Let’s construct a special kind of set defined by,
In words, is the set of all natural numbers , such that is not in the set to which maps it. That is, all naturals , with the property that . To illustrate, let’s say our bijection of the first 5 naturals looks like this,
After these 5 maps, which elements will we have added to ? Since isn’t in , must be in . Since is in , . Similarly, whereas . Hence, so far, . Continuing in this fashion, numbers will be added to , so long as they aren’t in their map under .
Let’s stop for a moment and consider what we have. is just a set of natural numbers, and therefore being a subset of , must be an element of the powerset . Since , and we have assumed that , being bijective, matches every natural number with a unique element of the powerset, there must be some natural number such that .
Let’s see what else we can say. One thing that we can be certain of is is either in or not in .
So is ? Well, if it were, that would mean that gets mapped to a set containing it, and by the way we defined , this is impossible. Hence is not in .
But if , then is mapped by to a set which doesn’t contain it, meaning that must be in . Clearly being both in and not in is absurd, and we can only conclude our assumption of being a bijection is clearly wrong. Hence there is no bijective function mapping into its powerset. We can only conclude that even though is itself countably infinite, its powerset just has too many elements to be countable.
Ok, so far we’ve shown that
for any countable set .
What about uncountable sets? Then the same argument still applies. Let be an arbitrary set, and be a purported bijection. Then we once again construct our special set
and proceed to show, as we did in the countable case, that this generates the contradiction,
Hence for arbitrary set ,
The ב ,א, and ω numbers
Under revision: Thx Fourmisain
Cantor realized that even though both countable sets and their powersets had infinite cardinality, these cardinalities represented different magnitudes of infinity, the latter being indescribably larger than the former. For this reason, he created two new classes of numbers called the transfinite numbers.
Transfinite ordinals the class of which is denoted by , are used to encode order information upon infinitely numerous sets. The first and smallest such number is denoted by . There are a variety of characterizations of such numbers, (see Von Neumann’s construction), but we can loosely consider to be the first “spot” at the end of a countably infinite “list”.
Hence, the next few transfinite ordinals will be
until we can no longer count these infinite numbers, and we “arrive” at the first uncountable transfinite ordinal, . Keep in mind that transfinite numbers do not usually behave the same way finite numbers do under familiar operations of addition and multiplication. For example, for any finite ordinal number ,
See here if interested why. To give a visual, from wiki,
The transfinite cardinals, also known as the aleph numbers, are denoted by,
For every ordinal . Cantor chose from the Hebrew words אין סוף (Ayn Sof), which translates as “without end”, and is one of the names used to describe God in the Zohar.
Cantor discerned between these infinities by replacing infinity with for the cardinality of countably infinite sets, and for uncountably infinite sets. Hence, Cantor’s Theorem can be succinctly stated as,
In the next post, we’ll explore the nature of two extremely important infinite sets: the set of all rational numbers , and the real numbers . We’ll also take a look at another of Cantor’s brilliant constructs – the Cantor set.