Lecture 8 - The Probabilistic Method
约 2028 个字 预计阅读时间 7 分钟
- Speaker: Prof. Alexander Scott
It can be very difficult to construct mathematical objects without embedding some sort of regular structure.
Random or typical objects often have desirable properties that are difficult to construct explicitly.
Example 1: Tournaments
A tournament is an orientation of a complete graph: every edge between and is assigned a direction (towards or towards ).
A tournament has Property if for every set of players, there is someone who beats all of them. For example, the cyclic tournament on three vertices has Property . For large , tournaments with Property are difficult to construct!
Theorem
If , then some tournament with vertices has Property .
Idea:
- Consider a random tournament.
- Show that with positive probability it has Property .
- Deduce that some tournament must have this property.
Proof
Let be a random tournament on vertices, where each edge between and is directed independently with equal probability towards or . For each set of vertices, let be the event that no vertex beats every vertex in (we say that is then a bad set).
Then
Here we have used the inequality , which holds for all real numbers (and is extremely useful).
We deduce that the expected number of bad -sets is at most
Suppose this expectation is strictly less than . If on average we have less than bad set, then there must be some tournament where we have less than bad set (the minimum is at most the average).
Thus, it is sufficient to show that
Recall that ; then
It is straightforward to see that this is less than .
Let's note:
- We did not directly show that exists: we showed that a random tournament had a strictly positive probability of having the required properties.
- We needed to perform some calculations!
We can consider random structures, even when our original problem does not mention randomness.All the tools of Probability Theory are now at our disposal!
Example 2: Coding Theory
Coding Theory addresses the problem of transmitting information through a binary channel: in other words, we aim to send information as a sequence of s and s.
A code is a collection of binary strings, one for each type of information we aim to transmit (for example, one string for each letter of our alphabet).
Sometimes the channel is noisy, in which case we require our strings to be highly distinct (we need an error-correcting code). Even without noise, important questions arise. For example, how quickly can we transmit information through a channel?
A prefix of a binary string is an initial segment. For example, is a prefix of but not of .
A set of binary strings is prefix-free if no string in is a prefix of another. A fundamental theorem, the Kraft-McMillan Inequality, applies to prefix-free codes:
Theorem (Kraft-McMillan Inequality)
Let be a prefix-free set of binary strings, and suppose that contains strings of length for each . Then
Proof
Consider a random (infinite) sequence . For a string of length ,
Thus, the expected number of strings from that occur as a prefix of is
On the other hand, we can never have more than one string from as prefixes simultaneously, so we deduce that
(the average is at most the maximum).
Example 3: Max Cut
In the first example, the linearity of expectation is straightforward: if , then
It also applies to variance if the random variables are independent.
In the Max Cut problem, we are given a graph and aim to divide its vertices into two classes so that as many edges as possible have ends in both classes.
The Max Cut problem is known to be a challenging algorithmic problem. (It is NP-hard; in fact, it is NP-hard even to find a good approximate solution!)
A theorem provides a simple bound for the Max Cut problem:
Theorem
For every graph , there exists a partition such that at least half the edges of have one end in each class.
Proof
Consider a random partition. For each edge , we define a random variable by setting if has one end in each set and otherwise. Let
We aim to show that there exists some partition in which .
Observe that
Thus, by linearity of expectation
It follows that there exists some partition for which at least half the edges have one end in each class.
It is possible to derandomise this argument to obtain a very fast algorithm.
Example 4: Independent Sets
In the alteration method, we generate a random structure and then modify it to achieve the desired properties.
For example: an independent set in a graph is a set of vertices, none of which are joined by edges.
Theorem
Let have vertices and average degree . Then contains an independent set of size at least .
Proof
We generate a set in two steps. First, let be a random subset of obtained by including each vertex independently with probability . Then, let be obtained from by deleting one end from each edge.
The expected size of is
The expected number of edges contained in is
Thus, on average, the number of vertices remaining is
Setting yields
Broader Horizons: Random Graphs
A highly important and fascinating example is provided by random graphs. The theory of random graphs was initially developed in the 1960s, but it has since grown into a significant and influential area of research, with connections to numerous fields.
In the model, we consider an -vertex graph in which each edge is present independently with probability . Thus, has no edges, is the complete graph, and has (on average) approximately half the edges.
Random graphs are used to model numerous real-world processes, from social networks to epidemics. A deep theory exists regarding the various changes that a typical random graph undergoes as increases from to .
What can we say about the typical structure of a graph in , where depends on ?
Let us consider triangles. How large must be for triangles to appear in ? The expected number of triangles is given by
Thus, if , then the expected number of triangles tends to . It follows (for example, by Markov's Inequality) that the probability that contains a triangle tends to .
On the other hand, if , then the expected number of triangles tends to infinity. Does this imply that the probability of obtaining a triangle tends to ?
To address this, we need to consider the variance.
Idea:
- Consider a random graph , and let denote the number of triangles
- Calculate the mean and variance of
- Use Chebyshev's Inequality to show that it is highly unlikely that
For each triple of vertices , we define a random variable as follows:
We know that . Let us calculate its variance . We have
where the sum is taken over all pairs of and .
If and are disjoint, then and are independent. Thus, . In fact, if , then they remain independent.
Thus
If
and if , then ,
Thus
We now apply Chebyshev's Inequality:
By setting to different values, one can determine how large must be for the expected number of triangles to meet certain conditions.
We have demonstrated:
- If , then
- If , then
We say that is a threshold function for the presence of triangles in .
We observe the same pattern for other graphs: there exists a threshold such that if , then it is highly unlikely that contains a copy of ; but if , then will almost certainly contain copies. This is an example of a phase transition.
For many graphs , the threshold for to contain a copy of occurs around the point where the expected number of copies of becomes large.
Finally, let us briefly discuss martingale methods. A martingale is (informally) a sequence of random variables where : at each step, the expected value remains unchanged. (This can be likened to making a sequence of fair bets in a casino.)
Martingales are highly useful in analysing various types of random graph processes. For example, let denote the chromatic number of . Estimating the average value of for a random graph is challenging, but it is known to be approximately
The key idea is as follows:
- Reveal the graph one vertex at a time: let denote the expected chromatic number of given the information provided by the first vertices.
- The sequence forms a martingale!
- Now apply a martingale inequality (for example, the Hoeffding Inequality, a relative of the Chernoff Inequality for binomial random variables) to show that is likely to be close to its mean.
The result follows elegantly!