CR 17: Probabilistic Methods with Application to Large Graphs
No specialized knowledge required, however basic notions in graph theory, algorithms and discrete probability will help.
Dealing with large (real life) graphs leads to several problems among which one can highlight
two major questions: How to generate them? and How to make computations on them? These two
directions have a common feature in the sense that one cannot access the full graph, but only parts
of it. This refines our approach into : How to generate a graph under some constraints (usually
local ones like degree, density of triangles, etc)? and How to compute properties of the graph
under statistical knowledge?
These question are far reaching, and the objective of this course is to introduce some
(usually probabilistic) tools to this purpose.
1) Szemeredi Regularity Lemma. Applications to graph property-testing.
The celebrated regularity lemma asserts that every dense graph can be partitioned in a constant number of parts in such
a way that (nearly) all pairs of parts form a random looking bipartite graph. The most famous
applications of it concerns Ramsey theory: when a structure is dense enough (dense set of integers, or
dense graph), one can find many regular substructures (arithmetic progressions, or certain types
of subgraphs). One very nice consequence of the Regularity Lemma is that it certifies that
some properties can be tested, with good confidence, by sampling only a constant number of vertices, hence giving
constant time algorithms.
2) Random graphs.
Erdos-Renyi model, classical thresholds (applications of the first and second moment method). Random graphs with constraints (fixed degree,
triangle-free graphs, power-law distribution on the degrees, etc).
3) Graph limits. Flag algebras.
As mentioned previously, some important parameters for large graphs are statistical, like
the density of edges, triangles, etc. Intuitively, two large graphs are "similar"
if their densities of small subgraphs are "close enough". To get some grasp on this kind
of intuition, Lovasz introduced the notion of graph limit, where a sequence of graphs
is converging if all densities of finite graphs are converging. The very nice point
of considering graph limits instead of large graphs is that low error terms are neglected
in the limit, which gives rise to some relations between densities of subgraphs.
The natural context to manipulate these relations are Razborov's Flag algebras, in which
we can write for instance:
which sense will be explained in this course (teaser).
The notion of graph limits and related concepts are clearly established for dense graphs. We will also mention attempts to define suitable convergence notions and limit objects in the sparse case.
4) Pseudo-random graphs. Expanders.
A very natural question is to ask for deterministic models of random graphs. This question naturally
arises for instance when one wants to design a network enjoying some random-like properties. Pseudo-randomness
in the dense case is easily captured with subgraphs densities. For instance if the number of edges is
roughly n^2/4 and the number of cycles of length four is roughly n^4/16 the graph behaves like
a random graph with density 1/2. In the sparse case, pseudo-random behaviour is
expressed through expansion properties and leads to the notion of expanders. We will show some important applications of expanders, including derandomization.
5) Lovasz Local Lemma and Moser's entropy compression algorithm.
The Local Lemma is a powerful tool that given a certain number of events with low probability, and with low inter-dependence, asserts that the probability that none of the events occur is non-zero. This lemma has been used extensively in combinatorics and theoretical computerscience in the past 30 years (in particular to show the existence of very large graphs with certain properties). Until recently the only known proof was non-constructive (the algorithmic versions of the lemma were significantly weaker). In a recent breakthrough, Moser (and Tardos) proved an algorithmic version of the general lemma with a very simple and elegant argument. We will show how the proof works and how the new version of the lemma can be applied to solve various problems in Graph Theory.
6) VC-dimension. Epsilon-nets and Epsilon-approximations. Sampling.
Some large real-life graphs (for instance graphs defined from physical networks) have constrained geometrical properties. The suitable notion to capture these properties is the VC-dimension of hypergraphs. We will see applications to sampling.
Validation: There will be a homework assignment and a written final exam.
Probability: Alon and Spencer's Probabilistic Method.
Graphs: Diestel's Graph Theory,
Bondy and Murty's Graph Theory, see here for the old edition.
Graph limits: Laszlo Lovasz's Large network and graph limits.