The goal of this page is to give an introduction on matching theory and short proofs of the main results. The approach is based on a lecture by András Sebő. András's inspiration came in part from Lovász' exercises on Edmonds algorithm, but mostly from his own works on matchings and -joins and other related topics. I wrote the proofs from my memory, which means that they may differ from the original, probably in the wrong direction. These notes can also be taken as a list of consecutive exercises (and that was András's original intention); the proof of each claim, lemma or theorem should not be hard to solve using the previous ones.
Given a connected graph , a matching is a subset of vertex-disjoint edges. We denote the maximum cardinality of a matching of . A matching with maximum cardinality is a maximum matching. Notice that a matching is by definition a subgraph with maximal degree . A simple combinatorial dual object to a matching is a vertex cover: a set of vertices that intersects every edge of . The size of a matching cannot exceed the size of vertex cover, as each edge of the matching must use a distinct vertex in the vertex cover. Using the notation for the cardinality of a minimum vertex cover, this translates into:
We say that a vertex is covered by a matching when contains an edge incident to . We define the two following disjoint sets:
Notice that .
We say that a path is -alternating if but no two consecutive edges of are both in or both in . For a particular matching , an -alternating path is an -alternating path. Alternating paths play a great role in matching theory. Notice that a matching is equivalently a subgraph of maximum degree . Hence, the symmetric difference (that we denote by ) of two matchings and is a disjoint union of -alternating paths (some of them can be closed, that is, cycles), because the degree of any vertex is at most . Moreover, if is a path of , then is also a matching, as can be checked at each vertex locally. From these observations follows the classical result:
Claim 2: A matching is maximal iff there is no -alternating path whose endpoints are not covered by .
Unfortunately, it does not give an efficient algorithm in the general case (but it does in the bipartie case). The reason is that finding an alternating path as stated is not easy. It is a major contribution from Jack Edmonds to have demonstrated a way to do so, and we will cover his algorithm later. But for now we take a different direction to gain some insights on the structure of the problem.
The Berge-Tutte characterization
Let's consider an edge with and in .
Claim 3: If , then for any maximal matchings and of and respectively, there is an even -alternating -path.
Proof: Consider the path with endpoint , and let be the opposite endpoint.
If has an odd number of edges, then is covered by but not by . In particular, . Hence would be a matching of of cardinality , contradiction.
Hence, has an even number of edges. If , then is a matching of size , again a contradiction. Thus, is a path with endpoints and , and is even.
An immediate consequence is for bipartite graphs: for any edge of a bipartite graph, at least one of , is in . From this we prove:
Theorem 4: (Kőnig) For any bipartite graph ,
Proof: We only need to prove . The proof is by induction on . Let be a maximal matching, and . We may assume that . Consider , this is a bipartite graph on which we can apply induction. Let a minimum vertex cover of , of cardinality . Then, is a vertex cover of . Indeed, every edge incident to is covered by , and every edge not incident to is in , and thus covered by . Hence .
Note how this theorem fails on non-bipartite graph: an odd cycle of length has but . The point is that it fails precisely because there are edges whose both endpoints are in . To get a result for general graphs, we need to treat this case in a specific way. For that, and this is the great idea of Edmonds, we introduce the contraction of a graph by an edge set , denoted , as the graph obtained by identifying the endpoints of edges of , and removing loops and multiple edges.
Claim 5: Let be the alternating path of Claim 3 , and be its length. Then, (notice that we are thus contracting vertices).
Proof:
: it suffices to find a matching of size in , which is provided by for instance.
: suppose is a maximal matching of . Let be the contracted vertex in . If does not cover , we are done by adding to a maximal matching of the cycle . Hence we assume that contains an edge . By definition of the contraction, there is a vertex such that . We can build a matching of size by starting from , and adding a maximal matching on . As is a path of length , it has a matching of size . Thus, we get a matching of with cardinality , proving the claim.
Theorem 6: (Berge , Tutte) For a graph and a vertex set , denote (or when we need the precision) the number of odd components of . Then:
That is, the minimum number of vertices not covered by a matching is equal to the maximum of over all vertex sets .
Proof: One direction is obvious: given a set with and a matching , each odd component of has either a vertex not covered by , or a vertex matched by to a vertex of . But at most odd components may be in the latter case. Hence for every possible .
We prove the equality by induction on . Consider a matching of . Let .
if one of , is in , say , then consider . By induction hypothesis, there is a set with . We pose . Then we have , as .
else, we are in the situation of Claims
3 and
5. Let be the path of Claim
5, of length . We apply the induction hypothesis on . Denote by the contraction of in . Let be a set for which . It happens that is not in : indeed, there is a maximal matching in that does not cover (for instance the matching induced by ), but the previous equality implies that every uncovered vertex must be in an odd component of (see below the proof). Thus, we pose and we have , as we replace by , the component containing is replaced by a component containing with same parity. Using claim 5 and , we get .
A set reaching the maximum in Theorem 6 is called a Tutte set. Tutte sets are natural certificates of optimality for the matching problem. As with any min-max theorem, we have some complementary slackness criterion for the optimality of a matching and a Tutte set (that we actually used in the induction):
all the vertices not covered by are in distinct odd components of ,
the vertices of must all be matched to vertices in distinct odd components of .
Note also that each odd component of either has a vertex not covered by , or (exclusively) has a vertex matched with a vertex of in .
We will refine this criterion and show that there is actually a canonical Tutte set latter, but first we use the contraction idea to get a polynomial-time algorithm for the maximum cardinality matching.
Edmonds' algorithm for perfect matching
The goal of this section is to exhibit and prove an algorithm for finding a maximum perfect matching. We do not want to tackle the details of implementation and how to get the fastest possible algorithm, instead we just want to show that is is possible to compute efficiently a maximum matching, and learn more about the structure of matchings. Hence, our algorithm will actually do only the following:
it takes a graph and a matching in ,
if is maximal, its output is a Tutte set such that , fulfil the complementary slackness criterion stated above,
if is not maximal, its output is an -alternating -path where and are not covered by .
Claim 2 ensures that exactly one of the two possible outputs is always possible. By iterating the algorithm, starting from , until the algorithm outputs a Tutte set, we get a maximal matching from any graph . Edmonds' complete algorithm would actually use ad-hoc data structures to encode , that would be updated at each step, to get a better asymptotic complexity, but our approach only keeps the heart of the procedure and is best suited for our purpose.
Given a graph and a matching , we try to find an augmenting path, that is, an -alternating -path where and are not covered by . Let's try a naive idea: start from an uncovered vertex , and look at its neighbours. If one of them is also uncovered, say , we are done, is a path as desired. Else, for any neighour of , is matched to some vertex by . Then if has an uncovered neighbour , is an -alternating path as desired, and otherwise, we check the vertices matched to neighbours of , and so on. If we are (very) lucky, we will find an augmenting path. But usually at some point we would visit a vertex already considered, which leads either to exponential complexity (if we go on from this vertex) or to unsound behaviour (if we ignore it).
An -alternating -tree (or alternating tree when there is no ambiguity) is a tree rooted at an uncovered vertex , such that any path between and a leaf is an -alternating path of even length.
What we described is a (too simple) procedure to grow an alternating tree. An -forest is a forest of -alternating trees whose roots are exactly the uncovered vertices of . Given a forest , we partition the set of vertices of in:
inner, or odd, vertices are the vertices at odd distance of the root of their tree in ,
outer, or even, vertices are the vertices at even distance of the root of their tree in . The uncovered vertices, that is the roots, are then outer. Inner-inner, inner-outer and outer-outer edges are edges whose endpoints are both inner, one inner and one outer, or both outer respectively. A forest is maximal if there is no edge between an outer vertex and a vertex outside .
We are motivated by the next claim.
Claim 7: If is a maximal -alternating forest such that does not contain an outer-outer edge, then is maximal and the set of inner vertices is a Tutte set.
Proof: Let and be the sets of inner and outer vertices respectively. Then each outer vertex induces a distinct odd component in by maximality of . Also, the vertices not covered by induces even components, as they are paired together by . We check the complementary slackness conditions:
all the uncovered vertices are outer vertices, hence in distinct odd components of ,
by definition of an -alternating forest, every inner vertex is matched to an outer vertex, in distinct odd components of .
Hence is maximal and is a Tutte set.
We face two challenges: dealing with outer-outer edges, and growing the tree until it is maximal. Let's answer the latter problem first. Suppose that is an outer vertex of , and is adjacent to but not in . Then is covered (otherwise it would be a root in ), let be the incident edge in . As is not in , is a bigger -alterning forest. Now we go back to the former problem.
Given a matching , a contractible cycle is a cycle of length containing edges in , and such that there is an even -alternating path from the vertex of not covered by to an uncovered vertex of . This definition derives naturally from Claim 5: if is a maximum matching, then the condition that there is an alternating path from to an uncovered vertex implies that . Hence if is maximal, it induces a maximal matching when contracting
Claim 8: If an -alternating forest contains an outer-outer edge , then
if and are in the same tree in , there is a contractible odd cycle containing .
if and are in distinct tree in , there is an -augmenting path.
Proof: First case, if and are in the same tree , then there is an -path in , and let be the unique vertex in closest to the root of . then can be decomposed into two -alternating even paths: from to and from to . Then is an -alternating odd cycle (where is the vertex not covered by the restriction of to ), that is is contractible.
Second case, if and are in distinct trees and . Let be the path from the root of to , and be the path from to the root of . Then is an -alternating odd -paths, and and are not covered.
Reformulation Claim 5, we have:
Claim 9: If is a maximum matching and a contractible cycle for , then is maximal in , and a Tutte set of is also a Tutte set in .
Proof: It remains to prove that a Tutte set of extends to a Tutte set in . Let be the contraction of in . Then because by definition of contractibility is not covered by at least one matching of cardinality , i.e. . It follows that is a Tutte set of as (because has an odd nomber of vertices).
It remains one case in the algorithm: when is not maximal and there is a contractible cycle. Then is not maximal and hence it has an augmenting path.
Claim 10: Let be a matching in , a contractible cycle for , the contraction of . Suppose that there is an -augmenting path , then there is an -augmenting path in with .
Proof: If does not contain , it is an alternating path in , thus in too.
Suppose is in . Let and be the endpoints of . Then there are only two possible -paths with edge sets contained in , inducing respectively a subpath and in , and one of their extremity is the vertex not covered by . Assume that is even. Then is alternating and induces an -alternating -path.
We summarize our discussion by the following algorithm. Given a graph , a matching , and an -alternating forest :
if there is an edge between an outer vertex and a vertex not in matched by to , then let (Claim
8), recurse on .
if there is an outer-outer edge between two different trees of , output an -augmenting path (Claim
8).
if there is an outer-outer edge between two vertices of a same tree in , define an odd cycle that is contractible, recurse on :
otherwise, output the Tutte set of all the inner vertices in .
We just proved:
Theorem 11: (Edmonds) Maximum Cardinality Matching is solvable in polynomial-time.
Remark that, even without considering details of implementation, our algorithm is inefficient when it finds a contractible cycle: a better algorithm would not start with an empty forest, but instead contract the current forest for into a forest for . But this algorithm is possibly the simplest to understand and prove for this problem.
Implementation
Let's apply this and write an real algorithm in a real programming language. But first, a disclaimer:
The implementation is given in OCaml, and is part of my own graph library. The interface for graphs is given here. I also use a module for paths, that implements a double-ended queue with fold and concatenation functions, complete interface here. The advantage is that when something is supposed to be a path, it is indeed a path for sure (I suppose that the implementation for path is correct, which is reasonable as it is straightforward, and it can be checked independantly). Notice that a path is defined over a fixed graph, hence a sequence of edges that is a path in two distinct graphs needs two representations.
Similarly, I define a module for matching. Again, a matching is defined for one fixed graph, and a matching is really what we meant it to be. The definition of a matching is given by:
type matched = G.edge * G.vertex
type t =
{ g : G.graph;
phi : matched G.VMap.t;
set : G.E.t;
covered : G.V.t;
uncovered : G.V.t;
}
phi is simply a partial function that gives for every vertex covered the vertex to which and the edge by which it is matched.
The interface is here. There is only one non-obvious function here:
val contract_blossom : t -> P.t -> t
This function takes a matching and a path representing the cycle of a blossom, and contracts the blossom (hence returns a matching on a smaller graph). Here is the implementation that I give:
let contract_blossom m path =
if not (check_blossom m path) then invalid_arg "contract_blossom (Matching): not a blossom"
else
let w = P.last_vertex path in
let vset = P.fold (fun vset u _ _ -> G.V.add u vset) (G.V.singleton w) path in
{ g = G.simplify (G.identify m.g ~w vset) w;
phi = G.V.fold G.VMap.remove (G.V.remove w vset) m.phi;
set = P.fold (fun eset _ e _ -> G.E.remove e eset) m.set path;
covered = G.V.add w (G.V.diff m.covered vset);
uncovered = m.uncovered
}
check_blossom checks that the operation is legal. Actually, my implementation is not correct because I don't check that the graphs on which m and path are defined are the same. It would have been cleaner to actually define a module just for alternating paths, defined for a given matching, that would have almost the same interface than a normal path (and it would have provided a nice example of recursive modules). Another possible choice would be not to include that function in the matching interface. The implementation is straightforward here: we define w the vertex at the top of the blossom, and compute the contracted graph (line 7): we identify the vertices of the blossom with w and then remove loops and multiple edges around w. vset is the set of vertices in the blossom, that we use to define the contracted matching. Notice that w is still covered by the same edge after the contraction.
Now we have all that we can desire to write our implementation of Edmonds' algorithm. The interface will be:
module type S =
sig
module M:Matching.S
type kind =
| Outer of M.P.G.edge option
| Inner of M.P.G.edge
| Away
type forest
val initial_forest : M.t -> forest
val link : forest -> M.P.G.vertex -> kind
val g : forest -> M.P.G.graph
type augment_result =
| Augmenting_Path of M.P.t
| Blossom of M.P.t
| Forest of forest
val augment_forest : forest -> M.P.G.edge -> augment_result
type edmonds_result =
| Augment of M.P.t
| Tutte of M.P.G.V.t
val edmonds : M.t -> edmonds_result
val maximum_matching: M.t -> (M.t * M.P.G.V.t)
end
module Make : functor (M:Matching.S) -> S with module M = M
We keep the structure of the proof: we give Edmonds' algorithm as an algorithm taking a matching and computing a certificate of optimality or an augmenting path. The algorithm is further decomposed into an algorithm that given a forest, computes an augmenting path, a blossom, or a larger forest. For completeness, we give a function maximum_matching that computes directly a maximum matching and a Tutte set from any matching, even empty.
The type forest is coded like this:
type forest =
{ matching : M.t;
links : kind G.VMap.t;
unscanned : G.E.t
}
links classifies the vertices as inner, outer or away from the forest. Notice that an outer vertex or an inner vertex points to the edge toward its father in the forest. We also keep a set of edges, unscanned (implemented as a set, but could have been a list for better performance), having one outer endpoint but not being yet in the forest. We must check all these unscanned edges for augmenting paths, blossoms or for a way to grow the forest.
The heart of the algorithm is the function augment_forest, implemented like this:
let augment_forest forest edge =
let (u,v) = G.extremities (g forest) edge in
let chain path1 edge path2 = P.rev_concat path1 (P.add_first edge path2) in
if equal u v then Forest forest
else match link forest u, link forest v with
| Outer _, Outer _ ->
let path1 = get_path forest u (P.empty (g forest) u) in
let path2 = get_path forest v (P.empty (g forest) v) in
if equal (P.last_vertex path1) (P.last_vertex path2)
then
let (subpath1, subpath2) = remove_common_suffix path1 path2 in
Blossom (chain subpath1 edge subpath2)
else
Augmenting_Path (chain path1 edge path2)
| Outer _, Away -> grow forest edge u v
| Away, Outer _ -> grow forest edge v u
| _ , Inner _
| Inner _, _
| Away, Away -> Forest { forest with unscanned = G.E.remove edge forest.unscanned }
This implementation emphasizes the four different cases happening when one wants to add an edge to te tree. If the two endpoints and are outer, there is a blossom or an augmenting paths. To distinguish the two cases we use the function get_path to get the paths from and to the roots of their respective trees. If the roots are equal we have a blossom, in which case we must remove the common edges of the two paths (using remove_common_suffix, otherwise we get an augmenting path by a simple concatenation.
If one endpoint is outer, and the other one is away, the forest grows. We factorize the code for growing in the function grow. In every other case nothing happens: outer--inner does nothing, and the other cases cannot happen as one of the endpoints must be outer.
The code for growing a forest is:
let grow forest edge u v =
match M.matched_to forest.matching v with
| Some (e_in_m,w) ->
Forest (
{ forest with
links = G.VMap.add v (Inner edge) (G.VMap.add w (Outer (Some e_in_m)) forest.links);
unscanned = G.E.remove edge (G.E.fold G.E.add (G.delta (g forest) w) forest.unscanned)
})
| None -> assert false
We simply add two edges to the forest, and add all the edges incident to the new outer vertex to unscanned, except the edge being scanned.
Now we can code the one-step algorithm, that augments by one the size of a matching or computes a Tutte set. This is the function edmonds
let rec edmonds_forest forest =
if G.E.is_empty forest.unscanned
then
Tutte (G.VMap.fold
(fun v k vset -> match k with Inner _ -> G.V.add v vset | _ -> vset)
forest.links G.V.empty
)
else
let edge = G.E.choose forest.unscanned in
match augment_forest forest edge with
| Augmenting_Path p -> Augment p
| Blossom c ->
begin
let m_c = M.contract_blossom forest.matching c in
match edmonds m_c with
| Augment p -> Augment (uncontract_blossom forest m_c c p)
| Tutte vset -> Tutte vset
end
| Forest f -> edmonds_forest f
and edmonds matching = edmonds_forest (initial_forest matching)
The implementation follows the proof: if all the edges incident to an outer node are scanned, the set of inner vertices is a Tutte set certifying the optimality. Otherwise, augment_forest finds either an augmenting path (we are done), or the forest grows (we simply recurse), or it finds a blossom (we recurse on the contracted graph). In the last case, if we find a Tutte set in the contracted graph, we proved that it is a Tutte set in the original graph. If we find an augmenting path, then we need to uncontract that path, this is the most technical part of the algorithm:
let uncontract_blossom forest m_c blossom path =
let w = P.first_vertex blossom in
let (p1,p2) = cut_when (fun u -> equal u w) path in
if P.is_empty p2 then
translate_f (g forest) path
else
let (path1,path2) = (translate_f (g forest) p1, translate_b (g forest) p2) in
let (v1,v2) = (P.last_vertex path1, P.first_vertex path2) in
let w' = if equal w v1 then v2 else v1 in
let (c1,c2) = cut_when (fun u -> equal u w') blossom in
match G.E.cardinal (P.edge_set c1) mod 2 = 0, equal w v1 with
| false, true -> P.concat path1 (P.concat (P.reverse c2) path2)
| false, false -> P.concat path1 (P.concat c2 path2)
| true, true -> P.concat path1 (P.concat c1 path2)
| true, false -> P.concat path1 (P.concat (P.reverse c1) path2)
This is ugly. We need to cut path into two subpaths, and insert an even-length subpath of the blossom in-between, which is done by also cutting the blossom into two pieces (at vertices and ). And because path lives in the contracted graph, we must rewrite it as a path in the original graph (lines 6 and 8).
Finally, the function to compute a maximum matching:
let rec maximum_matching matching =
match edmonds matching with
| Tutte set -> (matching,set)
| Augment p -> let matching = M.augment matching p in maximum_matching matching
The complete implementation is available here
The Edmonds-Gallai decomposition
Bibliography
1. | Jack Edmonds: Paths, trees, and flowers. Canadian Journal of mathematics, 17, 449 -- 467. (1965). | |
2. | László Lovász, Michael D. Plummer: Matching theory. AMS Chelsea Publishing, (1986). | |