Monte-Carlo algorithms in Poly testing Babai 1979.pdf

download Monte-Carlo algorithms in Poly testing Babai 1979.pdf

of 33

Transcript of Monte-Carlo algorithms in Poly testing Babai 1979.pdf

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    1/33

    Monte-Carlo algorithms in

    graph isomorphism testing

    Laszlo Babai

    Eotvos University, Budapest, HungaryUniversite de Montreal, Canada

    Abstract

    Abstract. We present an O(V4 logV) coin flipping algorithm to testvertex-colored graphs with bounded color multiplicities for color-preservingisomorphism. We are also able to generate uniformly distributed ran-

    dom automorphisms of such graphs. A more general result finds gener-ators for the intersection of cylindric subgroups of a direct product ofgroups in O(n7/2 log n) time, where n is the length of the input string.This result will be applied in another paper to find a polynomial timecoin flipping algorithm to test isomorphism of graphs with boundedeigenvalue multiplicities. The most general result says that ifAutX isaccessible by a chain G0 Gk = AutX of quickly recognizablegroups such that the indices|Gi1: Gi|are small (but unknown), theorder of|G0|is known and there is a fast way of generating uniformlydistributed random members ofG0 then a set of generators ofAutXcan be found by a fast algorithm. Applications of the main result im-

    prove the complexity of isomorphism testing for graphs with boundedvalences to exp(n1/2+o(1)) and for distributive lattices to O(n6 log logn).

    Apart from possible typos, this is a verbatim transcript of the authors 1979 technicalreport,Monte Carlo algorithms in graph isomorphism testing (Universite deMontreal, D.M.S. No. 79-10), with two footnotes added to correct a typo and aglaring omission in the original.

    1

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    2/33

    All algorithms depend on a series of independent coin flips and have asmall probability of failure (reaching no decision), but, unlike for someclassical Monte-Carlo algorithms, the correctness of the decision madecan always be checked and we are not referred to the hope that eventswith small probability are practically impossible. We suggest the termLas Vegas computation for such strong Monte-Carlo procedures.

    0 Monte Carlo or Las Vegas?

    0.1

    Fast Monte-Carlo algorithms to decide some interesting recognition problemshave been around for a while now, the most notable among them being theStrassen-Solovay primality test [16].

    One feature of this algorithm is that in case of a negative answer (theinput number is decided to be prime) there is no check on the correctnessof the answer. The situation is similar in the case when we wish to decidewhether a multivariate polynomial vanishes identically, by substituting ran-dom numbers for the variables. (R. Zippel [18]). In some cases, however,

    we can check whether the decision reached was correct (as in Zippels GCDalgorithm [17]).

    It may be worth distinguishing these two kinds of random algorithms byreserving the term Monte-Carlo for the Strassen-Solovay type algorithms. Ipropose the term Las Vegas algorithm for those stronger procedures wherethe correctness of the result can be checked. Adopting this terminology, acoin-tossing algorithm, computing a function F(x) will be called a

    Monte-Carlo algorithmif it has

    INPUT:x(a string in a finite alphabet)

    OUTPUT:y (believed to be equal to F(x))

    ERROR PROBABILITY: less than 1/3 (the error being y=F(x))1.1Typo corrected: the original had 1/2 here. (Footnote added.)

    2

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    3/33

    We shall call the computation a Las Vegas algorithmif it hasINPUT:x

    OUTPUT: either ? or F(x)

    PROBABILITY OF FAILURE: less than 1/2 (the failure meaning ? out-put).

    (Of course, repeatedly applying the algorithm t times, the probability oferror/failure is reduced to less than 2t).

    In particular, if our computation is to solve a recognition problem (i.e.

    F(x) {yes, no}) and its running time is a polynomial of the length of theinput string2, thenFbelongs to the class calledRPby Adleman and Manders[1] if the algorithm is Monte-Carlo and it belongs to R =RPcoRP if thealgorithm is Las Vegas.

    Note that every Las Vegas computation is Monte Carlo, but not con-versely.

    0.2

    In this paper we give (contrary to the title) Las Vegas algorithms to testisomorphism in certain classes of graphs. One of the applications of themain results of this paper (2.5, 5.2) will be a polynomial time Las Vegasisomorphism testing for graphs having bounded multiplicities of eigenvalues[6]. In particular, non-isomorphism is in N P for these classes of graphs (i.e.isomorphism is well characterized in the sense of Edmonds: the negativeanswer can also be verified in polynomial time) (since R N P coNP).It is not known in general, whether non-isomorphism of two graphs on nvertices can be provedin exp(o(n)) steps.

    It is the authors dream that such a proof, though difficult, isnt out of

    reach anymore, and the present paper may be a contribution to that modestgoal.

    2We also need to assume that a yes answer is always correct. (Footnote added.)

    3

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    4/33

    0.3To the authors knowledge, no coin-tossing algorithms have previously beenknown to solve truly combinatorial recognition problems, not solvable by anyknown polynomial time deterministic algorithm.

    1 Introduction

    1.1

    Graph isomorphism testing by brute force takes n! time. (n will alwaysrefer to the number of vertices.) Heuristic algorithms like the one treatedin 4.4 are often used to classify the vertices and thereby reduce the numberof trials needed (cf. [14]). If such a process splits the vertex set of a graphX into pieces of sizes k1, . . . , kr(

    ki = n) and the same happens to the

    graph Y then we are still left withr

    i=1(ki!) bijections as candidates foran isomorphism X Y. It is frustrating that this number is exponentiallylarge even if our vertex classification algorithm was as successful as to achievek1 = . . . = kr = 3, and we dont know of any deterministic algorithm that

    could test isomorphism of such pairs of graphs with so successfully classifiedvertices within subexponential time.

    1.2

    We are able, however, to give an O(n4 log n) Las Vegas algorithm (for the casewhen all classes have bounded size). The precise statement of the problemto be solved is this. Given two graphs with colored vertices, each color classhaving sizek, decide whether they admit a color preserving isomorphism.The cost of our Las Vegas computation will be O(k

    4k

    n4

    log n). (Theorems3.1, 3.2). (We remark that for k 2, there is a straightforward linear timedeterministic algorithm.)

    After the computation is done, we shall also be able to generate uniformlydistributed random members of the automorphism groupAutXof the colored

    4

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    5/33

    graph X at O(k!n2

    ) cost each. Moreover, the algorithm displays a set ofgenerators ofAutX.

    1.3

    The more general setting, covering both the vertex-colored graphs with smallcolor multiplicities and the graphs with bounded eigenvalue multiplicities isthe following.

    SupposeAutX ispolynomially accessible from a well-described group. By

    this we mean that we have a chain of groups G0 G1 Gk =AutX(k < nc) such that

    (i) G0 is well-described, i.e. we know its order|G0| and we are able togenerate uniformly distributed random members ofG0 (cf. Def. 2.1);

    (ii) the indices|Gi1:Gi|are small (uniformly bounded by a polynomialofn);

    (iii) the groups Gi are recognizable(i.e., given a member ofGi1 we candecide in polynomial time whether it belongs to Gi, i= 1, . . . , k 1).

    Under these circumstances we proceed as follows (cf. Theorems 2.5 and2.10).

    We extend this chain by adding Gk Gk+1 Gk+n where Gk+jis the stabilizer of the jth vertex of X in Gk+j1(j = 1, . . . , n). Clearly|Gk+n| = 1. By induction on i, we generate uniformly distributed randommembers ofGi1 in sufficient numbers such as to represent all left cosets ofGi1modGiwith high probability, and then select a complete set of left cosetrepresentatives. We use these coset representatives to generate uniformlydistributed random members ofGi(once such elements ofGi1are available)and continue. This way we compute the indices ri =|Gi1 : Gi with apossible error, but this will be eliminated by checking ifri=|Gi 1 :Gi| butthis can be eliminated by checking ifk+ni=1 ri=|G0|. If not, we output ?; ifyes, we have all we wanted. AutXis generated by the coset representativesbelow Gk =AutX.

    5

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    6/33

    1.4We apply this procedure to obtain improved theoretical bounds on the com-plexity of isomorphism testing for distributive lattices (nc log logn, Cor. 3.4),trivalent graphs (exp(cn1/2 log n)) and of graphs with bounded valence (exp(n1/2+o(1)),Theorem 4.1). The best previously known bound for these classes was (1+c)n,c a positive constant (G.L. Miller [14]).

    1.5

    It is the authors hope that an exp((log n)c) Las Vegas isomorphism test fortrivalent graphs may arrive in the not too distant future. Besides the Las Ve-gas algorithm presented here, results about the behavior of a simple canonicalvertex classification algorithm (4.4) combined with a depth-first search maypossibly possibly find further applications (Theorem 4.8, Algorithm 4.9).

    1.6

    Open problems, indicating the limits of (the authors) present knowledge are

    scattered throughout the paper. Some comments about the way how thepresented ideas arose are included with the acknowledgments at the end ofthe paper.

    1.7

    Let me add here some comments on theshortcomingsof the results. The factthat the algorithms are not deterministic is not a defect from either theoret-ical or practical point of view. Any attempt of practical implementation of

    the colored graph isomorphism test 3.1-3.2 should, however, be preceded bythe use of all available heuristics, since the O(n4 log n) running time is toolong. It would be very interesting to see improvements on the exponent 4.The most essential deficiency of our algorithm is, however, that it does notprovide a canonical labelingof the vertices. A canonical labeling of the class

    6

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    7/33

    Kof graphs is an assignment of a labeled graph C(X) to every graphXKsuch that(i) X=C(X); (ii) X=Y iffC(X) =C(Y).

    In all cases previously known to me, isomorphism testing algorithms ac-tually yield canonical labeling (see e.g. [11], [2], [12]). R.E. Tarjan has kindlyinformed me that the celebrated fast planar graph isomorphism test [7], [8],[9] is no exception.

    Problem 1.8. Find a (random) polynomial time canonical labeling algo-rithm for vertex-colored graphs (V , E , f ) where every color class has size

    3.(Here, (V, E) is a graph and f : V {1, . . . , n} is an arbitrary map

    called coloring, such that|f1(i)| 3(i= 1, . . . , n). )Remark 1.9. The hierarchy of the results of this paper and ref. [6] is this:

    2.5 5.23.13.23.33.4

    2.10[6] 4.1

    2.11

    In view of the technical nature of the proof of both 2.5 5.2, and5.23.1, we present a direct proof of 2.53.1 in section 3.

    2 The main result

    We are going to handle large groups, far too large to be stored by their

    multiplication tables. The group elements will be thought of as 0 1 strings,and we require that group operations be carried out by a fast algorithm.For our purposes it is not necessary to postulate that the set of those 0 1 strings representing group elements be recognizable by a fast algorithm,although this will always be the case in the subsequent applications of themain result. The information we need about our large group is its order,

    7

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    8/33

    and an algorithm generating uniformly distributed random members of thegroup (using random numbers).

    For notational convenience, we adopt the convention to identify non-negative integersMwith the sets of their predecessors: M={0, . . . , M 1}.A mapg : MG will be called uniformif|g1(x)|= M/|G|for eachxG.Definition 2.1. We say that a group G is well describedwith respect to thetime bounds (t, T) where tTif the following conditions are satisfied:(a) the order N ofG is given;

    (b) we are given an algorithm executing group operations on G in less thantsteps (in particular, the length of the 0 1 strings representing the elementsofG is less than t, hence N=|G|< 2t);(c) we are given a multiple M ofN, where M is still less than 2t, and analgorithm computing a uniform map g: MG in less than T steps.Remark 2.2. Condition (c) means that we can compute uniformly dis-tributed random members ofG, within Tsteps, employing uniformly dis-tributed random integers from M={0, . . . , M 1}.Definition 2.3. The numbers N, Mand the algorithms (b) and (c) consti-

    tute a good description of the group Gw. r. to the time bounds t, T.Definition 2.4. A chain of subgroups G0 G1 Gk is recognizablein time , if there is an algorithm with

    INPUT: a pair (i, x) where xis known to belong to Gi1 (1ik);OUTPUT: yes ifxGi, no otherwise;RUNNING TIME: not exceeding .

    (For k = 1, we shall say that G1 is a subgroup ofG0, recognizable in time.)

    Now we are able to formulate the main result of this paper.

    Let G0 be a group, well described w. r. to the time bounds t, T and letG0G1 Gm= E(|E|= 1) be a chain of subgroups, recognizable intime. LetSi |Gi1:Gi|be known upper bounds on the unknown indicesof subsequent subgroups (i= 1, . . . , m).

    8

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    9/33

    Theorem 2.5. There is a Las Vegas algorithm withINPUT: a good description ofG0 w.r. to time bounds (t, T); an algorithm,recognizing the chain of subgroups G0G1 Gm=E in steps; theintegerss1, . . . , sm (where si |Gi1 : Gi|).OUTPUT: either ? or else

    (i) Ri, a complete set of left coset representatives of each Gi1 mod Gi(i= 1, . . . , m) (|Ri|=|Gi1:Gi|);

    (ii) a good description of each Gi w. r. to the time bounds (t, T+ 2(s1+. . .+ si)) (i= 1, . . . , m).

    COST OF COMPUTATION: less thanmi=1 si(log si+log (4m))(T+2

    ij=1 sj)ms(log s+log (4m))(T+(m+1) s)

    binary operations and less than 4tms(log s+ log m+ 3) coin tosses, wheres= max{si : 1im}.Remark 2.6. Output (i) yields, in particular, the orders of every Gi, andalso a set of at most

    mj=i+1 si generators ofGi. Output (ii) enables us to

    generate uniformly distributed random members ofGi.

    Our procedure rests on the following two observations. First, we note thata recognizable subgroupHhaving small index in a well-described group G isitself well described, and all we need in order to construct a good descriptionis a complete set of left coset representatives of G mod H. The secondobservation will be that we have a good chance of obtaining a complete set ofrepresentatives simply by guessing a sufficient number of random membersof G and selecting a maximal subset of pairwise left incongruent elementsamong them (mod H).

    Lemma 2.7. Let G be a group, well described w.r. to the time bounds(T, t) and Ha subgroup ofG, recognizable in time . ThenHhas a good

    description with time bounds (t, T+ 2|G: H|). We can construct the gooddescription once a good description of G and a complete set of left cosetrepresentatives ofGmod H is given.

    Proof. Let{a1, . . . , as} be a complete set of left coset representatives ofG mod H. We check Def. 2.1. (a), (b), (c) for H.

    9

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    10/33

    (a) The order ofH is|G|/s. (b) is clearly inherited to H. (c) Let M denotethe number occurring in the good description of G, and g : M G thecorresponding uniform function. We use the same number M for H. Firstwe define a uniform map h : GHby

    h(x) =a1j x i f xaj H.

    (For any x G, h(x) can be computed by computing a11 x , . . . , a1s x anddetermining which of them belongs to H.)

    Now, let g(u) =hg(u) (uM). Clearly, g: M His uniform.Let again G be a group, well described w.r. to the time bounds (t, T);

    and H a subgroup ofG, recognizable in time . The index|G : H| is notknown to us, only an upper bound s |G: H|.Lemma 2.8. There exists a Monte Carlo algorithm with

    INPUT: a good description of the group G w.r. to the time bounds (t, T);

    an algorithm, recognizing the subgroup H in time ;

    an integer swhich is known to be |G: H|;a positive number q(the parameter of the cost-security tradeoff).

    OUTPUT: a positive integer k (believed to be equal to|G: H|);a set a1, . . . , ak of elements ofG, pairwise left incongruent mod H.

    PROBABILITY OF ERROR: less than eq.

    COST OF COMPUTATION: less than s(log s + q)(T+ 2s) elementary op-erations ands(log s+ q) independent random numbers from the intervalM={0, . . . , M 1}.(x denotes the smallest integer, not smaller than x.)

    Note that this is aMonte-Carloalgorithm,notLas Vegas. The possible errorisk

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    11/33

    To select theai we have to perform at most (r 1)sdivisions of the formb1i bj and test whether b1i bjHeach time.We estimate the error probability. For a coset bH, the probability that

    bibH is 1/|G: H| 1/s, hence the probability that none of the bi belongsto bH is at most (1 1/s)r < er/s. Finally, the probability that there is acoset not represented byb1, . . . , br(i.e.,k

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    12/33

    1

    24Q

    Q1j=0

    4Q

    j

    q, all we have to check is whether Viq is pointwise fixed under x. Hencethe chain G0

    G1

    Gm is recognizable in =O(k

    2) time.

    Finally, G0 = AutX0 = Sk1 . . .Skp (direct product of symmetricgroups) is well described w.r. to the time bounds (t, T) where t = =O(n log k. Clearly, we may chooseM = N =|G0| =pi=1(ki!) for the gooddescription ofG0.

    We have now all we need in order to apply Theorem 2.5. Our output willbe either ?, or R2q+1. . .Rm, a generating set ofG2q = AutX. Theprobability of failure is less than 1/2 by 2.5.

    Let s= k!. Clearly,m= 2q+p < n2.

    The cost of computation is less than

    ms(log s+ log (4m))(T+ (m+ 1) s) =

    =O(n2k!(k log k+ log n)(n log k+ n2k2k!)) =

    O(n4 log n(k+ 2)!2) operations and less than

    15

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    16/33

    4tms(log s+ log m+ 3) =O(n log k n2

    k! (k log k+ log n)) ==O(n3 log n(k+ 2)!) coin tosses.

    Corollary 3.2. There is a Las Vegas algorithm with

    INPUT: vertex-colored digraphs Xand Y;

    OUTPUT: either ?, or an isomorphism between Xand Y, or a proof thatthey are not isomorphic.

    COST OF COMPUTATION: O(k4kn4 log n) operations and O(k2kn3)log ncoin tosses,

    where n is the number of vertices ofX and k is the size of its largest color-class.

    Proof. We may assumeXand Yare connected and both have the same setof colors, each color occurring the same number of times in the two graphs.We apply Theorem 3.1. to the disjoint union X Y. Clearly, X and Yare isomorphic if and only if at least one of the generators ofAut(X Y)interchanges X and Y, thus providing an isomorphism. The proof of non-isomorphism consists of displaying coset representatives for the subsequentsubgroups in the chain, constructed in the proof of Theorem 3.1, and proving

    that each of them is a complete set of coset representatives (by multiplyingtheir orders), and finally observing that none of the elements thus proven togenerate Aut(X Y) interchange Xand Y.

    We have an immediate application to partially ordered sets (posets). Thewidthof a poset is the size of its largest antichain.

    Corollary 3.3. There is a Las Vegas algorithm with

    INPUT: posetsX and Y;

    OUTPUT and COST OF COMPUTATION: as in Corollary 3.2, withnbeingthe number of elements ofX, and k standing for the width ofX.

    Proof. The partial orders can be viewed as digraphs. Let us assign colori to a vertex x of X or Y if i is the length of the longest chain below x.Clearly, the color classes are antichains, hence their size does not exceed k.This coloration being invariant under isomorphisms, the result is immediatefrom 3.2.

    16

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    17/33

    The isomorphism problem for lattices is isomorphism complete (i.e., poly-nomial time equivalent to graph isomorphism) (FRUCHT [4]). This is, how-ever, very unlikely to be the case for distributive lattices, in view of thefollowing result.

    Corollary 3.4. There is a Las Vegas algorithm with

    INPUT: distributive lattices X, Y(given by their operation tables);

    OUTPUT: either ?, or an isomorphism between X and Y, or a decisionthat they are not isomorphic;

    COST OF COMPUTATION: O(n6 log logn) operations and coin tosses. (n=|X|=|Y|).Proof. Every finite distributive lattice X is isomorphic to the lattice ofideals of the poset J(X) of its join-irreducible elements. (Birkhoff, see [5, p.61]). Xand Yare isomorphic if and only ifJ(X) andJ(Y) are isomorphic.Therefore we find the posetsJ(X) andJ(Y) and apply Cor. 3.3. We have toestimatek, the width ofJ(X). IfJ(X) contains an antichainAJ(X), thenAgenerates a Boolean algebra on 2|A| elements inX. Hence, 2k |X|. Also,obviously|J(X)|

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    18/33

    We dont know any deterministic algorithm whose worst case behavioron trivalent graphs would be better than cn for every c > 1. We are, how-ever, able to replace the exponent n by essentially

    n, using our Las Vegas

    algorithm.

    Theorem 4.1. There is a Las Vegas algorithm with

    INPUT: graphs Xand Ywith maximum valence 3;

    OUTPUT: either ?, or an isomorphism ofXandY, or a decision that theyare not isomorphic;

    COST OF COMPUTATION: less than exp((4 +o(1))

    n log n) operationsand coin tosses.

    More generally, for maximum valence d3 the cost of our computationwill be less than exp((4 + o(1))

    n(log n)1+(d1))/2) where (x) denotes the

    number of primes not exceeding x. ((2) = 1, (3) = (4) = 2, (5) =(6) = 3.)

    Remark 4.2. For bounded d this is an exp(n1/2+o(1)) cost. For d < (12c)log n (c a positive constant), our cost is exp(n1 c+ o(1)), still betterthan the previously known exp(n1+o(1)) bounds.

    The next problem to be solved in this area is to extend the range ofpossible maximum degrees.

    Problem 4.3. Find a positive constant c such that if X and Y are non-isomorphic graphs on n vertices with maximum valence less than nc thentheir non-isomorphism has a proof, shorter than exp(n1c).

    Procedure 4.4. The proof of 4.1 will use a well-known naive canonicalvertex classificationwhich we have to describe here (cf. [15]).

    Let X= (V, E) be a graph with colored vertices, the function f :vnbeing the coloration. We suppose that the set of colors actually occurring

    forms an initial segment ofn={0, 1, . . . , n 1}. We refine the color classesas follows. For v V, let ki(v) denote the number of those neighbors ofvhaving color i. Let us assign to v the vector g(v) = (f(v), k1(v), . . . , kn(v)).Let us arrange these vectors lexicographically, sayw1

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    19/33

    f

    (v1) =f

    (v2) ifg(v1) =g(v2).Let now f0 = f, f1 = f

    , . . . , f i+1 = fi . Clearly, we havefi0 = fi0+1 =

    . . . = fn = . . . for some i0 n. (There are not more than n1 properrefinements possible.) Let us call fn the stable refinementoffand denote itby f. Let us call a coloration f stableiff=f (hence f = f). Clearly, thestable refinement of any coloration is stable.

    By a semiregular bipartite graph we mean a bipartite graph with colorpartition V = V1 V2 such that all vertices in one class have the samevalence. For V1, v2 disjoint subsets of the vertex set of X, the bipartitesubgraph induced by (V1, V2) means the bipartite graph on V1

    V2 whose

    edge set consists of those edges ofXconnecting a vertex ofV1 to a vertex ofV2. We denote this bipartite graph byX(V1, V2). Also, the subgraph inducedbyV1 will be denoted by X(V1). The following is straightforward.

    Proposition 4.5. The coloration f of the graph X is stable if f all in-duced subgraphs X(f1(i)) are regular and all induced bipartite subgraphsX(f1(i), f1(j)) are semiregular (i, j < n, i=j).

    We shall need to change fby assigning a new color to a vertex v, unlessits color-class was a singleton.

    Definition 4.6. For a coloration f and vV let the v-pointed recolorationf0v be defined byf0v (x) =f(x) ifx=v,f0v (v) =min(n{f(x) :x=v}).

    Observe that f0v =f iff f1(f(v)) ={v}.

    The f0v -color-classofv is necessarily a singleton. Let fv denote the stablerefinement off0v . We callfv the stabilizerofv in f. ForS = (v1, . . . , vs)an ordered s-tupleof vertices, the stabilizer ofS in f will be fS = fv1...vs,obtained by repeated application of the operation above. (The terms are

    borrowed from the theory of permutation groups. The classes of a stablecoloring imitate some properties of the orbits of the automorphism group ofa (colored) graph. We prove one of these analogies below, Lemma 4.8.).

    The following is straightforward.

    19

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    20/33

    Proposition 4.7. Letfbe a stable coloration ofXsuch that the color-classofxVis a singleton. Then for any verticesy, zV,dist(x, y)=dist(x, z)impliesf(y)=f(z).

    Now we prove that ifX is connected and a vertex of valence less thanthe maximum has a singleton for its color-class then all prime divisors of thelengths of the color-classes are less than the maximum valence.

    For m2, let pr(m) denote the largest prime divisorofm. Set pr(1) =1, pr(0) = 0.

    Now we give a bound on the prime divisors of the lengths of the color-

    classes of fx for a connected graph X with bounded valences. Note thatunder the same conditions, the same bounds are easily seen to be valid forthe prime divisors of the order of the stabilizer subgroup of the automorphismgroup and consequently for the orbit lengths of the stabilizer, cf. [3, Theorem1].

    Lemma 4.8. Suppose that all vertices of the connected graph Xhave va-lences d. Letfbe a stable coloration ofXsuch that the color-class of a ver-texx of valenced 1 is a singleton (i.e. fx=f). Thenpr(f1(i))d 1for every i < n (i.e. the lengths of the color-classes have no prime divisorexceeding d

    1).

    Proof. Let z V. We prove by induction on the distancedist(x, z) thatthe color-class of zhas no prime divisors exceeding d1. This is true ifz= x. Suppose it holds for all distances less than dist(x, z) = k > 1. Lety V be a neighbor ofz at distance k1 from x. Let A and B denotethe color-classes ofy and z, resp. (A= B by 4.7.) pr(|A|) d 1 by theinduction hypothesis. Let the vertices fromA and B have valence a and b,resp., in the bipartite graph X(A, B). Clearly,a, bd. Moreover,ad 1since ifk2 then y has a neighbor at distance k 2 from x; ifk = 1 theny=x and|B| d 1 (since now B is a subset of the neighbors ofx).

    Clearly,a|A|= b|B|,ab= 0,hence pr(|B|)max(a,pr(|A|))d 1.

    Now we are able to describe our Las Vegas algorithm to test isomorphismof graphs with maximum valence d. We may supppose both X and Y areconnected. (It suffices to test components.)

    20

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    21/33

    The idea is that we shall be able to reach a coloration with color-classesof moderate size by stabilizing an initial trivial coloration at a sequence ofvertices, the sequence having moderate length. We do this in all possibleways and decide whether the obtained colored graphs are isomorphic usingthe Las Vegas algorithm of Cor. 3.2. Moderate here means

    n(log n)c, c

    constant. By guess we shall mean an arbitrary choice.

    Algorithm 4.9. Choose a positive integer k. This will be our desired upperbound on the color multiplicities and we shall compute its optimum value atthe end of the proof (before Remark 4.13). Guess an edge ofX, and halve itby inserting a new vertexx0of valence 2. Denote the obtained graph on n +1

    vertices byX

    . Define a coloringgofX

    byg(x0) = 0 andg(x) = 1 forxV.Let fbe the stable refinement ofg. Clearlyf=fx0. If there is a color-classof size exceeding k, let i be the smallest number such that the color-classf1(i) has largest size. Guess a vertexx1 from f

    1(i). Compute the stablecoloration fx0x1 and guess a vertexx2 from the first color-class of largest sizeif it is larger than k, etc., until we arrive at a sequence S= (x0, x1, . . . , xs)of vertices such that the length of each color-class offs is at most k.

    Let us execute the same operations onY. There are|E(Y)| nd/2 waysof guessing the edge to be halved by y0, and we have at most n

    s ways ofguessing the sequencey1, . . . , ys. Clearly,XandYare isomorphic if there is

    a correct guess, i.e. for at least one of these sequences of arbitrary choice, theobtained colored graphs are isomorphic (colors are preserved under isomor-phism by definition). Hence we may apply the Las Vegas algorithm of Cor.3.2 to test whether the colored graph (X, fs) is isomorphic to any of the atmost nsnd/2 augmented and colored versions ofY. In each case when theLas Vegas algorithm fails (outputs ?), we repeat it until decision is reached,but the total number of calls on the Las Vegas algorithm should not exceedns+1d. If this number of calls is insufficient (failure occurs in more than halfof the cases), we output ?. Clearly, the probability that this happens isless than 1/2.

    The cost of the computation is ns+1d/2 applications of the stable refine-ment procedure which costs (n2d) each time, and an ns+1d-fold applicationof the Las Vegas isomorphism test for colored graphs with at most k-foldcolors. Hence the total number of operations required is

    21

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    22/33

    O(k4kns+5d log n)< O(k4kns+6).

    Clearly, there is a k stradeoff here that we have to analyze.Let

    (n, d) = max2hn

    |{x: hx |A0|/2|i 1|/2. It is a proper subset ofAi1 sinceziAi1 (zi was always selectedfrom a largest color-class) and zi Ai (since the color-class ofzi became asingleton at this step). Hence|A0|>|A1|> >|Ar1>|A0|/2.By Lemma 4.8, pr(|Ai|)d 1, hence r(n, d).

    We conclude that it took less than (n, d)n/pointed recolorations to re-duce the size of the largest color-class from 2 to . (Namely, thereare less than n/ classes of size greater than .) Let us apply this ar-gument to = k, 2k, 4k , . . . , 2mk where 2m1k n < 2mk. We obtain

    22

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    23/33

    that the total number of pointed recolorations occurring in the algorithm iss < (n, d)

    nk

    + n2k

    + . . .+ n2mk

    < (n, d)2n

    k.

    End of the proof of Theorem 4.1.

    Now we are kin the position to make our optimal choice of the parameterk. Letk =((n, d)n)1/2. Then by 4.12s log n our result doesnt makesense: even brute force is faster.

    Remark 4.14. The guesses (for pointed recoloration) we made in the al-gorithm are an example of the depth-first search. The stable refinementprocedure is a breadth-first search. The global approach of our Las Vegasalgorithm using a tower of supergroups ofAutX could, however, be hardly

    classified by these terms.

    Remark 4.16. It is the authors hope that a deeper understanding of the sta-bilizer of an edge in the automorphism group of trivalent graphs will resultin a further great improvement on the complexity given in 4.1, and hope-fully exp(c log2 n) could be reached. (For the class of arc-transitive trivalentgraphs, this goal has been achieved by R. Lipton [11].) A nice intermedi-ate problem, providing a good starting point for research, has recently beenformulated by Maria Klawe.

    Problem 4.17. (Maria Klawe [10]) By a marked graph we mean a triple

    X = (V , E , R) where (V, E) is a graph and R is an equivalence relation.Estimate the complexity of isomorphism testing for marked binary trees.(Isomorphism preserves both relationsEand R.)

    It was this problem which led the author to Theorem 4.1. It is worthnoting that the automorphism group of a binary tree (V, E) is the iterated

    23

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    24/33

    wreath product of cyclic groups of order two. It is well-described in the senseof Def. 2.1. It is a 2-group, hence there is a chain of subgroups

    Aut(V, E) =G0G1 Gr =AutXwhere|Gi : Gi+1| 2. The problem is, however, to find such a chain,recognizablein polynomial (or exp(log2 n)) time; then 2.10 could be applied.

    In contrast to 4.17, Maria Klawe observes:

    Proposition 4.18. (Maria Klawe [10]) Marked trees are isomorphism com-plete (i.e. isomorphism of marked trees is as hard as graph isomorphism).

    Proof. LetX= (V, E) be a graph onn vertices. Let (W, F) denote the treedefined by

    W={w0} V {(v, z) : (v, z)E};F ={[w0, v] :vV} {[v, (v, z)] :v, zV, [v, z]E}.

    (The vertexw0 is adjacent to nvertices. All points at distance two from w0have valence one. Two such points correspond to every edge ofx.)

    Define Ron W by

    R=

    {(x, x) :x

    w

    } {((v, z), (z, v)) : [v, z]

    E

    }.

    (Each equivalence class is either a singleton or a pair. The edges ofXare inone-to-one correspondence with the 2-element classes ofR.)

    Let T(X) = (W,F,R) denote the obtained marked tree. Clearly, X=YifT(X)=T(Y), hence graph isomorphism reduces in linear time to markedtree isomorphism.

    4.19. I propose some terminology for often occurring complexity classes.The terms exponentialand subexponentialshould be defined to be invariantunder polynomial equivalence (substituting nc for n, c > 0). Hence f(n)grows exponentially if exp(nc) < f(n) < exp(nd)(0 < c < d) for n large

    enough, and subexponential should mean exp(o(n)). Important classes ofsubexponential functions are the logonomial functions of degree c, meaningf(n) = exp((log n)c+o(1))(c > 1). (This class is invariant under polynomialequivalence for every particular value ofc. (Etymology: log f(n) is a polyno-mial of log n.) nlog logn issublogonomial(i.e. exp((log n)1+o(1))). It is worth in-

    24

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    25/33

    troducing such subclasses of the exponential functions that are invariant un-der linear substitutions only. exp(n2/3+o(1)) might be called 2/3-exponential.Theorem 4.1 says that isomorphism of graphs with bounded valence is atmost half-exponential(exp(n1/2+o(1))), and the same holds for strongly regu-lar graphs by [2]. Functions satisfying nc < f(n)< nd, 0< c < d < 1 (for nlarge enough) might be called frexponential(the exponent of the exponent isa proper fraction). (R.L. Graham warns me; this term will never catch on.)

    4.20. Using this terminology, the fundamental goals of the theory of isomor-phism testing for the not too distant future are, in my opinion, the following.

    PROBLEM A.Find a frexponential isomorphism testing algorithm for allgraphs.

    PROBLEM B.Find a logonomial isomorphism testing algorithm for triva-lent graphs.

    5 Intersection of cylindric subgroups of a di-

    rect product.

    In this section we generalize the situation of Theorem 3.1. Theorem 5.2 is theresult to be applied in [6] to find a polynomial time Las Vegas isomorphismtesting algorithm for graphs whose adjacency matrix has bounded eigenvaluemultiplicities.

    Definition 5.1. LetH0, . . . , H r1be groups and Jr a subset of the indexset. LetB be a subgroup of

    jJHj. LetD =

    j

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    26/33

    Bi jJiHjD where D=j

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    27/33

    For Jr, set HJ=jJHj.For IJ let prIJ :HJHIdenote the projection map.For each i < s choose a strictly descending chain of subsets

    Ji=J0i Jii = where i =|Ji|

    hence JPi Jp+1i ={jpi } is a singleton for each p < i.Set Bpi =pr

    JpiJi

    Bi and let Cpi be the cylinder with base B

    pi, i.e.,

    Cpi =Bpi HrJpi .

    Clearly,Ci=C0iC1i Cii =D,and|Cp+1i :Cpi | |Hjip| h.Moreover,A={Cji :i < s, j < i}.Set

    i

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    28/33

    We conclude that the chain G0 Gms is recognizable in =O(sh log h) steps.In order to be able to apply 2.5, we have to continue our chain beyond

    A. Let Ej denote the unit subgroup ofHj and set

    Kd=

    j

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    29/33

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    30/33

    6 Acknowledgements.Announced erroneous results, mathematical discussions in various corners ofthe world, and three weeks of despair led to the results of this paper.

    The first thoughts about the subject came to my mind after a half daysconversation with D. Yu. Grigoriev in Leningrad in November 1978 abouta polynomial time isomorphism testing algorithm for graphs with boundedmultiplicities of eigenvalues [6]. Shortly afterwards in Budapest I realizedthat such an algorithm would follow from a polynomial time solution ofthe cylinder intersection problem for groups, and I (mistakenly) believed I

    had found a deterministic algorithm for that. Still I dont know if such analgorithm exists.

    At the 1979 SIGACT conference (early May in Atlanta) a polynomialtime isomorphism testing for distributive lattices was announced, but theproof was incorrect and this problem is still open. At the end of May wediscussed this problem with Gary L. Miller at MIT. He conjectured thatnc loglog n could be achieved without much difficulty. I replied that this fol-lows from my cylinder intersection algorithm. The implication was correct(see 5.3, 3.1-3.4) but it turned out that I had no cylinder intersection algo-rithm. This was quite embarrassing since I had announced this non-existingresult at many universities in three countries. Gary Miller gave me a promptproof of the NP-completeness of the cylinder intersection problem for sets(5.6), considerably adding to my panic. Relief came in June at Universitede Montreal with the discovery of the simple Las Vegas algorithm presentedhere (2.4).

    My thanks are due to D. Yu. Grigoriev for starting this process, to GaryMiller for his kind hospitality and many stimulating conversations that in-fluenced my entire view of complexity theory, and to Maria Klawe for askingProblem 4.17 a partial answer to which led to the entire material of Section

    4.It is my pleasure to list those institutions whose financial assistance di-

    rectly contributed to the birth of this paper.

    I am indebted to Eotvos University (Budapest) and to the scientific ex-change program of Hungary and the U.S.S.R. for sponsoring my visit to

    30

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    31/33

    Soviet universities in Fall 1978; to Prof. B. Jonsson of Vanderbilt University(Nashville, Tennessee) for sponsoring my visit to the U.S.; to the AppliedMathematics Group at M.I.T.; to the Department of Mathematics at Uni-versite de Montreal; and to the Summer Research Workshop in AlgebraicCombinatorics of the Canadian Mathematical Society at Simon Fraser Uni-versity (Vancouver, July 1979) where Section 4 was conceived and this paperacquired its final form.

    31

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    32/33

    REFERENCES

    [ 1 ] L. Adleman and K. Manders, Reducibility, randomness and intractabil-ity, Proc. 9th Annual ACM Symp. on the Theory of Computing (1977),151-163.

    [ 2 ] L. Babai, On the complexity of canonical labeling of strongly regulargraphs, SIAM J. Computing, to appear.

    [ 3 ] L. Babai and L. Lovasz, Permutation groups and almost regular graphs,Studia Sci. Math. Hung. 8 (1973), 141-150.

    [ 4 ] R. Frucht, Lattices with given abstract group of automorphisms, Canad.J. Math. 2 (1950), 417-419.

    [ 5 ] G. Gratzer, General Lattice Theory, Birkhauser Verlag, Basel, 1978.

    [ 6 ] D. Yu. Grigoriev and and L. Babai, Isomorphism testing for graphswith bounded eigenvalue multiplicities, in preparation.

    [ 7 ] J. E. Hopcraft and R. E. Tarjan, Dividing a graph into triconnectedcomponents, SIAM J. Computing 2 (1973), 135-158.

    [ 8 ] J. E. Hopcraft and R. E. Tarjan, A V log Valgorithm for isomorphismof triconnected planar graphs, J. Comp. Syst. Sci. 7 (1973), 323-331.

    [ 9 ] J. E. Hopcraft and J. K. Wong, Linear time algorithm for isomorphismof planar graphs, Proc. Sixth Annual ACM Sump. on the Theory ofComputing (1974), 172-184.

    [10 ] Maria Klawe, Marked trees are isomorphism complete, oral communi-cation (1979).

    [11 ] R. J. Lipton, The beacon set approach to graph isomorphism, SIAMJ. Computing, to appear.

    [12 ] Gary L. Miller, On the nlogn isomorphism technique, Proc. TenthSIGACT Symp. on the Theory of Computing (1978), 51-58.

    [13 ] Gary L. Miller, The cylinder intersection problem for 3-sets is N P-complete, oral communication (1979).

    32

  • 8/10/2019 Monte-Carlo algorithms in Poly testing Babai 1979.pdf

    33/33

    [14 ] Gary L. Miller, Trivalent graph isomorphism, oral communication (1979).[15 ] R. C. Read and D. G. Corneil, The graph isomorphism disease, J.

    Graph Theory 1 (1977), 339-363.

    [16 ] R. Solovay and V. Strassen, A fast Monte-Carlo test for Primality,SIAM J. Computing 6 (1977), 84-85.

    [17 ] R. E. Zippel, Probabilistic Algorithms for Sparse Polynomials, Ph.D.Thesis, M.I.T. (1979).

    [18 ] R. E. Zippel, Probabilistic algorithms for sparse polynomials, to ap-

    pear.

    Authors address:

    Laszlo BabaiComputer Science DepartmentUniversity of Chicago1100 E. 58th St., Ry 152

    Chicago, IL 60637

    33