Automata and Formal Languages Are all Languages Regular We have seen many ways to specify Regular languages Are all languages Regular languages? The answer is No, How can we tell? A language is regular if we can describe it using any of the formalisms we have studied. If we cant describe it, does that mean it is not regular? Maybe were not clever enough. Lecture 8 Tim Sheard 1 Importance of

loops Consider this DFA. The input string 01011 gets accepted after an execution that goes through the state sequence s p q p q r. This path contains a loop (corresponding to the substring 01) that starts and ends at p. There are two simple ways of modifying this path without 1 beginning and ending states: changing its 0 s p 0 1 1

q 1 0 r (1)delete the loop from the path; (2)instead of going around the loop once, do it several times. As a consequence, we see that all strings of the form 0(10)i11 (where i 0) are accepted. 1 0 s p 0

1 1 q 1 0 r Long paths must contain a loop Suppose n is the number of states of a DFA. Then every path of length n or more makes at least n+1 visits to a state and therefore must visit some state twice. Thus, every path of length n or longer must contain a loop.

Strateg y Every long string in a regular language must have a loop. Regular Languages with loops exhibit certain kinds of patterns that are distinctly regular. Languages with long strings that do not adhere to the loop patterns for regular languages cannot be regular. Pump s Suppose L is a regular language, w is a string in L, and y is a non-empty substring of w. Thus, w=xyz, for some strings x, z . We say that y is a pump in w if all strings xyiz (that is, xz, xyz, xyyz, xyyyz, ) belong to L.

w a x b c y d e f z aef

abcdef abcdbcdef abcdbcdbcdef Pumping Lemma Let L be a regular language. Then there exists a number n such that all w L such that for all |w| n, there exists a prefix of w whose length is less than n which contains a pump. Formally: If w L and |w| n then w = xyz such that 1. y e 2. |xy| n 3. xyiz L (xy is the prefix) Definition. The number n associated to the regular language L as described in the Pumping Lemma is called the pumping

constant of L. Proo f w L, |w| xyiz L n, w = xyz such that 1. y e 2. . |xy| n 3. Let the DFA have m states. Let |w|m. Consider the path from the start state s to the (accepting) state d(s,w). Just following the first m arcs, we make m+1 total visits to states, so there must be a loop formed by some of these arcs.

We can write w=opqr, where p corresponds to that loop, and|opq| = m (the prefix of size m). Thus let n=|op|, x=o, y = p, and z = qr. 1) Since every loop has at least one arc, we know |p| >0, thus y e 2) |xy| n because xy = op and n = |op| 3) xyiz L because If p is a loop, its starts at state si and d(si,p) = si, and we know that d(si,qr) = sfinal.. Thus d(sstart,x) = si, Thus for each i d(si,yi) = si, and were done. y = q x = p start m steps qi

z = rs r | s final Proving nonregularity To prove that a given language is not regular, we use the Pumping Lemma as follows. Assuming L is regular (we are arguing by contradiction!), let n be the pumping constant of L. Making no other assumptions about n (we don't know what it is exactly), we need to produce a string wL of length n that does not contain a pump in its n-prefix. This w depends on n; we need to give w for any value of n.

There are many substrings of the n-prefix of our chosen w and we must demonstrate that none of them is a pump. Typically, we do this by writing w=xuy, a decomposition of w into three substrings about which we can only assume that u e and |xu| n. Then we must show that for some concrete i (zero or greater) the string xuiy does not belong to L. Skill required Notice the game-like structure of the proof. Somebody gives us n. Then we give w of length n. Then our opponent gives us a non-empty substring u of the n-prefix of w (and with it the factorization w =xuy of w). Finally, we choose i such that xuiy L. Our first move often requires ingenuity: We must find w so that we

can successfully respond to whatever our opponent plays next. Example 1 We show that L={0k1k | k=0,1,2, } is not regular. Assuming the Pumping Lemma constant of L is n, we take w=0n1n. We need to show that there are no pumps in the n-prefix of w, which is 0n. If u is a pump contained in 0n then 0n = xuz, and xuuz must also be in the language. But since |u| > 0, if |xuz| = n then | xuuz| = m where m > n. So we obtain a string 0m1n with m>n, which is obviously not in L, so a contradiction is obtained, and are assumption that 0K1K is regular must be false. Note. The same choice of w and i works to show that the language: L={w {0,1}* | w contains equal number of 0s and 1s}

is not regular either. Example 2 We show that L = { uu | u{a,b}* } is not regular. Let n be the pumping constant. Then we choose w=anbanb which clearly has length greater than n. The initial string an must contain the pump, u. So w = xuybanb, and xuyb = anb. But pumping u 0 times it must be the case that xybanb is in L too. But since u is not e, we see that xyb anb, since it must have fewer as. Which leads to a contradiction. Thus our original assumption that L was regular must be false. Question. If in response to the given n we play w=anan, the opponent has a chance to win. How?