Wednesday, 17 November 2010

Turning NFAs into DFAs (the actual code)

See previous post.

Right, so now, given the bfs function we can convert a given NFA into a DFA. First off, think of the types. We use integers to represent the states of the NFA. The set of states for the DFA, however is the powerset of the set of states of the NFA. Then:

toDFA :: NFA Integer ab → DFA [Integer] ab

If the NFA is M(Q,S,d,q0,F) then the equivalent DFA will be M'(Q',S,d',q0',F') . Let's take them one by one. To obtain Q' we need a powerset function:

powerSet :: [a] → [[a]]
powerSet [] = [[]]
powerSet (x:xs) = ps ++ map (x:) ps
where ps = powerSet xs

The alphabet stays the same.

The initial state is the set of all the states that can be reached by epsilon transitions from q0. We need to write a function that will return this set. This is where we need to use the bfs we designed earlier:

--needs trans function, initial state
newInitialState :: Eq st => (Maybe ab -> st -> [st]) -> st -> [st]
newInitialState f q = bfs q (f Nothing)

By passing f Nothing as an argument to bfs we are only allowing epsilon transitions.

To compute F' we need:

That is:

--needs set of states OF THE DFA, final states OF THE NFA
newFinalStates :: Eq(st) => [[st]] -> [st] -> [[st]]
newFinalStates dfaqs nfafqs = [q | q <- dfaqs, not $ null $ intersect q nfafqs]

Next up, let's compute d'. This is going to be slightly more difficult to do. According to the definition of d', given a state and a symbol we need to return the list of all the states that can be reached by any number of epsilon transitions and only one s transition. Let's define a function that, given a node and a function returns that exact list. We'll call it neighbourStates:

neighbourStates :: Eq(st) => st -> ab -> (Maybe ab -> st -> [st]) -> [st]
neighbourStates q0 s f = ana (f Nothing) (nub$concat$map (f (Just s)) (nub((nub$concat$map (f Nothing) (dfs q0 (f Nothing))) ++ f (Just s) q0)))

Wow, that's a big definition. Here's the idea:
  1. get a list of all the nodes reachable by only epsilon transitions
  2. perform ONE s transition from each of the states in the list
  3. for each state in the new list, find all the reachable nodes by epsilon transitions.
Turning into code:
  1. Perform bfs by using only epsilon transitions. The problem with bfs, however, is that it includes the original node, regardless of whether there exists a path to it, or not. To avoid this problem, simply do another epsilon transition from all of the nodes in the list. That is:

    nub((nub$concat$map (f Nothing) (dfs q0 (f Nothing))) ++ f (Just s) q0))

  2. Now, perform one s transition from all the odes in the list. Remember however, that we need to take care of q0, namely we need to take into account the s transitions that start straight from q0. Then:

    (nub$concat$map (f (Just s)) (nub((nub$concat$map (f Nothing) (dfs q0 (f Nothing))) ++ f (Just s) q0)))

  3. From the new list, just do bfs using epsilon transitions from all the states in the list. We could just map bfs on the list, then do concat and nub, but we already have a function that does this for us: bfsaux

    bfsaux (f Nothing) (nub$concat$map (f (Just s)) (nub((nub$concat$map (f Nothing) (dfs q0 (f Nothing))) ++ f (Just s) q0)))

So that's the neighbourStates function. Then, we can build the actual transformation function! Here it is:

toDFA :: NFA Integer ab -> DFA [Integer] ab
toDFA (NFA voc sA fA s0A finA) = DFA voc sB fB (newInitialState fA s0A) (newFinalStates sB finA)
where sB = powerSet s
fB sym state = nub (concat [neighbourStates q sym fA | q <- state]);

No comments:

Post a Comment