Friday 12 November 2010

Breath First Search in Haskell

Programming languages are, more or less, created equal. Despite different syntax and concepts, all decent languages are Turing complete. This means that they are part of the same class. There is no program that one can write in one language that one cannot write in another.

We know that we can perform Breath First Searches (BFS) in linear time in imperative languages. Intuitively, then, we should be able to write a linear time BFS in Haskell as well. Turns out we can, but it's too much of an effort. We'll settle for the next best thing.

Now, we are writing this bfs function for our Complementing Regular Expressions program, so we will choose a slightly odd way of representing the data. BFS works on a graph and a starting node and returns a list of all the reachable nodes from the start node. Instead of defining something like:

data Graph a = Node a [Graph a]

We will keep track of all the edges in the graph through a function f. Then, if q is a node then f q returns a list of all the reachable nodes from q.

f :: a → [a]
bfs :: a → (a → [a]) → [a]

The rather inelegant way we shall go about this is the following: At each step, we will maintain a list of all the nodes that we have visited so far called xs. At every iteration, for each visited node (so for every node in xs) we will try and add all its neighbours to xs. The iteration stops when no new nodes can be added. This is precisely what the bfsaux function does:

bfsaux :: Eq(a) => (a -> [a]) -> [a] -> [a]
bfsaux f xs = if next == [] then xs
               else bfsaux f (next ++ xs)
 where next = nub [y | x <- xs, y <- f x, (not . elem y) xs]

Note: nub ys is simply a function that deletes repeats in the list ys. Also, Eq(a) is required for elem y xs to work.

This is not efficient! One of the problems here is that, for most nodes in xs, we try adding their neighbours to xs at every iteration and we fail each time. All the checking and calling of f costs us time. One way to improve it is to split the information we carry from recursive call to recursive call into a pair (xs,ys) such that at every step we only try and add the neighbours of ys to xs and then we merge ys and xs. That is (xs , ys) → (nub(xs++ys) , next). However, our current implementation is sufficient for now. We can come back and change it if we wish so.

Then bfs is defined as:

bfs q f = bfsaux f [q]

No comments:

Post a Comment