Home > bioinformatics, C#, F#, Graphs > HashSet, Graph, Cognac

HashSet, Graph, Cognac

February 13, 2017 Leave a comment Go to comments

The post on applying GPU to finding Eulerian path mentioned a stumbling block: partitioning a very specific kind of graph.

In general, partitioning is a hard problem, NP-hard actually. The graphs we are dealing with are very specific, though, and may be partitioned in O(|E|) time (E – the set of graph edges). Our graphs are of the following form: partitions If we are “lucky” it consists of just one or very few (directed) loops, if not – we have lots of such disconnected loops. And it is this unlucky case that kept the bottle of good cognac sealed all these months (only a descent solution to the partitioning problem would break it out).

To partition such a graph, i.e., color every loop in its distinct color, all we need to do is walk the graph from any vertex and once the loop is detected, we pick a vertex at random from the set of those we haven’t visited yet and start walking again. We repeat until everything is visited:

let partitionLinear (end' : int [])=
    let allVertices = HashSet<int>(end')
    let colors = Array.create end'.Length -1
    let mutable color = 0
 
    while allVertices.Count > 0 do
        let mutable v = allVertices.First()
        while colors.[v] < 0 do
            allVertices.Remove v |> ignore
            colors.[v] <- color
            v <- end'.[v]
        color <- color + 1
    colors, color

This is great, except it doesn’t work. The problem is revealed when a graph is very large and very fragmented. This is when the code at line 7 fails us. The problem is, we expect it to work in O(1): how hard can it be to retrieve the “first” element?! Well, since we are dealing with a data structure that does not have an intrinsic notion of order, it may be quite hard. In fact, the complexity here is O(|E|) (actually O(|V|, but for our graphs |V| = |E|), thus the complexity of the code above is O(|E|^2).

The following code actually works:

let partitionLinear (end' : int [])=
    let colors = Array.create end'.Length -1
    let mutable color = 0
    let mutable num = end'.Length
    let mutable curIdx = 0
    let mutable nextIdx = num - 1

    while num > 0 && curIdx >= 0 do
        let mutable v = curIdx
        while colors.[v] < 0 do
            colors.[v] <- color
            v <- end'.[v]
        color <- color + 1
        num <- num - 1
        while nextIdx >= 0 && colors.[nextIdx] >= 0 do
            nextIdx <- nextIdx - 1
        curIdx <- nextIdx
    colors, color

In the worst case it is still O(|E|^2), however, this worst case is unlikely and we expect the performance close to O(|E|) in general.
This won’t cure the performance problems of the GPU algorithm, which still relies on the number of graph partitions to be somewhat reasonable, but it will enable it to run in some respectable time. Perhaps time to break out the cognac after all!

  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: