### Archive

Archive for the ‘C#’ Category

## HashSet, Graph, Cognac

The post on applying GPU to finding Eulerian path mentioned a stumbling block: partitioning a very specific kind of graph.

In general, partitioning is a hard problem, NP-hard actually. The graphs we are dealing with are very specific, though, and may be partitioned in $O(|E|)$ time (E – the set of graph edges). Our graphs are of the following form: If we are “lucky” it consists of just one or very few (directed) loops, if not – we have lots of such disconnected loops. And it is this unlucky case that kept the bottle of good cognac sealed all these months (only a descent solution to the partitioning problem would break it out).

To partition such a graph, i.e., color every loop in its distinct color, all we need to do is walk the graph from any vertex and once the loop is detected, we pick a vertex at random from the set of those we haven’t visited yet and start walking again. We repeat until everything is visited:

```let partitionLinear (end' : int [])=
let allVertices = HashSet<int>(end')
let colors = Array.create end'.Length -1
let mutable color = 0

while allVertices.Count > 0 do
let mutable v = allVertices.First()
while colors.[v] < 0 do
allVertices.Remove v |> ignore
colors.[v] <- color
v <- end'.[v]
color <- color + 1
colors, color
```

This is great, except it doesn’t work. The problem is revealed when a graph is very large and very fragmented. This is when the code at line 7 fails us. The problem is, we expect it to work in O(1): how hard can it be to retrieve the “first” element?! Well, since we are dealing with a data structure that does not have an intrinsic notion of order, it may be quite hard. In fact, the complexity here is $O(|E|)$ (actually $O(|V|$, but for our graphs $|V| = |E|$), thus the complexity of the code above is $O(|E|^2)$.

The following code actually works:

```let partitionLinear (end' : int [])=
let colors = Array.create end'.Length -1
let mutable color = 0
let mutable num = end'.Length
let mutable curIdx = 0
let mutable nextIdx = num - 1

while num > 0 && curIdx >= 0 do
let mutable v = curIdx
while colors.[v] < 0 do
colors.[v] <- color
v <- end'.[v]
color <- color + 1
num <- num - 1
while nextIdx >= 0 && colors.[nextIdx] >= 0 do
nextIdx <- nextIdx - 1
curIdx <- nextIdx
colors, color
```

In the worst case it is still $O(|E|^2)$, however, this worst case is unlikely and we expect the performance close to $O(|E|)$ in general.
This won’t cure the performance problems of the GPU algorithm, which still relies on the number of graph partitions to be somewhat reasonable, but it will enable it to run in some respectable time. Perhaps time to break out the cognac after all!

Categories: F#, C#, bioinformatics, Graphs

## F# vs C#. Fold and Aggregate

Suppose you need to write a script that finds n files, all called based on some pattern, say “c:\temp\my_file_x.txt”, where “x” is replaced by a range of numbers [1..30] for instance, reads the content of these files and glues them together. Suppose also that the files are very small, so you can keep them in memory all at once. Also, it should be solved in one line (except for auxilaires: defining variables, writing out the results).

One-line solutions exist both in F# and C#. Which one is prettier? I vote for F#.

Here is the C# code:

```
string templ = @"C:\temp\my_file_";

var content =
Enumerable.Range(1, 30)
.Aggregate(
new List<string>(),
(a, e) =>
{
return a;
});

File.WriteAllLines(templ + ".txt", content);

```

And here is the F# version (of just the relevant part):

```let content =
[1..30]
|> List.fold (
fun content i ->
content @
(File.ReadAllLines(fun i -> templ + i.ToString() + ".txt") |> Array.toList)
) []
```

You can accomplish almost anything with fold() and its C# Linq equivalent Aggregate().
So first we create a range, (1..30) (note here, that although [1..30] and Enumerable.Range(1, 30) generate sequences of numbers from 1 to 30, their semantics are different, so [0..30] and Enumerable.Range(0, 30) generate different sequences: the latter generates a sequence of numbers 0..29).

Then we fold the range of numbers into a list of lines (we could have just kept appending the text, not lines, but it is not all that important for this macro, and we want to make sure we start each new addition from a new line), by reading the files and gluing the results together

Categories: C#, F#, LINQ Tags: ,

## C#/F# Interop and TDD

Since this development was supposed to conform to the best methodology available, I naturally chose Test Driven Development (TDD), which, as the name suggests, presumes writing tests for each piece of the code written. Actually it suggests writing tests first, but who wants to be so dogmatic! I was writing tests at least at the same time as I was developing.

Because functions in F# are first-class values, it is easier than in any other language to split the problem into smaller pieces, implement them, and then throw them around. This is where TDD really shines, because as you write smaller pieces it is easier to write tests for them (since test paths explode exponentially relative to the paths in code being tested). After small pieces have been tested, you proceed with confidence to “lego-izing” them into larger ones.

For this project, I did not want to learn any new unit test framework and just went with the Visual Studio one. Calling F# functions and objects from C# is pretty straightforward for the most part, except when F# functions come into play.

Here is a test for Stack functionality, for example:

```using push.stack;
.....
.....
[TestMethod]
[Description("Tests pushStack/pop")]
public void PushPopTest()
{
Stack.Stack<string> stack = Stack.empty<string>();
stack = Stack.pushStack("a", stack);
stack = Stack.pushStack("b", stack);

var res = Stack.pop(stack);

Assert.AreEqual<string>("b", res.Item1);

res = Stack.pop(res.Item2);

Assert.AreEqual<string>("a", res.Item1);

}

```

here push.stack is the namespace in the Push.Core assembly that implements Push stack.

“pushStack” and “pop” are defined as follows:

```    let pushStack hd tl =
match tl with
|StackNode(x) -> StackNode(hd::x)
```

and

```    let pop = function
| StackNode([]) -> Unchecked.defaultof<'a>, StackNode([])
| StackNode(hd::tl) -> hd, StackNode(tl)
```

(pop function is defined to do nothing in case of an empty stack. It is convenient to do so for Push, which is an absolutely robust language, i.e. any syntactically correct program is guaranteed to execute).

Intellisense also does a great job of guiding you to write correct code.

It becomes slightly more complicated when F# lambda expression need to be used. (Digression. At the Seattle F# meetup recently, someone suggested syntactic improvement to F#: to make the “fun” keyword unnecessary. After all it is kind of ironic, that lambda-syntax in C#, which is an imperative language, is simpler than the one in F#!)

In any event, I found .NET Reflector to be quite useful to quickly arrive at what I needed.

For instance:

```let population = List.init config.populSize (fun i -> Code.rand (config.maxCodePoints, "INTEGER"))
```

became

```this.population =
ListModule.Initialize<Ast.Push>(config.populSize,
FSharpFunc<int, Ast.Push>.FromConverter(i => Code.rand(config.maxCodePoints, FSharpOption<string>.None)));
```

in the tests.

The Reflector shows the following signature of the F# “init” function:

```[CompilationArgumentCounts(new int[] { 1, 1 }), CompilationSourceName("init")]
public static FSharpList<T> Initialize<T>(int length, FSharpFunc<int, T> initializer)
{
return List.init<T>(length, initializer);
}
```

We then click on FSharpFunc<int, T> and get the following definition:

```[Serializable, AbstractClass, CompilationMapping(SourceConstructFlags.ObjectType)]
public abstract class FSharpFunc<T, TResult>
{
// Methods
protected FSharpFunc();
public static FSharpFunc<T, TResult> FromConverter(Converter<T, TResult> converter);
public abstract override TResult Invoke(T func);
public static V InvokeFast<V>(FSharpFunc<T, FSharpFunc<TResult, V>> func, T arg1, TResult arg2);
public static W InvokeFast<V, W>(FSharpFunc<T, FSharpFunc<TResult, FSharpFunc<V, W>>> func, T arg1, TResult arg2, V arg3);
public static X InvokeFast<V, W, X>(FSharpFunc<T, FSharpFunc<TResult, FSharpFunc<V, FSharpFunc<W, X>>>> func, T arg1, TResult arg2, V arg3, W arg4);
public static Y InvokeFast<V, W, X, Y>(FSharpFunc<T, FSharpFunc<TResult, FSharpFunc<V, FSharpFunc<W, FSharpFunc<X, Y>>>>> func, T arg1, TResult arg2, V arg3, W arg4, X arg5);
public static implicit operator Converter<T, TResult>(FSharpFunc<T, TResult> func);
public static implicit operator FSharpFunc<T, TResult>(Converter<T, TResult> converter);
public static Converter<T, TResult> ToConverter(FSharpFunc<T, TResult> func);
}

```

It is far from straightforward, but the FromConverter method is of interest. Its argument Converter is defined as:

```public delegate TOutput Converter<in TInput, out TOutput>(TInput input);
```

which is exactly what we want.

So, our F# line above becomes:

```this.population =
ListModule.Initialize<Ast.Push>(config.populSize,
FSharpFunc<int, Ast.Push>.FromConverter(i => Code.rand(config.maxCodePoints, FSharpOption<string>.Some("INTEGER"))));
```

(Code.rand is defined as:

```static member rand (maxPoints, ?bias : string)
```

“bias” argument is optional, which allows for its omission in F# code when calling the function. It compiles into the argument of option type, hence FSharpOption in the C# code.

Categories: C#, F#, TDD

## Push Operation. A Few Words on Computation Expressions/Monads

As we have seen in the previous post, it is easy to implement Push types and operations:

```  [<PushType("FLOAT")>]
type Float =
inherit PushTypeBase

new () = {inherit PushTypeBase ()}
new (f : float) = {inherit PushTypeBase(f)}

static member Me = Float()

[<PushOperation("/")>]
static member Divide() =
match processArgs2 Float.Me.MyType with
| [a1; a2] ->
if a2.Raw<float>() = 0.
then
pushResult a1
pushResult a2
else pushResult(Float(a1.Raw<float>() / a2.Raw<float>()))
| _ -> ()

```

Here is an example of an implementation of `FLOAT./` operation: a division of two integers on top of the stack.

Since in Push an operation is denoted by “OP_TYPE.OP_NAME”, it is convenient from the implementation standpoint to group operations that are unique to a specific type under the definition of this type. Operations are declared static and that declaration is enforced by the system.

Methods, implementing operations take no arguments, since all of the information is implied in implementation: we know exactly the type of stack we are working with, as well as how to obtain arguments from it.

### Minimizing Code Duplication. Generic Operations.

Some push operations are “generic” in a sense, that each type implements one. E.g.: DUP, YANK, SWAP, etc manipulate stacks of their type (i.e. INTEGER.DUP pushes a duplicate of the top of the INTEGER stack).

In order to minimize code duplication, it is possible to define “Generic” Push types. These types do not have names and no stack is created for them. They are simply “bags of operations”.

Generic stock operations are implemented as follows:

```    [<GenericPushType>]
type Ops ()=
[<GenericPushOperation("<", AppliesTo=[|"INTEGER"; "FLOAT"|])>]
static member Lt tp =
dualOp (<) tp Bool.Me.MyType

[<GenericPushOperation(">", AppliesTo=[|"INTEGER"; "FLOAT"|])>]
static member Gt tp =
dualOp (>) tp Bool.Me.MyType

[<GenericPushOperation("=")>]
static member Eq tp =
dualOp (=) tp Bool.Me.MyType

[<GenericPushOperation("DUP", Description = "Pushes the duplicate of the top of the stack")>]
static member dup tp =
if stockTypes.Stacks.[tp].length = 0 then ()
else
stockTypes.pushResult (peek stockTypes.Stacks.[tp])

[<GenericPushOperation("POP", Description = "Pops the top of the stack")>]
static member pop tp =
processArgs1 tp |> ignore

[<GenericPushOperation("ROT", Description = "Rotates 3 top stack entries")>]
static member rot tp =
pushResult (new Integer(2L))
Ops.yank tp

```

The type that bags these operations is decorated with GenericPushType attribute. No name is specified, since this is not a “type” at all, just a bag of operations. Each operation is decorated with GenericPushOperation attribute, that besides the name and the description of the operation can have an AppliesTo argument that filters out the types to which the operation applies. So, for example operations “>”, “<", "=" apply to numeric types only.

The signature of a method that implements a generic operation is slightly different than that of a regular operation. There is a string argument denoting the type of the operation. This is used to determine on which stack to perform the operation. The type argument can be any of the current types: "INTEGER", "FLOAT", etc.

All Push operations have a very similar structure. They do the following:

1. Pop one or two arguments from the stack. If the stack has fewer arguments – quit.
2. Present the top of the stack as the right-hand argument and the next value on the stack as the left-hand one.
3. Make sure the conditions for performing the operation are met. E.g., if the operation is “/”, top of the stack is not 0. If they are not met, push the arguments back and return.
4. Execute the operation.
5. Push the result onto the appropriate stack.

The devide operation on the FLOAT type goes through exactly these steps:

```        [<PushOperation("/")>]
static member Divide() =
match processArgs2 Float.Me.MyType with
| [a1; a2] ->
if a2.Raw<float>() = 0.
then
pushResult a1
pushResult a2
else pushResult(Float(a1.Raw<float>() / a2.Raw<float>()))
| _ -> ()
```

The following helper functions are almost ubiquitously used.
processArgs2 (processArg1) – pops 2 (1) arguments from the top of the stack and presents them in the right order. If it returns anything but a list of two objects of the right type, the operation does nothing.
pushResult – pushes result of the operation to the appropriate stack. Or in this case returns arguments to the stack, if one of them is 0.

Basically, code for all operations looks very similar. Which leads us to think that computation expressions (aka “Monads”) are a good fit for implementing operations. Monads allow for an easy way to factor out the common “side effects” of the code being executed, and just let us concentrate on functionality. So, for example, the computation expression-based implementation of the division operation above looks like this:

```
[<PushOperation("/")>]
static member Divide() =
let divide stack =
push {
let! right = popOne stack
let! left = popOne stack
if right <> 0. then
return! (result stack (left / right))
}
divide Float.Me.MyType
```

Note:

• in this case we do not need to special-case the fact that arguments are returned to the stack if the operation cannot be performed. We were able to include that functionality into the implementation of the “push” computation expression.
• No need to call Raw() to extract values. This is also encoded in the computation expression.
• No special provision is made for the case when there are fewer than two arguments. Again, the computation expression takes care of that.
• The code is easier to read. All of the “common” functionality is factored out of the “push{}”.

Monadic implementation will be discussed in detail in the future posts.

### Implementing a Generic Op in C#

For testing, I have implemented a TOSTRING operation for INTEGER type as a generic operation. This operation represents the top of the INTEGER stack as string literal and pushes the result to the LITERAL stack:

```   [GenericPushType]
public class IntegerOpExtension
{
[GenericPushOperation("TOSTRING",
Description="Converts top integer to a literal",
AppliesTo= new string [] {"INTEGER"})]
public static void ConvertToString()
{
if(TypeFactory.isEmptyStack("INTEGER"))
{
return;
}

long val = TypeFactory.processArgs1("INTEGER").Raw<long>();

TypeFactory.pushResult (new Literal(val.ToString()));
}
}
```
Categories: C#, computation expression, F#, monad, Push Tags:

## Implementing a Push Type

From a developer’s point of view, a Push type is implemented through a class, derived from PushTypeBase, as mentioned in the previous post.

Once the type is implemented in any .NET language the system will hook it into the currently available types, will add a parser and a stack for it.

All that needs to be done is:

1. Declare a class, deriving from PushTypeBase, and decorated with PushType attribute. Here the required parameter to the attribute is the name of the type:
```    [<PushType("FLOAT")>]
type Float =
inherit PushTypeBase
```
2. Define default constructor, and the constructor with one argument: the underlying type value, e.g., for a FLOAT type – a float type value:

in C#:

```        public Float() : base() {};
public Float(double d) : base (d) {};
```

and here is the syntax for the same in F#:

```        new () = {inherit PushTypeBase ()}
new (f : float) = {inherit PushTypeBase(f)}
```
3. For easy access to the type name of this Push type, define a static member and instantiate it with the default constructor.
```          static member Me = Float()
```
4. Implement custom parsing if required by creating a parse function and overriding the base class Parser property. Or simply override the Parser property to return Unchecked.defaultof<ExtendedTypeParser> in F#, null in C#:

```        // custom parsing
static member parse s =
let result = ref Unchecked.defaultof<float>
if not (System.Double.TryParse(s, result))
then
Unchecked.defaultof<PushTypeBase>
else
new Float(!result) :> PushTypeBase

override t.Parser
with get() =
ExtendedTypeParser(Float.parse)
```
5. Override the base class ToString() implementation, if necessary. The base implementation will simply call the ToString() of the underlying value type if it is implemented.
6. Implement type specific operations (more in the next post):
```       [<PushOperation("+")>]
match processArgs2 Float.Me.MyType with
| [a1; a2] -> pushResult(Float(a1.Raw<float>() + a2.Raw<float>()))
| _ -> ()

[<PushOperation("*")>]
static member Multiply() =
match processArgs2 Float.Me.MyType with
| [a1; a2] -> pushResult(Float(a1.Raw<float>() * a2.Raw<float>()))
| _ -> ()

```

Here is an example of extending the language with the URL type, implemented in C#:

```    [PushType("URL")]
public class UrlPushType : Type.PushTypeBase
{
public UrlPushType() : base() {}
public UrlPushType(Uri url) : base(url) {}

static UrlPushType UrlParse(string url)
{
try
{
if (!url.Trim().StartsWith("http://", true, CultureInfo.InvariantCulture))
{
return null;
}

Uri uri = new Uri(url);
if (uri.HostNameType != UriHostNameType.Dns)
{
return null;
}

return new UrlPushType(uri);

}
catch (Exception)
{
return null;
}
}

public override Type.ExtendedTypeParser Parser
{
get
{
return new Type.ExtendedTypeParser(UrlParse);
}
}

[PushOperation("DOMAIN", Description="Extract domain name from the URL")]
static void ExtractDomain()
{
// pop the URL from the URL stack
var arg = TypeFactory.processArgs1("URL");

//if there is nothing there...
if (arg == null)
{
return;
}

// extract the underlying data
var uri = arg.Raw<Uri>();

//create the new URI
var newUri = new UrlPushType(new Uri(uri.Host));

// push it back to the URL stack.
TypeFactory.pushResult(newUri);
}

public override string ToString()
{
return base.ToString();
}
```
Categories: C#, F#, Push Tags: