Computing Self-Organizing Maps in a Massively Parallel Way with CUDA. Part 2: Algorithms

In the previous post I spoke briefly about motivations for implementing self-organizing maps in F# using GPU with CUDA. I have finally been able to outperform a single threaded C++ implementation by a factor of about 1.5. This is quite modest, but on the other hand rather impressive since we started out by being 60 … Continue reading Computing Self-Organizing Maps in a Massively Parallel Way with CUDA. Part 2: Algorithms

Computing Self-Organizing Maps in a Massively Parallel Way with CUDA. Part 1: F#

By 2017, it is expected that GPUs will no longer be an external accelerator to a CPU; instead, CPUs and GPUs will be integrated on the same die with a unified memory architecture. Such a system eliminates some of accelerator architectures’ historical challenges, including requiring the programmer to manage multiple memory spaces, suffering from bandwidth … Continue reading Computing Self-Organizing Maps in a Massively Parallel Way with CUDA. Part 1: F#

Outperforming MathNet with Task Parallel Library

Math.NET is a great project that brings numerics to .NET the OS way. Perusing their blog, I found this post on the on-line algorithm for std calculation. A simple Wiki search revealed this article that describes a parallel algorithm due to Chan for calculating variance (std = sqrt(variance)). So, I set to figure out whether … Continue reading Outperforming MathNet with Task Parallel Library