Computing Self-Organizing Maps in a Massively Parallel Way with CUDA. Part 2: Algorithms

In the previous post I spoke briefly about motivations for implementing self-organizing maps in F# using GPU with CUDA. I have finally been able to outperform a single threaded C++ implementation by a factor of about 1.5. This is quite modest, but on the other hand rather impressive since we started out by being 60 … Continue reading Computing Self-Organizing Maps in a Massively Parallel Way with CUDA. Part 2: Algorithms

Computing Self-Organizing Maps in a Massively Parallel Way with CUDA. Part 1: F#

By 2017, it is expected that GPUs will no longer be an external accelerator to a CPU; instead, CPUs and GPUs will be integrated on the same die with a unified memory architecture. Such a system eliminates some of accelerator architectures’ historical challenges, including requiring the programmer to manage multiple memory spaces, suffering from bandwidth … Continue reading Computing Self-Organizing Maps in a Massively Parallel Way with CUDA. Part 1: F#