Continuing the Advent of Code theme from the previous post. Figured since this year is going to be my year of CUDA, this would be a good opportunity to take it for a ride. A good April 1st post, but why wait? So, how can we make this even faster than the already fast imperative … Continue reading Look-and-say: [Alea.]CUDA

# Category: CUDA

# Non-linear Thinking with CUDA.

I love GPU programming for precisely this: it forces and enables you to think about a solution in a non-linear fashion in more than one sense of the word. The Problem Given a set $latex A = \{a_1, a_2 \ldots a_n\}$, output a set $latex S_A = \{0,\ \sum\limits_{k=1}^{n} a_k,\ \sum\limits_{k=i}^{i + j \mod n} … Continue reading Non-linear Thinking with CUDA.

# Fun with Alea.CUDA, F# Interactive, Charts

Source code for this post can be found on my GitHub. It's great to see technologies evolving over the years. Alea.CUDA has done so in leaps and bounds since the first time I laid eyes on it a couple of years ago. At the time the name seemed unfortunate and hinted at the "aleatic" programming … Continue reading Fun with Alea.CUDA, F# Interactive, Charts

# Generating Permutations: Clojure or F#: Part 2

Marching on from the last post. Lazy Sequences This is my favorite feature ever. If I want to generate just a few of 10! (nobody even knows how much that is) permutations, I could: provided, the function is defined (as described in the first post): Here I am not sure which language I like more. … Continue reading Generating Permutations: Clojure or F#: Part 2

# Supercharging SQL Join with GTX Titan, CUDA C++, and Thrust: Part 2

Note: All this code is now on GitHub. Compute the mathces Here is a simple, purely brute-force algorithm for computing the join mentioned in Part 1. Here is the entirely "CPU" implementation of the algorithm: Loop over both datasets, compare them one-by-one, if there is a match - flag it. The only thing to note … Continue reading Supercharging SQL Join with GTX Titan, CUDA C++, and Thrust: Part 2

# Supercharging SQL Join with GTX Titan, CUDA C++, and Thrust: Part 1

This is a post in two parts: Part 1 - The problem, solution setup, the algorithm. Part 2 - (The juicy) Implementation details, discussion. Suppose at the heart of the data layer of a web application there is a join like this: This join filters patents belonging to a set of classes from the Patents … Continue reading Supercharging SQL Join with GTX Titan, CUDA C++, and Thrust: Part 1

# Compiling CUDA Projects with Dynamic Parallelism (VS 2012/13)

Just a quick note. If you are starting from a template C++ CUDA project in VS 2012/2013, calling a kernel from a kernel (dynamic parallelism) would not compile: error : kernel launch from __device__ or __global__ functions requires separate compilation mode To fix this, first make sure your hardware supports it (cc 3.5 or higher) … Continue reading Compiling CUDA Projects with Dynamic Parallelism (VS 2012/13)