% Week 6 Progress Report
% Abdulmajed Dakkak
% July 18th, 2008
Monday
======
Got the FCHC program to work and explained how to split a CUDA program to
Professor Francis and Katie.
Code
----
* FCHC ([.c][fchc.m6.c], [xcode.tar.gz][fchc.m6.tar.gz])
[fchc.m6.c]: code/fchc.m6/main.c
[fchc.m6.tar.gz]: code/fchc.m6.tar.gz
Tuesday
=======
Fixed an embarrassing bug in the FCHC code (all the indecies were off by one).
Dr. Francis told me that there is no reason to pursue FCHC anymore since it
branches too much to be placed on a `SIMD` machine. It seemed like the project
has ended and I had to pick a new one, but after doing some more research I
found a different way of doing 3D fluid simulation using Cellular automata ---
the Lattice Boltzmann Method. The 3D version, namely D3Q19, is used frequently
to model free surfaces. The collision could also be written using no `if`
statements, which makes it ideal for `CUDA`.
The only disadvantage of LBM is that while it can be thought of as an
extension of Lattice Gases and uses CA as its basis, its detached for the CA.
Code
----
* [FCHC.c] (fixed)
* [D2Q9.c]
[FCHC.c]: code/fchc.t6.c
[D2Q9.c]: code/d2q9.c
Wednesday
=========
Started implementing `D2Q9`, and by the middle of the day I had something
working, but blew up to infinity after a few time steps. I spent the rest of
the day debugging the code.
Thursday
========
Read a bit about adaptive time stepping and wrote some [code][d2q9-adaptive]
to perform it, but quickly realized that the paper I was looking at did not
define what mass is, and I did not know. The paper defined mass based on
density, and density based on mass, and while density makes sense to define,
mass was not. Just before the visit to Wolfram, I did get the `d2q9` code to
work. The problem with going to infinity was because the equations will blow
up if any of the velocity vectors in the cell exceed $\frac{1}{3}$. After
setting the initial vectors to $\frac{1}{3}$ and disallowing any vectors to
grow over $\frac{1}{3}$, the system works.
The visit to Wolfram was helpful, and he gave me good advice. It seems that
only one things was of interest to him, however, and that is CUDA. I described
what CUDA is to him, and what work I have been doing. Wolfram said that the
best experiment to do is to use FHP and have a system where there are no
random choices and only a cylinder as an obstacle. It seems like he wants to
see whether turbulence occurs because of the obstacle. He said that there are
three scientific camps each believing they know the cause of the turbulence
(which is not predicted by the Navier-Stokes equations):
1. The molecular dynamics at the atomics level are noticeable at the fluids
level and that is the cause of the turbulence.
2. The initial conditions are the cause of the turbulence. This is the Chaos
camp that believes that small variations of the initial conditions have a
grave impact on the final outcome.
3. The cellular automata view, which he advocates, is that randomness can grow
out of a deterministic system (I did not understand this point as I did for
the other two).
He says that while some work has been done in the early 90s to show that his
theory works (citing an example where scientists found a periodicity of 2
hours in the turbulence of liquid nitrogen), the work of lattice gases has
faded.
While it is clear that he values lattice gases over lattice Boltzmann, he did
offer a compelling argument to use LGCA and that is that whatever outcome is
produced is due to molecular interaction --- no scientist will blame error on
numerical inaccuracies. He mentioned that turbulence was for many years
regarded as an numerical error in computation fluid dynamics, but he showed
that it wasn't.
He did mention that he worked on a 3D version of LGCA and that it is simple to
derive a boolean expression for it, but I have yet to find a paper describing
anything but FCHC for 3D LGCA.
After getting the advice from Wolfram, and two copies of the periodic table
poster, I went home to sleep. After 2 hours of attempting to sleep, I figured
out how to place obstacles in my `d2q9` program and how to perform collisions.
I quickly wrote the program down, and it worked.
Code
----
* [D2Q9.c]
[D2Q9.c]: code/d2q9.c
[d2q9-adaptive]: code/d2q9-adaptive-time.c
Friday
======
Starting placing the `d2q9` in CUDA. While this process is trivial, there are
many nuances one has to deal with. First, one has to identify which part of
the program must be parallelize --- the collision and propagation in my case.
Second, one must flatten the array --- 3D in my case. Finally, one must copy
the data from the CPU to the GPU correctly and have optimization in mind from
the beginning --- placing unchanging data in constant cached memory. I spent 2
hours to get the function parallelize, the arrays flattened, and the constant
data into cache. While the program does not work correctly, it does compile. A
few hours were then spent teaching Katie how to parallelize the Mandelbrot
set.
Code
----
* [d2q9-cuda][lbm-d2q9]([tar.gz][lbm-d2q9.tar.gz])
[lbm-d2q9]: code/lbm-d2q9
[lbm-d2q9.tar.gz]: code/lbm-d2q9.tar.gz