Another CUDA post – how to move to cuda compute 1.3 safely

CUDA compute1.3 (and higher) add features that you might want to use, but they also add double support. This can be troublesome in performance applications, as double performance is much slower than float. Here are some tips for making your code use only floats. Please comment with any other tips you have!

The solution should just be to explicitly label any floats. For an example of the problem, 0.1 will be interpreted by the compiler as a double. This could propagate through your code and in an inner loop could seriously impact your performance. Whenever you use a floating point number and don’t want a double, add an ‘f’ to the end, thus: ‘0.1f’.

If you’ve already written a massive amount of code without this, here are some tips for finding doubles that have crept in.

  • Regex for “0.[0-9]+”, and match the whole word (add word start and end tags, or tick match whole word in your editor). This will identify all ‘0.0’s, and ignores all ‘0.0f’s (this tip works for non CUDA code too)
  • Add -keep to the NVCC compile options. You can then open the PTX in a text editor, and search for f64. If there are any occurrences of f64, your code is using doubles at some point. If you successfully did the above step there should be none.
  • Watch the compiler output for warnings about float and double conversion.

California! GTC 2010



This week I am in… California! You would hardly have guessed from the title of the post eh? I’m here for work, at the NVIDIA GTC 2010 conference. This is the second day here, and while yesterday was just a tutorial day, I’ve seen some amazing stuff today.

– NVIDIA named the next two gpus in their roadmap, kepler and maxwell, as well as claiming an 8* performance per watt improvement by 2013.

– Adobe showed off an amazing piece of digital photography tech allowing you to refocus after taking the shot. It used the high megapixel of current sensors and many lenses to capture lots of small images, then sew them together in software. Magic.

– stacks of stereo vision, auto stereo, surround displays, some really impressive, others blah. Auto stereo was a bit nauseous, I thought- bad omen for 3ds.

– a nice multi touch screen extension, allowing 32 simultaneous touch points, and it did indeed seem pretty robust.

Other than that I curse biological rhythm, my body is sure it should be asleep despite not being awake long enough yet!