CUDA Week in Review Newsletter Homepage
Fri., Nov. 9, 2012, Issue #83 Newsletter Home
Welcome to CUDA: WEEK IN REVIEW, a news summary for the worldwide CUDA, GPGPU and parallel programming community.
NEW: NVIDIA Tesla K20 Now Available for Pre-Order. Tesla K20s for cluster nodes and workstations are available for pre-order today. With over 1 TF double precision performance, Tesla K20 is the world’s fastest and most energy-efficient GPU accelerator. See
GTC 2013 UPDATE: Call for Posters is now open. This is a great opportunity to share innovative research topics in the areas of GPU computing, computer graphics, cloud graphics and game development. Visit the Call for Posters page.
CUDA TECH TIP: When debugging code or outputting results, the first go-to for any programmer is usually C’s trusty printf() command. Did you know that CUDA allows printf() directly from a GPU kernel? Read more.


GPU Power for James Bond
This week’s Spotlight is on Mark Jaszberenyi, co-founder of Colorfront in Budapest, Hungary.

Colorfront’s CUDA-accelerated technology powered the production workflow for the new James Bond 007 movie, Skyfall, which opens today in the U.S.

Read the interview with Mark Jaszberenyi.
GPU Power for James Bond


GPU Technology Theater at SC12
Check out the NVIDIA GPU Technology Theater in booth #2217 to hear industry experts talk about advances in scientific discovery as well as parallel programming tips and hints. Featured speakers include:
   • Bill Dally, NVIDIA
   • Jack Dongarra, University of Tennessee and ORNL
   • Felice Pantaleo, CERN
Can’t make it to SC12? Watch the live web stream on the NVIDIA SC12 event page. For more info, visit

Petascale Earthquake Simulation
ONERA, the French Aerospace Lab, will host a seminar on Nov. 19 in Chatillon, France on "Petascale Earthquake Simulations" by Prof. Yifeng Cui, San Diego Supercomputer Center. For info, contact thle@onera(dot)fr and gdenis@onera(dot)fr.

Abstract: "Earthquake system science seeks to provide society with better predictions of earthquake causes and effects. Toward this goal, we have developed a highly scalable 3D finite difference code called AWP-ODC that has achieved "M8", a full dynamical simulation of a magnitude-8 earthquake on the southern San Andreas fault. This code was implemented with CUDA-MPI to allow efficient utilization of accelerators on hybrid GPU systems."

NVIDIA Nsight 3.0 Visual Studio Edition Preview
The new NVIDIA Nsight 3.0 Visual Studio Edition preview release supports CUDA 5.0 and the Kepler architecture in Tesla K20, with full CUDA Dynamic Parallelism debugging support and tracing of parent/child kernel launches. Available to registered Nsight developers. See:

NVIDIA Developer Forums
The new NVIDIA developer forums are now live. Join the new online community to learn from other developers and share your experience. Sign up at


Title: Scalable GPU Acceleration of B-Spline Signal Processing Operations
Author: Alexander Karantza, Rochester Institute of Technology
Advisor: Dr. Sonia Lopez Alarcon


HGST, a Western Digital Company, seeks a researcher in the area of magnetic recording heads. Candidate should have expertise in analytical and numerical modeling of nano-structured devices using micromagnetic, electron and spin-transport, thermal, and/or optical modeling. Desired skills: Physics or EE; MPI, OpenMP, and/or CUDA experience. Job #6366. Contact: hgstcareers (at)


back to the top
New on the NVIDIA blog:
How NVIDIA Gave Film Makers Early Look at Skyfall, by Joachim Zell, EFILM
How GPUs Accelerate Innovation, by Liza Gabrielson
Four Amazing Things Carnegie Mellon is Doing, by Chandra Cheij
Beyond Seeing Through Walls to Real-Time Body Scanning, by Bob Sherbin

New on the Parallel Forall blog:
Performance Metrics in CUDA Fortran, by Greg Ruetsch
An Easy Introduction to CUDA C and C++, by Mark Harris
An Easy Introduction to CUDA Fortran, by Greg Ruetsch
Do More, Code Less with ArrayFire GPU Matrix Library, by Chris McClanahan, AccelerEyes
(Subscribe to the Parallel Forall RSS feed)


back to the top
Output from a GPU Kernel with printf(): When debugging code or outputting results, the first go-to for any programmer is usually C’s trusty printf() command.

Did you know that CUDA allows printf() directly from a GPU kernel? All the usual format specifiers are supported, and every thread can call printf() independently. If you only want to output something once, however, then you must conditionally choose a thread. The following example shows how it works:
__global__ void helloCUDA() {
    if(threadIdx.x == 0)
       printf("Hello CUDA from...\n"); // Only thread 0 will print this
    printf("Thread %d\n", threadIdx.x); // Every thread will print this
Note that printf() in device code requires a GPU with Compute Capability 2.0 or later.
(Thank you to Mark Harris for this Tech Tip. Have one you’d like to share? Email:


back to the top
Meetup Momentum: Congrats to the brand new GPU Meetup in Minneapolis, Minn., led by Kirk Dybvik. The New York GPU Meetup, led by Andrew "Shep" Sheppard, now has over 800 members. The Silicon Valley GPU Meetup, led by Jike Chong, has over 500 members. The Boston GPU Meetup, led by Eliot Eshelman, has over 250 members.

Find a GPU Meetup in your location, or start one up. Upcoming meetings include:
Paris, Nov. 13
Brisbane, Nov. 22
New York, Nov. 29
Silicon Valley, Dec. 3


back to the top
Nov. 10-16, 2012, Salt Lake City, Utah
NVIDIA booth #2217

Intro to OpenACC and CUDA (FCSCL)
Nov. 12-14, 2012, Leon, Spain
In partnership with the BSC/UPC CUDA Center of Excellence

Nov. 19, 2012, Chatillon, France
Prof. Yifeng Cui, SDSC, on Petascale Earthquake Simulations
Contact: thle@onera(dot)fr and gdenis@onera(dot)fr

GPUs in the Cloud
Dec. 3-6, 2012, Taipei, Taiwan

Many-Core Developer Conference (UKMAC 2012)
Dec. 5, 2012, University of Bristol, UK

Dec. 6, 2012, Montpellier, France

Getting Started with ArrayFire: 30-Minute Jump Start (Webinar)
Dec. 13, 2012


GPU Tech Conference (GTC 2013)
March 18-21, 2013, San Jose, Calif.
See list of tutorials:

(To list an event, email:


CUDA Consulting

Training, programming, and project development services are available from CUDA consultants around the world. To be considered for inclusion on list, email: (with CUDA Consulting in subject line).

GPU Computing on Twitter

For daily updates about GPU computing and parallel programming, follow @gpucomputing on Twitter.


CUDA 5 survey:
CARMA (pre-register):

CUDA on the Web

GPU Test Drive:
Learn about upcoming Udacity course:
Learn more about CUDA:
Network with other developers:
Stay tuned to GPGPU news and events:
Newsletter archive:
CUDA Spotlights:


CUDA® is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU). NVIDIA provides a complete toolkit for programming on the CUDA architecture, supporting standard computing languages such as C, C++ and Fortran. Send comments and suggestions on the newsletter to
Copyright © 2012 NVIDIA Corporation. All rights reserved.
2701 San Tomas Expressway, Santa Clara, CA 95050.
NVIDIA - World Leader in Visual Computing Technologies