What is CUDA?

CUDA is the name of NVIDIA’s parallel computing architecture in our GPUs. NVIDIA provides a complete toolkit for programming the CUDA architecture that includes the compiler, debugger, profiler, libraries and other information developers need to deliver production quality products that use the CUDA architecture. The CUDA architecture also supports standard languages such as C and Fortran, and APIs for GPU Computing, such as OpenCL and DirectCompute.

What’s the performance increase with CUDA?

This depends on how well the problem maps onto the architecture. For data parallel applications, we have seen speedups anywhere from 10x to 200x.

See the CUDA Zone web page for many examples of applications accelerated using CUDA @

What operating systems does CUDA support?

CUDA supports Windows XP, Windows Vista, Windows 7, Linux, and OS X. Both 32-bit and 64-bit versions are supported across these operating systems.

Which applications support CUDA?

Some examples of shipping consumer applications

  • Badaboom – video encoding
  • MotionDSP – vReveal (video enhancement – reduce noise, increase resolution, stabilize image)
  • ArcSoft – SimHD (upscaling to HD)
  • CyberLink– PowerDirector 7 (encoding, video filters)
  • Pegasys – TMPGEnc 4.0 Xpress (video filters including noise reduction, sharpening)
  • SETI@home (analyze volumes of radio telescope signals searching for ET)

For Gamers there is CUDA-accelerated PhysX game acceleration like Mirrors Edge, Sacred2, CryoStasis and more

Is CUDA a programming language?

CUDA is our architecture for GPU Computing and makes it possible to run standard C on our GPUs. To make this possible, NVIDIA has defined a general computing instruction set (PTX) and small set of C language extensions that allow developers to take advantage of the massively parallel processing capabilities in our GPUs. The Portland Group is providing support for Fortran on the CUDA architecture, and others provide support for Java, Python, .NET, and other languages.

We use the term CUDA C to describe the language and the small set of extensions developers use to specify which functions will be executed on the GPU, how GPU memory will be used, and how the parallel processing capabilities of the GPU will be used by an application.

NVIDIA’s C language compiler was built using the Edison Design Group C language parser and the Open64 compiler, and was extended to support the CUDA C extensions. Both the EDG parser and Open64 compiler are used extensively by CPU companies for their compilers.

Is it hard to program a GPU?

It’s actually a question of how hard is it to parallelize your critical code. The industry challenge in parallel processing, for both CPUs and GPUs, is to determine which algorithms are taking up the bulk of the computation time (critical path), and porting only those algorithms that will scale.

The CUDA architecture removes much of the burden of manually managing parallelism. An algorithm written for the CUDA architecture is actually just a serial algorithm that can be run on many different processors simultaneously, often called a kernel. The GPU takes this kernel and executes it in parallel by launching thousands of instances across many processors in the GPU. Since most algorithms start off as serial algorithms, it’s often trivial to port programs to the CUDA architecture. It can be as simple as converting a loop into a CUDA kernel using CUDA C. There’s no need to completely re-architect the entire program to be multi-threaded as you do with modern multi-core CPUs.

The programming model used by the CUDA architecture, and for that matter OpenCL and DirectCompute is very similar, and many developers feel it is significantly easier to get good, scalable performance from that programming model than other attempts at parallelism.

Are there GPU Computing courses that students can take today?

Within 12 months of the first course being taught at University of Illinois, over 300 Universities, Colleges and schools began teaching parallel programming using the CUDA SDK and Toolkit. In addition 1,000+ Universities have generic parallel programming courses which provide the fundamentals for students to start learning how to apply parallel processing to their domain algorithms. This is one of the largest shifts in computer science teaching in decades.

For a complete map and listing of schools offering courses on programming the CUDA architecture

Does NVIDIA support OpenCL or DirectCompute?

Absolutely - we support all standard APIs. At GDC 09, NVIDIA demonstrated examples written in both in our C language programming environment as well as APIs, OpenCL and DirectCompute. OpenCL was developed on NVIDIA GPUs and NVIDIA was the first to demonstrate an OpenCL app running on a GPU at Siggraph Asia in December. NVIDIA released its OpenCL driver to strategic developers in April of 2009. NVIDIA was also first to release support for DirectCompute in its GPUs.

For more information see and

How does CUDA relate to OpenCL?

Our CUDA C programming model was an inspiration for OpenCL and other programming interfaces. OpenCL, DirectCompute, our CUDA C and The Portland Group’s CUDA Fortran extensions all use similar concepts for porting parallel applications to the GPU.

What language does NVIDIA recommend developers use? OpenCL, CUDA C, CUDA FORTRAN, DirectCompute? Why would someone continue to use CUDA C now that OpenCL is available? What is the advantage to a developer?

This boils down to personal preference. Developers will use the programming interface most comfortable to them, i.e. one that supports a development environment, libraries and OS that they are accustomed to using.

Since NVIDIA and the CUDA architecture supports all of the above languages this choice sits with the developer. How a developer chooses a programming environment is based on a classic combination of questions: when they need to start their coding, what language they use today, what OSes they need to support, what other codes or libraries they need to implement, vendor technical support, legacy code they use, and so on.

As support for other languages and APIs matures, developers can port their code as needed since many of the programming concepts are similar. This is true of developing for the CPU today, no one forces a developer to use C, C++, C#, Java – the developers get to choose, and in the end, developers benefits as they have a variety of choices.