CUDA Spotlight: Patrick RoyeBy Calisa Cole, posted Aug 13, 2013 Ultra-Low Latency Shape Sensing Using CUDAThis week's Spotlight is on Patrick Roye of Luna, a technology development company based in Virginia. In the healthcare market, Luna is a pioneer in fiber-optic shape and position sensing. Its technology is being developed to integrate into systems which perform minimally invasive diagnostics, surgery and therapy — pinpointing the position and shape of an instrument inside the body. Patrick works on accelerating Luna's processing algorithms using GPUs. He and a team of engineers and scientists are developing a prototype system that uses CUDA to calculate the shape of a fiber-optic sensor in real-time. Luna's shape-sensing systems, which are currently in development, will be used to guide the next generation of medical robotic systems safely through a patient's body. This interview is part of the CUDA Spotlight Series. Q & A with Patrick RoyeNVIDIA: Patrick, tell us about your work at Luna. NVIDIA: What are some applications of Luna's technology? NVIDIA: Why did you choose to work with GPUs? Fortunately, at the same time the door closed on our FPGAs, NVIDIA opened a window with the announcement of RDMA for GPUDirect. Since we had used CUDA a year earlier to accelerate our strain and temperature sensing calculations, we already had an idea of the advantages of GPU-accelerated processing. With RDMA for GPUDirect and CUDA-accelerated processing, we determined that we could perform data acquisition and minimal processing on an FPGA, transfer our data directly to the GPU for processing and then transfer the results back to the FPGA fast enough to meet our real-time requirements. Get it working first: There's no point in doing something fast if you're doing it wrong. Take time to generate comprehensive unit tests for each of the kernels: Once you begin optimizing, these unit tests will be invaluable for ensuring your optimizations don't introduce new processing bugs. Implement each kernel a few different ways: There were a few times where an implementation I was almost certain would be slower turned out to be the fastest one. Additionally, thinking through multiple solutions to one kernel may give you an idea that helps accelerate a different kernel later on.
NVIDIA: Tell us about some of the computations performed by the many CUDA kernels you use. NVIDIA: What types of parallel algorithms are being implemented? NVIDIA: In your field, what are the biggest challenges going forward? That’s a lot of space and energy required for components that basically act as glue between the FPGA and GPU. We’re very excited about NVIDIA’s roadmap toward Parker, which we hope will allow us to shrink our design considerably by combining a next-generation Maxwell GPU with a 64-bit ARM core in a single package. NVIDIA: How did you first learn about GPU computing? NVIDIA: If you had more computing power, what could you and your team do? NVIDIA: Describe your development infrastructure. NVIDIA: What advice would you offer others who are considering GPU computing? I found that article to be the most helpful in understanding how the CUDA architecture works and how to organize algorithms to get the most performance out of the GPU. NVIDIA: How did you become interested in the area of sensor technology? Bio for Patrick RoyePatrick and his wife Joanna live in Blacksburg, Virginia. Patrick enjoys working on challenging software problems. He has developed toolkits for programming collaborative simulations in virtual environments. He also created software that used genetic algorithms to control robotic arms using shape-sensing fiber-optic sensors. Relevant Links Contact Info |