Events

Subscribe

SC15 | Austin | TX November 16-19, 2015

GPU TECHNOLOGY THEATER AT SC15

MONDAY, NOVEMBER 16 - THURSDAY, NOVEMBER 19
DURING EXHIBITION HOURS | NVIDIA BOOTH #1021

Come hear how others in your field are taking on a wide range of topics in HPC and accelerated computing. The theater will host talks every 30 minutes, every day, from an impressive lineup of industry luminaries and scientific experts—including two ACM Gordon Bell Finalists:

  • Jack Dongarra, University of Tennessee
  • John Stone, University of Illinois at Urbana-Champaign
  • James C. Phillips, University of Illinois at Urbana-Champaign
  • Jeroen Tromp, Princeton
  • Mathieu Luisier, ETH Zurich
  • Dirk Pleiter, Jülich Supercomputing Centre
  • Prof. Paul Shapiro, University of Texas Austin
  • Alex Feltus, Clemson University
  • Bryan Catanzaro, Baidu Research
  • Mark Staveley, Microsoft
  • Brad Bebee, University of Chicago

The theater is open to all attendees and seating is first come, first serve. We recommend you come early to reserve your seat.

 

MONDAY, NOVEMBER 16 | BOOTH #1021

7:30PM - 8:30PM

VIEW NOW

VIEW PDF

Accelerated Computing: The Path Forward

 

 

Jen-Hsun Huang (NVIDIA)

Jen-Hsun Huang

Jen-Hsun Huang co-founded NVIDIA in 1993 and has served since its inception as president, chief executive officer and a member of the board of directors. Prior to founding NVIDIA, Huang worked at LSI Logic and Advanced Micro Devices. He holds a BSEE degree from Oregon State University and an MSEE degree from Stanford University.

Ian Buck(NVIDIA)

Ian Buck

Ian Buck is an inventor, developer, and General Manager of CUDA GPU Computing technology. CUDA awarded multiple recognitions (HPCWire Readers & Editors Choice Awards, Best HPC Software, Technologies to watch, Popular Science "Best of What's New") and remains established standard for GPU Computing in the HPC community. Technology foundation for the $100M Tesla product line and runs 3 of the 5 top supercomputers in the world. VP of Accelerated Computing business unit which include all hardware and software product lines, 3rd party enablement, and inbound marketing activities for GPU Computing at NVIDIA.

 

TUESDAY, NOVEMBER 17 | BOOTH #1021

10:30AM - 10:50AM

VIEW NOW

VIEW PDF

The Path to Exascale Computing

 

Achieving sustained Exascale performance on demanding applications requires advances in energy efficiency and programmability. With Moore's law no longer providing historic efficiency gains from semiconductor scaling, advances in circuits and architecture are needed for energy efficiency. With deep storage hierarchies and billion-thread concurrency, target-independent programming systems with data-aware runtimes are required to simplify programming while providing performance on future hardware. This talk will review the challenges of Exascale and give examples on recent progress toward this goal.


 

Bill Dally (NVIDIA Research)

Bill Dally

Bill Dally is chief scientist at NVIDIA and senior vice president of NVIDIA Research. Dally first joined NVIDIA in 2009 after spending 12 years at Stanford University, where he was chairman of the computer science department and the Willard R. and Inez Kerr Bell Professor of Engineering. Dally has published more than 200 papers, holds more than 100 issued patents and is the author of three textbooks.

 

11:00AM - 11:20AM

VIEW NOW

VIEW PDF

Bringing NVIDIA GPUs to Azure

 

Microsoft and NVIDIA have partnered to bring both GPU compute and high-end visualization capabilities to Microsoft Azure. This talk presents an overview of the different Virtual Machines with NVIDIA GPUs that will be featured on Microsoft Azure.


 

Mark Staveley (Microsoft)

Mark Staveley

Mark Staveley is a Senior Program Manager with Azure's High Performance Computing team where he is responsible for overseeing Azure's Accelerated Computing and Visualization Program. Prior to Azure, Mark worked for Microsoft Research where he oversaw their Large-Scale Data Management and Processing Program, and architected their large-scale GPU processing systems.

Mark holds a BSc from Queen's University, a MSc from the University of Waikato, and a PhD from Memorial University.

Mark is also an Associate Member of the Knowledge Media and Design Institute at the University of Toronto, and an Adjunct Instructor with the University of Washington's Data Science and Engineering Professional Certification Program.

 

11:30AM - 11:50AM

VIEW NOW

VIEW PDF

Towards a High Performance Analytics and Computing Platform for Brain Research

 

The European Human Brain Project aims for a pre-exascale supercomputing facility with an exceptional memory capacity of up to 20 PBytes to facilitate future brain research. In this talk we introduce this fascinating area of research, discus its challenging requirements and the resulting needs for new HPC architectures and technologies.


 

Dirk Pleiter (Jülich Supercomputing Centre)

Dirk Pleiter

Prof. Dr. Dirk Pleiter is research group leader at the Jülich Supercomputing Centre (JSC) and professor of theoretical physics at the University of Regensburg. At JSC he leads a group on application oriented technology development and is principal investigator of several exascale labs including the POWER Acceleration and Design Center.

 

12:00PM - 12:20PM

VIEW NOW

VIEW PDF

An Open-Source CUDA Compiler

 

Deploying closed-source vendor binaries in datacenters is a big problem because of security concerns, binary dependencies, and missed performance opportunities. In this talk we preview our upcoming release of an open-source CUDA toolchain. It is on par with NVIDIA's toolchain for several open-source benchmark suites, winning in many cases. It supports modern language features and has faster compile times. We believe this compiler will enable reproducible compiler research and make deployment of GPUs in controlled and restricted environments dramatically simpler.


 

Robert Hundt (Google)

John Stone

Robert Hundt is a Principal Engineer at Google, where he focuses on software for accelerators. He is strongly engaged in datacenter and compiler research.

 

12:30PM - 12:50PM

VIEW NOW

VIEW PDF

Visualization and Computation: Moving Forward Together

 

Whether your data is too large for post-processing or you demand a more interactive, exploratory workflow, leading edge HPC applications require tighter coupling of simulation, analysis and visualization. In this talk, I will show how NVIDIA GPUs can help you master this challenge.


 

Peter Messmer (NVIDIA)

Peter Messmer

Peter is part of NVIDIA's DevTech organization and leads the NVIDIA HPC visualization efforts. Prior to joining NVIDIA, Peter was working as a computational scientist and a developer of commercial HPC applications. He holds a MSc and PhD in Physics from ETH Zurich.

 

1:00PM - 1:20PM

VIEW NOW

VIEW PDF

VASP on GPUs: When and How

 

We introduce VASP on the GPU, which will be included in the next VASP release. We begin by establishing robustness through the results of a multi-site beta and reviewing performance tuning techniques and associated performance expectations. We conclude with a tentative road-map and welcome feedback.


 

Max Hutchinson (University of Chicago)

Max Hutchinson

Max Hutchinson is a doctoral student in physics at the University of Chicago and a former Department of Energy Computational Science Graduate Fellow. He splits time between electronic structure and fluid dynamics, unified by an effort to use emerging architectures to do exciting science.

 

1:30PM - 1:50PM

VIEW NOW

VIEW PDF

Scaling Deep Learning

 

Training deep neural networks is computationally intensive, and speed matters. In this talk, I will detail how we scaled the training process for end-to-end speech recognition to multiple GPUs on multiple nodes. This will cover techniques including 16-bit floating-point arithmetic, optimized all-reduce algorithms, optimized GEMM kernels, and GPU implementation of the Connectionist Temporal Classification loss function. I will also discuss the use of GPUs for deploying deep neural networks to end users.


 

Bryan Catanzaro (Baidu Research)

Bryan Catanzaro

Dr. Bryan Catanzaro is a senior researcher at the Silicon Valley AI Lab of Baidu Research, where he uses clusters of GPUs to train large deep neural networks. Prior to joining Baidu, Bryan was at NVIDIA Research, where he researched tools and methodologies to make it easier to use parallel computing for machine learning. He earned his PhD from UC Berkeley, where he built the Copperhead language and compiler. He earned his MS and BS degrees from Brigham Young University.

 

2:00PM - 2:20PM

VIEW NOW

VIEW PDF

Extreme Level-of-Detail Simulation and Visualization

 

Combustion simulations are challenging because they capture features ranging in size from microns to meters. This presents problems for visualization, which typically needs to render the whole domain. SpaceX's approach uses multi-resolution data for both simulation and visualization, rendering simulation output directly at up to 20 levels of detail.


 

Stephen Jones (SpaceX )

Stephen Jones

Stephen leads the Simulation and Analytics group at SpaceX, working on large-scale simulation of rocket engines. Prior to SpaceX he worked at NVIDIA as the architect for the CUDA language. His background is in computational fluid mechanics and plasma physics, but he has worked in software for many years.

 

2:30PM – 2:50PM

VIEW NOW

VIEW PDF

HPC with the NVIDIA Accelerated Computing Toolkit

 

The NVIDIA Accelerated Computing Toolkit includes everything you need to accelerate your HPC applications. Learn about the transformative impact of GPU-accelerated computing on scientific research and commercial industries ranging from computational chemistry to artificial intelligence. See examples demonstrating how GPU-accelerated libraries like cuDNN and AmgX make it easy to accelerate existing applications; how the OpenACC SDK increases developer productivity and performance portability; and how CUDA programming tools enable high performance and flexibility in familiar HPC languages like C++ and Fortan. The NVIDIA Accelerated Computing Toolkit lets you start developing immediately with a complete set of development tools for embedded, desktop, server, and cloud datacenter environments.


 

Mark Harris (NVIDIA)

Mark Harris

Mark Harris is Chief Technologist for GPU Computing at NVIDIA, where he works as a developer advocate and helps drive NVIDIA's GPU computing software strategy. His research interests include parallel computing, general-purpose computation on GPUs, physically based simulation, and real-time rendering. Mark founded www.GPGPU.org while he was earning his PhD in computer science from the University of North Carolina at Chapel Hill. Mark lives off-grid in the beautiful MacKellar Range of northern New South Wales, Australia with his wife and daughter.

 

3:00PM - 3:20PM

VIEW NOW

VIEW PDF

Towards Exascale Seismic Imaging & Inversion

 

Imaging Earth's interior requires significant computational resources. Post-petascale supercomputers bring a cohort of concerns tied to obtaining optimum performance, including energy consumption, fault resilience, scalability of the current parallel paradigms, workflow management, I/O performance and feature extraction with large datasets. In this presentation, we focus on the last three issues.


 

Jeroen Tromp (Princeton University)

Jeroen Tromp

Jeroen received his B.Sc. from the University of Utrecht in 1988 and Ph.D. from Princeton in 1992. His awards include the Macelwane Medal of the American Geophysical Union, the Gordon Bell Award, and the Gutenberg Medal of the European Geophysical Union. He is a corresponding member of the Royal Netherlands Academy of Sciences.

 

3:30 PM - 3:50 PM

VIEW NOW

VIEW PDF

Petascale Biomolecular Simulation with NAMD on Titan, Blue Waters, and Summit

 

Hundred-million-atom simulations of viruses and small organelles on Titan and Blue Waters reveal biological mechanisms unobservable in smaller or coarser-grained simulations. NVIDIA GPUs are a critical technology for preparing, running, and analyzing these simulations today, and the upcoming Summit supercomputer is ideally suited for continued leadership-scale biomolecular science.


 

James C. Phillips (University of Illinois at Urbana-Champaign)

James C. Phillips

The lead developer of the highly scalable molecular dynamics program NAMD, for which he received a Gordon Bell Award in 2002, James has a Ph.D.in Physics and his research interests include improving the performance and accuracy of biomolecular simulations through parallelization, optimization, hardware acceleration, better algorithms, and new methods.

 

4:00PM - 4:20PM

VIEW NOW

VIEW PDF

The State of Accelerated Applications

 

OpenACC is an open standard for writing performance portable code for parallel architectures using compiler directives. With the release of the NVIDIA OpenACC Toolkit, OpenACC is seeing even wider adoption. This talk will highlight recent OpenACC successes on both GPU and multicore systems.


 

Michael Feldman (Intersect360)

 

4:30PM - 4:50PM

VIEW NOW

VIEW PDF

From "Piz Daint" to "Piz Kesch": The Making of a GPU-based weather forecasting system

 

On October 1, 2015 "Piz Kesch", an appliance based on the Cray CS-Storm system architecture that is loaded with NVIDIA K80 GPUs, became operational at CSCS on behalf of MeteoSwiss. We discuss the hardware-software co-design project behind this most cost and energy efficient system for numerical weather prediction.


 

Thomas Schulthess (ETH Zurich / CSCS)

Thomas Schulthess

Thomas Schulthess is Director of the Swiss National Supercomputing Centre (CSCS) and a professor for computational physics at ETH Zurich. He received his PhD in physics in 1994. Since 2010 he has taken interest in refactoring climate codes to take advantage of novel, energy efficient computing architectures.

 

5:00:PM - 5:50PM

Accelerating Cognitive Workloads with Machine Learning

 

GPUs are being increasingly exploited as key compute engines to achieve dramatic improvements in cognitive and machine learning workload performance. This requires a broad understanding of these new workloads, system structures, and algorithms to determine what to accelerate/specialize, and how. In this talk, we will discuss this application driven approach to cognitive workload acceleration achieved through exploitation of GPUs with Power Systems.


 

Ruchir Puri (IBM Thomas J Watson Research Center)

Ruchir Pori

Ruchir Puri is an IBM Fellow at IBM Thomas J Watson Research Center, Yorktown Hts, NY where he leads research efforts in system design and acceleration. Most recently, he led the design methodology innovations for IBM's Power and zEnterprise microprocessors. Dr. Puri has received numerous accolades including the highest technical position at IBM, the IBM Fellow, which was awarded for his transformational role in microprocessor design methodology. In addition, he has received "Best of IBM" awards in both 2011 and 2012 and IBM Corporate Award from IBM's CEO, and several IBM Outstanding Technical Achievement awards. Dr. Puri is a Fellow of the IEEE, an ACM Distinguished Speaker and IEEE Distinguished Lecturer. He is also a member of IBM Academy of Technologyand was appointed an IBM Master Inventor in 2010. Dr. Puri is a recipient of Semiconductor Research Corporation (SRC) outstanding mentor award and have been an adjunct professor at Dept. of Electrical Engineering, Columbia University, NY and was also honored with John Von-Neumann Chair at Institute of Discrete Mathematics at Bonn University, Germany. Dr. Puri is also a recipient of the 2014 Asian American Engineer of the Year Award. He has delivered numerous keynotes and invited talks at leading academic and industrial conferences, National Science Foundation and US Department of Defense Research panels and have been an editor of IEEE Transactions on Circuits and Systems. Dr. Puri is an inventor of over 50 U.S. patents (both issued and pending) and have authored over 100 publications on the SW/HW co-optimization, and design and synthesis of low-power and high-performance circuits.

Rajesh Bordawekar (IBM Thomas J Watson Research Center)

Rajesth Bordawekar

Rajesh Bordawekar is a Research Staff Member at the IBM T. J. Watson Research Center. His current interest is exploring software-hardware co-design of analytics workloads. He works at the intersection of high-performance computing, analytics, and data management domains. He have been investigating how GPUs could be used for accelerating key analytics kernels in text analytics, data management, graph analytics, and deep learning. As part of this work, he collaborates closely with the IBM Power Systems, and various analytics and database product teams.

 

WEDNESDAY, NOVEMBER 18 | BOOTH #1021

 

10:00AM - 10:20AM

VIEW NOW

VIEW PDF

Performance Portability with OpenACC

 

OpenACC is an open standard for writing performance portable code for parallel architectures using compiler directives. With the release of the NVIDIA OpenACC Toolkit, OpenACC is seeing even wider adoption. This talk will highlight recent OpenACC successes on both GPU and multicore systems.


 

Jeff Larkin (NVIDIA)

Jeff Larkin

Jeff Larkin is a software engineer in NVIDIA's Developer Technology (DevTech) group where he works on porting and optimizing HPC applications. He is also closely involved with the development of both the OpenACC and OpenMP specifications. Prior to joining NVIDIA Jeff worked in Cray's Supercomputing Center of Excellence at ORNL.

 

10:30AM - 10:50AM

VIEW NOW

VIEW PDF

Legion: A Vision for Future HPC Programming Systems

 

We present Legion, a programming model and runtime system for the implementation of future exascale applications. We describe the important high-leveldesign decisions made in Legion and their implications for application development.We'll report on our experiences using Legion to do a full-scale production run of S3D, a large combustion simulation, and why we believe systems like Legionwill be crucial for the future of HPC.


 

Michael Bauer (NVIDIA)

Michael Bauer is a research scientist in NVIDIA Research. He holds a PhD in computer science from Stanford University.

Pat McCormick (LANL)

Pat McCormick is a senior scientist at Los Alamos National Laboratory where he leads the programming models efforts within the Applied Computer Science group. He was involved in the early efforts in developing GPGPU programming efforts that started almost 15 years ago. Beyond programming systems, his research interests cover data analysis and visualization, domain-specific languages and supporting compiler technologies.

 

11:00AM - 11:20AM

VIEW NOW

VIEW PDF

Exposing Mass Parallelism in C++

 

Modern high-performance programmers should be equipped with programming languages that provide mechanisms supporting massively parallel computations. In this talk, I will describe our recent research work that combines a small set of organizing concepts into a powerful programming model for exposing parallelism and managing parallel execution in the C++ language.


 

Michael Garland (NVIDIA Research)

Michael Garland is the Director of Programming Systems Research at NVIDIA, and holds a Ph.D. in Computer Science from Carnegie Mellon University. His current research interests include graph and mesh processing, parallel algorithm design, parallel programming models and languages, and parallel architecture.

 

11:30AM - 11:50AM

VIEW NOW

VIEW PDF

G3NA-V: A GPU-Enabled Human Interaction and Visualization Tool for Mining and Aligning Complex Gene Interaction Graphs

 

Life scientists peer deep into complex molecular systems which are modelled as gene interaction biographs. Conserved subgraphs between species imply ancient paleogenetic interaction. The utility of nVidia GPU-optimized biograph alignment will be explored with our open source G3NA graph aligner and human interaction and visualization tool G3NA-V.


 

F. Alex Feltus (Clemson University)

F. Alex Feltus

Feltus received a B.Sc. in Biochemistry from Auburn University, served in the Peace Corps, and then completed advanced training in biomedical sciences at Vanderbilt and Emory. Since 2002, he has performed research in bioinformatics, high-performance computing, network biology, genome assembly, systems genetics, paleogenomics, and systems genetics.

Melissa Smith (Clemson University)

Melissa Smith

Dr. Melissa C. Smith joined the ECE department at Clemson in 2006; previously she was a Research Associate at ORNL. Her current research focuses on performance analysis and optimization of emerging heterogeneous computing architectures for application domains including scientific applications, high-performance or real-time embedded applications, and medical and image processing.

 

12:00PM - 12:20PM

VIEW NOW

VIEW PDF

VASP on GPUs: When and How

 

We introduce VASP on the GPU, which will be included in the next VASP release. We begin by establishing robustness through the results of a multi-site beta and reviewing performance tuning techniques and associated performance expectations. We conclude with a tentative road-map and welcome feedback.


 

Max Hutchinson (University of Chicago)

Max Hutchinson

Max Hutchinson is a doctoral student in physics at the University of Chicago and a former Department of Energy Computational Science Graduate Fellow. He splits time between electronic structure and fluid dynamics, unified by an effort to use emerging architectures to do exciting science.

 

12:30PM - 12:50PM

VIEW NOW

VIEW PDF

Embracing Heterogeneous Simulation of Complex Fluid Flows

 

Exascale systems will require software that akes advantage of massive parallelism without generating unmanageable data volumes. We address this challenge by instrumenting fluid flow simulations with in situ data destructively observe microscopic processes that occut within complex materials.


 

James McClure (Virginia Tech)

James McClure

James McClure is a Computational Scientist with Advanced Research Computing at Virginia Tech. His research interests include multi-scale transport phenomena and novel applications of heterogeneous and GPU-accelerated computing. His computational work is supported by a DOE INCITE award through Oak Ridge Leadership Computing Foundation. He obtained his Ph.D. from the University of North Carolina at Chapel Hill.

 

1:00PM - 1:20PM

VIEW NOW

VIEW PDF

Overcoming the Barriers of Graphs on GPUs: Delivering Graph Analytics100X Faster and 40X Cheaper

 

Graphs are not like other big data challenges. Achieving graph analytics at scale requires the right combination of hardware and software. Blazegraph GPU delivers analytics 100X faster and 40X cheaper by exploiting the advantages of GPU hardware. High level APIs overcome adoption barriers for applications of all sizes.


 

Brad Bebee (Blazegraph by SYSTAP)

Brad Bebee

In graphs size matters. Brad is the CEO of SYSTAP leading efforts to deliver graphs at scale with Blazegraph products. An expert in graphs and large-scale analytics, he has a diverse background in software developments, telecommunications, and information retrieval. He has served as CTO, CFO, and CEO in his career.

 

1:30PM - 1:50PM

VIEW NOW

VIEW PDF

Responsive Large Data Analysis and Visualization with the ParaView Ecosystem

 

The ParaView Ecosystem enables responsive analysis and visualization leveraging Nvidia GPUs at extreme scale. The Ecosystem includes: ParaView for exploratory workflows; Catalyst for in situ workflows; next generation VTK for faster rendering; VTK-m for faster algorithms; and Nvidia's IndeX for accelerated volume visualization. We discuss details of the ParaView Ecosystem.


 

Patrick O'Leary (Kitware, Inc)

Patrick O'Leary

Dr. O'Leary has held faculty, research and leadership positions at a number of universities (Wyoming, Texas A&M, Minnesota, Alaska Anchorage and Northern Arizona University, University of Alberta), laboratories (Los Alamos National, NOAA FSL, Idaho National) and institutions (Wyoming Institute for Scientific Computation, Texas A&M Institute for Scientific Computation, Desert Research Institute and WestGrid/Compute Canada).

Currently, Dr. O'Leary is the Assistant Director of Scientific Computing for Kitware, Inc. and leads the office in Santa Fe, NM. His research interests include high performance computing (HPC), numerical analysis, finite elements and visualization.

 

2:00PM - 2:20

VIEW NOW

VIEW PDF

Simulating the Reionization of the Local Universe: Witnessing our Own Cosmic Dawn

 

Cosmic Dawn (CoDa), the first fully-coupled radiation-hydrodynamics simulation of cosmic reionization and galaxy formation in the local universe, will be described. Our new hybrid CPU-GPU code,RAMSES-CUDATON, achieved this milestone on ORNL supercomputer Titan, with 8192 CPUs and 8192 GPUs, in 2 million node hours, or 10 wall clock days.


 

Paul Shapiro (The University of Texas Austin)

Paul Shapiro

Paul R. Shapiro is Frank N. Edmons, Jr. Regents Professor in Astronomy at The University of Texas at AUstin and Chair of the Division of Astrophysics of the American Physical Society; Harvard A.B. (1974), Ph.D. (1979); Institute for Advanced Study (1978-1980); Alfred P. Sloan Fellowship in Physics and Fellow of the American Physical Society.

 

2:30PM - 2:50PM

VIEW NOW

VIEW PDF

Energy-Efficient Architectures for Exascale Systems

 

Compared to today's high-performance computers, Exascale systems are expected to require 50x more energy efficiency and the ability to exploit 1000x concurrency. This talk will present concepts that are being explored within NVIDIA research that aim to help throughput processing systems achieve these efficiency and concurrency targets.


 

Stephen W. Keckler (NVIDIA Research)

Stephen W. Keckler

Dr. Stephen W. Keckler is a Senior Director of research at NVIDIA, where he leads the Architecture Research Group and conducts research in throughput processor architectures, memory systems, and resilience. He is also an Adjunct Professor of Computer Science at the University of Texas at Austin.

 

3:00PM - 3:20PM

VIEW NOW

VIEW PDF

GPU-Enabled Simulation and Visualization of Nanoelectronic Devices

 

To model nano-devices such a transistors, memristors, or solar cells, predict their performance before fabrication, and shed light on their working principles advanced simulation and visualization approaches are needed. In this talk I will show how GPUs can be used to reduce their computational time and improve their quality.


 

Mathieu Luisier (ETH Zurich)

Mathieu Luisier

Mathieu Luisier is assistant professor at ETH Zurich, Switzerland, where he graduated in electrical engineering in 2003 and got his Ph.D. in 2007. His research focuses on the computational modeling of next-generation transistors, memories, or photovoltaic devices. He won an honorable mention at the Gordon Bell Prize competition in 2011.

 

3:30PM - 3:50PM

VIEW NOW

VIEW PDF

VMD+OptiX: Bringing Interactive Molecular Ray Tracing from Remote GPU Clusters to your VR Headset

 

Commodity head mounted displays (HMDs) offer a tremendous opportunity to make immersive molecular visualization techniques broadly available. HMDs offer the promise of intuitive exploration of large molecular complexes and their dynamics, but their requirement for low-latency high-frame-rate display presents a formidable challenge for high quality remote ray tracing at distant HPC centers. This session will present a new interactive ray tracing system for remote visualization with HMDs, implemented within the popular molecular visualization tool VMD using a combination of interactive OptiX ray tracing, omnidirectional stereoscopic projection, H.264 video streaming, and high performance OpenGL rasterization.


 

John E. Stone (University of Illinois at Urbana-Champaign)

John E. Stone

John Stone is a Senior Research Programmer in the Theoretical and Computational Biophysics Group at the Beckman Institute for Advanced Science and Technology, and Associate Director of the NVIDIA CUDA Center of Excellence at the University of Illinois. Mr. Stone is the lead developer of VMD, a high performance molecular visualization tool used by researchers all over the world. His research interests include molecular visualization, GPU computing, parallel rendering, ray tracing, haptics, and virtual environments. Mr. Stone was awarded as an NVIDIA CUDA Fellow in 2010. In 2015 Mr. Stone joined the Khronos Group Advisory Panel for the Vulkan Graphics API. Mr. Stone also provides consulting services for projects involving computer graphics, GPU computing, and high performance computing in general.

 

4:00PM - 4:20PM

VIEW NOW

VIEW PDF

MVAPICH2-GDR: Pushing the Frontier of Designing MPI Libraries Enabling GPUDirect Technologies

 

The talk will focus on the latest developments in MVAPICH2-GDR library that helps MPI developers to exploit maximum performance and scalability on clusters with NVIDIA GPUs. Multiple designs focusing on GPUDirect RDMA(GDR)_Async, MPI-3 RMA using GDR, usage of fast GDRCOPY, Non-Blocking Collectives using GDR and Core-Direct, support for managed memory and datatype processing will be addressed.


 

Dhabaleswar K. (DK) Panda (Ohio State University)

Dhabaleswar K. (DK) Panda

Dhabaleswar K. (DK) Panda is a Professor and University Distinguished Scholar of Computer Science and Engineering at the Ohio State University. The MVAPICH2 (High Performance MPI over InfiniBand, iWARP and RoCE) libraries, developed by his research group (//mvapich.cse.ohio-state.edu), are currently being used by more than 2,450 organizations worldwide (in 76 countries).

 

4:30PM - 4:50PM

VIEW NOW

VIEW PDF

18,688 K20X's Running after a Tumor Cell

 

We discuss the features of uDeviceX, the petascale in-silico lab-on-a-chip. Microfluidic devices are essential for the detection of circulating tumor cells as prognostic markers of metastatic cancer. Theunprecedented compute power offered by CUDA-enabled supercomputers allows us to contribute both on fundamental understanding and technological advancements in this field.


 

Diego Rossinelli (ETH Zurich)

Diego Rossinelli

I am a research associate of CSE Lab at ETH Zurich. I focus on rapid software development for petascale supercomputers. My domain interests are DNS of compressible and incompressible flows, wavelets for both simulations and data compression, as well as distributed scientific visualization.

 

5:00PM - 5:20PM

VIEW NOW

VIEW PDF

Advancing Weather Prediction at NOAA

 

Modern HPC systems provide an wide array of options for integrating NVIDIA Tesla GPUs. Dale Southard, Principal System Architect, Office of the CTO for Tesla Computing, will provide an overview to help decision makers and system administrators in making best hardware and software selections and configurations.


 

Tom Henderson (NOAA)

Tom Henderson

Tom Henderson is a Software Engineer at the Global Systems Division of NOAA's Earth System Research Laboratory (ESRL) in Boulder, Colorado. Mr. Henderson has contributed during the past 20 ears to software development for scientific and engineering applications on cutting-edge high-performance computing architectures. These developments include numerical weather prediction (NWP) and climate models at both NOAA and the National Center for Atmospheric Research. Mr. Henderson's current research interests include application of accelerator HPC technologies (GPU, MIC) to atmospheric modeling.

 

5:30PM - 5:50PM

VIEW NOW

VIEW PDF

Visualization capabilities on Azure's new N-Series

 

Microsoft and NVIDIA have partnered to bring both GPU compute and high-end visualization capabilities to Microsoft Azure. This talk presents an overview of the different visualization options and capabilities available on the new N-Series VMs.


 

Chris Huybregts (Microsoft)

Chris Huybregts

Chris Huybregts is a Senior Program Manager at Microsoft on the Virtual GPU team and has been driving the strategy for Microsoft's Virtual GPU technology for the last 2 years. By working with teams like HyperV and Azure, he's helping ensure the capabilities of GPUs in different virtualized environments meet the expectations of our technology partners.

 

THURSDAY, NOVEMBER 19 | BOOTH #1021

 

10:00AM - 10:20AM

VIEW NOW

VIEW PDF

MAGMA: Development of High-Performance Linear Algebra for GPUs

 

The MAGMA project aims to develop a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures, starting with current "Multicore+GPU" systems.The MAGMA research is based on the idea that, to address the complex challenges of the emerging hybrid environments, optimal software solutions will themselves have to hybridize, combining the strengths of different algorithms within a single framework. Building on this idea, we aim to design linear algebra algorithms and frameworks for hybrid manycore and GPU systems that can enable applications to fully exploit the power that each of the hybrid components offers.


 

Jack Dongarra (University of Tennessee)

Jack Dongarra holds an appointment at the University of Tennessee, Oak Ridge National Laboratory, and the University of Manchester. He specializes in numerical algorithms in linear algebra, parallel computing, use of advanced-computer architectures, programming methodology, and tools for parallel computers.

 

10:30AM - 10:50AM

VIEW NOW

VIEW PDF

Bridges: A GPU-Enabled HPC System for New Communities

 

Bridges is a new, uniquely capable HPC system designed to empower new research communities, expand access, and help researchers facing challenges in Big Data to work more intuitively. Bridges will feature K80 and next-generation NVIDIA GPUs to enable deep learning, image and video analysis, materials science, and other important research.


 

Nick Nystrom (Pittsburgh Supercomputing Center)

Dr. Nick Nystrom leads the development of hardware and software architecture at the Pittsburgh Supercomputing Center to enable new, groundbreaking research throughout academia and industry. Some of his research interests include algorithms for causal discovery and data analytics, genomics, and languages, tools, and software environments for advancing HPC.

 

11:00AM - 11:20AM

VIEW NOW

High-Performance Graph Analytics on GPUs

 

Future high-performance computing systems shall enable fast processing of large data sets, as highlighted by President Obama's Executive Order on National Strategic Computing Initiative. Of significant interest is the need for analyzing big graphs arising from a variety of areas including social networks, biology, to national security. This talk will present our on-going efforts at GWU in accelerating big graph analytics on GPUs. We have developed a new GPU-based BFS system that delivers exceptional performance through efficient scheduling of a large number of GPU threads and effective utilization of GPU memory hierarchy. This system ranks highly on both Graph500 and GreenGraph500.


 

Howie Huang (George Washington University)

Howie Huang

Howie Huang is an Associate Professor in Computer Engineering at George Washington University. His research interests are Computer Systems and Architecture, Big Data, High-Performance Computing and Storage. He received National Science Foundation CAREER Award in 2014, Comcast Technology R&D Fund Award in 2015, GWU SEAS Outstanding Young Researcher in 2014, NVIDIA Academic Partnership Award in 2011, and IBM Real Time Innovation Faculty Award in 2008. His research won the ACM Undergraduate Student Research Competition at SC'12, a Best Student Paper Award Finalist at SC'11, the Best Poster Award at PACT'11, and a High-Performance Storage Challenge Finalist at SC'09. He received a PhD in Computer Science from the University of Virginia.

 

11:30AM - 11:50AM

VIEW NOW

VIEW PDF

Fast Execution of Simultaneous Breadth-First Searches on Sparse Graphs

 

The construction of efficient parallel graph algorithms is important for solving problems in social network analysis and hardware verification. Existing GPU algorithms are monolithic and contributions from the literature are typically rebuilt rather than reused. This talk presents multi-search, an efficient abstraction for GPU graph analysis that promotes code reuse.


 

Adam McLaughlin (Georgia Institute of Technology)

Adam McLauglin

Adam McLaughlin is a Ph.D. candidate in Electrical and Computer Engineering at Georgia Tech, working with Professor David A. Bader. His research focuses on utilizing GPUs for fast parallel execution of algorithms that traverse unstructured network data sets. He will join D. E. Shaw Research in early 2016.

 

12:00PM - 12:20PM

VIEW NOW

VIEW PDF

AmgX 2.0 Scaling towards CORAL

 

Announcing AmgX 2.0 Public availability. New features include support for Power8 platform, GPUDirect with CUDA Aware MPI, better host memory management and scalability.


 

Joe Eaton (NVIDIA Research)

Joe Eaton leads the sparse linear algebra and graph analytics libraries at NVIDIA. His goals is to bring GPU performance to engineering users without the need to learn CUDA programming.

 

12:30PM - 12:50PM

VIEW NOW

VIEW PDF

Merge-based Parallel Sparse Matrix-Vector Multiplication (SpMV)

 

We present a perfectly balanced, "merge-based" parallel algorithm for computing sparse matrix-vector products (SpMV). Our method operates directly upon the CSR sparse matrix format, without preprocessing or specialized/ancillary data formats. The parallel decomposition is two-dimensional, establishing a logical merger of entire CSR matrix dataset, the contents of which are evenly assigned among parallel threads.


 

Duane Merrill (NVIDIA Research)

Duane Merrill

Duane Merrill is a Senior Research Scientist at NVIDIA Research. His principal research interests are programming model and algorithm design for parallel computing. His work focuses on problems involving sparse, irregular, and cooperative computation. He is the author of CUB, a library of "collective" software primitives to simplify CUDA kernel construction, performance tuning, and maintenance. He received his B.S., M.C.S., and Ph. D. from the University of Virginia.

 

1:00PM – 1:20PM

VIEW NOW

VIEW PDF

CNTK: Open-Source, Distributed Deep Learning System from Microsoft

 

GPUs play a critical role in deep learning revolution in the recent years. This talk presents CNTK: a deep learning system which combines powerful hardware (GPUs in Azure) with open source software to provide scalable, robust and easy to use system for distributed deep learning.


 

Alexey Kamenev (Microsoft Research)

Alexey Kamenev is an engineer in Microsoft Research Advanced Technology Group working on open-source, distributed deep learning system that is used throughout the company for various workloads. Prior to MSR he worked on neural network implementation in Azure Machine Learning.

 

1:30PM – 1:50PM

VIEW NOW

VIEW PDF

Gunrock:A Fast and Programmable Multi-GPU Graph Processing Library

 

Gunrock, our multi-GPU graph processing library, enables easy graph algorithm implementation and extension onto multiple GPUs for scalable performance on large graphs with billions of edges. We developed a high-level data-centric abstraction focusing on vertex or edge frontier operations, and a multi-GPU framework for both programmability and performance.


 

Yuechao Pan (University of California, Davis)

Yuechao Pan

Yuechao Pan is a PhD student at UC Davis, from Prof. John D. Owens' group, focusing on multi-GPU graph processing. He designed and implemented the multi-GPU framework of Gunrock, which brought both the performance and the flexibility of the library to a new level.

 

2:00PM – 2:20PM

VIEW NOW

VIEW PDF

NCCL: Accelerated Collective Communications for GPUs

 

We present NCCL, a library of multi-GPU communication collectives (e.g., broadcast, all-reduce, all-gather). NCCL enables applications to harness the computational throughput of multiple GPUs with minimal developer effort by providing optimized, topology-aware, asynchronous collectives with a familiar API.


 

Cliff Woolley (NVIDIA)

Cliff Woolley

Cliff Woolley is a manager of developer technology engineering at NVIDIA. He received his MCS from the University of Virginia, where he was among the earliest academic researchers to explore GPGPU programming. He and his team work with HPC and Enterprise developers to maximize their application performance using NVIDIA GPUs.

 
 

BOOTH INFORMATION

 

CONTACT US

FOLLOW US