SCHEDULE

Come hear talks featuring cutting-edge technologies, groundbreaking research, and more from a broad range of disciplines such as weather forecasting, energy exploration, and molecular dynamics.

MONDAY 11/12

TUESDAY 11/13

  • 10:00 AM - 10:25 AM
    Booth #2417, Hall D
    Applications/Containers

       
    •  
       
      As deep learning becomes more prevalent, one challenging aspect is the time it takes to setup and maintain systems. Finding the best combination of framework version, drivers, runtimes, operating system and patches and testing all of these pieces to ensure they work well together takes away valuable time from your deep learning goals. In this session, we’ll talk about a better way to get your projects up and running with GPU-accelerated deep learning containers from the NVIDIA GPU Cloud (NGC) container registry. We’ll discuss the variety of software containers available from NGC and how use them in your on-prem, cloud, or hybrid cloud deployments.
  •  
  • 10:30 AM - 10:55 AM
    Booth #2417, Hall D
    Applications/Containers

       
    •  
       
      NAMD and VMD provide state-of-the-art molecular simulation, analysis, and visualization tools that leverage a panoply of GPU acceleration technologies to achieve performance levels that enable scientists to routinely apply research methods that were formerly too computationally demanding to be practical. To make state-of-the-art MD simulation and computational microscopy workflows available to a broader range of molecular scientists including non-traditional users of HPC systems, our center has begun producing pre-configured container images and Amazon EC2 AMIs that streamline deployment, particularly for specialized occasional-use workflows, e.g., for refinement of atomic structures obtained through cryo-electron microscopy. This talk will describe the latest technological advances in NAMD and VMD, using CUDA, OpenACC, and OptiX, including early results on ORNL Summit, state-of-the-art RTX hardware ray tracing on Turing GPUs, and easy deployment using containers and cloud computing infrastructure.
  •  
  • 11:00 AM - 11:25 AM
    Booth #2417, Hall D
    Programming Languages

       
    •  
       
      Come learn why the authors of VASP, Fluent, Gaussian, Synopsys and numerous other science and engineering applications are using OpenACC. OpenACC supports and promotes scalable parallel programming on both multicore CPUs and GPU-accelerated systems, enabling large production applications to port effectively to the newest generation of supercomputers. It has very well-supported interoperability with CUDA C++, CUDA Fortran, MPI and OpenMP, allowing you to optimize each aspect of your application with the appropriate tools. OpenACC has proven to be the ideal on-ramp to parallel and GPU computing, even for those who need to tune their most important kernels using libraries or CUDA. Come see how you can try OpenACC with the free PGI Community Edition compiler suite.
  •  
  • 11:30 AM - 11:55 AM
    NVIDIA GPUs on Google Cloud for HPC & ML
    Chris Kleban , Product Manager, Google Compute Engine, GPUs, Google [ View Recording ]
    Booth #2417, Hall D
    Cloud Computing

       
    •  
       
      Come learn about Google Cloud solutions with NVIDIA GPUs. We will show why Google Cloud is the best choice to run your NVIDIA instances. You will learn how Google's fundamental principles around infrastructure, data intelligence, and openness help provide the best services for your HPC and ML deployments. In addition, we'll announce exciting details on our new NVIDIA GPU offerings. It is a talk that technical leaders, developers, data scientists, or anyone with a Cloud and GPU interest will not want to miss!
  •  
  • 12:00 PM - 12:25 PM
    Latest Results from the Summit Supercomputer
    Jack Wells , Director of Science, Oak Ridge Leadership Computing Facility, Oak Ridge National Laboratory, ORNL [ View Recording ]
    Booth #2417, Hall D
    Tensor Cores for Science

       
    •  
       
      This presentation will communicate selected, early results from application readiness activities at the Oak Ridge Leadership Computing Facility (OLCF), in preparation for Summit, the Department of Energy, Office of Science new supercomputer operated by Oak Ridge National Laboratory. With over 9,000 POWER9 CPUs and 27,000 V100 GPUs, high-bandwidth data movement, and large node-local memory, Summit’s architecture is proving to be effective in advancing performance across diverse applications in traditional modeling and simulation, high-performance data analytics, and artificial intelligence. These advancements in application performance are being achieved with small increases in Summit's electricity consumption as compared with previous supercomputers operated at OLCF.
  •  
  • 12:30 PM - 12:55 PM
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      Modern-day enablement of AI has been achieved by acceleration of deep learning by GPUs; now, we are entering the realm of ever-more complex deep learning tasks involving complicated algorithms, deeper and sophisticated network layers, as well as rapidly increasing data sets, whereby a handful of GPUs are proving to be insufficient for such tasks. By designing and building large-scale HPC machines with extensive vector/tensor processing capabilities based on GPUs, such as Tsubame3, ABCI ,and Post-K, as well as designing new scalable learning algorithms, we are overcoming such challenges. In particular, the ABCI grand challenge had yielded 3 research groups, including ours at Tokyo Tech., to scale ImageNet trainings to over 4000 GPUs and training times in minutes. This paves a way for the new era of "scalable AI" as much as traditional HPC has been.
  •  
  • 1:00 PM - 1:25 PM
    Optimizing Strong-Scaling HPC Apps with DGX-2 & NVSwitch
    Marc Hamilton , Vice President of Solutions Architecture and Engineering, NVIDIA [ View Recording ]
    Booth #2417, Hall D
    Accelerated Computing

       
    •  
       
      NVSwitch on the DGX-2 is a super crossbar switch which greatly increases the application performance in several ways. First, it increases the problem size capacity traditionally limited by a single GPU’s memory to the aggregate DGX-2 GPU memory of 512 GB. Second, NUMA-effects of traditional multi-GPU servers are greatly reduced, growing memory bandwidth with the number of GPUs. Finally, ease-of-use is simplified as apps written for a smaller number of GPUs can now be more easily ported with the large memory space.
  •  
  • 1:30 PM - 1:55 PM
    PRIONN: Predicting Runtime and IO using Neural Networks and GPUs
    Michela Taufer , Jack Dongarra Professor in High Performance Computing and SC19 Conference Chair, University of Tennessee, Knoxville [ View Recording ]
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      For job allocation decision, current batch schedulers have access to and use only information on the number of nodes and runtime because it is readily available at submission time from user job scripts. User-provided runtimes are typically inaccurate because users overestimate or lack understanding of job resource requirements. Beyond the number of nodes and runtime, other system resources, including IO and network, are not available but play a key role in system performance. In this talk we tackle the need for automatic, general, and scalable tools that provide accurate resource usage information to schedulers with our tool for Predicting Runtime and IO using Neural Networks and GPUs (PRIONN). PRIONN automates prediction of per-job runtime and IO resource usage, enabling IO-aware scheduling on HPC systems. The novelty of our tool is the input of whole job scripts into deep learning models that allows complete automation of runtime and IO resource predictions.
  •  
  • 2:00 PM - 2:25 PM
    Booth #2417, Hall D
    Science

       
    •  
       
      Exascale-class simulations will be achieved through a combination of high concurrency and energy efficiency. Although accelerator architectures like GPUs are so equipped, the task of adapting a feature-rich legacy application to modern HPC hardware can be daunting. We present the implementation of such GPU capability in the NASA Langley FUN3D computational fluid dynamics solver. With this effort, a thousand of today's 6-GPU nodes can do the work of over a million CPU cores for a fraction of the energy cost.
  •  
  • 2:30 PM - 2:55 PM
    MVAPICH2-GDR Library: Pushing the Frontier of HPC and Deep Learning
    Dhabaleswar K (DK) Panda , Professor and University Distinguished Scholar, The Ohio State University
    [ View Recording ]
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      The talk will focus on the latest developments in MVAPICH2-GDR MPI library that helps HPC and Deep Learning applications to exploit maximum performance and scalability on GPU clusters. Multiple designs focusing on GPUDirect RDMA (GDR), Managed and Unified memory support, datatype processing, and support for OpenPOWER and NVLink will be highlighted for HPC applications. We will also present novel designs and enhancements to the MPI library to boost performance and scalability of Deep Learning frameworks on GPU clusters. Container-based solutions for GPU-based cloud environment will also be highlighted.
  •  
  • 3:00 PM - 3:25 PM
    CUDA 10 - New Features
    Stephen Jones , Principal Software Engineer, NVIDIA [ View Recording ]
    Booth #2417, Hall D
    Programming Languages

       
    •  
       
      CUDA 10.0, the latest major revision of the CUDA platform, was released in September and introduces the latest Turing GPU architecture and a host of new features. This talk presents all the details including a new graphs programming model, more flexible system support mechanisms and of course the new capabilities offered by the Turing GPU.
  •  
  • 3:30 PM - 3:55 PM
    The Convergence of HPC and AI in a post-Moore’s Law World
    Steve Oberlin , Chief Technology Officer, Accelerated Computing, NVIDIA [ View Recording ]
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      AI methods and tools are starting to be applied to HPC applications by a growing number of brave researchers in diverse scientific fields. This talk will describe an emergent workflow that uses traditional HPC numeric simulations to generate the labeled data sets required to train machine learning algorithms, then employs the resulting AI models to predict the computed results, often with dramatic gains in efficiency, performance, and even accuracy. Some compelling success stories will be shared, and the implications of this new HPC + AI workflow on HPC applications and system architecture in a post-Moore's Law world considered.
  •  
  • 4:00 PM - 4:25 PM
    Booth #2417, Hall D
    Programming Languages

       
    •  
       
      Architectures are becoming increasingly heterogeneous offering developers a rich variety of computing resources. While these architectures benefit from customized optimization strategies, the scientific developers tend to prefer a 'write-once' code in order to create a code that is portable, yet performance-efficient and migratable to rapidly changing hardware. This talk will present stories of porting scientific applications using OpenACC to state-of-the-art heterogeneous computing systems. Applications will range from molecular dynamics, nuclear physics, neutrino experiments, and climate domains.
  • 4:30 PM - 4:55 PM
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      The Perlmutter machine will be delivered to NERSC/LBNL in 2020 and contain a mixture of CPU-only and NVIDIA Tesla GPU-accelerated nodes. In this talk we will describe the analysis we performed in order to optimize this design to meet the needs of the broad NERSC workload. We will also discuss our application readiness program, the NERSC Exascale Science Applications Program (NESAP), where we will work with our users to optimize their applications in order to maximize their performance on GPU's in Perlmutter.
  •  
  • 5:00 PM - 5:25 PM
    OpenMP for GPUs on NERSC-9
    Douglas Miles , Senior Director, PGI Compilers & Tools, NVIDIA
    Christopher Daley , Performance Engineer, NERSC
    Booth #2417, Hall D
    Accelerated Computing

       
    •  
       
      OpenMP has a 20 year history in HPC, and been used by NERSC developers for node-level parallelization on several generations of NERSC flagship systems. Recent versions of the OpenMP specification include features that enable Accelerator programming generally, and GPU programming in particular. Given the extensive use of OpenMP on previous NERSC systems, and the GPU-based node architecture of NERSC-9, we expect OpenMP to be important in helping users migrate applications to NERSC-9. In this talk we’ll give an overview of the current usage of OpenMP at NERSC, describe some of the new features we think will be important to NERSC-9 users, and give a high-level overview of a collaboration between NERSC and NVIDIA to enable OpenMP for GPUs in the PGI Fortran, C and C++ compilers.
  •  
     
  • 5:30 PM - 5:55 PM
    Combining Machine Learning and Numerical Modeling to Transform Atmospheric Science
    Dr. Richard D. Loft , Director, Technology Development Division Computational and Information Systems Laboratory, National Center for Atmospheric Research [ View Recording ]
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      Rapid progress in atmospheric science has been fueled in part over the years by faster computers. However, progress has slowed over the last decade due to three factors: the plateauing of core speeds, the increasing complexity of atmospheric models, and the mushrooming of data volumes. Our team at the National Center for Atmospheric Research is pursuing a hybrid approach to surmounting these barriers that combines machine learning techniques and GPU-acceleration to produce, we hope, a new generation of ultra-fast models of enhanced fidelity with nature and increased value to society.

WEDNESDAY 11/14

  • 10:00 AM - 10:25 AM
    NVIDIA Deep Learning Institute: University Ambassador Program
    Joe Bungo , Deep Learning Institute Program Manager, NVIDIA [ View Recording ]
    Yu Wang , AI Scientist, Leibniz Supercomputing Centre [ View Recording ]
    Booth #2417, Hall D
    Accelerated Computing

       
    •  
       
      The NVIDIA Deep Learning Institute (DLI) offers hands-on training in AI and accelerated computing to solve real-world problems across autonomous vehicles, digital content creation, healthcare, finance, and more. Designed for developers, data scientists, researchers, and students with a technical background, DLI training can be accessed in an instructor-led workshop or online in a self-paced course, complete with certification of competency. The DLI University Ambassador Program enables qualified educators to teach DLI workshops at university campuses and academic conferences to faculty, students, and researchers at no cost, complementing the traditional theoretical approaches to university education in machine learning, data science, AI, and parallel computing.
  •  
  • 10:30 AM - 10:55 AM
    Developer Tools in CUDA 10: New Features and Capabilities
    Rafael Campana , Software Engineering Director, Compute Developer Tools, NVIDIA [ View Recording ]
    Booth #2417, Hall D
    Programming Languages

       
    •  
       
      Come and learn about the Developer Tools available in CUDA 10.0 and the exciting changes that are in this release. We will cover IDE integrations, debuggers, memory checker, profilers, as well going over the introduction of the new products to the “Nsight” family: Nsight Systems and Nsight Compute – enabling you, the developer, to harness even more power out of the NVIDIA GPUs.
  •  
  • 11:00 AM - 11:25 AM
    Democratizing HPC with Containers
    CJ Newburn , Principal Architect for HPC, NVIDIA Compute Software, NVIDIA [ View Recording ]
    Booth #2417, Hall D
    Applications/Containers

       
    •  
       
      NVIDIA offers several containerized applications in HPC, visualization, and deep learning. We have also enabled a broad array of contain-related technologies for GPU with upstreamed improvements to community projects and with tools that are seeing broad interest and adoption. Furthermore, NVIDIA is acting as a catalyst for the broader community in enumerating key technical challenges for developers, admins and end users, and is helping to identify gaps and drive them to closure. This talk describes NVIDIA new developments and upcoming efforts. It outlines progress in the most important technical areas, including multi-node containers, security, and scheduling frameworks. It highlights the breadth and depth of interactions across the HPC community that are making the latest, highly-quality HPC applications available to platforms that include GPUs.
  •  
  • 11:30 AM - 11:55 AM
    Leveraging Big Data to do Deep Learning on Small Data
    Nathan Hodas , Senior Data Scientist, Pacific Northwest National Laboratory [ View Recording ]
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      The recent success of deep learning has been driven by the ability to combine significant GPU resources with extremely large labeled datasets. However, many labels are extremely expensive to obtain or even impossible to obtain more than one, such as a specific astronomical event or scientific experiment. By combining vast amounts of labeled surrogate data with advanced few-shot learning, we have demonstrated success in leveraging small data in deep learning. In this talk, we will discuss these exciting results and explore the scientific innovations that made this possible.
  •  
  • 12:00 PM - 12:25 PM
    Advancing Scientific Frontiers Using Deep Learning
    Courtney Corley , Chief Data Scientist / Technical Group Manager, Pacific Northwest National Laboratory [ View Recording ]
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      Pacific Northwest National Laboratory’s scientific mission spans energy, molecular science to national security. Under the Deep Learning for Scientific Discovery Initiative, PNNL has invested in integrating advanced machine learning with traditional scientific methods to push the state-of-the-art in many disciplines. We will provide an overview of some of the thirty projects we have stewarded, demonstrating how we have leveraged computing and analytics in fields as diverse as ultrasensitive detection to metabolomics to atmospheric science.
  • 12:30 PM - 12:55 PM
    RAPIDS, GPU accelerated Data Science
    Rollin Thomas , Data Architect and Python Data Analytics Lead, Lawrence Berkeley National Laboratory
    [ View Recording ]
    Joshua Patterson , Director of AI Infrastructure, NVIDIA [ View Recording ]
    Booth #2417, Hall D
    Accelerated Computing

       
    •  
       
      The next big step in data science combines the ease of use of common Python APIs, but with the power and scalability of GPU compute. The RAPIDS project is the first step in giving data scientists the ability to use familiar APIs and abstractions for data science while taking advantage of GPU accelerated hardware commonly found in HPC centers. This session discusses RAPIDS, how to get started, and our roadmap for accelerating more of the data science ecosystem.
  • 1:00 PM - 1:25 PM
    Exascale Deep Learning for Climate Analytics
    Michael Houston , Senior Distinguished Engineer, NVIDIA [ View Recording ]
    Prabhat , Data and Analytics Group Lead, NERSC [ View Recording ]
    Thorsten Kurth , Application Performance Specialist, NERSC, Lawrence Berkeley National Laboratory
    [ View Recording ]
    Booth #2417, Hall D
    Tensor Cores for Science

       
    •  
       
      We'll discuss how we scaled the training of a single deep learning model to 27,360 V100 Tensor Core GPUs (4,560 nodes) on the OLCF Summit HPC System using the high-productivity TensorFlow framework. We discuss how the neural network was tweaked to achieve good performance on NVIDIA Volta GPUs with Tensor Cores and what further optimizations were necessary to provide excellent scalability, including data input pipeline and communication optimizations, as well as gradient boosting for SGD-type solvers.
  •  
     
     
  • 1:30 PM - 1:55 PM
    Booth #2417, Hall D
    Tensor Cores for Science

       
    •  
       
      The use of low-precision arithmetic in computing methods has been a powerful tool to accelerate numerous scientific computing applications including Artificial Intelligence. We present an investigation showing that other HPC applications can harness this power too, and in particular, the general HPC problem of solving Ax = b, where A is a large dense matrix, and the solution is needed in FP64 accuracy. Our approach is based on the mixed-precision (FP16->FP64) iterative refinement technique – we generalize and extend prior advances into a framework, for which we develop architecture-specific algorithms and highly-tuned implementations where we show how the use of FP16-TC (tensor cores) arithmetic can provide up to 4X speedup and improve the energy consumption by a factor of 5 achieving 74 Gflop/Watt. This is due to the performance boost that the FP16 (Tensor Cores) provide and to its better accuracy that outperforms the classical FP16.
  •  
  • 2:00 PM - 2:25 PM
    Booth #2417, Hall D
    Accelerated Computing

       
    •  
       
      Exciting advances in technology have propelled AI computing to the forefront of mainstream applications. The desire to drive advanced visualization with photo realistic real-time rendering and efficient exa-scale class high performance computing fed with huge scale data collection have driven development of the key elements needed to build the most advanced AI computational engines. While these engines connected with advanced high speed busses like NVLINK are now providing true scalable AI computation within single systems, the challenge to break out of the box with large scale AI is upon us. In this talk we will discuss insights gained from creating NVIDIA's SATURNV AI Supercomputer enabling efficient use of this new class of dense AI computational engines and keys to optimizing data centers for GPU multi-node computing specifically targeted for today's neural net and HPC computing.
  •  
  • 2:30 PM - 2:55 PM
    Booth #2417, Hall D
    Accelerated Computing

       
    •  
       
      Get an inside look at the world’s most powerful AI system, NVIDIA DGX-2. Explore the design and hardware architecture that enables sixteen Tesla Volta GPUs to operate as one giant GPU. Find out how NVIDIA DGX-2 can enable you to explore and solve the most complex AI challenges.
  •  
  • 3:00 PM - 3:25 PM
    GPU Computing on Oracle Cloud Infrastructure
    Karan Batta , Director of Product Management, Oracle Cloud Infrastructure, Oracle [ View Recording ]
    Booth #2417, Hall D
    Cloud Computing

       
    •  
       
      You’d imagine with the growth of the public cloud that majority of the HPC workloads and applications would have transitioned to the cloud however almost all enterprise HPC workloads are still running in on-premises datacenters; which means millions of mission critical use-cases such as engineering cash simulations and cancer research are still constrained by on-premise environments. Learn how Oracle Cloud Infrastructure is solving these problems with cutting edge GPU and HPC infrastructure along with datacenter level features that make it more attractive for Enterprises to migrate; allowing new use-cases such as using data from Oracle databases to run deep learning training instantly adding more value to your data and business. This is a session not to be missed!
  •  
  • 3:30 PM - 3:55 PM
    How XALT Can Help You Build The Optimal Supercomputing System That Meets Your Users’ Needs
    Robert McLay , HPC Manager of Software Tools, TACC
    Dion Harris , Sr. Manager, Product Marketing, NVIDIA Data Center Group
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      Detailed knowledge of application workload characteristics can optimize performance of current and future systems. This may sound daunting, with many HPC data centers hosting over 2,000 users running thousands of applications and millions of jobs per month. XALT is an open source tool developed at the Texas Advanced Computing Center (TACC) that collects system usage information to quantitatively report how users are using your system. This session will explore the benefits of detailed application workload profiling and how the XALT tool has helped leading supercomputing sites unlock the power of their application usage data.
  •  
  • 4:00 PM - 4:25 PM
    Booth #2417, Hall D
    Science

       
    •  
       
      Real-time ray racing has finally become reality. Following individual rays through a virtual scene leads to highly accurate renderings but the associated computational cost has so far prevented its interactive use. While mostly known in the computer graphics space, a lot of computational science and HPC applications perform similar operations. The ray tracing capabilities available on the latest generation GPUs, including the hardware support for ray-tracing via the RTCores on the Turing GPUs, are therefore of relevance not only to applications that generate colorful pixels, but also to applications performing ray-tracing like operations, including particles tracking in complex geometries or even spatial database searches. In this talk I will briefly summarize the features enabling real-time ray tracing and will look at the impact this technology has on scientific visualization and HPC applications in general.
  •  
  • 4:30 PM - 4:55 PM
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      PSC's "Bridges" was the first system to successfully converge HPC, AI, and Big Data. Designed for the U.S. national research community and supported by NSF, it now serves approximately 1600 projects and 7500 users at over 350 institutions. Bridges emphasizes "nontraditional" uses that span the life, physical, and social sciences, engineering, and business, many of which are based on AI or AI-enabled simulation. We describe the characteristics of Bridges that have made it a success, and we highlight several inspirational results and how they benefited from the system architecture. We then introduce "Bridges AI", a powerful new addition for balanced AI capability and capacity that includes NVIDIA's DGX-2 and HPE NVLink-connected 8-way Volta servers.
  •  
  • 5:00 PM - 5:25 PM
    Booth #2417, Hall D
    HPC + AI

       
    •  
       
      The RAPIDS suite of software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NIVIDA CUDA primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
  •  
  • 5:30 PM - 5:55 PM
    Booth #2417, Hall D
    Science

       
    •  
       
      Attendees will learn about the role of CGYRO in simulating the confinement of plasma in toroidal fusion devices. CGYRO solves the gyrokinetic equations which describe the nonlinear, turbulent flow of plasma in toroidal geometry. It is compute and memory intensive, with high-fidelity cases running on leadership class HPC systems, like the U.S. Department of Energy's Oak Ridge National Laboratory (ORNL) Titan and Summit GPU-heavy supercomputers. Attendees will learn about the physics motivation, the software architecture, as well as the performance on leadership systems. In particular, we will discuss how CGYRO was designed and implemented to make good use of the tens of thousands of GPUs on leadership systems working in unison to provide simulations that bring us ever closer to fusion as an abundant clean energy source.

THURSDAY 11/15

  • 10:00 AM - 10:25 AM
    Booth #2417, Hall D
    Cloud Computing

       
    •  
       
      Zenotech Ltd are a UK based company developing the latest in Computation Fluid Dynamics solvers and Cloud based HPC systems. Their Computation Fluid Dynamics solver (zCFD) has been engineered to take full advantage of the latest developments in GPU technology. This talk will present the performance advantages that Zenotech see when using GPUs and the impact this has on their customers. It will showcase industrial problems that have been solved in a fast/cost effective manner using a combination of zCFD and the P100 and V100 GPU’s available on AWS. Traditionally these cases are run on in-house parallel computing clusters but the larger number of GPUs per node with AWS have enable the solving of large CFD problems on a single instance. Benchmarking with zCFD demonstrates that a single P3 node is providing the equivalent performance to over 1100 CPU cores. As well as performance benefits, the spot market and on-demand nature of AWS provide some real cost savings for Zenotech's customers and opens up a scale of simulation that was previously not affordable. The session will present real world examples from Zenotech's customers in the aerospace, renewable and automotive sectors. The session will also show how Zenotech's EPIC platform makes the combination of zCFD and AWS GPUs a simple and cost effective solution for engineers.
  •  
  • 10:30 AM - 10:55 AM
    What Accelerating Genomics Means for Your Health
    Fernanda Foertter , GPU Developer Advocate (Healthcare, HPC + AI), NVIDIA [ View Recording ]
    Booth #2417, Hall D
    Science

       
    •  
       
      Genomics is on the verge of a breakthrough. With cost decreasing and accuracy increasing, the amount of data generated will make genomics accessible for a whole new generation of population level research. This talk will highlight the challenges faced in this emerging era of genetic data abundance and explore what accelerating workflows means for health outcomes.
  •  
  • 11:00 AM - 11:25 AM
    Booth #2417, Hall D
    Programming Languages

       
    •  
       
      The C++17 and Fortran 2018 language standards include parallel programming constructs well-suited for GPU computing. The C++17 parallel STL (pSTL) was designed with intent to support GPU parallel programming. The F18 do concurrent construct with its shared and private variable clauses can be used to express loop-level parallelism across multiple array index ranges. We will share our experiences and results implementing support for these constructs in the PGI C++ and Fortran compilers for NVIDIA GPUs, and explain the capabilities and limitations they offer HPC programmers. You will learn how to use OpenACC as a bridge to GPU and parallel programming with standard C++ and Fortran, and we will present additional features we hope and expect will become a part of those standards.
  •  
  • 11:30 AM - 11:55 AM
    Booth #2417, Hall D
    Accelerated Computing

       
    •  
       
      This talk will provide a peek into how NVSHMEM allows the extension of the Kokkos Programming Model to supporting PGAS semantics on GPUs. The implementation and integration strategy will be discussed, demonstrating the ease of use for multi GPU programming afforded by NVSHMEM. Initial performance results will be presented, demonstrating that NVHSMEM on NVLINK enabled systems provides performance characteristics much better than the one experienced on multi-socket CPU nodes using SHMEM or MPI3 one-sided communication.
  •  
  • 12:00 PM - 12:25 PM
    Scientific Application Development and Early Results on Summit
    Tjerk P. Straatsma , Distinguished Research Scientist, Oak Ridge National Laboratory
    [ View Recording ]
    Booth #2417, Hall D
    Tensor Cores for Science

       
    •  
       
      Summit, the world fastest supercomputer, located in the Oak Ridge Leadership Computing Facility at the DOE Oak Ridge National Laboratory will provide unprecedented computational resources for open science supported by the DOE user programs. The unique aspects of its GPU-accelerated architecture are reviewed in this presentation. The collaborative efforts to prepare scientific modeling and simulation as well as data-intensive computing applications to take advantage of the architectural features of Summit are highlighted, and early scientific results enabled by the porting and development work presented.
  •  
  • 12:30 PM - 12:55 PM
    Booth #2417, Hall D
    Accelerated Computing

       
    •  
       
      In this talk Brian will present an overview of some of the research within their cognitive simulation portfolio with a highlight on how they are using the Sierra Supercomputer to tackle problems of cancer biology at scale.
  •  
  • 1:00 PM - 1:25 PM
    Simulating and Visualizing Turbulent Fluid Mixing using 16,384 GPUs on LLNL's Sierra Supercomputer
    Cyrus Harrison , Computer Scientist and Associate Division Leader, Lawrence Livermore National Laboratory [ View Recording ]
    Booth #2417, Hall D
    Science

       
    •  
       
      In October 2018, LLNL ran a 97.8 billion element hydrodynamics simulation using 16,384 GPUs on 4,096 Sierra Compute Nodes. This a high resolution simulation of two-fluid mixing in a spherical geometry was run to gain insight into the growth of a Rayleigh-Taylor instability. We will share details about running this simulation and visualizing the results.
  •  
  • 1:30 PM - 1:55 PM
    Booth #2417, Hall D
    HPC + AI, Cloud Computing

       
    •  
       
      We have developed a HPC ML training algorithm that can reduce training time on PBs of data from days and weeks to minutes. Using the same research, we can now conduct inferencing on completely encrypted data. We have built a distributed ML framework on commodity Azure VMs that scales to tens of terabytes and thousands of cores, while achieving better accuracy than state-of-the-art.
  •  
  • 2:00 PM - 2:25 PM
    Booth #2417, Hall D
    Applications/Containers

       
    •  
       
      Containers simplify application deployments in the data centers by wrapping applications into an isolated virtual environment. By including all application dependencies like binaries and libraries, application containers run seamlessly in any data center environment. The HPC application containers available on NVIDIA GPU Cloud (NGC) dramatically improve ease of application deployment while delivering optimized performance. However, if the desired application is not available on NGC registry, building HPC containers from scratch trades one set of challenges for another. Parts of the software environment typically provided by the HPC data center must be redeployed inside the container. For those used to just loading the relevant environment modules, installing a compiler, MPI library, CUDA, and other core HPC components from scratch may be daunting. HPC Container Maker (HPCCM) is an open-source project that addresses the challenges of creating HPC application containers. Scott McMillan will present how HPCCM makes it easier to create HPC application containers by separating the choice of what should go into a container image and will cover the best practices to minimize container development effort, minimize image size, and take advantage of image layering.
  •  
  • 2:30 PM - 2:55 PM
    Exascale Computing: The New Microscope for Systems Biology
    Dan Jacobson , Chief Scientist for Computational Systems Biology, ORNL [ View Recording ]
    Booth #2417, Hall D
    Tensor Cores for Science

       
    •  
       
      Integrated biological models need to capture the higher order complexity in the interactions that occur among cellular components. A full model of all of the higher order interactions of cellular and organismal components is one of the ultimate grand challenges of systems biology. The ability to build such comprehensive models will usher in a new era in biology. Success in the construction and application of computational algorithms will enable new insights into the molecular mechanisms responsible for complex biological systems and related emergent properties; using technologies not previously available on a scale not feasible before. A full systems biology model of all of the higher order interactions of cellular and organismal components would lead to breakthroughs, which would have profound effects on the field.