Computer and Machine Vision focuses on breakthroughs in GPU technology that enable computers to see and make sense of their world. Explore the future of robotics, intelligent video analytics, real-time image indexing and autonomous navigation with leading minds in one of the fastest growing and most impactful areas in computing today. Learn from researchers, technologists, entrepreneurs and industry leaders from around the world.
3D Deep Learning
Enhancing Visual Realism of Mixed Reality Applications with Stereo Vision
UNIVERSITY OF ALBERTA
Real-Time Free Viewpoint TV System Based on a New Panorama Stitching Framework
PRINCETON UNIVERSITY, Assistant Professor
We'll discuss some of our research projects about 3D deep learning in computer vision, including our projects to use 3D convolution neural networks on GPUs to learn 3D descriptors for point features, to model 3D shapes, and to parse 3D scenes. Finally, we'll talk about Marvin, a deep learning software framework for N-dimensional data that we developed for NVIDIA GPUs, which could impact other fields, such as neural sciences, biology, medical images, and healthcare.
ABOUT THE SPEAKER: Jianxiong Xiao is an assistant professor in the Department of Computer Science at Princeton University and the director of the Princeton Vision Group. He received his Ph.D. from the Computer Science and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology (MIT). Jianxiong's research interests are in computer vision. He has been motivated by the goal of building computer systems that automatically understand visual scenes, both inferring the semantics and extracting 3D structure. Jianxiong focuses on 3D deep learning, RGB-D recognition and reconstruction, place-centric 3D context modeling, graphics for vision (synthesis for analysis), deep learning for autonomous driving, large-scale crowd-sourcing, and petascale big data. His work has received the Best Student Paper Award at the European Conference on Computer Vision (ECCV) in 2012 and Google Research Best Papers Award for 2012. Jianxiong was awarded the Google U.S./Canada Fellowship in Computer Vision in 2012, MIT CSW Best Research Award in 2011, and two Google Research Awards in 2014 and in 2015.
Discover how stereo vision and 3D depth sensing on GPU enable the development of mixed reality applications, which merge virtual information into a live 3D video stream of the real world. We will discuss the various stages of a real-time mixed reality processing pipeline, and how NVIDIA's GPU acceleration is integral to every step of the pipeline. We will also show demonstrations of how stereo depth sensing can be used to create 3D virtual playgrounds and real-time augmentation of the environment.
ABOUT THE SPEAKER: Edwin Azzam co-founded STEREOLABS in 2010. As STEREOLABS's Chief Technical Officer, Edwin is responsible for leading the company's product development and technology strategy in stereo vision. Prior to founding STEREOLABS, Edwin was a project manager at Astrium Space Transportation, Paris.Edwin holds a Master's degree in Optics & Image Processing from Institut d'Optique, France, as well as a Master's degree in Management from ESSEC Business School. He is a PhD supervisor and a National Technical Expert for the ANR (National Research Agency), where he uses his technical and market expertise for the assessment of national research projects in the field of computer vision and 3D image processing.
UNIVERSITY OF ALBERTA, Professor
With the advance of GPU and vision technologies, free viewpoint TV (FTV) will become a reality in the near future. Traditional videos such as those shown on TV or viewed on the Internet are passive and two-dimensional in nature. Viewers can only passively observe the events captured by a cameraman and have no ability to actively change their viewpoint once the video is recorded. On the contrary, FTV will allow the viewer to select an arbitrary viewpoint and thus enjoy a feeling of immersion into events such as an Olympic competition or a popular theatre show. In this presentation, we will describe a FTV system based on creating a real-time panorama from multiple pixel synchronized cameras using GPU and how to transmit this information using normal IPTV technologies.
ABOUT THE SPEAKER: Pierre Boulanger has more than 30 years of experience in 3D computer vision, rapid product development, and the applications of virtual reality systems to medicine and industrial manufacturing. He worked for 18 years at the National Research Council of Canada as a senior research officer where his primary research interest was in 3D computer vision, rapid product development, and virtualized reality systems. He now has a double appointment as a professor at the University of Alberta Department of Computing Science and at the Department of Radiology and Diagnostic Imaging. He is currently the director of the Advanced Man Machine Interface Laboratory (AMMI) as well as the scientific director of the SERVIER Virtual Cardiac Centre. In 2013, Pierre was awarded the CISCO chair in healthcare solutions, a 10 years investment by CISCO systems in the development of new IT technologies for healthcare in Canada. His main research topics are on the development of new techniques for telemedicine, patient specific modeling using sensor fusion, and the application of tele-presence technologies to medical training, simulation, and collaborative diagnostics.
Where Tegra Meets Titan: Asymmetric Computer Vision for Smartphones and Robotics
UNIVERSITY COLLEGE LONDON
Autonomous Robotic 3D Printing: Real-Time Path Planning with Computer Vision
UCLA SCHOOL OF THEATER, FILM AND TELEVISION
Assistant Dean, Technology and Innovation
Real-time Person Tracking on Jetson with OpenPTrack
MONASH UNIVERSITY, Professor
This presentation will argue that battery life and thermal limits will prevent small mobile devices from implementing the next generation of visual processing algorithms without external assistance from high performance computing. Several innovative methods of distributing these problems between lightweight and high-powered nodes will be explored for a number of visual processing applications relevant to smartphones and robotics. We'll illustrate how these problems can be mapped onto the thread model of GPUs and will present a couple of CUDA tricks used to maximize efficiency.
ABOUT THE SPEAKER: Tom Drummond has been a principal investigator on several EU Framework projects and is a chief investigator in the ARC Centre of Excellence for Robotic Vision. Tom studied mathematics for his B.A. at the University of Cambridge. In 1989, he emigrated to Australia and worked for CSIRO in Melbourne for four years before moving to Perth for his Ph.D. in computer science at Curtin University. In 1998, he returned to Cambridge as a postdoctoral research associate and in 1991 was appointed as a university lecturer and was subsequently promoted to senior university lecturer. In 2010, he returned to Melbourne and took up a professorship at Monash University.
UNIVERSITY COLLEGE LONDON, Architect
Autonomous Robotic 3D Printing: Real-Time Path Planning with Computer Vision
Teach your 3D printing robot how to adapt to unpredictable material behavior by using deep learning algorithms. We'll introduce a path planning strategy for iteratively correcting robot target positions in a 3D printing process by using an NVIDIA Jetson card attached to an industrial robotic arm. Initial path generation, visual tracking of material behavior in real-time, evaluation and recomputation of robot trajectories will be explained by code examples and video recordings from the fabrication process.
ABOUT THE SPEAKER: Daghan Cam is an architect and researcher based in London. He is the director of Daghan Cam Limited, which operates between architecture, technology, and research. He runs a post-graduate research cluster at UCL's Bartlett School of Architecture with Alisa Andrasek and Andy Lomas. He also leads research on GPU computing and he is a co-principal investigator of UCL as an NVIDIA GPU Research Center. Previously he worked with Zaha Hadid Architects. He taught workshops and gave lectures at AA Visiting Schools in Istanbul, Athens, London, and at Ecole d'architecture in Paris. His work on computational design and large-scale robotic fabrication has been widely exhibited, recently in San Francisco and in Milan Design Week 2015.
UCLA SCHOOL OF THEATER, FILM AND TELEVISION, Assistant Dean, Technology and Innovation
We'll provide an overview of OpenPTrack, a GPU-enabled, open-source project that enables real-time position tracking of many people using networked 3D imagers, which is now available for the Jetson TK1/TX1 embedded platform. OpenPTrack specifically targets innovative applications in education, arts, and culture, where it aims to meet a need for real-time person tracking that is reliably scalable over large areas, realistically deployable, and low cost. We'll cover the basic technical approach, UCLA REMAP's experience from real-world multi-imager deployments, and the technology roadmap, using Jetson, that aims to bring occlusion-resistant, real-time person tracking into the mainstream of interactive design and experimentation.
ABOUT THE SPEAKER: Jeff Burke is Assistant Dean for Technology and Innovation at the UCLA School of Theater, Film and Television (UCLA TFT). He has produced, managed, programmed and designed experimental performances, short films, new genre art installations and new facility construction internationally for more than 15 years. Jeff has been a faculty member since 2001 and today, in addition to his role developing technology and innovation strategy at TFT, is Co-PI and application team lead for the Named Data Networking project, a multi-campus effort supported by the National Science Foundation (NSF) and an international 25-member consortium to develop a future Internet architecture. In 2004, Burke co-founded UCLA TFT's Center for Research in Engineering, Media and Performance (REMAP), a collaboration with the Henry Samueli School of Engineering and Applied Science, which combines research, artistic production and community engagement. At REMAP, Burke's research has been supported by the NSF and NEA, Intel, Cisco, Trust for Mutual Understanding and the MacArthur Foundation, among others. From 2006-2012, he was area lead for participatory sensing at the NSF Center for Embedded Networked Sensing, helping to define a new application arena for mobile devices. In 2014, Jeff received a three-year Google Focused Award on the "Future of Storytelling," for work that will explore the intersection of storytelling and coding through research and production of original, interdisciplinary digital media works at UCLA TFT.