[PyTorch Conference 2022] Torch-TensorRT: A Compiler for Accelerating PyTorch Inference Using TensorRT
Torch-TensorRT is an open-source compiler targeting NVIDIA GPUs for high-performance deep-learning inference in PyTorch. It combines the usability of PyTorch with the performance of TensorRT allowing for easy optimization of inference workloads on NVIDIA GPUs. Torch-TensorRT supports all classes of optimizations in TensorRT including reduced mixed precision down to INT8, through simple Python & C++ APIs designed to work directly from PyTorch. Torch-TensorRT outputs standard PyTorch modules as well as the TorchScript format to allow for a completely self-contained, portable, & static module with TensorRT engines embedded as attributes
[PyTorch Developer Conference 2021] Torch-TensorRT: Accelerating Inference Performance Directly from PyTorch using TensorRT
Torch-TensorRT is an open-source SDK designed to target NVIDIA GPUs for high-performance deep-learning inference. It combines the usability of PyTorch with the performance optimizations of TensorRT allowing for easy optimization of any inference workload on NVIDIA GPUs. Torch-TensorRT supports all optimizations of TensorRT including reduced mixed precision down to INT8, all within simple Python & C++ APIs designed to work directly from PyTorch. Torch-TensorRT outputs the standard TorchScript format to allow for a completely self-contained, portable, & static module with TensorRT engines embedded as attributes. Torch-TensorRT partitions the model into subgraphs based on TensorRT compatibility of each node. Compatible subgraphs are replaced with a single optimized TensorRT engine; the incompatible subgraphs “fallback” and run in native PyTorch. This fallback means there is no requirement for full model support by TensorRT, expanding Torch-TensorRT compatible models to all of TorchScript.
[PTED] TRTorch: A Compiler for TorchScript Targeting NVIDIA GPUs with TensorRT
We present TRTorch, a compiler for PyTorch and TorchScript targeting NVIDIA GPUs, which combines the usability of PyTorch with the performance of TensorRT allowing users to optimize easily inference workloads on NVIDIA GPUs. For experimentation and the development of machine learning models, few tools are as approachable as PyTorch. However, some of the features that make PyTorch great for development make it hard to deploy. With TorchScript, PyTorch now has solid tooling for addressing some of these problems. TorchScript removes the dependency on Python and produces portable, self contained, static representations of code and weights. In addition to portability, users also look to optimize performance in deployment. On NVIDIA GPUs, TensorRT, NVIDIA’s deep learning optimizer, provides the capability to maximize performance of workloads by tuning the execution of models for specific target hardware. TRTorch merges the benefits of TorchScript and TensorRT to simplify conducting optimization including post training quantization by leveraging common PyTorch tooling. It can be used directly from PyTorch as a TorchScript Backend, via CLI or embedded (C++/Python) in an application to easily increase inference performance.
[arXiv:1606.05002] 3DFS: deformable dense depth fusion and segmentation for object reconstruction from a handheld camera
We propose an approach for 3D reconstruction and segmentation of a single object placed on a flat surface from an input video. Our approach is to perform dense depth map estimation for multiple views using a proposed objective function that preserves detail. The resulting depth maps are then fused using a proposed implicit surface function that is robust to estimation error, producing a smooth surface reconstruction of the entire scene. Finally, the object is segmented from the remaining scene using a proposed 2D-3D segmentation that incorporates image and depth cues with priors and regularization over the 3D volume and 2D segmentations. We evaluate 3D reconstructions qualitatively on our Object-Videos dataset, comparing to fusion, multiview stereo, and segmentation baselines. We also quantitatively evaluate the dense depth estimation using the RGBD Scenes V2 dataset [Henry et al. 2013] and the segmentation using keyframe annotations of the Object-Videos dataset.
[TEI '14] Gesture based distributed user interaction system for a reconfigurable self-organizing smart wall
We describe user interactions with the self-organized amorphous wall, a modular, fully distributed system of computational building blocks that communicate locally for creating smart surfaces and functional room dividers. We describe a menu and a widget-based approach in which functions are color-coded and can be selected by dragging them from module to module on the surface of the wall. We also propose an on-off switch gesture and a dial gesture each spanning multiple units as canonical input mechanisms that are realized in a fully distributed way.
[GPU Technology Conference Fall 2021] Accelerate PyTorch with TensorRT
Learn how to accelerate PyTorch inference without leaving the framework with Torch-TensorRT. Torch-TensorRT makes the performance of NVIDIA’s TensorRT GPU optimizations available in PyTorch for any model. You'll learn about the key capabilities of Torch-TensorRT, how to use them, and the performance benefits you can expect. We'll walk you through how to easily transition from a trained model to an inference deployment fine-tuned for your specific hardware, all with just a few lines of familiar code. If you want more technical details, the second half of the talk will give you a chance to deep dive into how Torch-TensorRT operates, the mechanics of key features, and a few in-depth examples.
[GPU Technology Conference Spring 2021] New Features in TRTorch, a PyTorch/TorchScript Compiler Targeting NVIDIA GPUs Using TensorRT
We'll cover new features of TRTorch, a compiler for PyTorch and TorchScript that optimizes deep learning models for inference on NVIDIA GPUs. Programs are internally optimized using TensorRT but maintain full compatibility with standard PyTorch or TorchScript code. This allows users to continue to feel like they're writing PyTorch code in their inference applications while fully leveraging TensorRT. We'll discuss new capabilities enabled in recent releases of TRTorch, including direct integration into PyTorch and post-training quantization.
[GPU Technology Conference Fall - Oct, 2020] TRTorch: A PyTorch/TorchScript Compiler Targeting NVIDIA GPUs Using TensorRT
We'll dive into TRTorch, a new compiler for PyTorch and TorchScript that optimizes deep learning models for inference on NVIDIA GPUs. Programs are internally optimized using TensorRT but maintain full compatibility with standard PyTorch or TorchScript code. This allows users to continue to feel like they're writing PyTorch code in their inference applications while fully leveraging TensorRT. We'll cover how the compiler works internally and different ways to leverage the compiler.
[GPU Technology Conference - 2020] PyTorch-TensorRT: Accelerating Inference in PyTorch with TensorRT
TensorRT is a deep-learning inference optimizer and runtime to optimize networks for GPUs and the NVIDIA Deep Learning Accelerator (DLA). Typically, the procedure to optimize models with TensorRT is to first convert a trained model to an intermediary format, such as ONNX, and then parse the file with a TensorRT parser. This works well for networks using common architectures and common operators; however, with the rapid pace of model development, sometimes a DL framework like Tensorflow has ops that are not supported in TensorRT. One solution is to implement plugins for these ops. Another is to use a tool like TF-TRT, which will convert supportable subgraphs to TensorRT and use Tensorflow implementations for the rest. We'll demonstrate the same ability with PyTorch with our new tool PTH-TRT, as well leveraging the PyTorch API's great composability features to allow users to reuse their TensorRT-compatible networks in larger, more complex ones.
[University of Illinois Urbana-Champaign - CS@Illinois SAIL] Lecture: Intro to Convolutional Neural Nets
A quick crash course in using neural nets for Computer Vision. Builds up from logistic regression to CNNs with implementations in PyTorch
[University of Colorado - OIT Tech Talk] Intro to Convolutional Neural Nets and Implications of Deep Learning and AI
Artificial Intelligence has entered a great age of productivity, with massive strides in Computer Vision, Natural Language Processing and Task Learning being enabled by the exponential growth in data availability and the computing power enabled by General Purpose GPU (GPGPU) computing. Developers can now create near state of the art AI applications on their laptops. This talk will cover one of the main tools in deep learning and AI: Convolutional Neural Networks (CNN), how to build one, and how to apply it to a problem like handwriting recognition. It will then explore some of the current problems and approaches in the field of AI such as self driving cars, machine translation, and robotics.
[TEDxMileHigh - Emergence] Navigating Learning in the Multidisciplinary World
What skills are necessary to succeed in a world that's becoming more and more complex? Is it better to specialize, focusing on one subject area? Or is it better to have an inter-disciplinary approach? Why the future of work lies not within specialization, but in the ability to draw on design thinking and immediate problem solving to solve the world's big challenges.