Files in this item

FilesDescriptionFormat

application/pdf

application/pdfTANG-THESIS-2020.pdf (3MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:TensorRT inference performance study in MLModelScope
Author(s):Tang, Jingning
Advisor(s):Hwu, Wen-Mei
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):deep learning
inference
TensorRT
MLModelScope
Abstract:As deep learning has been adopted in various domains, the inference process is of growing importance to ensure the deployment across multiple computing platforms. Within many deep learning frameworks that support freezing and deploying the well-trained models, NVIDIA TensorRT is the leading framework that is exclusively developed for inference. It allows the developer to optimize the model to facilitate high-performance inference. While it has been shown extensively that TensorRT can significantly boost the inference capability, quantitative study is lacking on how assorted optimization strategies can improve the inference compared to other well-known deep learning frameworks such as TensorFlow. This thesis presents such a study that consists of experiments using TensorRT on MLModelScope, a deep learning inference platform that enables standardized inference and multi-level profiling.
Issue Date:2020-06-23
Type:Thesis
URI:http://hdl.handle.net/2142/108566
Rights Information:Copyright 2020 Jingning Tang
Date Available in IDEALS:2020-10-07
Date Deposited:2020-08


This item appears in the following Collection(s)

Item Statistics