当前位置:网站首页>Literature reading (245) roller
Literature reading (245) roller
2022-07-28 14:10:00 【tiaozhanzhe1900】
List of articles
- subject :Roller: Fast and Efficient Tensor Compilation for Deep Learning
- Time :2022
- meeting :OSDI
- Research Institute : Microsoft
The main of this paper motivation Now DNN It takes a long time to explore the compilation of , Especially for Nivida Other hardware platforms such as AMD GPU and Graphcore IPU, So this paper has a different idea , Generate by construction kernel, First, introduce the basic concepts :
- rTile: The most basic level of abstraction , It's just one. data tile, However, it corresponds to the basic size of calculation and memory access
- rProgram: It includes load、store、compute Based on the rTile The program , You can meet GPU Medium SM perform
- kernel: utilize rProgram structure kernel

- rTile is a new tile abstraction that encapsulates tensor shapes that align with the key features of the underlying accelerator, thus achieving efficient execution by limiting the shape choices.
- rProgram: adopts a recursive rTile-based construction algorithm to gradually increase the size of the rTile shape to construct an rProgram that saturates a single execution unit of the accelerator (e.g., an SM, a streaming multi-processor in a NVIDIA GPU)
- kernel: performs the scale-out process, which simply replicates the resulting rProgram to other parallel execution units

边栏推荐
- Postgresql14 installation and master-slave configuration
- Postgresql14安装及主从配置
- redis哨兵机制
- Websocket chat
- a标签_文件下载(download属性)
- DXF reading and writing: align the calculation of the position of the dimension text in the middle and above
- Dojnoip201708 cheese solution
- Custom Configuration Sections
- MySql5.5之后的默认存储引擎为InnoDB。
- Several efficient APIs commonly used in inventory operation URL
猜你喜欢

Jmeter安装教程及登录增加token

DXF reading and writing: align the calculation of the position of the dimension text in the middle and above

Multithreading and high concurrency (III) -- source code analysis AQS principle

Clickhouse分布式集群搭建

Security assurance is based on software life cycle - networkpolicy application

LeetCode 0142.环形链表 II

安全保障基于软件全生命周期-NetworkPolicy应用

What is the reason why the words behind word disappear when typing? How to solve it?

记一次COOKIE的伪造登录

每日一题——奖学金
随机推荐
DXF读写:标注样式组码中文说明
你真的了解esModule吗
jenkins
How to configure ADB environment variables (where to open environment variables)
Jmeter安装教程及登录增加token
IP黑白名单
30 day question brushing plan (III)
Security assurance is based on software life cycle -istio authentication mechanism
在centos中安装mysql5.7.36
How to effectively conduct the review meeting (Part 1)?
安全保障基于软件全生命周期-PSP应用
【服务器数据恢复】HP StorageWorks系列服务器RAID5两块盘离线的数据恢复
文献阅读(245)Roller
ES6 what amazing writing methods have you used
Power amplifier and matching network learning
Read how to deploy highly available k3s with external database
Master several common sorting - Select Sorting
Dojnoip201708 cheese solution
Chapter 6 support vector machine
DXF reading and writing: align the calculation of the position of the dimension text in the middle and above