当前位置:网站首页>文献阅读(245)Roller
文献阅读(245)Roller
2022-07-28 13:10:00 【tiaozhanzhe1900】
文章目录
- 题目:Roller: Fast and Efficient Tensor Compilation for Deep Learning
- 时间:2022
- 会议:OSDI
- 研究机构:微软
本篇论文的主要的motivation在于现在DNN的编译探索工作的时间比较长,特别是针对Nivida以外的硬件平台如AMD GPU和Graphcore IPU,所以这篇论文换了一个思路,采用构造的方式生成kernel,首先介绍基本概念:
- rTile:最基础的抽象层级,就是一个data tile,不过与计算和访存的基础尺寸对应
- rProgram:包括了load、store、compute的基于rTile的程序,可以满足GPU中的SM执行
- kernel:利用rProgram构造kernel

- rTile is a new tile abstraction that encapsulates tensor shapes that align with the key features of the underlying accelerator, thus achieving efficient execution by limiting the shape choices.
- rProgram: adopts a recursive rTile-based construction algorithm to gradually increase the size of the rTile shape to construct an rProgram that saturates a single execution unit of the accelerator (e.g., an SM, a streaming multi-processor in a NVIDIA GPU)
- kernel: performs the scale-out process, which simply replicates the resulting rProgram to other parallel execution units

边栏推荐
- SQL daily practice (Niuke new question bank) - day 4: advanced operators
- URL related knowledge points
- Vite configuring path aliases in the project
- POJ3268最短路径题解
- .net for subtraction, intersection and union of complex type sets
- 基于NoneBot2的qq机器人配置记录
- 正则表达式
- R语言ggplot2可视化:可视化散点图并为散点图中的数据点添加文本标签、使用ggrepel包的geom_text_repel函数避免数据点标签互相重叠(自定义指定字体类型font family)
- 【Try to Hack】HFish蜜罐部署
- 【服务器数据恢复】HP StorageWorks系列服务器RAID5两块盘离线的数据恢复
猜你喜欢

Product Manager: job responsibility table

基于NoneBot2的qq机器人配置记录

83.(cesium之家)cesium示例如何运行

30 day question brushing plan (IV)

Qt5开发从入门到精通——第一篇概述

在 Kubernetes 中部署应用交付服务(第 1 部分)

111. The sap ui5 fileuploader control realizes local file upload and encounters a cross domain access error when receiving the response from the server

阿里、京东、抖音:把云推向产业心脏

产品经理:岗位职责表

The strongest distributed locking tool: redisson
随机推荐
Deploy application delivery services in kubernetes (Part 1)
DXF reading and writing: Chinese description of dimension style group codes
Jmeter安装教程及登录增加token
牛客多校-Link with Level Edito I-(线性dp)
Remember to use pdfbox once to parse PDF and obtain the key data of PDF
产品经理:岗位职责表
在 Kubernetes 中部署应用交付服务(第 1 部分)
30 day question brushing plan (IV)
R language Visual scatter diagram, geom using ggrep package_ text_ The repl function avoids overlapping labels between data points (add labels to specific areas of the visual image using the parameter
Tutorial on the principle and application of database system (062) -- MySQL exercise questions: operation questions 32-38 (6)
离散对数问题(DLP) && Diffie-Hellman问题(DHP)
走进音视频的世界——FLV视频封装格式
R language test sample proportion: use prop The test function performs the single sample proportion test to calculate the confidence interval of the p value of the successful sample proportion in the
Operator3 - design an operator
【LVGL事件(Events)】事件在不同组件上的应用(一)
在centos中安装mysql5.7.36
URL related knowledge points
安全保障基于软件全生命周期-Istio的授权机制
DXF读写:标注样式组码中文说明
After finishing, help autumn move, I wish you call it an offer harvester