当前位置:网站首页>Aike AI frontier promotion (2.13)
Aike AI frontier promotion (2.13)
2022-07-06 06:13:00 【Zhiyuan community】
LG - machine learning CV - Computer vision CL - Computing and language AS - Audio and voice RO - robot
Turn from love to a lovely life
1、[LG] Diversify and Disambiguate: Learning From Underspecified Data
Y Lee, H Yao, C Finn
[Stanford University]
Diversified and ambiguous learning of some specified data . Many data sets are ambiguous , It means that there are several equally feasible solutions for data . For the method of learning a single hypothesis , Ambiguous data sets can be problematic , Because different functions that achieve low training loss can focus on different prediction characteristics , Therefore, for the distribution outside (OOD) The predictions of the data are quite different . In this paper, DivDis, A simple two-stage framework , Use the unlabeled data of the test distribution to learn a diverse set of assumptions for a task , With minimal extra supervision , In the form of additional tags or visual checks of functions , Choose one of the discovered hypotheses to disambiguate . Experiments show that ,DivDis Robust features can be used to find hypotheses in some specified image classification and natural language processing problems , Has higher performance , The cost is unmarked target data and a few corresponding tags .
Many datasets are underspecified, which means there are several equally viable solutions for the data. Underspecified datasets can be problematic for methods that learn a single hypothesis because different functions that achieve low training loss can focus on different predictive features and thus have widely varying predictions on out-of-distribution data. We propose DivDis, a simple two-stage framework that first learns a diverse collection of hypotheses for a task by leveraging unlabeled data from the test distribution. We then disambiguate by selecting one of the discovered hypotheses using minimal additional supervision, in the form of additional labels or inspection of function visualization. We demonstrate the ability of DivDis to find hypotheses that use robust features in image classification and natural language processing problems with underspecification.
2、[LG] MuZero with Self-competition for Rate Control in VP9 Video Compression
A Mandhane, A Zhernov, M Rauh...
[DeepMind & Google]
be based on MuZero Self competitive VP9 Video compression rate control . With entertainment 、 Education and business are increasingly dependent on online video , The use of video streaming has increased significantly . Optimizing video compression may improve users' access and content quality , And reduce energy use and overall costs . This paper introduces MuZero Application of algorithm in video compression challenge , The goal is to learn the rate control strategy , To choose libvpx Quantization parameters in the coding process (QP), This is a popular video on demand (VOD) Open source services are widely used VP9 Video compression library . Regard it as a sequential decision problem , Under the accidental constraint brought by the target bit rate , Maximize video quality . A new reward mechanism based on self competition is proposed , In order to solve the constrained reinforcement learning with variable constraint satisfaction difficulty , It is a challenge to the existing constrained reinforcement learning methods . Experiments show that , And libvpx The dual channel VBR Compared with the rate control strategy , At the same delivery video quality level ( With PSNR BD- Rate measurement ) Next , be based on MuZero The average size of compressed video is reduced by rate control of 6.28%, At the same time, it has better constraint satisfaction behavior .
Video streaming usage has seen a significant rise as entertainment, education, and business increasingly rely on online video. Optimizing video compression has the potential to increase access and quality of content to users, and reduce energy use and costs overall. In this paper, we present an application of the MuZero algorithm to the challenge of video compression. Specifically, we target the problem of learning a rate control policy to select the quantization parameters (QP) in the encoding process of libvpx, an open source VP9 video compression library widely used by popular video-on-demand (VOD) services. We treat this as a sequential decision making problem to maximize the video quality with an episodic constraint imposed by the target bitrate. Notably, we introduce a novel self-competition based reward mechanism to solve constrained RL with variable constraint satisfaction difficulty, which is challenging for existing constrained RL methods. We demonstrate that the MuZero-based rate control achieves an average 6.28% reduction in size of the compressed videos for the same delivered video quality level (measured as PSNR BD-rate) compared to libvpx’s two-pass VBR rate control policy, while having better constraint satisfaction behavior.
3、[LG] Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative
L M. Dery, P Michel, A Talwalkar, G Neubig
[CMU & ENS PSL University]
Terminal task awareness training as an alternative to pre training . In most environments of practical concern , Machine learning practitioners know in advance what final tasks they want to improve through auxiliary tasks . However , The widely used method of using auxiliary data , Such as pre training and variants of continuous training , Are irrelevant to the final task : They rarely , If any , Use the knowledge of the target task . This paper studies the terminal task perception training of the pre training language model to replace the terminal task unknowable continuous training . For sufficiently important terminal tasks , The benefits of using auxiliary data in a task aware way , Enough to show the abandonment of the traditional image ( continued ) Pre training in this way to get general 、 The method of terminal task agnostic representation is reasonable . Three different low resources in two areas NLP On mission , It proves the results of multi task final task and auxiliary target , It achieves significantly better downstream task performance than the widely used task independent continuous pre training paradigm . An online meta learning algorithm is further proposed , Learn a set of multi task weights , To better balance multiple auxiliary goals , Further improvement of terminal task performance and data efficiency .
In most settings of practical concern, machine learning practitioners know in advance what end-task they wish to boost with auxiliary tasks. However, widely used methods for leveraging auxiliary data like pre-training and its continuedpretraining variant are end-task agnostic: they rarely, if ever, exploit knowledge of the target task. We study replacing end-task agnostic continued training of pretrained language models with end-task aware training of said models. We argue that for sufficiently important end-tasks, the benefits of leveraging auxiliary data in a task-aware fashion can justify forgoing the traditional approach of obtaining generic, end-task agnostic representations as with (continued) pre-training. On three different low-resource NLP tasks from two domains, we demonstrate that multi-tasking the end-task and auxiliary objectives results in significantly better downstream task performance than the widely-used task-agnostic continued pre-training paradigm of Gururangan et al. (2020). We next introduce an online meta-learning algorithm that learns a set of multi-task weights to better balance among our multiple auxiliary objectives, achieving further improvements on endtask performance and data efficiency.
4、[LG] A backbone-centred energy function of neural networks for protein design
B Huang, Y Xu, X Hu, Y Liu...
[University of Science and Technology of China]
Skeleton centered neural network energy function for protein design . If there is a considerable number of amino acid sequences, they can fold into a protein skeleton structure , Then the structure is designable . Some people think that the designability of the skeleton is mainly restricted by the molecular interaction independent of the side chain or insensitive to the type of side chain , This shows the design of a new skeleton ( For amino acid selection ) The method is based on continuous sampling and optimization of the central energy surface of the skeleton . However , At present, there is no sufficiently comprehensive and accurate energy function constructed for this purpose . This article shows that , This goal can be achieved through a method called SCUBA( Represents the side chain - Unknown skeleton arrangement ) Statistical model to achieve , The model uses the energy term in the form of neural network , Learn through two steps , Including kernel density estimation and neural network training , It can analytically represent the multidimensional structure of known proteins 、 High order correlation . Reported 9 Crystal structure of a new protein , The skeleton of these proteins is made of SCUBA High precision design , among 4 One has novel 、 Unnatural overall structure . By avoiding using fragments of existing protein structures ,SCUBA Driven structural design promotes the far-reaching exploration of designable skeleton space , It expands the novelty and diversity of proteins that can be redesigned .
A protein backbone structure is designable if a substantial number of amino acid sequences exist that autonomously fold into it. It has been suggested that the designability of backbones is governed mainly by side chain-independent or side chain type-insensitive molecular interactions, indicating an approach for designing new backbones (ready for amino acid selection) based on continuous sampling and optimization of the backbone-centred energy surface. However, a sufficiently comprehensive and precise energy function has yet to be established for this purpose. Here we show that this goal is met by a statistical model named SCUBA (for Side Chain-Unknown Backbone Arrangement) that uses neural network-form energy terms. These terms are learned with a two-step approach that comprises kernel density estimation followed by neural network training and can analytically represent multidimensional, high-order correlations in known protein structures. We report the crystal structures of nine de novo proteins whose backbones were designed to high precision using SCUBA, four of which have novel, non-natural overall architectures. By eschewing use of fragments from existing protein structures, SCUBA-driven structure design facilitates far-reaching exploration of the designable backbone space, thus extending the novelty and diversity of the proteins amenable to de novo design.
5、[CL] Identifying Weaknesses in Machine Translation Metrics Through Minimum Bayes Risk Decoding: A Case Study for COMET
C Amrhein, R Sennrich
[University of Zurich]
Identify the weakness of machine translation indicators through minimum Bayesian risk decoding :COMET A case study . Machine translation system evaluation , There is an impressive correlation between neural indicators and artificial judgment , But before we can safely optimize these indicators , Be aware of ( It's best to eliminate ) Deviation from bad translation that gets high marks . Experiments show that , Sample based minimum Bayesian risk decoding can be used to explore and quantify this weakness . When applying this strategy to COMET Of en→de and de→en when , Find out COMET The model is not sensitive enough to the differences between numbers and named entities . To show further that , These deviations cannot be completely eliminated by simply training on additional synthetic data .
Neural metrics have achieved impressive correlation with human judgements in the evaluation of machine translation systems, but before we can safely optimise towards such metrics, we should be aware of (and ideally eliminate) biases towards bad translations that receive high scores. Our experiments show that sample-based Minimum Bayes Risk decoding can be used to explore and quantify such weaknesses. When applying this strategy to COMET for en→de and de→en, we find that COMET models are not sensitive enough to discrepancies in numbers and named entities. We further show that these biases cannot be fully removed by simply training on additional synthetic data.
Several other papers worthy of attention :
[CV] The EMory BrEast imaging Dataset (EMBED): A Racially Diverse, Granular Dataset of 3.5M Screening and Diagnostic Mammograms
EMory BrEast Imaging data set (EMBED):350 Ten thousand pieces of breast screening and diagnosis X The racial diversity of light 、 Granular data set
J J. Jeong, B L. Vey, A Reddy...
[Emory University & Georgia Institute of Technology & Kennesaw State University]
[CV] NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN
NÜWA-LIP: Based on flawless VQGAN Language guided image completion
M Ni, C Wu, H Huang, D Jiang, W Zuo, N Duan
[Harbin Institute of Technology & Microsoft Research Asia]
[CL] pNLP-Mixer: an Efficient all-MLP Architecture for Language
pNLP-Mixer: An efficient all MLP Language architecture
F Fusco, D Pascual, P Staar
[IBM Research]
[CL] Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models
Exploring the limitations of domain adaptive training for large-scale language model detoxification
B Wang, W Ping, C Xiao, P Xu, M Patwary, M Shoeybi, B Li, A Anandkumar, B Catanzaro
[University of Illinois at Urbana-Champaign & NVIDIA]
边栏推荐
- About PHP startup, mongodb cannot find the specified module
- Network protocol model
- 技术分享 | 常见接口协议解析
- MySQL之基础知识
- Overview of three core areas of Mathematics: algebra
- Interface test: what are the components of the URL in fiddler
- The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
- 测试周期被压缩?教你9个方法去应对
- Introduction to promql of # yyds dry goods inventory # Prometheus
- [eolink] PC client installation
猜你喜欢
随机推荐
Coordinatorlayout+nestedscrollview+recyclerview pull up the bottom display is incomplete
[untitled]
RestTemplate、Feign实现Token传递
联合索引的左匹配原则
Luogu p1460 [usaco2.1] healthy Holstein cows
Software test interview questions - Test Type
全链路压测:构建三大模型
【微信小程序】搭建开发工具环境
数据库-当前读与快照读
Eigen稀疏矩阵操作
二维码的前世今生 与 六大测试点梳理
Overview of three core areas of Mathematics: algebra
GTSAM中ISAM2和IncrementalFixedLagSmoother说明
《卓有成效的管理者》读书笔记
单元测试的意义
GTSAM中李群的运用
Pat (Grade B) 2022 summer exam
黑猫带你学eMMC协议第10篇:eMMC读写操作详解(read & write)
黑猫带你学UFS协议第8篇:UFS初始化详解(Boot Operation)
Gtest之TEST宏的用法