当前位置:网站首页>Engineering practice behind dall-e 2: ensure that the output of the model complies with the content policy
Engineering practice behind dall-e 2: ensure that the output of the model complies with the content policy
2022-06-29 12:15:00 【Zhiyuan community】
To share with the audience DALL·E 2 The magic of , We need to reduce the risks associated with powerful image generation models . So , We have set up various protective measures , To prevent generated images from violating our content policy . This article focuses on pre training mitigation measures , A subset of these safeguards , Directly modifying DALL·E 2 Data learned from . especially ,DALL·E 2 Use hundreds of millions of subtitle images from the Internet for training , We delete and reweight some of these images to change the learning content of the model . This article is divided into three parts , Each section describes different pre training mitigation measures : In the first part , We described how we started from DALL·E 2 The training data set filters out violent and pornographic images . Without such mitigation measures , The model will learn to generate graphics or explicit images when prompted , You may even inadvertently return these images in response to seemingly harmless prompts . In the second part , We found that filtering training data can magnify the deviation , And describe our technology to mitigate this impact . for example , Without such mitigation measures , We noticed that , Compared with the model trained on the original data set , Models trained on filtered data sometimes produce more images of men and less images of women . In the last part , We turn to the question of memory , Discovery image DALL·E 2 Such models can sometimes reproduce images they have trained , Instead of creating new images . In practice , We found that this image backflow is caused by multiple copies of images in the dataset , And alleviate this problem by deleting images that are visually similar to other images in the dataset .边栏推荐
- 杰理之关于开机发起回连对耳的位置:【篇】
- 东方财富证券开户安全吗 证券开户办理
- torch. Load load model error: can't get attribute 'VAE_ vc‘ on <module ‘__ main__‘ From 'xxxx() run file path‘
- 什么是外链和内链?
- GBase8s数据库select有ORDER BY 子句5
- AOSP ~ Logcat 持久化
- Serving millions of developers, the first techo day Tencent technology open day released 7 "lightweight" products
- 如何在Rocky Linux和AlmaLinux上安装MySQL 8.0
- Return value‘s Lifetime
- GBase8s数据库FOR READ ONLY 子句
猜你喜欢
Some printer driver PPD files of Lenovo Lingxiang lenovoimage

When you are young, you should be awake to fight, and when you are young, you should have the courage to try

【VTK】MFC基于VTK8.2的网格编辑器

Numpy的ndarray数组基础

RSLO:自监督激光雷达里程计(实时+高精度,ICRA2022)

初次使用 eolink 感受

Pytorch - 分布式通信原语(附源码)

pod安全策略(PSP)

ERP编制物料清单 基础
![Jerry's about TWS pairing mode configuration [chapter]](/img/c8/d78e817295169753244299545d9aba.png)
Jerry's about TWS pairing mode configuration [chapter]
随机推荐
RepOptimizer: 其实是RepVGG2
Record once MSI notebook ge63 plays webpage video flash screen and randomly turns green to solve the problem
大家有没有觉得学机械的人很可怕?
每周推荐短视频:爱因斯坦是怎样思考问题的?
Sofaregistry source code | data synchronization module analysis
文件包含之日志中毒(User-Agent)
Return value‘s Lifetime
小白学习MySQL - 增量统计SQL的需求 - 开窗函数的方案
C语言##__VA_ARGS__的用法
Zhengda futures liu4 data integration
Ttchat x Zadig open source co creates helm access scenarios, and environmental governance can be done!
力扣每日一题-第31天-13.三角形的最大周长
联想领像 lenovoimage 部分打印机 驱动 PPD 文件
杰理之关于开机发起回连对耳的位置:【篇】
【VTK】MFC基于VTK8.2的网格编辑器
GBase8s数据库FOR READ ONLY 子句
What is the main account of Chia Tai futures used for 4 quotation software?
Is it safe for Hengtai securities to open an account? Ranking of securities
开源机器学习平台
爱可可AI前沿推介(6.29)