当前位置:网站首页>CVPR 2022 𞓜 a creative and aesthetic text generation method! Support any input
CVPR 2022 𞓜 a creative and aesthetic text generation method! Support any input
2022-06-28 21:41:00 【Zhiyuan community】
This article briefly introduces CVPR 2022 Papers hired “Aesthetic Text Logo Synthesis via Content-aware Layout Inferring” . This paper aims to explore the process of character logo image design Automatic layout generation in . This paper is based on conditional countermeasure generation network (conditional-GAN), A dual discriminator structure and a differentiable splicing module are proposed , According to the visual and semantic information of the input text , The layout geometric parameters of each glyph are deduced , So as to synthesize the text identification image . This method can assist graphic design and other visual tasks related to text . The data set and code related to this work have been open source ( See the end of the article ).
The paper : https://arxiv.org/abs/2204.02701
Data sets and code : https://github.com/yizhiwang96/TextLogoLayout
One 、 Research background
Text mark (Text Logo) The design of depends very much on the creativity and experience of the designer , among , How to arrange the layout of each text element is a core problem . Layout design needs to consider many factors , Such as glyph 、 Literal semantics 、 Theme, etc . Pictured 1 Shown , There is usually no shape overlap between different words ; Line breaks or column breaks in Chinese signs are usually in the word element (Token) after ; Emphasize the meaning of words , Larger sizes are usually used ; Geometric transformations such as oblique cutting and rotation can respectively embody themes such as a sense of strength and a sense of joy . Most of the existing schemes in the industry are to design a set of rules that are easy to implement , Design the layout according to some preset templates , However, the results are often monotonous and lack creativity and beauty . In response to this question , This paper proposes a content aware text logo image generation model , From a large number of existing words Logo Learn layout design rules implicitly , Thus, a new font can be generated for any input font Logo.

chart 1 Common layout types in text logo images
Two 、 Data sets
Training AI Models usually require a lot of data , However, there is no data set for this task in the industry . To solve this problem , This paper proposes TextLogo3K Data sets , With the help of Tencent video platform , collect 、 Marking the 3,470 Zhang's carefully selected words Logo chart , these Logo From movies 、 Cover of TV series and animation . Pictured 2 Sum graph 3 Shown , The dataset accurately annotates the glyphs at the pixel level , Also marked with a font bounding box 、 Character category .

chart 2 TextLogo3K in Logo Image annotation
meanwhile , Their position and segmentation information in the original poster picture are also provided :

chart 3 TextLogo3K Annotation of poster image in
The data set is free for users to do academic research ( No commercial use ). Except for the words Logo Generate , This data set can also be applied to Text detection and recognition 、 Artistic font generation 、 Texture effect migration 、 Scene text editing Etc .
3、 ... and 、 Model design
The flow diagram of this model is shown in the figure below :

chart 4 The flow chart of this model
This model is based on Conditional GAN To generate text Logo, Innovative use of the dual discriminator structure ( Sequence discriminator and image discriminator ), On the trajectory sequence and the whole of the glyph Logo The images are judged separately ; At the same time, with the help of differentiable splicing (Differentiable Composition), Construction position coordinates to Logo The differential rendering process of an image . Its main processes include :
- First, the bimodal characteristics of the input elements are used ( That is, the visual features of the font and the semantic features of the text ), Encode it as a conditional feature .
- Coordinate generator A conditional feature and a random noise are used as input , Predict position coordinates for each character , That is, the coordinate of the center point of the font circumscribed frame , Width and height .
- The position coordinates of each character form a track sequence , Therefore, a Sequence discriminator To judge whether the sequence and are true or false according to the conditions . Note that the coordinate values in this task are continuous , It ensures that the sequence discriminator can propagate the gradient .
- adopt Differentiable splicing , Merge each glyph to get Logo Images .
- introduce Image discriminator , As Sequence discriminator A supplement to , The purpose is to further capture the details of the logo image , Ensure that there is no large overlap between different glyphs , The space between glyphs is reasonable .
边栏推荐
- LeetCode1114. Print in sequence
- Leetcode daily question - 515 Find the maximum value in each tree row
- 零基础自学SQL课程 | SQL中的日期函数大全
- Flask - Summary
- Real time transformer: meituan's research on single image depth estimation
- Lumiprobe protein labeling research scheme
- Ehcache configuration data, convenient for self checking
- Anti rabbit dylight 488 abbkine universal immunofluorescence (if) toolbox
- LeetCode56. 合并区间
- Leetcode daily question - 522 Longest special sequence II
猜你喜欢

图神经网络也能用作CV骨干模型,华为诺亚ViG架构媲美CNN、Transformer

Zero foundation self-study SQL course | complete collection of date functions in SQL
![[software test] 2022 national unified college enrollment examination](/img/9a/d76d7eb30a097d364fef28c2230e1a.png)
[software test] 2022 national unified college enrollment examination

Bitbucket 使用 SSH 拉取仓库失败的问题

Lumiprobe proteorange protein gel dye instructions

The further application of Li Kou tree

17 `bs对象.节点名h3.parent` parents 获取父节点 祖先节点
![[Note: circuit intégré MOS analogique] référence de bande Gap (principe de base + mode courant + circuit en mode tension)](/img/cd/be62272d465ca990456c222b38df67.png)
[Note: circuit intégré MOS analogique] référence de bande Gap (principe de base + mode courant + circuit en mode tension)

Query rewriting for opengauss kernel analysis

How to use dataant to monitor Apache apisex
随机推荐
Is the VIP securities account of qiniu school really safe and regular? How do I say this?
[Note: analog MOS integrated circuit] bandgap reference (basic principle + current mode + voltage mode circuit explanation)
How do independent site sellers efficiently manage complex Facebook pages?
17 `bs object Node name h3 Parent ` parents get parent node ancestor node
The further application of Li Kou tree
PHP自学Go日记(四):GO的变量声明方式
Application practice | 1billion data second level correlation. Huolala's OLAP System Evolution Based on Apache Doris (with PPT download)
Leetcode daily question - 324 Swing sort II
LeetCode213. 打家劫舍II
Stability summary
Openfire 3.8.2 cluster configuration
LeetCode188. 买卖股票的最佳时机IV
Leetcode: merge K ascending linked lists_ twenty-three
Workplace tips | understanding the advantages of the position "knowing people"
Interface test process
接口测试流程
Embedded dynamic Arabic string conversion LCD display string [thanks for Jianguo ambition]
CVPR 2022|极具创意&美感的文字生成方法!支持任意输入
LeetCode117. 填充每个节点的下一个右侧节点指针_II
Figure neural network can also be used as CV backbone model. Huawei Noah Vig architecture is comparable to CNN and transformer