当前位置:网站首页>Continue to write the greatest work based on modelarts [play with Huawei cloud]
Continue to write the greatest work based on modelarts [play with Huawei cloud]
2022-07-28 10:14:00 【Hua Weiyun】
Preface
Some time ago , Jay finally released his new album , Directly put the good sound on the public screen ! Jay, after a while 6 The new album of 《 The greatest work 》 Recently, a wide range of screen brushing , I feel like YYDS(“ The eternal God ”). So , I have a bold idea , I generated a word cloud from several songs of his new album ( Special shape ), Then I want to use AI According to Jay's style and the hot words of the lyrics, a song similar to his style is generated , Don't talk much , Development .

《 The greatest work 》 The word cloud
1. Prepare the data
Here I import three songs from Jay's new album 《 The greatest work 》、《 Pink ocean 》、《 Still wandering 》, The lyrics data set is as follows :

2. Reading data
Find the... In the console modelarts: entrance

Create a new one notebook, Choose the lowest configuration 2 nucleus 4G Just go , The approximate cost is 0.8/ Hours , Pay attention to using the end bundle to run , Otherwise, you will deduct a lot of money for nothing like me !!!!
After the creation is successful, it is like this , Then start and open

Create a new one Pytorch-1.0.0, Start running code


The output is :

3. Data preprocessing
Write the processed data into memory and convert the text of Jay's lyrics into a complete number

4. Building neural network
Import the dependent package and check whether it is used GPU Training

Build the input layer

Build stacked RNN unit

Word Embedding
Add... To the model Embedding Layer to reduce the dimension of input words

Generally speaking : Is to give a document , A document is a sequence of words, such as “A B A C B F G”, We hope to get a corresponding vector for each different word in the document ( It is often a low dimensional vector ) Express . such as , For something like this “A B A C B F G” A sequence of , Maybe we can finally get :A The corresponding vector is [0.1 0.6 -0.5],B The corresponding vector is [-0.2 0.9 0.7] ( The values here are only for illustration )
The reason why I hope to turn every word into a vector , The purpose is to facilitate calculation , such as “ Ask for words A A synonym for ”, You can go through “ Ask for words A stay cos The most similar vector under distance ” To achieve .
Building neural network , take RNN The layer is connected with the full connection layer
Parameters :
---
cell: RNN unit
rnn_size: RNN Number of hidden layer nodes
input_data: input tensor
vocab_size
embed_dim: Embedded layer size

5. structure batch
ad locum , We will adopt the following methods batch Construction , If we have one 1-20 Sequence , Pass in the parameter batch_size=3, seq_length=2 Words , I want to return the following four-dimensional vector . It is divided into three batch, Every batch Contains the input and the corresponding target output . for example : get_batches([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], 3, 2)

6. model training
Define the input parameters of the model

Start training

Print the results and save the model

Training results output , common 82 individual epoch

Get model training results

Lyrics data generation

Output results

emmmm, The following lyrics are generated , objective evaluation , Feel generally , It needs to be optimized Hahahahahaha ! Last , If you like listening to Jay's songs , We are friends !

summary
This article is based on word granularity pair RNN Training , The text adopts the word segmentation text of Chinese lyrics . Add... To the model Embedding Layer to reduce the dimension of input words . At the same time, I learned Embedding The power of , Understand its principle by querying information : First of all, computers don't know these words , We want to use numbers to express these words , In common ways, the corresponding KEY As mapped numbers . Suppose we divide the final result according to words :【“ I ”,“ Love ”,“ you ”,“ in ”,“ countries ”】 Mapping to numbers is 【2,3,4,5,6】, The result is a one-dimensional vector . Now I want to do word embedding , Use 3 The result of vector representation of dimension is as follows :
【【0.9212,0.1181,0.4291】, representative “ I ”
【0.4388,0.6217,0.4416】, representative “ Love ”
…………】
Each word is represented by a vector , Each number in the vector represents a feature that describes the word . From the original one dimension above the dimension 1*5 It becomes two-dimensional , To form the 5*3 A matrix of ,5 representative 5 Word ,3 For each word, use 3 A numeric feature indicates . The principle of generating Jay's lyrics is this , However, this model needs to be optimized , The lyrics are not rhymed enough , Continue to learn later .
reference
pr0d1gy : Oh, good ModelArts Teach you to write songs
边栏推荐
- [jzof] 14 cut rope
- 2021-10-13arx
- Leetcode076 -- the kth largest number in the array
- 数据库mysql基础
- 【JZOF】15二进制中1的位数
- TCP Basics
- Redis interview questions must be known and learned
- ASP. Net core 6 framework unveiling example demonstration [29]: building a file server
- Being on duty less than 8 hours a day and being dismissed? Tencent's former employees recovered 13million overtime pay, etc., and the court won a compensation of 90000 in the final judgment
- OSPF expansion configuration, routing principles, anti ring and re release
猜你喜欢
![[jzof] 14 cut rope](/img/36/6f58b443a549ad245c1c4cfe5d13af.png)
[jzof] 14 cut rope

ADVANCE.AI出海指南助力企业出海印尼,掌握东南亚市场半边天

Flink - checkpoint Failure reason: Not all required tasks are currently running

19. 删除链表的倒数第 N 个结点

JS promotion: the underlying principle of flat tiling

Arthas tutorial

B2B2C系统亮点是什么?如何助力珠宝首饰企业打造全渠道多商户商城管理体系

In the era of home health diagnosis, Senzo creates enhanced lateral flow test products

Performance test of API gateway APIs IX in Google cloud T2a and T2D

Redis面试题必知必会
随机推荐
Introduction to evaluatorfilter
SkiaSharp 之 WPF 自绘 拖曳小球(案例版)
_HUGE and __IMP__HUGE in “math.h“
Redis面试题必知必会
医药行业数字化建设,箭在弦上
leetcode076——数组中的第 k 大的数字
Installing MySQL for Linux operating system (centos7)
15、判断二维数组中是否存在目标值
SQL server, MySQL master-slave construction, EF core read-write separation code implementation
博弈论 1.Introduction(组合游戏基本概念、对抗搜索、Bash游戏、Nim游戏)
Voice chat app - how to standardize the development process?
Sizebasedtriggingpolicy introduction
二维前缀和
Guangzhou metro line 14 xinshixu station is under construction, and residents in Baiyun District are about to start a double line transfer mode!
OSPF的LSA及优化
Consul
B2B2C系统亮点是什么?如何助力珠宝首饰企业打造全渠道多商户商城管理体系
LSA and optimization of OSPF
广州地铁14号线新市墟站开建,白云区居民即将开启双线换乘模式!
小黑重新站起来看leetcode:653. 两数之和 IV - 输入 BST