当前位置：网站首页>Continue to write the greatest work based on modelarts [play with Huawei cloud]

Continue to write the greatest work based on modelarts [play with Huawei cloud]

2022-07-28 10:14:00 【Hua Weiyun】

Preface

Some time ago , Jay finally released his new album , Directly put the good sound on the public screen ！ Jay, after a while 6 The new album of 《 The greatest work 》 Recently, a wide range of screen brushing , I feel like YYDS（“ The eternal God ”）. So , I have a bold idea , I generated a word cloud from several songs of his new album （ Special shape ）, Then I want to use AI According to Jay's style and the hot words of the lyrics, a song similar to his style is generated , Don't talk much , Development .

《 The greatest work 》 The word cloud

1. Prepare the data

Here I import three songs from Jay's new album 《 The greatest work 》、《 Pink ocean 》、《 Still wandering 》, The lyrics data set is as follows ：

2. Reading data

Find the... In the console modelarts： entrance

Create a new one notebook, Choose the lowest configuration 2 nucleus 4G Just go , The approximate cost is 0.8/ Hours , Pay attention to using the end bundle to run , Otherwise, you will deduct a lot of money for nothing like me ！！！！

After the creation is successful, it is like this , Then start and open

Create a new one Pytorch-1.0.0, Start running code

The output is ：

3. Data preprocessing

Write the processed data into memory and convert the text of Jay's lyrics into a complete number

4. Building neural network

Import the dependent package and check whether it is used GPU Training

Build the input layer

Build stacked RNN unit

Word Embedding

Add... To the model Embedding Layer to reduce the dimension of input words

If you will word Look at the smallest unit of the composition , Can be Word Embedding Understand as a kind of mapping , The process is ： Put a... In the text space word, In a certain way , mapping Or say The embedded （embedding） To another numerical vector space （ So it's called embedding, Because this kind of representation is often accompanied by a sense of dimension reduction .

Generally speaking ： Is to give a document , A document is a sequence of words, such as “A B A C B F G”, We hope to get a corresponding vector for each different word in the document ( It is often a low dimensional vector ) Express . such as , For something like this “A B A C B F G” A sequence of , Maybe we can finally get ：A The corresponding vector is [0.1 0.6 -0.5],B The corresponding vector is [-0.2 0.9 0.7] （ The values here are only for illustration ）
The reason why I hope to turn every word into a vector , The purpose is to facilitate calculation , such as “ Ask for words A A synonym for ”, You can go through “ Ask for words A stay cos The most similar vector under distance ” To achieve .

Building neural network , take RNN The layer is connected with the full connection layer

Parameters :
---
cell: RNN unit
rnn_size: RNN Number of hidden layer nodes
input_data: input tensor
vocab_size
embed_dim: Embedded layer size

5. structure batch

ad locum , We will adopt the following methods batch Construction , If we have one 1-20 Sequence , Pass in the parameter batch_size=3, seq_length=2 Words , I want to return the following four-dimensional vector . It is divided into three batch, Every batch Contains the input and the corresponding target output . for example ： get_batches([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], 3, 2)

6. model training

Define the input parameters of the model

Start training

Print the results and save the model

Training results output , common 82 individual epoch

Get model training results

Lyrics data generation

Output results

emmmm, The following lyrics are generated , objective evaluation , Feel generally , It needs to be optimized Hahahahahaha ！ Last , If you like listening to Jay's songs , We are friends ！

summary

This article is based on word granularity pair RNN Training , The text adopts the word segmentation text of Chinese lyrics . Add... To the model Embedding Layer to reduce the dimension of input words . At the same time, I learned Embedding The power of , Understand its principle by querying information ： First of all, computers don't know these words , We want to use numbers to express these words , In common ways, the corresponding KEY As mapped numbers . Suppose we divide the final result according to words ：【“ I ”,“ Love ”,“ you ”,“ in ”,“ countries ”】 Mapping to numbers is 【2,3,4,5,6】, The result is a one-dimensional vector . Now I want to do word embedding , Use 3 The result of vector representation of dimension is as follows ：

【【0.9212,0.1181,0.4291】, representative “ I ”

【0.4388,0.6217,0.4416】, representative “ Love ”

…………】

Each word is represented by a vector , Each number in the vector represents a feature that describes the word . From the original one dimension above the dimension 1*5 It becomes two-dimensional , To form the 5*3 A matrix of ,5 representative 5 Word ,3 For each word, use 3 A numeric feature indicates . The principle of generating Jay's lyrics is this , However, this model needs to be optimized , The lyrics are not rhymed enough , Continue to learn later .

reference

pr0d1gy ： Oh, good ModelArts Teach you to write songs

原网站

版权声明
本文为[Hua Weiyun]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/209/202207280953519762.html