当前位置:网站首页>Music generation through deep neural network
Music generation through deep neural network
2020-11-06 01:28:00 【Artificial intelligence meets panchuang】
author |Ramya Vidiyala compile |VK source |Towards Data Science
Deep learning has improved many aspects of our life , Whether it's obvious or subtle . Deep learning in the movie recommendation system 、 Spam detection and computer vision play a key role in the process .
Although the discussion about deep learning as a black box and the difficulty of training is still going on , But in medicine 、 There is a huge potential in many areas such as virtual assistants and e-commerce .
At the intersection of art and Technology , Deep learning can work . To further explore the idea , In this paper , We're going to look at the process of generating machine learning music through a deep learning process , Many people think that this field is beyond the scope of machines ( It's also another interesting area of intense debate !).
Catalog
-
The musical representation of machine learning model
-
Music dataset
-
Data processing
-
Model selection
-
RNN
-
Time distribution, full connection layer
-
state
-
Dropout layer
-
Softmax layer
-
Optimizer
-
Music generation
-
Abstract
The musical representation of machine learning model
We will use ABC music notation .ABC Notation is a shorthand notation , It uses letters a To G To show notes , And use other elements to place the added value . These added values include note length 、 Keys and decorations .
This form of symbol begins as a kind of ASCII Character set code , To facilitate online music sharing , Add a new and simple language for software developers , Easy to use . Here are ABC music notation .

The notation of music is No 1 The lines in the section show a letter followed by a colon . These represent all aspects of the tune , For example, the index when there are multiple tunes in the file (X:)、 title (T:)、 Time signature (M:)、 Default note length (L:)、 The type of melody (R:) Sum key (K:). The key name is followed by the melody .
Music dataset
In this paper , We're going to use the Nottingham music database ABC Open source data provided on the . It contains 1000 Many folk tunes , Most of them have been converted into ABC Symbol :http://abc.sourceforge.net/NMD/
Data processing
Data is currently in a character based classification format . In the data processing phase , We need to convert the data to an integer based numeric format , Prepare for the work of neural networks .

Here each character is mapped to a unique integer . This can be done by using a single line of code .“text” Variables are input data .
char_to_idx = { ch: i for (i, ch) in enumerate(sorted(list(set(text)))) }
To train the model , We use vocab Convert the entire text data into digital format .
T = np.asarray([char_to_idx[c] for c in text], dtype=np.int32)
Machine learning music generation model selection
In the traditional machine learning model , We can't store the previous phase of the model . However , We can use cyclic neural networks ( Often referred to as RNN) To store the previous stages .
RNN There's a repeating module , It takes input from the previous level , And take its output as the input of the next level . However ,RNN Only the most recent information can be retained , So our network needs more memory to learn about long-term dependencies . This is the long-term and short-term memory network (LSTMs).
LSTMs yes RNNs A special case of , Have and RNNs The same chain structure , But there are different repeat module structures .

Use here RNN Because :
-
The length of the data doesn't need to be fixed . For each input , The data length may vary .
-
You can store sequences .
-
Various combinations of input and output sequence lengths can be used .
Except for the general RNN, We will also customize it to fit our use cases by adding some adjustments . We will use “ Character level RNN”. In the character RNNs in , Input 、 Both the output character and the output are in the form of conversion .

RNN
Because we need to generate output on each timestamp , So we're going to use a lot of RNN. In order to achieve multiple RNN, We need to put the parameters “return_sequences” Set to true, To generate each character on each timestamp . By looking at the figure below 5, You can understand it better .

In the diagram above , Blue units are input units , Yellow units are hidden units , Green units are output units . This is a lot of RNN A brief overview of . In order to know more about RNN Sequence , Here's a useful resource :http://karpathy.github.io/2015/05/21/rnn-effectiveness/
Time distribution, full connection layer
To handle the output of each timestamp , We created a full connectivity layer of time distribution . To achieve this , We create a time distributed full connectivity layer on top of the output generated by each timestamp .
state
By changing the parameters stateful Set to true, The output of the batch is passed as input to the next batch . After combining all the features , Our model will look like the following figure 6 The overview shown .

The code snippet for the model architecture is as follows :
model = Sequential()
model.add(Embedding(vocab_size, 512, batch_input_shape=(BATCH_SIZE, SEQ_LENGTH)))
for i in range(3):
model.add(LSTM(256, return_sequences=True, stateful=True))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(vocab_size)))
model.add(Activation('softmax'))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
I strongly recommend that you use layers to improve performance .
Dropout layer
Dropout Layer is a regularization technique , In the process of training , Zero a small part of the input unit at each update , To prevent over fitting .
Softmax layer
The generation of music is a multi class classification problem , Each class is the only character in the input data . therefore , We used a softmax layer , The classification cross entropy is taken as a loss function .
This layer gives the probability of each class . From the probability list , We choose the one with the highest probability .

Optimizer
To optimize our model , We use adaptive moment estimation , Also known as Adam, Because it is RNN A good choice for .

Making music
up to now , We created one RNN Model , And train them according to their data . The model learns the pattern of input data in the training stage . We call this model “ Training models ”.
The input size used in the training model is the batch size . For music produced by machine learning , The input size is a single character . So we created a new model , It and "" Training models "" be similar , But the size of the input character is (1,1). In this new model , We load weights from the training model to copy the features of the training model .
model2 = Sequential()
model2.add(Embedding(vocab_size, 512, batch_input_shape=(1,1)))
for i in range(3):
model2.add(LSTM(256, return_sequences=True, stateful=True))
model2.add(Dropout(0.2))
model2.add(TimeDistributed(Dense(vocab_size)))
model2.add(Activation(‘softmax’))
We load the weight of the trained model into the new model . This can be done by using a single line of code .
model2.load_weights(os.path.join(MODEL_DIR,‘weights.100.h5’.format(epoch)))
model2.summary()

In the production of music , Randomly select the first character from the unique character set , Use the previously generated character to generate the next character , And so on . With this structure , We have music .

Here's a snippet of code to help us do this .
sampled = []
for i in range(1024):
batch = np.zeros((1, 1))
if sampled:
batch[0, 0] = sampled[-1]
else:
batch[0, 0] = np.random.randint(vocab_size)
result = model2.predict_on_batch(batch).ravel()
sample = np.random.choice(range(vocab_size), p=result)
sampled.append(sample)
print("sampled")
print(sampled)
print(''.join(idx_to_char[c] for c in sampled))
Here are some of the generated music clips :
We use what is called LSTMs The machine learning neural network for generating these pleasant music samples . Every piece is different , But similar to the training data . These melodies can be used for many purposes :
-
Enhance the creativity of artists through inspiration
-
As a productivity tool for developing new ideas
-
As an adjunct to the artist's work
-
Finish the unfinished work
-
As an independent piece of music
however , This model needs to be improved . Our training materials only have one instrument , The piano . One way we can enhance training data is to add music from a variety of instruments . Another way is to increase the genre of music 、 Rhythm and rhythm characteristics .
at present , Our pattern produces some false notes , Music is no exception . We can reduce these errors and improve the quality of music by increasing the training data set .
summary
In this article , We studied how to deal with music used with neural networks , Deep learning models such as RNN and LSTMs How it works , We also discussed how to adjust the model to produce music . We can apply these concepts to any other system , In these systems , We can generate other forms of art , Including the creation of landscapes or portraits .
Thanks for reading ! If you want to experience this custom dataset for yourself , You can download annotated data here , And in Github Check my code on :https://github.com/RamyaVidiyala/Generate-Music-Using-Neural-Networks
Link to the original text :https://towardsdatascience.com/music-generation-through-deep-neural-networks-21d7bd81496e
Welcome to join us AI Blog station : http://panchuang.net/
sklearn Machine learning Chinese official documents : http://sklearn123.com/
Welcome to pay attention to pan Chuang blog resource summary station : http://docs.panchuang.net/
版权声明
本文为[Artificial intelligence meets panchuang]所创,转载请带上原文链接,感谢
边栏推荐
- 6.3 handlerexceptionresolver exception handling (in-depth analysis of SSM and project practice)
- 零基础打造一款属于自己的网页搜索引擎
- I think it is necessary to write a general idempotent component
- Network security engineer Demo: the original * * is to get your computer administrator rights! 【***】
- 速看!互联网、电商离线大数据分析最佳实践!(附网盘链接)
- Mongodb (from 0 to 1), 11 days mongodb primary to intermediate advanced secret
- vue-codemirror基本用法:实现搜索功能、代码折叠功能、获取编辑器值及时验证
- Vue.js Mobile end left slide delete component
- Synchronous configuration from git to consult with git 2consul
- 前端都应懂的入门基础-github基础
猜你喜欢

2019年的一个小目标,成为csdn的博客专家,纪念一下

keras model.compile Loss function and optimizer

Do not understand UML class diagram? Take a look at this edition of rural love class diagram, a learn!

至联云分享:IPFS/Filecoin值不值得投资?

Calculation script for time series data

多机器人行情共享解决方案

前端基础牢记的一些操作-Github仓库管理

助力金融科技创新发展,ATFX走在行业最前列

华为云“四个可靠”的方法论

Vue 3 responsive Foundation
随机推荐
Installing the consult cluster
Mac installation hanlp, and win installation and use
Natural language processing - BM25 commonly used in search
Face to face Manual Chapter 16: explanation and implementation of fair lock of code peasant association lock and reentrantlock
Filecoin主网上线以来Filecoin矿机扇区密封到底是什么意思
What is the side effect free method? How to name it? - Mario
ES6学习笔记(五):轻松了解ES6的内置扩展对象
前端基础牢记的一些操作-Github仓库管理
Nodejs crawler captures ancient books and records, a total of 16000 pages, experience summary and project sharing
一篇文章带你了解CSS3图片边框
Common algorithm interview has been out! Machine learning algorithm interview - KDnuggets
htmlcss
Vue 3 responsive Foundation
Our best practices for writing react components
Calculation script for time series data
Examples of unconventional aggregation
小程序入门到精通(二):了解小程序开发4个重要文件
Classical dynamic programming: complete knapsack problem
How to use parameters in ES6
I've been rejected by the product manager. Why don't you know