Muzic: Music Understanding and Generation with Artificial Intelligence

Overview



Muzic is a research project on AI music that empowers music understanding and generation with deep learning and artificial intelligence. Muzic is pronounced as [ˈmjuːzeik] and '谬贼客' (in Chinese). Besides the logo in image version (see above), Muzic also has a logo in video version (you can click here to watch ). Muzic was started by some researchers from Microsoft Research Asia.


We summarize the scope of our Muzic project in the following figure:


The current work in Muzic include:

Requirements

The operaton system is Linux. We test on Ubuntu 16.04.6 LTS, with Python 3.6.12. The requirements for running Muzic are listed in requirements.txt. To install the requirements, run:

pip install -r requirements.txt

We initially release the code of 5 research work: MusicBERT, PDAugment, DeepRapper, SongMASS, and TeleMelody. You can find the README in the corresponding folder for detailed instructions on how to use.

Reference

If you find the Muzic project useful in your work, you can cite the following papers if there's a need:

  • MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training, Mingliang Zeng, Xu Tan, Rui Wang, Zeqian Ju, Tao Qin, Tie-Yan Liu, ACL 2021.
  • PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription, Chen Zhang, Jiaxing Yu, Luchin Chang, Xu Tan, Jiawei Chen, Tao Qin, Kejun Zhang, arXiv 2021.
  • DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling, Lanqing Xue, Kaitao Song, Duocai Wu, Xu Tan, Nevin L. Zhang, Tao Qin, Wei-Qiang Zhang, Tie-Yan Liu, ACL 2021.
  • SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint, Zhonghao Sheng, Kaitao Song, Xu Tan, Yi Ren, Wei Ye, Shikun Zhang, Tao Qin, AAAI 2021.
  • TeleMelody: Lyric-to-Melody Generation with a Template-Based Two-Stage Method, Zeqian Ju, Peiling Lu, Xu Tan, Rui Wang, Chen Zhang, Songruoyao Wu, Kejun Zhang, Xiangyang Li, Tao Qin, Tie-Yan Liu, arXiv 2021.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Comments
  • [MusicBERT]: Could not infer model type from Namespace (eval_genre.py)

    [MusicBERT]: Could not infer model type from Namespace (eval_genre.py)

    Hello!

    I'm trying to run the evaluation script for the genre classification task using the command python -u eval_genre.py checkpoints/checkpoint_last_musicbert_small.pt topmagd_data_bin/x, and I'm getting the error below when running RobertaModel.from_pretrained:

    Traceback (most recent call last):
      File "eval_genre.py", line 39, in <module>
        user_dir='musicbert'
      File "/home/aspil/muzic/musicbert/fairseq/fairseq/models/roberta/model.py", line 251, in from_pretrained
        **kwargs,
      File "/home/aspil/muzic/musicbert/fairseq/fairseq/hub_utils.py", line 75, in from_pretrained
        arg_overrides=kwargs,
      File "/home/aspil/muzic/musicbert/fairseq/fairseq/checkpoint_utils.py", line 353, in load_model_ensemble_and_task
        model = task.build_model(cfg.model)
      File "/home/aspil/muzic/musicbert/fairseq/fairseq/tasks/fairseq_task.py", line 567, in build_model
        model = models.build_model(args, self)
      File "/home/aspil/muzic/musicbert/fairseq/fairseq/models/__init__.py", line 93, in build_model
        + model_type
    AssertionError: Could not infer model type from Namespace(_name='roberta_small', activation_dropout=0.0, activation_fn='gelu', adam_betas='(0.9,0.98)', adam_eps=1e-06, all_gather_list_size=16384, arch='roberta_small', attention_dropout=0.1, azureml_logging=False, batch_size=8, batch_size_valid=8, best_checkpoint_metric='loss', bf16=False, bpe='gpt2', broadcast_buffers=False, bucket_cap_mb=25, checkpoint_shard_count=1, checkpoint_suffix='_bar_roberta_small', clip_norm=0.0, cpu=False, criterion='masked_lm', curriculum=0, data='topmagd_data_bin/0/input0', data_buffer_size=10, dataset_impl=None, ddp_backend='c10d', device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_port=-1, distributed_rank=0, distributed_world_size=8, distributed_wrapper='DDP', dropout=0.1, empty_cache_freq=0, encoder_attention_heads=8, encoder_embed_dim=512, encoder_ffn_embed_dim=2048, encoder_layerdrop=0, encoder_layers=4, encoder_layers_to_keep=None, end_learning_rate=0.0, eos=2, fast_stat_sync=False, find_unused_parameters=False, finetune_from_model=None, fix_batches_to_gpus=False, fixed_validation_seed=None, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, freq_weighted_replacement=False, gen_subset='test', heartbeat_timeout=-1, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_last_epochs=-1, leave_unmasked_prob=0.1, load_checkpoint_heads=True, load_checkpoint_on_all_dp_ranks=False, localsgd_frequency=3, log_format='simple', log_interval=100, lr=[0.0005], lr_scheduler='polynomial_decay', mask_multiple_length=1, mask_prob=0.15, mask_stdev=0.0, mask_whole_words=False, max_epoch=0, max_positions=8192, max_tokens=None, max_tokens_valid=None, max_update=125000, maximize_best_checkpoint_metric=False, memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, model_parallel_size=1, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_save=False, no_save_optimizer_state=False, no_seed_provided=False, nprocs_per_node=8, num_shards=1, num_workers=1, optimizer='adam', optimizer_overrides='{}', pad=1, patience=-1, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, pooler_activation_fn='tanh', pooler_dropout=0.0, power=1.0, profile=False, quant_noise_pq=0, quant_noise_pq_block_size=8, quant_noise_scalar=0, quantization_config_path=None, random_token_prob=0.1, required_batch_size_multiple=8, required_seq_len_multiple=1, reset_dataloader=False, reset_logging=True, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoints/checkpoint_last_bar_roberta_small.pt', sample_break_mode='complete', save_dir='checkpoints', save_interval=1, save_interval_updates=0, scoring='bleu', seed=1, sentence_avg=False, shard_id=0, shorten_data_split_list='', shorten_method='none', skip_invalid_size_inputs_valid_test=False, slowmo_algorithm='LocalSGD', slowmo_momentum=None, spectral_norm_classification_head=False, stop_min_lr=-1.0, stop_time_hours=0, task='masked_lm', tensorboard_logdir=None, threshold_loss_scale=None, tokenizer=None, tokens_per_sample=8192, total_num_update='125000', tpu=False, train_subset='train', unk=3, untie_weights_roberta=False, update_freq=[4], use_bmuf=False, use_old_adam=False, user_dir='musicbert', valid_subset='valid', validate_after_updates=0, validate_interval=1, validate_interval_updates=0, wandb_project=None, warmup_updates=25000, weight_decay=0.01, zero_sharding='none'). Available models: dict_keys(['transformer_lm', 'wav2vec', 'wav2vec2', 'wav2vec_ctc', 'wav2vec_seq2seq']) Requested model type: roberta_small
    

    Environment

    • python: Python 3.6.13 :: Anaconda, Inc.
    • fairseq: git+https://github.com/pytorch/[email protected]#egg=fairseq

    Thanks in advance!

    Edit: When running the above command with the base checkpoint, I get the following:

    Traceback (most recent call last):
      File "eval_genre.py", line 39, in <module>
        user_dir='musicbert'
      File "/home/aspil/anaconda3/envs/musicbert_01/lib/python3.6/site-packages/fairseq/models/roberta/model.py", line 251, in from_pretrained
        **kwargs,
      File "/home/aspil/anaconda3/envs/musicbert_01/lib/python3.6/site-packages/fairseq/hub_utils.py", line 75, in from_pretrained
        arg_overrides=kwargs,
      File "/home/aspil/anaconda3/envs/musicbert_01/lib/python3.6/site-packages/fairseq/checkpoint_utils.py", line 355, in load_model_ensemble_and_task
        model.load_state_dict(state["model"], strict=strict, model_cfg=cfg.model)
      File "/home/aspil/anaconda3/envs/musicbert_01/lib/python3.6/site-packages/fairseq/models/fairseq_model.py", line 115, in load_state_dict
        return super().load_state_dict(new_state_dict, strict)
      File "/home/aspil/anaconda3/envs/musicbert_01/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict
        self.__class__.__name__, "\n\t".join(error_msgs)))
    RuntimeError: Error(s) in loading state_dict for RobertaModel:
            Unexpected key(s) in state_dict: "encoder.sentence_encoder.downsampling.0.weight", "encoder.sentence_encoder.downsampling.0.bias", "encoder.sentence_encoder.upsampling.0.weight", "encoder.sentence_encoder.upsampling.0.bias".
    

    I don't know if I messed up something, I'd appreciate any help!

    opened by aspil 13
  • miss argument

    miss argument

    when i bash generate.sh, the function " get_sentence_pinyin_finals() " input "raw_text", but this function has two parameters when it is defined

    TypeError: get_sentence_pinyin_finals() missing 1 required positional argument: 'invalids_finals'

    opened by pikapi111 10
  • [teleMelody] How to import lyric to the generated midi sample?

    [teleMelody] How to import lyric to the generated midi sample?

    Hi @jzq2000, I have a quick question about inference. In the example link https://ai-muzic.github.io/telemelody/, I notice that each sample contains both media and lyric. From the code, we could generated the midi file given lyric. So how do we match them together? Thank you!

    opened by elricwan 9
  • musicBert SMALL base MODEL error

    musicBert SMALL base MODEL error

    when i run: $ bash train_mask.sh lmd_full small with provided small model, fairseq gets error:

    -- Process 3 terminated with the following error: Traceback (most recent call last): File "/usr/local/anaconda3/envs/pyg/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap fn(i, *args) File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq/distributed_utils.py", line 270, in distributed_main main(args, **kwargs) File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq_cli/train.py", line 114, in main disable_iterator_cache=task.has_sharded_data("train"), File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq/checkpoint_utils.py", line 193, in load_checkpoint reset_meters=reset_meters, File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq/trainer.py", line 279, in load_checkpoint state = checkpoint_utils.load_checkpoint_to_cpu(filename) File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq/checkpoint_utils.py", line 232, in load_checkpoint_to_cpu state = _upgrade_state_dict(state) File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq/checkpoint_utils.py", line 436, in _upgrade_state_dict registry.set_defaults(state["args"], tasks.TASK_REGISTRY[state["args"].task]) AttributeError: 'NoneType' object has no attribute 'task'

    opened by jammyWolf 7
  • [MusicBERT] Restriction to 1002 octuples when using `preprocess.encoding_to_str`

    [MusicBERT] Restriction to 1002 octuples when using `preprocess.encoding_to_str`

    Hi once again!

    While preprocessing a MIDI file, I noticed that the MIDI_to_encoding method performs as intended and converts the sample song to 106 bars as seen in the snip below of the resultant octuples (please correct me if I'm wrong).

    However, the encoding_to_str method has the result with restriction to just 18 bars (as conculsive from highlighted <0-18> near the end of the encoded string in snip below):

    image

    More generally, what I have noticed in cases of multiple MIDI files is that only upto the first 1000 octuples (i.e, start token octuple + 1000 note octuples + end token octuple = (1002 * 8) = 8016 tokens) are considered:

    image

    Is there any way to change encoding_to_str to get the whole song instead?, upto 256 bars only I mean, as model vocabulary is also restricted to 256 bars. I am not familiar enough with miditoolkit or mido to understand the code properly as of now, else I would have tried to fix this.

    Thanks in advance!

    Edit: I am aware that the musicbert_base model can support upto 8192 octuples (i.e, final input to MusicBERT encoder) only, but that does not seem to be the issue here I think.

    opened by tripathiarpan20 6
  • Problems for inference stage

    Problems for inference stage

    i success in data and train stages and meet some problems in inference. image I meet the similar problem in the train stage, and I use a single GPU to avoid this. However, it didn't work for inference stage. My environment is Unbuntu 16.04, CUDA 10.2, Python 3.6.12 and others follow requiments.txt. Happy to hear your reply

    opened by DrWelles 6
  • Cannot install dependencies: No matching distribution found for fairseq==0.10.2

    Cannot install dependencies: No matching distribution found for fairseq==0.10.2

    Cannot install dependencies when I run pip install -r requirements.txt. The major error message is as followed:

    ...
    ERROR: Could not find a version that satisfies the requirement fairseq==0.10.2 (from versions: 0.6.1, 0.6.2, 0.7.1, 0.7.2, 0.8.0, 0.9.0, 0.10.0, 0.10.1, 0.10.2)
    ERROR: No matching distribution found for fairseq==0.10.2
    

    My development environments:

    • python 3.9.7
    • pip 21.2.4
    • macos 11.5.2
    opened by Bin-Huang 6
  • [MusicBERT]: How to fill masked tokens in an input sequence after training?

    [MusicBERT]: How to fill masked tokens in an input sequence after training?

    Hello again,

    I have fine-tuned MusicBERT on masked language modeling using a custom dataset. I have loaded the fine-tuned checkpoint using:

    roberta = RobertaModel.from_pretrained( # MusicBERTModel.from_pretrained also works
        '.',
        checkpoint_file=sys.argv[1],
        data_name_or_path=sys.argv[2],
        user_dir='musicbert'
    )
    

    What I want to do is to give it an input sequence, mask one or more tokens before passing the input to the model and somehow predict them. Something like masked language modeling, but with control over which tokens I want to mask and predict.

    What I cannot understand is what format the input sequence should be in order to be passed to the model, and how to make the model predict the masked tokens in the input. I have tried to replicate it by looking at the fairseq's training code since I want to do something similar, but it's too complicated.

    Thanks in advance.

    opened by aspil 5
  • [teleMelody] How does lyric and chord corresponds with each other?

    [teleMelody] How does lyric and chord corresponds with each other?

    Hi there, from the test example, how does the lyric and chord corresponds with each other? I thought every [sep] represents one chord, but apparently the number does not match. If we match every word to one chord, then number does not match either. Here is example: en:

    this thing called love and i just [sep] han -dle it [sep] this thing called love and i must get [sep] round to it [sep] i rea -dy [sep]
    
    Eb:maj7 Eb:maj7 C:dim C:dim C:dim C:dim C:dim C:dim
    

    ch:

    斑 驳 的 夜 色 在 说 什 么 [sep] 谁 能 告 诉 我 如 何 选 择 [sep] 每 当 我 想 起 分 离 时 刻 [sep] 悲 伤 就 逆 流 成 河 [sep] 你 给 的 温 暖 属 于 谁 呢 [sep] 谁 又 会 在 乎 我 是 谁 呢 [sep] 每 当 我 想 起 你 的 选 择 [sep] 悲 伤 就 逆 流 成 河 [sep] 失 去 了 你 也 是 种 获 得 [sep] 一 个 人 孤 单 未 尝 不 可 [sep] 每 当 我 深 夜 辗 转 反 侧 [sep] 悲 伤 就 逆 流 成 河 [sep] 离 开 你 也 是 一 种 快 乐 [sep] 没 人 说 一 定 非 爱 不 可 [sep] 想 问 你 双 手 是 否 温 热 [sep] 悲 伤 就 逆 流 成 河 [sep] 我 想 是 因 为 我 太 天 真 [sep] 难 过 是 因 为 我 太 认 真 [sep] 每 当 我 想 起 你 的 眼 神 [sep] 悲 伤 就 逆 流 成 河 [sep]
    
    C:m7 C:m7 G:m7 Bb: C:m7 F:m7 C:m7 C:m7 G:m7 Bb: C:m7 F:m7 C:m C:m G:m7 F:m7 C:m7 F:m7 C:m C:m G:m7 C:m7 C:m7 F:m7 C:m7 C:m7 G:m7 G:m7 C:m F:m7 C:m7
    
    opened by elricwan 4
  • Bump pillow from 8.3.1 to 9.0.1 in /relyme

    Bump pillow from 8.3.1 to 9.0.1 in /relyme

    Bumps pillow from 8.3.1 to 9.0.1.

    Release notes

    Sourced from pillow's releases.

    9.0.1

    https://pillow.readthedocs.io/en/stable/releasenotes/9.0.1.html

    Changes

    • In show_file, use os.remove to remove temporary images. CVE-2022-24303 #6010 [@​radarhere, @​hugovk]
    • Restrict builtins within lambdas for ImageMath.eval. CVE-2022-22817 #6009 [radarhere]

    9.0.0

    https://pillow.readthedocs.io/en/stable/releasenotes/9.0.0.html

    Changes

    ... (truncated)

    Changelog

    Sourced from pillow's changelog.

    9.0.1 (2022-02-03)

    • In show_file, use os.remove to remove temporary images. CVE-2022-24303 #6010 [radarhere, hugovk]

    • Restrict builtins within lambdas for ImageMath.eval. CVE-2022-22817 #6009 [radarhere]

    9.0.0 (2022-01-02)

    • Restrict builtins for ImageMath.eval(). CVE-2022-22817 #5923 [radarhere]

    • Ensure JpegImagePlugin stops at the end of a truncated file #5921 [radarhere]

    • Fixed ImagePath.Path array handling. CVE-2022-22815, CVE-2022-22816 #5920 [radarhere]

    • Remove consecutive duplicate tiles that only differ by their offset #5919 [radarhere]

    • Improved I;16 operations on big endian #5901 [radarhere]

    • Limit quantized palette to number of colors #5879 [radarhere]

    • Fixed palette index for zeroed color in FASTOCTREE quantize #5869 [radarhere]

    • When saving RGBA to GIF, make use of first transparent palette entry #5859 [radarhere]

    • Pass SAMPLEFORMAT to libtiff #5848 [radarhere]

    • Added rounding when converting P and PA #5824 [radarhere]

    • Improved putdata() documentation and data handling #5910 [radarhere]

    • Exclude carriage return in PDF regex to help prevent ReDoS #5912 [hugovk]

    • Fixed freeing pointer in ImageDraw.Outline.transform #5909 [radarhere]

    ... (truncated)

    Commits
    • 6deac9e 9.0.1 version bump
    • c04d812 Update CHANGES.rst [ci skip]
    • 4fabec3 Added release notes for 9.0.1
    • 02affaa Added delay after opening image with xdg-open
    • ca0b585 Updated formatting
    • 427221e In show_file, use os.remove to remove temporary images
    • c930be0 Restrict builtins within lambdas for ImageMath.eval
    • 75b69dd Dont need to pin for GHA
    • cd938a7 Autolink CWE numbers with sphinx-issues
    • 2e9c461 Add CVE IDs
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 4
  • Bump numpy from 1.21.3 to 1.22.0 in /relyme

    Bump numpy from 1.21.3 to 1.22.0 in /relyme

    Bumps numpy from 1.21.3 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 4
  • [Museformer] Could not override 'task.dataset_impl'

    [Museformer] Could not override 'task.dataset_impl'

    Hello, I, following the instructions in the README.md, create a virtual environment in Linux on WSL 2, download the checkpoint model, run the command that runs tgen/generation__mf-lmd6remi-x.sh and get the error hydra.errors.ConfigCompositionException: Could not override 'task.dataset_impl'. To append to your config use +task.dataset_impl=null

    I'm running the model on a 3080 Ti

    opened by artyomche9 1
  • 【meloform】why all token id in dictionary is 0?

    【meloform】why all token id in dictionary is 0?

    Notice that gen_dictionary function uses the variable ‘num’ to represent each token's id, but 'num' keeps 0 for the whole process, so all tokens in the dictionary are 0, is that a bug, or does it make any sense? image image

    opened by punkcure 1
  • 【museformer】TypeError: forward() got multiple values for argument 'key_padding_mask'

    【museformer】TypeError: forward() got multiple values for argument 'key_padding_mask'

    hi,I run bash ttrain/mf-lmd6remi-1.sh and bash tval/val__mf-lmd6remi-x.sh 1 checkpoint_best.pt 10240 in museformer it will be: 2022-11-19 13:04:11 | WARNING | fairseq.tasks.fairseq_task | 844 samples have invalid sizes and will be skipped, max_positions=1024, first few sample ids=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 2022-11-19 13:04:11 | INFO | fairseq.trainer | begin training epoch 1 Traceback (most recent call last): File "/home/hyc/anaconda3/envs/musefo/bin/fairseq-train", line 8, in sys.exit(cli_main()) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq_cli/train.py", line 352, in cli_main distributed_utils.call_main(args, main) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/distributed_utils.py", line 283, in call_main torch.multiprocessing.spawn( File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes while not context.join(): File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

    -- Process 2 terminated with the following error: Traceback (most recent call last): File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap fn(i, *args) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/distributed_utils.py", line 270, in distributed_main main(args, **kwargs) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq_cli/train.py", line 125, in main valid_losses, should_stop = train(args, trainer, task, epoch_itr) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/contextlib.py", line 75, in inner return func(*args, **kwds) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq_cli/train.py", line 208, in train log_output = trainer.train_step(samples) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/contextlib.py", line 75, in inner return func(*args, **kwds) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/trainer.py", line 480, in train_step loss, sample_size_i, logging_output = self.task.train_step( File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/tasks/fairseq_task.py", line 416, in train_step loss, sample_size, logging_output = criterion(model, sample) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/criterions/cross_entropy.py", line 35, in forward net_output = model(**sample["net_input"]) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 705, in forward output = self.module(*inputs[0], **kwargs[0]) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/models/fairseq_model.py", line 481, in forward return self.decoder(src_tokens, **kwargs) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/hyc/tmp/museformer/museformer/museformer_decoder.py", line 413, in forward x, extra = self.extract_features( File "/home/hyc/tmp/museformer/museformer/museformer_decoder.py", line 645, in extract_features (sum_x, reg_x), inner_states = self.run_layers( File "/home/hyc/tmp/museformer/museformer/museformer_decoder.py", line 731, in run_layers x, _ = layer( File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/hyc/tmp/museformer/museformer/museformer_decoder_layer.py", line 413, in forward x, attn = self.run_self_attn( File "/home/hyc/tmp/museformer/museformer/museformer_decoder_layer.py", line 486, in run_self_attn r, weight = self.self_attn( File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) TypeError: forward() got multiple values for argument 'key_padding_mask'

    fairseq is 0.10.2 torch is 1.8.0 python is 3.8 who know why?

    opened by hhhyc333 1
  • Error when running ttrain/mf-lmd6remi-1.sh 

    Error when running ttrain/mf-lmd6remi-1.sh 

    Currently I'm trying to implement museformer.

    I don't know how to deal with errors in the model learning stage.(I fixed the errors leading up to this point)

    ttrain/mf-lmd6remi-1.sh

    ...................

    
    2022-11-11 07:08:28 | WARNING | fairseq.tasks.fairseq_task | 290 samples have invalid sizes and will be skipped, max_positions=1024, first few sample ids=[0, 1, 2, 4, 5, 6, 7, 8, 9, 10]
    2022-11-11 07:08:28 | INFO | fairseq.trainer | begin training epoch 1
    Traceback (most recent call last):
      File "/usr/local/bin/fairseq-train", line 8, in <module>
        sys.exit(cli_main())
      File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 352, in cli_main
        distributed_utils.call_main(args, main)
      File "/usr/local/lib/python3.8/dist-packages/fairseq/distributed_utils.py", line 301, in call_main
        main(args, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 125, in main
        valid_losses, should_stop = train(args, trainer, task, epoch_itr)
      File "/usr/lib/python3.8/contextlib.py", line 75, in inner
        return func(*args, **kwds)
      File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 208, in train
        log_output = trainer.train_step(samples)
      File "/usr/lib/python3.8/contextlib.py", line 75, in inner
        return func(*args, **kwds)
      File "/usr/local/lib/python3.8/dist-packages/fairseq/trainer.py", line 480, in train_step
        loss, sample_size_i, logging_output = self.task.train_step(
      File "/usr/local/lib/python3.8/dist-packages/fairseq/tasks/fairseq_task.py", line 416, in train_step
        loss, sample_size, logging_output = criterion(model, sample)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/fairseq/criterions/cross_entropy.py", line 35, in forward
        net_output = model(**sample["net_input"])
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/fairseq/models/fairseq_model.py", line 481, in forward
        return self.decoder(src_tokens, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "muzic/museformer/museformer/museformer_decoder.py", line 413, in forward
        x, extra = self.extract_features(
      File "muzic/museformer/museformer/museformer_decoder.py", line 645, in extract_features
        (sum_x, reg_x), inner_states = self.run_layers(
      File "/content/muzic/museformer/museformer/museformer_decoder.py", line 731, in run_layers
        x, _ = layer(
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "muzic/museformer/museformer/museformer_decoder_layer.py", line 413, in forward
        x, attn = self.run_self_attn(
      File "muzic/museformer/museformer/museformer_decoder_layer.py", line 486, in run_self_attn
        r, weight = self.self_attn(
    TypeError: 'NotImplementedError' object is not callable
    
    

    I have confirmed that the environments match.

    tensorboardX 2.2 Python 3.8.15 fairseq 0.10.2 CUDA 11.3

    opened by taktak1 5
Releases(DeepRapper-v1.0)
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.

OMNIZART Omnizart is a Python library that aims for democratizing automatic music transcription. Given polyphonic music, it is able to transcribe pitc

MCTLab 1.3k Jan 08, 2023
A Simple Script that will help you to Play / Change Songs with just your Voice

Auto-Spotify using Voice Recognition A Simple Script that will help you to Play / Change Songs with just your Voice Explore the docs » Table of Conten

Mehul Shah 1 Nov 21, 2021
Port Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. / 筆墨クミDeepvocal中文音源

Hitsuboku Kumi (筆墨クミ) is a UTAU virtual singer developed by Cubialpha. This project ports Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. This is the first open-source deepvocal voicebank on Gith

8 Apr 26, 2022
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

Theodoros Giannakopoulos 5.1k Jan 02, 2023
SolidMusic rewrite version, need help

Telegram Streamer Bot This is rewrite version of solidmusic, but it can't be deployed now, help me to make this bot running fast and good. If anyone w

Shohih Abdul 63 Jan 06, 2022
Converting UGG files from Rode Wireless Go II transmitters (unsompressed recordings) to WAV format

Rode_WirelessGoII_UGG2wav Converting UGG files from Rode Wireless Go II transmitters (uncompressed recordings) to WAV format Story I backuped the .ugg

Ján Mazanec 31 Dec 22, 2022
Carnatic Notes Predictor for audio files

Carnatic Notes Predictor for audio files Link for live application: https://share.streamlit.io/pradeepak1/carnatic-notes-predictor-for-audio-files/mai

1 Nov 06, 2021
Voice helper on russian

Voice helper on russian

KreO 1 Jun 30, 2022
Open-Source bot to play songs in your Telegram's Group Voice Chat. Powered by @Akki_ThePro

VcPlayer Telegram Voice-Chat Bot [PyTGCalls] ⇝ Requirements ⇜ Account requirements A Telegram account to use as the music bot, You cannot use regular

Akki ThePro 2 Dec 25, 2021
pedalboard is a Python library for adding effects to audio.

pedalboard is a Python library for adding effects to audio. It supports a number of common audio effects out of the box, and also allows the use of VST3® and Audio Unit plugin formats for third-party

Spotify 3.9k Jan 02, 2023
A useful tool to generate chord progressions according to melody MIDIs

Auto chord generator, pure python package that generate chord progressions according to given melodies

Billy Yi 53 Dec 30, 2022
?️ Open Source Audio Matching and Mastering

Matching + Mastering = ❤️ Matchering 2.0 is a novel Containerized Web Application and Python Library for audio matching and mastering. It follows a si

Sergey Grishakov 781 Jan 05, 2023
Terminal-based music player written in Python for the best music in the world 🎵 🎧 💻

audius-terminal-player Terminal-based music player written in Python for the best music in the world 🎵 🎧 💻 Browse and listen to Audius from the com

Audius 21 Jul 23, 2022
A bot that can play music on Telegram Group and Channel Voice Chats

DaisyXmusic ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

TeamOfDaisyX 20 Jun 11, 2021
Python module for handling audio metadata

Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, MP4, Monkey's Audio, MP3, Musepack, Ogg Opus, Ogg FLAC, Ogg Speex, Ogg The

Quod Libet 1.1k Dec 31, 2022
Conferencing Speech Challenge

ConferencingSpeech 2021 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech challenge. For more detai

73 Nov 29, 2022
Python library for handling audio datasets.

AUDIOMATE Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a gener

Matthias 121 Nov 27, 2022
Code to work with wave files!

Code to work with wave files!

Mohammad Dori 3 Jul 15, 2022
Play any song directly into your group voice chat.

Telegram VCPlayer Bot Play any song directly into your group voice chat. Official Bot : VCPlayerBot | Discussion Group : VoiceChat Music Player Suppor

Shubham Kumar 50 Nov 21, 2022
Telegram Voice-Chat Bot Written In Python Using Pyrogram.

Telegram Voice-Chat Bot Telegram Voice-Chat Bot To Play Music From Various Sources In Your Group Support All linux based os. Windows Mac Diagram Requi

TheHamkerCat 314 Dec 29, 2022