A library for augmenting annotated audio data

Overview

muda

GitHub license PyPI Build Status Coverage Status Documentation Status

A library for Musical Data Augmentation.

muda package implements annotation-aware musical data augmentation, as described in the muda paper.

The goal of this package is to make it easy for practitioners to consistently apply perturbations to annotated music data for the purpose of fitting statistical models.

Documentation

The documentation for the latest version of muda is available here.

Citing

If you use this software, please cite the following publication:

@inproceedings{mcfee2015_augmentation,
    author  = {McFee,, B. and Humphrey,, E.J. and Bello, J.P.},
    year    = {2015},
    title   = {A software framework for musical data augmentation},
    booktitle = {16th International Society for Music Information Retrieval Conference},
    series  = {ISMIR}
}
Comments
  • AttributeError: 'dict' object has no attribute '_audio'

    AttributeError: 'dict' object has no attribute '_audio'

    I tried to run the muda code example from documentaion:

    import muda 
    import librosa 
    clip=muda.load_jam_audio('audio/7061-6-0-0_bgnoise0.jams','audio/6902-2-0-7.wav')
    pitch = muda.deformers.LinearPitchShift(n_samples=5,lower=-1,upper=1)
    for i, jam_out in pitch.transform(clip):
              muda.save('output_{:02d}.wav'.format(i),'output_{:02d}.jams'.format(i),jam_out) `
    
    

    but this error occurs:

    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    <ipython-input-7-e235d022c3b3> in <module>()
          1 pitch = muda.deformers.LinearPitchShift(n_samples=5,lower=-1,upper=1)
    ----> 2 for i, jam_out in pitch.transform(clip):
          3     muda.save('output_{:02d}.wav'.format(i),'output_{:02d}.jams'.format(i),jam_out)
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda/base.pyc in transform(self, jam)
        140         '''
        141 
    --> 142         for state in self.states(jam):
        143             yield self._transform(jam, state)
        144 
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda/deformers/pitch.pyc in states(self, jam)
        251                              endpoint=True)
        252 
    --> 253         for state in AbstractPitchShift.states(self, jam):
        254             for n_semitones in shifts:
        255                 state['n_semitones'] = n_semitones
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda/deformers/pitch.pyc in states(self, jam)
         67     def states(self, jam):
         68         mudabox = jam.sandbox.muda
    ---> 69         state = dict(tuning=librosa.estimate_tuning(y=mudabox._audio['y'],
         70                                                     sr=mudabox._audio['sr']))
         71         yield state
    
    AttributeError: 'dict' object has no attribute '_audio' 
    
    bug 
    opened by YazhouZhang0709 12
  • How to apply deformations from annotated jams file?

    How to apply deformations from annotated jams file?

    I am trying to use these (https://github.com/justinsalamon/UrbanSound8K-JAMS) jams files to deform sound files.

    I am trying to use BaseTransformer after creating the jams object from the jams file and the corresponding sound file. Like this,

    j_orig = muda.load_jam_audio('orig.jams', 'orig.ogg')
    deformer = muda.base.BaseTransformer()
    for jam_out in deformer.transform(jam_in):
         process(jam_out)
    

    But when I do this I am getting NotImplementedError.

    If I had to create deformations by creating objects from muda.deformers.* classes. Then what is the point of loading annotated jams files? Please help me understand the process.

    question 
    opened by Arkanayan 7
  • Use MUDA without audio file

    Use MUDA without audio file

    In my current project, my jams files completely specify my audio, and at training time the audio can be synthesized from the jams file. I'd ideally augment these jams files with muda without having to synthesize them first, and then I'd simply save the muda-augmented jams files. At training time, I'd synthesize the audio and process the muda deformations.

    Is it possible to use muda without passing the audio initial audio file? It seems right now, if I don't use muda.load_jam_audio() to process my jams file (it just adds the empty history and library versions to the jam?), it errors when I call the transform method of my pipeline.

    Is there a reason muda needs the audio file before actually processing the audio?

    enhancement functionality 
    opened by mcartwright 7
  • RNG seed [formerly Reproducibility enhancements]

    RNG seed [formerly Reproducibility enhancements]

    At least two ideas jump out at me re: reproducibility:

    1. RandomDoAThing deformers could optionally take seed params, but always use one internally (and serialize accordingly).
    2. It'd be great if we could reconstruct a deformation pipeline exactly from the "history" object ... which really means either (a) the serialization object should encompass state, which isn't the case for RandomDoAThing deformers, or (b) there's a higher-level object that combines state and pipeline as different objects. The difference here is small (and maybe semantic), but it's a difference between a class and an instance (the pipeline is the class, the state is the instance). This might have interesting repercussions for the design of the Pipeline, which is perhaps more aptly called a PipelineFactory.

    please yell if any of this is unclear, I'm kind of stream-of-consciousness working through the idea.

    functionality 
    opened by ejhumphrey 7
  • [No such file or directory error] when running documentation examples

    [No such file or directory error] when running documentation examples

    I probably missed something very basic. So I was trying to test the muda code examples from the documentation as follows.

    jams_obj = muda.load_jam_audio(jam_file, song_file) pitch = muda.deformers.LinearPitchShift(n_samples=5, lower=-2, upper=2)

    for i, jam_out in pitch.transform(jams_obj):
        muda.save('output_{:02d}.wav'.format(i),
                  'output_{:02d}.jams'.format(i),
                   jam_out)
    

    and I encountered the following error message:

    ---------------------------------------------------------------------------
    OSError                                   Traceback (most recent call last)
    <ipython-input-11-589bfe4aecc5> in <module>()
    ----> 1 for i, jam_out in pitch.transform(jams_obj):
          2     muda.save('output_{:02d}.ogg'.format(i),
          3               'output_{:02d}.jams'.format(i),
          4                jam_out)
    
    /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/muda/base.pyc in transform(self, jam)
        141 
        142         for state in self.states(jam):
    --> 143             yield self._transform(jam, state)
        144 
        145     @property
    
    /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/muda/base.pyc in _transform(self, jam, state)
        109 
        110         if hasattr(self, 'audio'):
    --> 111             self.audio(jam_w.sandbox.muda, state)
        112 
        113         if hasattr(self, 'metadata'):
    
    /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/muda/deformers/pitch.pyc in audio(mudabox, state)
         75         mudabox._audio['y'] = pyrb.pitch_shift(mudabox._audio['y'],
         76                                                mudabox._audio['sr'],
    ---> 77                                                state['n_semitones'])
         78 
         79     @staticmethod
    
    /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pyrubberband/pyrb.pyc in pitch_shift(y, sr, n_steps, rbargs)
        163     rbargs.setdefault('--pitch', n_steps)
        164 
    --> 165     return __rubberband(y, sr, **rbargs)
    
    /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pyrubberband/pyrb.pyc in __rubberband(y, sr, **kwargs)
         64         arguments.extend([infile, outfile])
         65 
    ---> 66         subprocess.check_call(arguments)
         67 
         68         # Load the processed audio.
    
    /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in check_call(*popenargs, **kwargs)
        533     check_call(["ls", "-l"])
        534     """
    --> 535     retcode = call(*popenargs, **kwargs)
        536     if retcode:
        537         cmd = kwargs.get("args")
    
    /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in call(*popenargs, **kwargs)
        520     retcode = call(["ls", "-l"])
        521     """
    --> 522     return Popen(*popenargs, **kwargs).wait()
        523 
        524 
    
    /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
        708                                 p2cread, p2cwrite,
        709                                 c2pread, c2pwrite,
    --> 710                                 errread, errwrite)
        711         except Exception:
        712             # Preserve original exception in case os.close raises.
    
    /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
       1333                         raise
       1334                 child_exception = pickle.loads(data)
    -> 1335                 raise child_exception
       1336 
       1337 
    
    OSError: [Errno 2] No such file or directory
    

    Please give me any pointers on what I have been missing... I checked the content of jams_obj and the audio is loaded.

    opened by wangsix 7
  • OSError: [Errno 2] No such file or directory

    OSError: [Errno 2] No such file or directory

    Hi, Brian it's me again. I tried to use the Dynamic range compression to deform some audio signal and met some error, could you help me figure it out? Thanks!

    
    def augment_features(parent_dir,sub_dirs,file_ext="*.wav"):
        for l, sub_dir in enumerate(sub_dirs):
            for fn in glob.glob(os.path.join(parent_dir, sub_dir, file_ext)):
                name = fn.split('/')[2].split('.')[0]
                jam = jams.load('audio/7061-6-0-0_bgnoise0.jams')
                muda.load_jam_audio(jam, fn)
                
                #pitch shift1
                pitch1 = muda.deformers.LinearPitchShift(n_samples=4,lower=-2,upper=2)
                for i, jam_out in enumerate(pitch1.transform(jam)):
                    muda.save('audio1/test/'+name+'_p1_{:02d}.wav'.format(i),'audio1/test/'+name+'_p1_{:02d}.jams'.format(i), jam_out)
                
                #pitch shift2 
                pitch2 = muda.deformers.LinearPitchShift(n_samples=4,lower=-4,upper=4)
                for i, jam_out in enumerate(pitch2.transform(jam)):
                    muda.save('audio1/test/'+name+'_p2_{:02d}.wav'.format(i),'audio1/test/'+name+'_p2_{:02d}.jams'.format(i), jam_out)
                
                #time stetching
                tstretch = muda.deformers.LogspaceTimeStretch(n_samples=4,lower=-3.5,upper=3.5)
                for i, jam_out in enumerate(tstretch.transform(jam)):
                    muda.save('audio1/test/'+name+'_ts_{:02d}.wav'.format(i),'audio1/test/'+name+'_ts_{:02d}.jams'.format(i), jam_out)
                
                #DRC
                drc = muda.deformers.DynamicRangeCompression(preset=['radio','film standard', 'speech', 'radio'])
                for i, jam_out in enumerate(drc.transform(jam)):
                    muda.save('audio1/test/'+name+'_drc_{:02d}.wav'.format(0),'audio1/test/'+name+'_drc_{:02d}.jams'.format(0), jam_out)
               
                
                
    parent_dir = "audio1"      
    for k in range(1,11):
            fold_name = 'fold' + str(k)
            augment_features(parent_dir,[fold_name])
    
    ---------------------------------------------------------------------------
    OSError                                   Traceback (most recent call last)
    <ipython-input-16-9541ffff5e33> in <module>()
         36 for k in range(1,11):
         37         fold_name = 'fold' + str(k)
    ---> 38         augment_features(parent_dir,[fold_name])
    
    <ipython-input-16-9541ffff5e33> in augment_features(parent_dir, sub_dirs, file_ext)
         25             #DRC
         26             drc = muda.deformers.DynamicRangeCompression(preset=['radio','film standard', 'speech', 'radio'])
    ---> 27             for i, jam_out in enumerate(drc.transform(jam)):
         28                 muda.save('audio1/test/'+name+'_drc_{:02d}.wav'.format(0),'audio1/test/'+name+'_drc_{:02d}.jams'.format(0), jam_out)
         29             #for i, jam_out in enumerate(drc.transform(jam)):
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda-0.1.2-py2.7.egg/muda/base.pyc in transform(self, jam)
        142 
        143         for state in self.states(jam):
    --> 144             yield self._transform(jam, state)
        145 
        146     @property
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda-0.1.2-py2.7.egg/muda/base.pyc in _transform(self, jam, state)
        110 
        111         if hasattr(self, 'audio'):
    --> 112             self.audio(jam_w.sandbox.muda, state)
        113 
        114         if hasattr(self, 'metadata'):
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda-0.1.2-py2.7.egg/muda/deformers/sox.pyc in audio(mudabox, state)
        146         mudabox._audio['y'] = drc(mudabox._audio['y'],
        147                                   mudabox._audio['sr'],
    --> 148                                   state['preset'])
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda-0.1.2-py2.7.egg/muda/deformers/sox.pyc in drc(y, sr, preset)
         91     '''
         92 
    ---> 93     return __sox(y, sr, 'compand', *PRESETS[preset])
         94 
         95 
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda-0.1.2-py2.7.egg/muda/deformers/sox.pyc in __sox(y, sr, *args)
         57         arguments.extend(args)
         58 
    ---> 59         subprocess.check_call(arguments)
         60 
         61         y_out, sr = psf.read(outfile)
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/subprocess.pyc in check_call(*popenargs, **kwargs)
        534     check_call(["ls", "-l"])
        535     """
    --> 536     retcode = call(*popenargs, **kwargs)
        537     if retcode:
        538         cmd = kwargs.get("args")
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/subprocess.pyc in call(*popenargs, **kwargs)
        521     retcode = call(["ls", "-l"])
        522     """
    --> 523     return Popen(*popenargs, **kwargs).wait()
        524 
        525 
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
        709                                 p2cread, p2cwrite,
        710                                 c2pread, c2pwrite,
    --> 711                                 errread, errwrite)
        712         except Exception:
        713             # Preserve original exception in case os.close raises.
    
    /home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
       1341                         raise
       1342                 child_exception = pickle.loads(data)
    -> 1343                 raise child_exception
       1344 
       1345 
    
    OSError: [Errno 2] No such file or directory
    
    opened by YazhouZhang0709 6
  • BackgroundNoise fails if len(soundf)==n_target

    BackgroundNoise fails if len(soundf)==n_target

    I have a dataset of audio clips of the same length. Half of these clips are positive (contain a bird flight call), while the other half is negative (contain only background noise). I want to augment the dataset by mixing clips together, without changing the label.

    But I ran into an error in 'muda.deformers.background.sample_clip_indices'. If I understand the stack trace correctly (see below my signature), the error happens when executing start = np.random.randint(0, len(soundf) - n_target) with len(soundf) - n_target equal to zero.

    I made a Gist to reproduce the bug: https://gist.github.com/lostanlen/15fe9c879fdd24fe9023fa430314cd51 It disappears when the difference in lengths is strictly larger than zero.

    Is this expected behavior? It seems to me that my issue could be fixed with

    if len(soundf) > n_target:
        start = np.random.randint(0, len(soundf) - n_target)
    else:
        start = 0
    

    Best, Vincent.

    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-43-9ecbf36219b3> in <module>()
         39 # Create short deformer
         40 short_deformer = muda.deformers.BackgroundNoise(files=[short_noise_path])
    ---> 41 short_jam_transformer = next(short_deformer.transform(jam_original)) # error
    
    /Users/vl238/miniconda3/lib/python3.5/site-packages/muda/base.py in transform(self, jam)
        142         '''
        143 
    --> 144         for state in self.states(jam):
        145             yield self._transform(jam, state)
        146 
    
    /Users/vl238/miniconda3/lib/python3.5/site-packages/muda/deformers/background.py in states(self, jam)
        154         for fname in self.files:
        155             for _ in range(self.n_samples):
    --> 156                 start, stop = sample_clip_indices(fname, len(mudabox._audio['y']), mudabox._audio['sr'])
        157                 yield dict(filename=fname,
        158                            weight=np.random.uniform(low=self.weight_min,
    
    /Users/vl238/miniconda3/lib/python3.5/site-packages/muda/deformers/background.py in sample_clip_indices(filename, n_samples, sr)
         40 
         41         # Draw a random clip
    ---> 42         start = np.random.randint(0, len(soundf) - n_target)
         43         stop = start + n_target
         44 
    
    mtrand.pyx in mtrand.RandomState.randint (numpy/random/mtrand/mtrand.c:16117)()
    
    ValueError: low >= high
    
    bug 
    opened by lostanlen 4
  • Allow PitchShift deformer to take list for n_semitones

    Allow PitchShift deformer to take list for n_semitones

    The PitchShift deformer only takes a single semitone parameter (n_semitones), which makes it a little awkward to work with when you want to perform multiple pitch shifts. LinearPitchShift can perform multiple shifts, but they have to be linearly spaced, which might not be the desired functionality (e.g. I may want n_semitones = [ -2, -1, 1, 2]).

    It would be nice if PitchShift would accept (in addition to a single value) a list for the n_semitones parameter, in which case it would generate an output audio/jams for every pitch shift value in the list (times n_samples). This would make the behavior more consistent with other deformers (e.g. DynamicRangeCompression), and would allow (what I'm really after) writing more generic code that can apply a deformer agnostically of which deformer it actually is because the deformer is fully defined during initialization.

    enhancement functionality 
    opened by justinsalamon 4
  • Design / Implement Audio and Payload containers

    Design / Implement Audio and Payload containers

    Currently, JAMS objects are being used via the top-level sandbox to ferry data through deformation pipelines. This is a little clunky for a few reasons, some more obvious than others. For my part, a big one is transforming JAMS without audio / transforming audio without JAMS.

    The important thing to note though is that the JAMS object is pretty powerful, which makes it super easy to do things with and to it. We can't say the same for the audio signal, and the JAMS object doesn't (and shouldn't) offer similar functionality for wrangling muda history, for example.

    I'd be keen to encapsulate audio and annotation data as separate attributes of a Payload object (or what have you) that can pass through the deformer pipeline agnostically. Putting some smarts into the different containers will also make it easier to introduce other audio deformations later, like stereo / spatialization, and keep good records on applied deformations.

    And, as another win (in my book at least), it could allow us to leverage different audio reading/writing backends, which can be justifiable in different scenarios.

    thoughts?

    question wontfix 
    opened by ejhumphrey 4
  • Remove in-place annotation modification

    Remove in-place annotation modification

    JAMS is likely to drop the pandas dataframe backing in the near future. Even in the short-term, pandas 0.20 breaks a variety of things involving in-place manipulation, so we should really just get out ahead of it and do things properly.

    enhancement functionality 
    opened by bmcfee 3
  • Pitch deformer breaks on unexpected namespace?

    Pitch deformer breaks on unexpected namespace?

    I'm trying to use the LinearPitchShift deformer on a tag_open JAMS annotation. Code looks something like this:

    audiopath = '101415-3-0-2.wav'
    jamspath = '101415-3-0-2.jams'
    jorig = muda.load_jam_audio(jamspath, audiopath)
    pitch = muda.deformers.LinearPitchShift(n_samples=3, lower=-1, upper=1)
    jpitch = []
    for j in pitch.transform(jorig):
        jpitch.append(j)
    

    If I try to do the same with the DRC deformer it seems to work OK, but with the pitch deformer I get:

    ---------------------------------------------------------------------------
    OSError                                   Traceback (most recent call last)
    <ipython-input-48-1c0aa6d965af> in <module>()
          1 jpitch = []
    ----> 2 for j in pitch.transform(jorig):
          3     jpitch.append(j)
    
    /usr/local/lib/python2.7/site-packages/muda/base.pyc in transform(self, jam)
        141 
        142         for state in self.states(jam):
    --> 143             yield self._transform(jam, state)
        144 
        145     @property
    
    /usr/local/lib/python2.7/site-packages/muda/base.pyc in _transform(self, jam, state)
        109 
        110         if hasattr(self, 'audio'):
    --> 111             self.audio(jam_w.sandbox.muda, state)
        112 
        113         if hasattr(self, 'metadata'):
    
    /usr/local/lib/python2.7/site-packages/muda/deformers/pitch.pyc in audio(mudabox, state)
         75         mudabox._audio['y'] = pyrb.pitch_shift(mudabox._audio['y'],
         76                                                mudabox._audio['sr'],
    ---> 77                                                state['n_semitones'])
         78 
         79     @staticmethod
    
    /usr/local/lib/python2.7/site-packages/pyrubberband/pyrb.pyc in pitch_shift(y, sr, n_steps, rbargs)
        163     rbargs.setdefault('--pitch', n_steps)
        164 
    --> 165     return __rubberband(y, sr, **rbargs)
    
    /usr/local/lib/python2.7/site-packages/pyrubberband/pyrb.pyc in __rubberband(y, sr, **kwargs)
         64         arguments.extend([infile, outfile])
         65 
    ---> 66         subprocess.check_call(arguments)
         67 
         68         # Load the processed audio.
    
    /usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in check_call(*popenargs, **kwargs)
        533     check_call(["ls", "-l"])
        534     """
    --> 535     retcode = call(*popenargs, **kwargs)
        536     if retcode:
        537         cmd = kwargs.get("args")
    
    /usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in call(*popenargs, **kwargs)
        520     retcode = call(["ls", "-l"])
        521     """
    --> 522     return Popen(*popenargs, **kwargs).wait()
        523 
        524 
    
    /usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
        708                                 p2cread, p2cwrite,
        709                                 c2pread, c2pwrite,
    --> 710                                 errread, errwrite)
        711         except Exception:
        712             # Preserve original exception in case os.close raises.
    
    /usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
       1325                         raise
       1326                 child_exception = pickle.loads(data)
    -> 1327                 raise child_exception
       1328 
       1329 
    
    OSError: [Errno 2] No such file or directory
    
    opened by justinsalamon 3
  • Tapdancing (use pedalboard as backend)

    Tapdancing (use pedalboard as backend)

    #1 lists several deformations left to implement here, and we currently rely on some ad-hoc constellation of backends to implement what we have now.

    It looks like pedalboard is on track to cover a large swath of the functionality we need from the audio side. Switching our backend processing over will simplify things, make it easier to cover more transformations, and make the whole project a bit more maintainable.

    In the short term, it ought to be an easy switch. The only thing to be careful of is the parallel implementation of annotation transformations to match the audio.

    In the longer term, it might be worth doing some kind of deferred processing of audio instead of making intermediate copies of the signal. The present implementation doesn't support this, but it could be more efficient if we provide some kind of lazy evaluation / pedalboard constructor that only generates audio when necessary.

    enhancement functionality 
    opened by bmcfee 0
  • ColoredNoise deformer bug

    ColoredNoise deformer bug

    error (dimension mismatch) occurs when applying a pipeline that includes coloredNoise deformer and IRConvolution deformer.

    Screen Shot 2020-08-24 at 2 50 17 PM

    where the original audio length is 118882 samples, impulse response length is 192000 samples.

    bug 
    opened by lylyhan 1
  • Modernization

    Modernization

    Our dependencies and tests are a bit out of date at this point, and we should do the following:

    1. bump librosa up to 0.7 (at least)
    2. upgrade pytest dependencies
    3. drop support for python <3.6
    enhancement 
    opened by bmcfee 0
  • Old jams BackgroundNoise lacking start and stop

    Old jams BackgroundNoise lacking start and stop

    Hello!

    I am reproducing a paper by Salomon & Bello, and need to make transformations according to the jams files in UrbanSound8K-JAMS, for which I am using muda.replay().

    It seems that they used a previous version of Background noise which didn't use start and stop parameters, probably prior to this commit. I am thinking of just setting a default value for the case where there is no start and stop in the status, something like this:

    try:
         start = state['start']
         stop = state['stop']
    except  KeyError:
         start = 0
         stop = len(mudabox._audio['y'])
    

    I will post if this works fine latter on. Would appreciate some feedback on this.

    opened by grudloff 3
  • External dependency I/O overhead for out-of-core pipelines

    External dependency I/O overhead for out-of-core pipelines

    MUDA relies heavily on external command line libraries such as rubberband and sox (lightly wrapped in pyrubberband and pysox) for core deformations such as time-stretch, pitch-shift and drc. These system library wrappers work by writing the transformed signal to disk and then reading it back from disk into memory (presumably to feed an ML algorithm).

    The external system call and particularly the additional read-write step introduce a large overhead in highly distributed/multithreaded out-of-core data pipelines. Would it not make sense to either a) allow an option to do an analagous deformation using in-memory python library (for example librosa) or b) replace the external system call altogether with an in-memory transformation?

    opened by Marko-Stamenovic-Bose 5
Releases(0.4.1)
Owner
Brian McFee
Assistant Professor of Music Technology and Data Science
Brian McFee
A voice based calculator by using termux api in Android

termux_voice_calculator This is. A voice based calculator by using termux api in Android Instagram account 👉 👈 Requirements and installation Downloa

ʕ´•ᴥ•`ʔ╠ŞĦỮβĦa̷m̷╣ʕ´•ᴥ•`ʔ 2 Apr 29, 2022
Python interface to the WebRTC Voice Activity Detector

py-webrtcvad This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3. A VAD classifies a p

John Wiseman 1.5k Dec 22, 2022
A small project where I identify notes and key harmonies in a piece of music and use them further to recreate and generate the same piece of music through Python

A small project where I identify notes and key harmonies in a piece of music and use them further to recreate and generate the same piece of music through Python

5 Oct 07, 2022
F.R.I.D.A.Y. ----- Female Replacement Intelligent Digital Assistant Youth

F.R.I.D.A.Y. Female Replacement Intelligent Digital Assistant Youth--Jarvis-- the virtual assistant made by python Overview This is a virtual assistan

JIB - Just Innovative Bro 4 Feb 26, 2022
Sound-Equalizer- This is a Sound Equalizer GUI App Using Python's PyQt5

Sound-Equalizer- This is a Sound Equalizer GUI App Using Python's PyQt5. It gives you the ability to play, pause, and Equalize any one-channel wav audio file and play 3 different instruments.

Mustafa Megahed 1 Jan 10, 2022
Mousai is a simple application that can identify song like Shazam

Mousai is a simple application that can identify song like Shazam. It saves the artist, album, and title of the identified song in a JSON file.

Dave Patrick 662 Jan 07, 2023
Music player - endlessly plays your music

Music player First, if you wonder about what is supposed to be a music player or what makes a music player different from a simple media player, read

Albert Zeyer 482 Dec 19, 2022
Inner ear models for Python

cochlea cochlea is a collection of inner ear models. All models are easily accessible as Python functions. They take sound signal as input and return

98 Jan 05, 2023
Algorithmic and AI MIDI Drums Generator Implementation

Algorithmic and AI MIDI Drums Generator Implementation

Tegridy Code 8 Dec 30, 2022
A Python wrapper for the high-quality vocoder "World"

PyWORLD - A Python wrapper of WORLD Vocoder Linux Windows WORLD Vocoder is a fast and high-quality vocoder which parameterizes speech into three compo

Jeremy Hsu 583 Dec 15, 2022
GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers

GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers

Bytedance Inc. 1.3k Jan 04, 2023
Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.

LPC_for_TTS Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm. 基于Levinson-Durbin

Zewang ZHANG 58 Nov 17, 2022
A Music Player Bot for Discord Servers

A Music Player Bot for Discord Servers

Halil Acar 2 Oct 25, 2021
Port Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. / 筆墨クミDeepvocal中文音源

Hitsuboku Kumi (筆墨クミ) is a UTAU virtual singer developed by Cubialpha. This project ports Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. This is the first open-source deepvocal voicebank on Gith

8 Apr 26, 2022
Audio spatialization over WebRTC and JACK Audio Connection Kit

Audio spatialization over WebRTC Spatify provides a framework for building multichannel installations using WebRTC.

Bruno Gola 34 Jun 29, 2022
Spotipy - Player de música simples em Python

Spotipy Player de música simples em Python, utilizando a biblioteca Pysimplegui para a interface gráfica. Este tocador é bastante simples em si, mas p

Adelino Almeida 4 Feb 28, 2022
Python I/O for STEM audio files

stempeg = stems + ffmpeg Python package to read and write STEM audio files. Technically, stems are audio containers that combine multiple audio stream

Fabian-Robert Stöter 72 Dec 23, 2022
Gradient - A Python program designed to create a reactive and ambient music listening experience

Gradient is a Python program designed to create a reactive and ambient music listening experience.

Alexander Vega 2 Jan 24, 2022
Noinoi music is smoothly playing music on voice chat of telegram.

NOINOI MUSIC BOT ✨ Features Music & Video stream support MultiChat support Playlist & Queue support Skip, Pause, Resume, Stop feature Music & Video do

2 Feb 13, 2022
A Quick Music Player Made Fully in Python

Quick Music Player Made Fully In Python. Pure Python, cross platform, single function module with no dependencies for playing sounds. Installation & S

1 Dec 24, 2021