C++ library for audio and music analysis, description and synthesis, including Python bindings

Overview

Essentia

Build Status License: AGPL v3

Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. The library is also wrapped in Python and includes a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications.

Documentation online: http://essentia.upf.edu

Installation

The library is cross-platform and currently supports Linux, Mac OS X, Windows, iOS and Android systems. Read installation instructions:

You can download and use prebuilt static binaries for a number of Essentia's command-line music extractors instead of installing the complete library

Quick start

Quick start using python:

Command-line tools to compute common music descriptors:

Asking for help

Versions

Official releases:

Github branches:

  • master: the most updated version of Essentia (Ubuntu 14.10 or higher, OSX); if you got any problem - try it first.

If you use example extractors (located in src/examples), or your own code employing Essentia algorithms to compute descriptors, you should be aware of possible incompatibilities when using different versions of Essentia.

How to contribute

We are more than happy to collaborate and receive your contributions to Essentia. The best practice of submitting your code is by creating pull requests to our GitHub repository following our contribution policy. By submitting your code you authorize that it complies with the Developer's Certificate of Origin. For more details see: http://essentia.upf.edu/documentation/contribute.html

You are also more than welcome to suggest any improvements, including proposals for new algorithms, etc.

Comments
  • Remove support for libswresample as we have libavresample

    Remove support for libswresample as we have libavresample

    I've installed all of the dependencies that I can uncover, and when I do: $ ./waf configure --mode=release --with-python --with-examples --with-vamp --with-cpptest

    I get: Setting top to : /home/roger/AudioSignalProcessing/essentia-2.0.1 Setting out to : /home/roger/AudioSignalProcessing/essentia-2.0.1/build → configuring the project in /home/roger/AudioSignalProcessing/essentia-2.0.1 → Building in release mode Checking for 'g++' (c++ compiler) : /usr/bin/g++ Checking for 'gcc' (c compiler) : /usr/bin/gcc Checking for program pkg-config : /usr/bin/pkg-config Checking for 'libavcodec' : yes Checking for 'libavformat' : yes Checking for 'libavutil' : yes Checking for 'libswresample' : yes Checking for 'taglib' : yes Checking for 'yaml-0.1' : yes Checking for 'fftw3f' : yes Checking for 'samplerate' : yes Checking for 'gaia2' : yes Checking for program python : /usr/bin/python Checking for python version : (2, 7, 6, 'final', 0) Checking for library python2.7 in LIBDIR : yes Checking for program /usr/bin/python-config,python2.7-config,python-config-2.7,python2.7m-config : /usr/bin/python-config Checking for header Python.h : yes ================================ CONFIGURATION SUMMARY

    • FFmpeg / libav detected! The following algorithms will be included: ['AudioLoader', 'MonoLoader', 'EqloudLoader', 'EasyLoader', 'MonoWriter', 'AudioWriter']
    • libsamplerate (SRC) detected! The following algorithms will be included: ['Resample']
    • TagLib detected! The following algorithms will be included: ['MetadataReader']
    • Gaia2 detected! The following algorithms will be included: ['GaiaTransform']

      'configure' finished successfully (1.766s)

    But when I do: $ ./waf

    I get a bunch of errors. Some are below and all seem to bee related: ../src/essentia/utils/audiocontext.cpp: In member function ‘int essentia::AudioContext::create(const string&, const string&, int, int, int)’: ../src/essentia/utils/audiocontext.cpp:107:10: error: ‘CODEC_ID_PCM_S16LE’ was not declared in this scope case CODEC_ID_PCM_S16LE: ^ ../src/essentia/utils/audiocontext.cpp:108:10: error: ‘CODEC_ID_PCM_S16BE’ was not declared in this scope case CODEC_ID_PCM_S16BE: ^ ../src/essentia/utils/audiocontext.cpp:109:10: error: ‘CODEC_ID_PCM_U16LE’ was not declared in this scope case CODEC_ID_PCM_U16LE: ^ ../src/essentia/utils/audiocontext.cpp:110:10: error: ‘CODEC_ID_PCM_U16BE’ was not declared in this scope case CODEC_ID_PCM_U16BE: ^ and I end up with: Build failed -> task in 'essentia' failed (exit status 1): ...

    Can anyone help? I am using Ubuntu 14.04.

    bug 
    opened by rgonnering 30
  • configuration issue on mac (Getting pyembed flags from python-config: Could not build a python embedded interpreter)

    configuration issue on mac (Getting pyembed flags from python-config: Could not build a python embedded interpreter)

    After ./waf configure --mode=release --with-python --with-cpptests --with-examples --with-vamp

    I got this

    python executable ... differs from system... ... Checking for library python2.7 in LIBPATH_PYEMBED: not found Checking for library python2.7 in LIBDIR: not found Checking for library python2.7 in python_LIBPL: not found Checking for library python2.7 in $prefix/libs: not found ... Getting pyembed flags from python-config: Could not build a python embedded interpreter ...

    The configuration failed

    any pointer on how to resolve this? thanks.

    opened by yyf 28
  • Probabilistic Yin and CREPE

    Probabilistic Yin and CREPE

    As the monophonic pitch extraction algorithms in Essentia are out-of-date, it is appealing to implement two state of the art pitch extraction algorithms which lead to better pitch extraction accuracy:

    • [x] Pyin: https://code.soundsoftware.ac.uk/projects/pyin
    • [ ] CREPE: https://github.com/marl/crepe
    algorithms wishlist 
    opened by ronggong 21
  • GaiaTransfrom not found in registry

    GaiaTransfrom not found in registry

    Hello,

    I have compiled and installed first Gaia then Essentia library to my Ubuntu 16.04. I want to use the out of box streaming_extractor_music executable. When I run streaming_extractor_music without any profile I get no problem and a nice output.

    However, when I create a profile file that includes: highlevel: compute: 1 svm_models: ['svm_models/genre_tzanetakis.history', 'svm_models/mood_sad.history']

    I get GaiaTransform not found in the registry error when it processes the high level svm models.

    Any help will be appreciated.

    opened by oak94 20
  • Allow filtering negative energy values

    Allow filtering negative energy values

    The PredominantPitchMelodia algorithm can return negative confidence values if guessUnvoiced=True. This adds a new option to PitchFilterMakam to automatically take the absolute value of any negative values. Also fix a problem where the octaveFilter parameter wasn't being loaded properly

    opened by alastair 19
  • ./waf build fail - TagLib

    ./waf build fail - TagLib

    Hello, I'm trying to run the script ./waf and when I use flags --mode=release --build-static --with-python --with-cpptests --with-examples --with-vamp, I always get stuck at the file metadatareader.cpp. Stacktrace:

    [338/374] Linking build/src/examples/essentia_standard_beatsmarker [339/374] Linking build/src/examples/essentia_standard_onsetrate src/libessentia.a(metadatareader.cpp.1.o): In functionformatString(TagLib::StringList const&)': metadatareader.cpp:(.text+0x14f1): undefined reference to TagLib::String::to8Bit(bool) const' metadatareader.cpp:(.text+0x157d): undefined reference toTagLib::String::to8Bit(bool) const' metadatareader.cpp:(.text+0x15b4): undefined reference to TagLib::String::to8Bit(bool) const' src/libessentia.a(metadatareader.cpp.1.o): In functionessentia::standard::MetadataReader::compute()': metadatareader.cpp:(.text+0x2a85): undefined reference to TagLib::String::to8Bit(bool) const' metadatareader.cpp:(.text+0x2c67): undefined reference toTagLib::String::to8Bit(bool) const' collect2: error: ld returned 1 exit status

    src/libessentia.a(metadatareader.cpp.1.o): In function `formatString(TagLib::StringList const&)':
    metadatareader.cpp:(.text+0x14f1): undefined reference to `TagLib::String::to8Bit(bool) const'
    metadatareader.cpp:(.text+0x157d): undefined reference to `TagLib::String::to8Bit(bool) const'
    metadatareader.cpp:(.text+0x15b4): undefined reference to `TagLib::String::to8Bit(bool) const'
    src/libessentia.a(metadatareader.cpp.1.o): In function `essentia::standard::MetadataReader::compute()':
    metadatareader.cpp:(.text+0x2a85): undefined reference to `TagLib::String::to8Bit(bool) const'
    metadatareader.cpp:(.text+0x2c67): undefined reference to `TagLib::String::to8Bit(bool) const'
    collect2: error: ld returned 1 exit status
    
    Waf: Leaving directory `/home/kapi/essentia/build'
    Build failed
     -> task in 'essentia_standard_beatsmarker' failed with exit status 1 (run with -v to display more information)
     -> task in 'essentia_standard_onsetrate' failed with exit status 1 (run with -v to display more information)
    

    ` I tried installing both the newest (1.11.1) and one of the older (1.9) versions of the TagLib. What can I do to make it work? My operating system is Ubuntu 16.04 LTS.

    builds 
    opened by katpi 16
  • ConstantQ Transform?

    ConstantQ Transform?

    My search in the algorithm reference documentation and a quick search of the repository proved fruitless. Is there an implementation of it (e.g. like this) available in Essentia?

    algorithms wishlist 
    opened by constd 16
  • cannot import the essentia.standard nor essentia.streaming

    cannot import the essentia.standard nor essentia.streaming

    when i import essentia it's fine i have no problem but when i try to import the essentia.standard or essentia.streaming i get no module named '..........' i don't know what's the problem

    opened by ahmed-jbeli 15
  • PitchYIN error on stationary signals

    PitchYIN error on stationary signals

    Hello

    lately we used the YIN implementation in essentia a lot. However for many applications (speech, instruments) I found an constant error compared to other pitch estimators like RAPT.

    I tried to produce some more systematic results by running a simple test script (https://gist.github.com/faroit/2ebcf956633f63d92ace) which generates a stationary sine wave of constant f0. The signal then is processed by the YIN algorithm and the mean of the estimate is compared to the (constant) ground truth.

    This is what I get:

    yin_error

    Obviously the estimation error is frequency depended, which is expected. Over 1 Khz, however, the estimate looks to be unstable.

    Did anyone have tested the estimate in comparison to the original C Yin implementation?

    bug 
    opened by faroit 15
  • Experimental windows support

    Experimental windows support

    Hey,

    Here's the modifications I did to get things building on Windows with MinGW, with the outcome that with the correct environment setup it should be a case of just supplying these three commands:-

    python waf configure --prefix="C:\Program Files (x86)\CodeBlocks\MinGW"

    python waf

    python waf install


    You need to install python and MinGW with pthreads (I used Codeblocks with built in TDM-GCC). During the configure stage it copies the dependencies into bin/include/lib in the MinGW root specified by the prefix option.

    I took the built dependencies from the mingw_port and made a few changes:-

    • I removed the pthread headers from libav as TDM-GCC has them already.
    • Recompiled libsamplerate to fix def file / dll inconsistency
    • moved taglib headers down a level in /include/taglib and added missing "tnmap.tcc"
    opened by carthach 15
  • not finding actual directory of libessentia.so

    not finding actual directory of libessentia.so

    I am new in linux, python and essentia. Using debian jessy

    When I call (in python) import essentia:

    Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.7/dist-packages/essentia/init.py", line 1, in import _essentia ImportError: libessentia.so: cannot open shared object file: No such file or directory

    (I see the file in /usr/local/lib/)

    opened by ErnestoAcc 14
  • Please add FreeBSD install instructions

    Please add FreeBSD install instructions

    The Installing Essentia page can have FreeBSD installation instructions: To install essentia's C++ library: pkg install essentia To install essentia's Python binding: pkg install py39-essentia

    The FreeBSD ports are now available:

    • https://cgit.freebsd.org/ports/tree/audio/essentia/Makefile
    • https://cgit.freebsd.org/ports/tree/audio/py-essentia/Makefile
    opened by yurivict 0
  • libessentia.so does not have a SONAME

    libessentia.so does not have a SONAME

    When I build the Python binding in the FreeBSD ports framework it complains:

    Error: /usr/local/lib/python3.9/site-packages/essentia/_essentia.cpython-39.so is linked to /usr/local/lib/libessentia.so which does not have a SONAME. audio/essentia needs to be fixed.

    libessentia.so doesn't have a SONAME fields set.

    opened by yurivict 0
  • using ios_simulator results in an empty lib !

    using ios_simulator results in an empty lib !

    Hello all

    I'm on macOs (Ventura 13.0)

    I have some difficulties to build essentia for ios-simulator actually I've made all the necessary glue, calling a simple essentia::init() to test the basis.

    But XCode is telling me it can find any symbols And indeed, it appears that the resulting lib may be defectuous ?

    when doing a ranlib, I've this bad message:

    /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: for architecture: i386 file: build_ios/src/libessentia.a(essentiautil.cpp.1.o) has no symbols
    /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: for architecture: x86_64 file: build_ios/src/libessentia.a(essentiautil.cpp.1.o) has no symbols
    

    I join the lib and the log if it can help

    Thank you :)

    opened by simdax 3
  • Creating a example for EffnetDiscogs

    Creating a example for EffnetDiscogs

    Hello all

    I have to admit I'm not very familiar with AI, so I'm struggling to create a simple example that would work as the other tensorflow examples, musicnn or vggish, in CPP.

    In python I have a result with this code:

    
    audio = MonoLoader(filename="../data/raw/blues/blues.00000.wav", sampleRate=16000)()
    model = TensorflowPredictEffnetDiscogs(graphFilename="../models/discogs-effnet-bs64-1.pb")
    activations = model(audio)
    
    #    [   INFO   ] TensorflowPredict: Successfully loaded graph file: `../models/discogs-effnet-bs64-1.pb`
    
    activations_mean = np.mean(activations, axis=0)
    top_n_idx = np.argsort(activations_mean)[::-1][0]
    

    I've just copied these files which are all similar, but it does not seem to work for me sadly.

    Anyone could help :) ? Thank you

    
    #include <iostream>
    #include <essentia/algorithmfactory.h>
    #include <essentia/streaming/algorithms/poolstorage.h>
    #include <essentia/scheduler/network.h>
    #include "credit_libav.h"
    
    using namespace std;
    using namespace essentia;
    using namespace essentia::streaming;
    using namespace essentia::scheduler;
    
    
    bool hasFlag(char** begin, char** end, const string& option) {
      return find(begin, end, option) != end;
    }
    
    string getArgument(char** begin, char** end, const string& option) {
      char** iter = find(begin, end, option);
      if (iter != end && ++iter != end) return *iter;
    
      return string();
    }
    
    void printHelp(string fileName) {
        cout << "Usage: " << fileName << " pb_graph audio_input output_json [--help|-h] [--list-nodes|-l] [--patchwise|-p] [[-output-node|-o] node_name]" << endl;
        cout << "  -h, --help: print this help" << endl;
        cout << "  -l, --list-nodes: list the nodes in the input graph (model)" << endl;
        cout << "  -p, --patchwise: write out patch-wise predctions (one per patch) instead of averaging them" << endl;
        cout << "  -o, --output-node: node (layer) name to retrieve from the graph (default: model/Sigmoid)" << endl;
        creditLibAV();
    }
    
    vector<string> flags({"-h", "--help",
                          "-l", "--list-nodes",
                          "-p", "--patchwise",
                          "-o", "--output-node"});
    
    
    int main(int argc, char* argv[]) {
      // Sanity check for the command line options.
      for (char** iter = argv; iter < argv + argc; ++iter) {
        if (**iter == '-') {
          string flag(*iter);
          if (find(flags.begin(), flags.end(), flag) == flags.end()){
            cout << argv[0] << ": invalid option '" << flag << "'" << endl;
            printHelp(argv[0]);
            exit(1);
          }
        }
      }
    
      string outputLayer = "PartitionedCall";
    
      string graphName = argv[1];
      string audioFilename = argv[2];
      string outputFilename = argv[3];
    
      // rather to output the patch-wise predictions or to average them.
      const bool average = (hasFlag(argv, argv + argc, "--patchwise") ||
                            hasFlag(argv, argv + argc, "-p")) ? false : true;
    
      // register the algorithms in the factory(ies)
      essentia::init();
    
      Pool pool;
      Pool aggrPool;  // a pool for the the aggregated predictions
      Pool* poolPtr = &pool;
    
      /////// PARAMS //////////////
      Real sampleRate = 16000.0;
    
      AlgorithmFactory& factory = streaming::AlgorithmFactory::instance();
    
      Algorithm* audio = factory.create("MonoLoader",
                                        "filename", audioFilename,
                                        "sampleRate", sampleRate);
    
      Algorithm* tfp   = factory.create("TensorflowPredictEffnetDiscogs",
                                        "graphFilename", graphName,
                                        "output", outputLayer);
      // If the output layer is empty, we have already printed the list of nodes.
      // Exit now.
      if (outputLayer.empty()){
        essentia::shutdown();
    
        return 0;
      }
    
      /////////// CONNECTING THE ALGORITHMS ////////////////
      cout << "-------- connecting algos --------" << endl;
    
      audio->output("audio")     >>  tfp->input("signal");
      tfp->output("predictions") >>  PC(pool, "predictions");
    
    
      /////////// STARTING THE ALGORITHMS //////////////////
      cout << "-------- start processing " << audioFilename << " --------" << endl;
    
      // create a network with our algorithms...
      Network n(audio);
      // ...and run it, easy as that!
      n.run();
    
      if (average) {
        // aggregate the results
        cout << "-------- averaging the predictions --------" << endl;
    
        const char* stats[] = {"mean"};
    
        standard::Algorithm* aggr = standard::AlgorithmFactory::create("PoolAggregator",
                                                                      "defaultStats", arrayToVector<string>(stats));
    
        aggr->input("input").set(pool);
        aggr->output("output").set(aggrPool);
        aggr->compute();
    
        poolPtr = &aggrPool;
    
        delete aggr;
      }
    
      // write results to file
      cout << "-------- writing results to json file " << outputFilename << " --------" << endl;
    
      standard::Algorithm* output = standard::AlgorithmFactory::create("YamlOutput",
                                                                       "format", "json",
                                                                       "filename", outputFilename);
      output->input("pool").set(*poolPtr);
      output->compute();
      n.clear();
    
      delete output;
      essentia::shutdown();
    
      return 0;
    }
    

    compiling it and activating it like these ./build/src/examples/essentia_streaming_discogs test/models/effnetdiscogs/effnetdiscogs-bs64-1.pb test/audio/recorded/mozart_c_major_30sec.wav outpout.json

    the result is empty :(

    {
    "metadata": {
        "version": {
            "essentia": "2.1-beta6-dev"
        }
    }
    }
    

    Thank you very much :)

    opened by simdax 1
  • Update static builds for Qt 5.15.6

    Update static builds for Qt 5.15.6

    Wishlist of TODOs before merge:

    • [ ] merge (https://github.com/MTG/gaia/pull/121) and update gaia version in build_config.sh accordingly.
    • [ ] check if full build for static examples works
    builds 
    opened by dbogdanov 0
Releases(v2.1_beta5)
  • v2.1_beta5(Sep 5, 2019)

    Essentia 2.1 beta5 is our current preliminary version of the forthcoming 2.1 release. This pre-release includes the following changes:

    • Algorithms updates and bug-fixes

      • Fix the slaneyMel scale implementation in MelBands and MFCC (#849). Introduced in 2.1-beta4, it was erroneously computing the HTK Mel scale. Set htkMel as the default scale to ensure backward compatibility with all previous versions of MelBands/MFCC.

      • New option unit_tri for triangle area normalization in MelBands, MFCC, and TriangularBands.

      • New parameter silenceThreshold in MFCC and GFCC. Set default threshold to 1e-10 (#543).

      • TriangularBands: faster unit-sum normalization and an improved check for insufficient spectrum resolution (#142).

      • ConstantQ and the related Chromagram and SpectrumCQ are reimplemented from scratch and now function correctly. The maxFrequency parameter is replaced by numberBins.

      • New negativeFrequencies parameter in FFTC to include negative frequencies in the output.

      • New normalize parameter for IFFT size normalization.

      • FFTC now supports KissFFT and Accelerate.

      • PoolAggregator: new aggregation method last to get the last value. Fix possible nan/inf values in kurtosis and skewness (#689). Apply aggregation for pool values that contain only one vector too.

      • New checkRange parameter in Trimmer and StereoTrimmer.

      • PitchFilter: improve consistency between input and output stream types (#674).

      • PitchMelodia: fix missing output pitchConfidence in streaming mode.

      • MultiPitchMelodia: peakFrameThreshold and peakFrameThreshold parameters now work correctly (they were overridden by hardcoded values).

      • New tolerance parameter in PitchYinFFT. When the pitch confidence is lower than the tolerance value the output pitch is set to 0. A tolerance of 1 disables this feature.

      • Fix occasional negative values output by Danceability (#483).

      • LoudnessEBUR128:

        • Fix memory leaks and warnings on empty input. Set a larger internal buffer size to avoid buffer resizes.
        • New parameter startFromZero to zero-center the first window for loudness estimation.
      • Fix a memory leak in AudioLoader.

      • BeatTrackerDegara output is now deterministic (#860).

      • ChordDetectionBeats: add new parameter chromaPick and fix a beat segment indexing bug in the case of very close consecutive beats.

      • New minPeakDistance parameter in PeakDetection.

      • Fix invalid memory access in PCA (#727).

      • Update Key and KeyExtractor algorithms with new pitch class profiles and new parameters for detuning correction and low-energy HPCP bin thresholding. Use the new bgate profile by default. Add spectral whitening step to KeyExtractor. Change output key naming. Add a new function equivalentKey to match between equivalent names.

      • Proper mutex implementation for all FFT* algorithms.

    • New algorithms

      • Invertible Constant-Q based on Non-Stationary Gabor frames: NSGConstantQ, NSGIConstantQ, NSGConstantQStreaming.
      • Chromaprinter (fingerprinting) wrapper for the Chromaprint library.
      • NNLSChroma and LogSpectrum (derived from the original NNLS Chroma code).
      • TriangularBarkBands (more configurable than BarkBands) and BFCC (bark-frequency cepstrum coefficients).
      • New algorithms for audio problems detection: ClickDetector, DiscontinuityDetector, FalseStereoDetector, GapsDetector, HumDetector, NoiseBurstDetector, SNR, SaturationDetector, StartStopCut, TruePeakDetector.
      • New algorithms for probabilistic Yin (pYIN) pitch estimation: PitchYinProbabilistic, PitchYinProbabilities, PitchYinProbabilitiesHMM.
      • StereoTrimmer and StereoMuxer.
      • Welch (power spectral density estimation).
      • New algorithm IFFTC for inverse complex STFT.
      • Histogram.
    • Updated music and sound feature extractors streaming_extractor_music and streaming_extractor_freesound. Both extractors are now also available as algorithms: MusicExtractor and FreesoundExtractor. New MusicExtractorSVM algorithm allows applying SVM models to the output of MusicExtractor.

      • Fix possible memory leaks in MusicExtractor

      • Proper logging for "out of memory" errors

      • Skip aggregation for some descriptors

      • Add audio length to metadata and remove end_time

      • Add number of audio channels to metadata (number_channels)

      • Better grouping of metadata related to audio analysis

      • Updated key/chords estimation parameters

      • Estimate key using three different key profiles (temperley, krumhansl, edma)

      • Updated descriptors in MusicExtractor:

        • New LoudnessEBU128 loudness descriptors
        • Add melbands128 high-resolution melbands
        • Compute hpcp_crest
        • Compute bpm_histogram
        • New stdev aggregate statistics in addition to var
      • Updated descriptors in FreesoundExtractor

        • Add melbands96 high-resolution melbands
        • Add stdev statistic
        • Remove frequency_bands
        • Do not output bpm_confidence when configured to use 'degara' for beat tracking
        • spectral_contrast and scvalleys are now called spectral_contrast_coeffs and spectral_contrast_valleys for consistency with MusicExtractor
        • startFrame and stopFrame are now called sound_start_frame and sound_stop_frame
    • New extractors

      • Add a new extractor for spectrograms and log-energy Mel-spectrograms (streaming_spectrogram).
    • Python bindings updates

      • Add support for Python 3.
      • Update all tutorials and code examples to Python 3.
      • New essentia.pyutils submodule provides useful functions for a number of use-cases (spectrograms, CQ-grams, batch processing with extractors, etc.)
      • Fix a memory bug in Pool on a isSingleValue check in Python.
      • Faster VECTOR_VECTOR_REAL conversion from Python types.
    • Build scripts updates

      • Add script for Python packaging (python.py) and wheels.
      • Travis CI and build scripts for manylinux wheels.
      • Update Waf to 2.0.10.
      • The code is now partly C++11.
      • Build flags for MSVC.
      • Fixes for cross-compilation with Mingw-w64.
      • Default --prefix=$VIRTUAL_ENV when inside a virtualenv.
      • Read PKG_CONFIG_PATH and add new flag --pkg-config-path for custom lib paths.
      • New flag --only-python to build Python extension separately from libessentia.
      • Link only to libessentia when building examples.
      • Generate a proper essentia.pc pkg-config file.
      • Static builds updates.
        • Replace LibAv with FFmpeg, build with muxers.
        • Update Taglib version to 1.11.1, build with zlib.
        • Update Gaia to 2.4.5.
    • Miscellaneous

      • Fix segfault in the Vamp plugin (#635, #371).
      • Add support for SingleVectorString to Pool.
      • Added support for Cephes Bessel functions via a 3rdparty library Cephes.
    • Updated documentation, tutorials, and examples including a significant web redesign.

      • Improve build scripts for documentation.
      • Every algorithm page now has links to related algorithms.
      • An updated list of research works using Essentia.
      • New python examples.
      • New QA scripts for audio problems detection and HPCPs.
    • A usual assortment of code cleanup, updated and expanded unit tests, and better logging (more informative log and exception messages).

    Source code(tar.gz)
    Source code(zip)
  • v2.1_beta4(May 23, 2018)

    This pre-release includes the following changes:

    • Improved algorithms

      • AudioLoader now supports audio sources with multiple audio streams (new parameter 'audioStream')
      • PoolAggregator now outputs stdev in addition to var (#342)
      • SpectralContrast: Improve precision for computation of subband bin intervals
      • Danceability now also outputs a DFA exponent vector
      • HPCP can now optionally apply unit sum normalization (#348)
      • HPCP: 'splitFrequency' parameter is now called 'bandSplitFrequency'
      • LoudnessEBUR128: Warn on empty input in the streaming mode
    • Updates to Mel and ERB energy band algorithms

      • Add support for extracting MelBands and MFCCs 'the htk way'
      • Add support for DCT type III in DCT algorithm
        • New parameter 'dctType' in DCT, MFCC and GFCC
        • New 'liftering' parameter in DCT and MFCC
      • New parameters 'normalize', 'type', 'scale' and 'weighting' in MelBands and MFCC
      • New 'type' parameter in GFCC
      • New 'logType' parameter in MFCC, GFCC
      • New 'log' parameter in TriangularBands and MelBands
      • ERBBands: 'type' parameter value "energy" is now called "power"
      • TriangularBands is now faster
    • New algorithms

      • SpectrumToCent for computing cent scale from frequency bins
      • New algorithm IDCT for inverse DCT
      • New algorithm SpectrumCQ
    • Bug-fixes in algorithms:

      • MelBands and TriangularBands: Add checks for insufficient spectrum resolution (#142)
      • Fix PitchYin out of range error (#376)
      • Fix Inf values in OddToEvenHarmonicEnergyRatio
      • Fix reset() in LowLevelSpectralExtractor and LowLevelSpectralEqloudExtractor
      • Fix occasional exception in BeatsLoudness (#199)
      • Danceability: Fix NaN danceability value occurring on very short input signals
      • Fix memory leak in MelBands
      • Fix memory bug in Vibrato
      • SpectralContrast: Force non-zero 'lowFrequencyBound' parameter to avoid division by zero (#568)
      • AudioLoader: Fix memory bug on exceptions while opening an audio file in AudioLoader
    • Updates to Python wrapper:

      • FrameGenerator now inherits the default parameters from FrameCutter
      • FrameGenerator now has a new method frame_times() to compute frame positions in time
      • Fix array memory corruption when passing NumPy array views to Essentia algorithms (#240)
      • Fix memory deallocation for streaming algorithms to avoid a memory leak
    • Extractors:

      • Freesound extractor now stores all results in json
    • Logging:

      • Remove colors in log messages when piped to file; do not print colors on Windows
    • Build scripts updates:

      • Update waf to 1.9.5
      • Update script for computing algorithm dependencies
    • Code cleanup and unit tests updates

    • Re-designed and expanded documentation:

      • Updated installation instructions
      • Reorganized and improved Python tutorials. Notebook tutorials are now also rendered as html
      • Updated algorithm descriptions
      • Added examples of industrial applications and academic studies using Essentia
    Source code(tar.gz)
    Source code(zip)
  • v2.1_beta3(Sep 29, 2016)

    This pre-release includes the following changes:

    • Build script updates:
      • Cross-compilation for iOS and Android
      • Support for javascript using Emscripten
      • Updated dependencies in static extractors (LibAv 11.2, Taglib 1.10)
      • Fixed cross-compilation for Windows
      • Homebrew formula for easy installation on OSX
      • Updated Debian packaging
      • All dependencies are now optional. Algorithms and examples relying on missing dependencies will be ignored.
      • New flags for building lightweight versions of Essentia
        • --lightweight=LIBS to specify dependencies to be included
        • --include-algos=ALGOS and --ignore-algos=ALGOS to specify algorithms to be included
    • New algorithms:
      • SuperFlux algorithm for real-time onset detection (SuperFluxExtractor, SuperFluxNovelty)
      • Algorithms for sound modeling
        • Overlap-add (OverlapAdd)
        • Sine model analysis/synthesis (SineModelAnal, SineModelSynth)
        • Sine subtraction (SineSubtraction)
        • Sinusoidal plus Residual model analysis/synthesis (SprModelAnal, SprModelSynth)
        • Melody Analysis (monophonic/predominant)
        • HarmonicMask
        • Signal resampling (ResampleFFT)
      • New pitch-related algorithms
        • Multi-pitch estimation in polyphonic music (MultiPitchKlapuri, MultiPitchMelodia)
        • Adaptation of Melodia algorithm for monophonic signals (PitchMelodia)
        • Yin pitch detection algorithm (PitchYin)
        • Pitch contour segmentation into notes (PitchContourSegmentation)
        • Vibrato detection (Vibrato)
      • BPM estimation on loops (PercivalEnhanceHarmonics, PercivalEvaluatePulseTrains, LoopBpmConfidence, LoopBpmEstimator, PercivalBpmEstimator)
      • STFT on complex inputs ( FFTC)
      • ConstantQ and Chromagram (still in experimental stage)
      • TriangularBands
      • Lightweight spectral centroid implementation (SpectralCentroidTime)
      • Chords detection on beat segments (ChordsDetectionBeats)
      • VectorRealAccumulator
    • Improved algorithms:
      • LoudnessEBUR128 algorithms are now finalized (includes bug-fixes)
      • FFT now supports KissFFT and Accelerate FFT libraries as an alternative to FFTW
      • New profiles for Key estimation (including profiles for electronic music)
      • New 'generalized' parameter in Autocorrelation algorithm
      • New 'scale' and 'shift' parameters in UnaryOperator algorithm
      • New 'normalized' parameter in Windowing algorithm
      • New 'inputSize' parameter in GFCC algorithm
      • Added support for 8kHz for EqualLoudness algorithm
      • LogAttackTime now outputs attack times
      • BpmHistogramDescriptors now outputs a complete histogram
      • ChordsDescriptors now throws exception on incorrect chords
      • Refactored AudioLoader and AudioWriter algorithms. Use libavresample, remove support for libswresample
      • Rename PitchFilterMakam to PitchFilter. Allow filtering negative energy values. Remove optional 'octaveFilter' parameter
      • Rename PredominantMelody algorithm to PredominantPitchMelodia
    • Bug-fixes:
      • Fix wrong behavior of HarmonicPeaks that was indirectly affecting results in HPCP, Key, Tristimulus and OddToEvenHarmonicEnergy
      • Fixed filter coefficients in BandReject and BandPass
      • Fixed weightings in NoveltyCurve
      • Different key profiles in Key streaming algorithm now work correctly
      • Bug fixes in Envelope, TonicIndianArtMusic, RhythmExtractor2013, PitchYinFFT, BpmHistogramDescriptors, ReplayGain streaming
    • Updated extractors (including Freesound extractor)
    • Improved documentation
      • Fresh new design
      • Algorithms are now organized by categories.
      • Improved and rewritten algorithm descriptions
      • New python examples and tutorials
    • More minor fixes, improvements and code cleanup
    • Updated unit tests. Audio files for tests are now hosted in a separate repository

    Known issues:

    • Some unit tests fail (#316)
    Source code(tar.gz)
    Source code(zip)
  • v2.1_beta2(Mar 26, 2015)

    Changes:

    • Build scripts updates:
      • New scripts for static builds on Linux, OSX and (cross-compilation) Windows
      • New flag --with-example to build only specific examples
      • New git commit SHA hash value accessible via Essentia library API for better versioning
    • Algorithm updates:
      • AudioLoader now outputs codec and bitrate, and computes md5 hash values over undecoded audio
      • MetadataReader now uses new TagLib 1.9 API and is able to read any tags
      • YamlInput now supports json
      • New Entropy algorithm
      • EffectiveDuration now accepts a threshold parameter
      • Fixed incorrect computation of onset rate in OnsetRate
      • New algorithm LoudnessEBUR128 for measuring loudness according to the EBU R128 standard (still in experimental stage)
      • New BinaryOperator algo
      • PitchYinFFT algorithm now includes peak interpolation
    • Revised and updated extractors:
      • Revised, refactored and expanded music extractor (streaming_extractor_music) including new functionality and descriptors
      • Updated Freesound extractor, including new descriptors
    • Some updates in core Essentia code
    • Updated documentation and examples
    • Bugfixes and unit tests updates

    Dependencies: Libav 9, Taglib 1.9

    Ubuntu/Debian Libav/Taglib compatibility:

    • Debian Jessie - the required package versions are already in the repository
    • Debian Wheezy - install libav/libtag1-dev packages from wheezy-backports repository
      • libav 6:10.1
      • libtag1-dev 1.9.1
    • Ubuntu Trusty (14.04 LTS), Utopic (14.10) and Vivid (15.04) - the required package versions are already in the repository
    Source code(tar.gz)
    Source code(zip)
  • v2.0.1(Feb 11, 2014)

    Essentia 2.0.1:

    • Added pre-trained high-level classifier models for genres, moods, rhythm and instrumentation (to be used with streaming_extractor_archivemusic extractor, see accuracies here)
    • Fixed scheduler in streaming mode
    • Fixed compilation with clang/libc++/c++11
    • PitchYinFFT now supports parabolic interpolation
    • Updated Vamp plugin
    • Updated documentation and tutorials
    • Minor bugfixes, more unittests, etc.

    For post-release bugfixes (including Ubuntu 14.04 compatibility) use the 2.0.1 branch.

    Ubuntu/Debian Libav compatibility:

    • Debian Wheezy - libav 6:0.8.17
    • Ubuntu Precise (12.04 LTS) - libav 4:0.8.17
    • Ubuntu Trusty (14.04 LTS) - libav 6:9.18
    Source code(tar.gz)
    Source code(zip)
  • v2.0(Mar 31, 2015)

    • First release to be publicly available as free software released under AGPLv3
    • Refactoring of the core API
      • fix small API annoyances for the standard mode
      • streaming mode refactor. It is now much better defined, using sound computer science techniques (The visible network is a directed acyclic graph, the composites have better defined semantics, and the order of execution of the algorithms is the topological sort of the transitive reduction of the visible network after the composites have been expanded). In particular, the scheduler that runs the algorithms in the streaming mode is now a lot more correct, which permitted to clean all the small hacks that had accumulated in the algorithms themselves during the 1.x releases to compensate for the deficiencies of the initial scheduler.
    • New algorithms for onset detection, beat tracking and melody extraction
    • New and updated features extractors
    • Updated Vamp plugin
    • Much better documentation, more python examples
    • Bugfixes, more unittests, etc.

    For post-release bugfixes use the 2.0 branch.

    Ubuntu/Debian Libav compatibility:

    • Debian Wheezy - libav 6:0.8.17
    • Ubuntu Precise (12.04 LTS) - libav 4:0.8.17
    • Ubuntu Trusty (14.04 LTS) - libav 6:9.18
    Source code(tar.gz)
    Source code(zip)
Owner
Music Technology Group - Universitat Pompeu Fabra
Software tools developed by the MTG
Music Technology Group - Universitat Pompeu Fabra
Scrap electronic music charts into CSV files

musiccharts A small python script to scrap (electronic) music charts into directories with csv files. Installation Download MusicCharts.exe Run MusicC

Dustin Scharf 1 May 11, 2022
gentle forced aligner

Gentle Robust yet lenient forced-aligner built on Kaldi. A tool for aligning speech with text. Getting Started There are three ways to install Gentle.

1.2k Dec 30, 2022
A Python wrapper for the high-quality vocoder "World"

PyWORLD - A Python wrapper of WORLD Vocoder Linux Windows WORLD Vocoder is a fast and high-quality vocoder which parameterizes speech into three compo

Jeremy Hsu 583 Dec 15, 2022
pyo is a Python module written in C to help digital signal processing script creation.

pyo is a Python module written in C to help digital signal processing script creation.

Olivier Bélanger 1.1k Jan 01, 2023
C++ library for audio and music analysis, description and synthesis, including Python bindings

Essentia Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license.

Music Technology Group - Universitat Pompeu Fabra 2.3k Jan 03, 2023
Make an audio file (really) long-winded

longwind Make an audio file (really) long-winded Daily repetitions are an illusion anyway.

Vincent Lostanlen 2 Sep 12, 2022
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a

Iver Jordal 1.2k Jan 07, 2023
A python script that can play .mp3 URLs upon the ringing or motion detection of a Ring doorbell. The sound plays through Sonos speakers.

Ring x Sonos A python script that plays .mp3 files whenever a doorbell is rung or a doorbell detects motion. Features Music! Authors @braden Running T

braden 0 Nov 12, 2021
L-SpEx: Localized Target Speaker Extraction

L-SpEx: Localized Target Speaker Extraction The data configuration and simulation of L-SpEx. The code scripts will be released in the future. Data Gen

Meng Ge 20 Jan 02, 2023
A Simple Script that will help you to Play / Change Songs with just your Voice

Auto-Spotify using Voice Recognition A Simple Script that will help you to Play / Change Songs with just your Voice Explore the docs » Table of Conten

Mehul Shah 1 Nov 21, 2021
Code to work with wave files!

Code to work with wave files!

Mohammad Dori 3 Jul 15, 2022
This library provides common speech features for ASR including MFCCs and filterbank energies.

python_speech_features This library provides common speech features for ASR including MFCCs and filterbank energies. If you are not sure what MFCCs ar

James Lyons 2.2k Jan 04, 2023
Spotify Song Recommendation Program

Spotify-Song-Recommendation-Program Made by Esra Nur Özüm Written in Python The aim of this project was to build a recommendation system that recommen

esra nur özüm 1 Jun 30, 2022
Analysis of voices based on the Mel-frequency band

Speaker_partition_module Analysis of voices based on the Mel-frequency band. Goal: Identification of voices speaking (diarization) and calculation of

1 Feb 06, 2022
𝙰 𝙼𝚞𝚜𝚒𝚌 𝙱𝚘𝚝 𝙲𝚛𝚎𝚊𝚝𝚎𝚍 𝙱𝚢 𝚃𝚎𝚊𝚖𝙳𝚕𝚝 💖

TeamDltmusic 𝙰 𝙼𝚞𝚜𝚒𝚌 𝙱𝚘𝚝 𝙲𝚛𝚎𝚊𝚝𝚎𝚍 𝙱𝚢 𝚃𝚎𝚊𝚖𝙳𝚕𝚝 💖 Deploy String Session String Click hear you can find string session OR join He

TeamDlt 5 Jan 18, 2022
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

Juan F. Montesinos 12 Oct 22, 2022
Enhanced Audio Player for Discord

Discodo is an enhanced audio player for discord

Mary 42 Oct 05, 2022
A rofi-blocks script that searches youtube and plays the selected audio on mpv.

rofi-ytm A rofi-blocks script that searches youtube and plays the selected audio on mpv. To use the script, run the following command rofi -modi block

Cliford 26 Dec 21, 2022
:notes: Cross-platform music player

Exaile Exaile is a music player with a simple interface and powerful music management capabilities. Features include automatic fetching of album art,

Exaile 327 Dec 19, 2022
Generating a structured library of .wav samples with Python.

sample-library Scripts for generating a structured sample library with Python Requires Docker about Samples are written to wave files in lib/. Differe

Ben Mangold 1 Nov 11, 2021