Manipulate audio with a simple and easy high level interface

Related tags

Audiopydub
Overview

Pydub Build Status Build status

Pydub lets you do stuff to audio in a way that isn't stupid.

Stuff you might be looking for:

Quickstart

Open a WAV file

from pydub import AudioSegment

song = AudioSegment.from_wav("never_gonna_give_you_up.wav")

...or a mp3

song = AudioSegment.from_mp3("never_gonna_give_you_up.mp3")

... or an ogg, or flv, or anything else ffmpeg supports

ogg_version = AudioSegment.from_ogg("never_gonna_give_you_up.ogg")
flv_version = AudioSegment.from_flv("never_gonna_give_you_up.flv")

mp4_version = AudioSegment.from_file("never_gonna_give_you_up.mp4", "mp4")
wma_version = AudioSegment.from_file("never_gonna_give_you_up.wma", "wma")
aac_version = AudioSegment.from_file("never_gonna_give_you_up.aiff", "aac")

Slice audio:

# pydub does things in milliseconds
ten_seconds = 10 * 1000

first_10_seconds = song[:ten_seconds]

last_5_seconds = song[-5000:]

Make the beginning louder and the end quieter

# boost volume by 6dB
beginning = first_10_seconds + 6

# reduce volume by 3dB
end = last_5_seconds - 3

Concatenate audio (add one file to the end of another)

without_the_middle = beginning + end

How long is it?

without_the_middle.duration_seconds == 15.0

AudioSegments are immutable

# song is not modified
backwards = song.reverse()

Crossfade (again, beginning and end are not modified)

# 1.5 second crossfade
with_style = beginning.append(end, crossfade=1500)

Repeat

# repeat the clip twice
do_it_over = with_style * 2

Fade (note that you can chain operations because everything returns an AudioSegment)

# 2 sec fade in, 3 sec fade out
awesome = do_it_over.fade_in(2000).fade_out(3000)

Save the results (again whatever ffmpeg supports)

awesome.export("mashup.mp3", format="mp3")

Save the results with tags (metadata)

awesome.export("mashup.mp3", format="mp3", tags={'artist': 'Various artists', 'album': 'Best of 2011', 'comments': 'This album is awesome!'})

You can pass an optional bitrate argument to export using any syntax ffmpeg supports.

awesome.export("mashup.mp3", format="mp3", bitrate="192k")

Any further arguments supported by ffmpeg can be passed as a list in a 'parameters' argument, with switch first, argument second. Note that no validation takes place on these parameters, and you may be limited by what your particular build of ffmpeg/avlib supports.

# Use preset mp3 quality 0 (equivalent to lame V0)
awesome.export("mashup.mp3", format="mp3", parameters=["-q:a", "0"])

# Mix down to two channels and set hard output volume
awesome.export("mashup.mp3", format="mp3", parameters=["-ac", "2", "-vol", "150"])

Debugging

Most issues people run into are related to converting between formats using ffmpeg/avlib. Pydub provides a logger that outputs the subprocess calls to help you track down issues:

>>> import logging

>>> l = logging.getLogger("pydub.converter")
>>> l.setLevel(logging.DEBUG)
>>> l.addHandler(logging.StreamHandler())

>>> AudioSegment.from_file("./test/data/test1.mp3")
subprocess.call(['ffmpeg', '-y', '-i', '/var/folders/71/42k8g72x4pq09tfp920d033r0000gn/T/tmpeZTgMy', '-vn', '-f', 'wav', '/var/folders/71/42k8g72x4pq09tfp920d033r0000gn/T/tmpK5aLcZ'])
<pydub.audio_segment.AudioSegment object at 0x101b43e10>

Don't worry about the temporary files used in the conversion. They're cleaned up automatically.

Bugs & Questions

You can file bugs in our github issues tracker, and ask any technical questions on Stack Overflow using the pydub tag. We keep an eye on both.

Installation

Installing pydub is easy, but don't forget to install ffmpeg/avlib (the next section in this doc)

pip install pydub

Or install the latest dev version from github (or replace @master with a release version like @v0.12.0)…

pip install git+https://github.com/jiaaro/[email protected]

-OR-

git clone https://github.com/jiaaro/pydub.git

-OR-

Copy the pydub directory into your python path. Zip here

Dependencies

You can open and save WAV files with pure python. For opening and saving non-wav files – like mp3 – you'll need ffmpeg or libav.

Playback

You can play audio if you have one of these installed (simpleaudio strongly recommended, even if you are installing ffmpeg/libav):

  • simpleaudio
  • pyaudio
  • ffplay (usually bundled with ffmpeg, see the next section)
  • avplay (usually bundled with libav, see the next section)
from pydub import AudioSegment
from pydub.playback import play

sound = AudioSegment.from_file("mysound.wav", format="wav")
play(sound)

Getting ffmpeg set up

You may use libav or ffmpeg.

Mac (using homebrew):

# libav
brew install libav --with-libvorbis --with-sdl --with-theora

####    OR    #####

# ffmpeg
brew install ffmpeg --with-libvorbis --with-sdl2 --with-theora

Linux (using aptitude):

# libav
apt-get install libav-tools libavcodec-extra

####    OR    #####

# ffmpeg
apt-get install ffmpeg libavcodec-extra

Windows:

  1. Download and extract libav from Windows binaries provided here.
  2. Add the libav /bin folder to your PATH envvar
  3. pip install pydub

Important Notes

AudioSegment objects are immutable

Ogg exporting and default codecs

The Ogg specification (http://tools.ietf.org/html/rfc5334) does not specify the codec to use, this choice is left up to the user. Vorbis and Theora are just some of a number of potential codecs (see page 3 of the rfc) that can be used for the encapsulated data.

When no codec is specified exporting to ogg will default to using vorbis as a convinence. That is:

from pydub import AudioSegment
song = AudioSegment.from_mp3("test/data/test1.mp3")
song.export("out.ogg", format="ogg")  # Is the same as:
song.export("out.ogg", format="ogg", codec="libvorbis")

Example Use

Suppose you have a directory filled with mp4 and flv videos and you want to convert all of them to mp3 so you can listen to them on your mp3 player.

import os
import glob
from pydub import AudioSegment

video_dir = '/home/johndoe/downloaded_videos/'  # Path where the videos are located
extension_list = ('*.mp4', '*.flv')

os.chdir(video_dir)
for extension in extension_list:
    for video in glob.glob(extension):
        mp3_filename = os.path.splitext(os.path.basename(video))[0] + '.mp3'
        AudioSegment.from_file(video).export(mp3_filename, format='mp3')

How about another example?

from glob import glob
from pydub import AudioSegment

playlist_songs = [AudioSegment.from_mp3(mp3_file) for mp3_file in glob("*.mp3")]

first_song = playlist_songs.pop(0)

# let's just include the first 30 seconds of the first song (slicing
# is done by milliseconds)
beginning_of_song = first_song[:30*1000]

playlist = beginning_of_song
for song in playlist_songs:

    # We don't want an abrupt stop at the end, so let's do a 10 second crossfades
    playlist = playlist.append(song, crossfade=(10 * 1000))

# let's fade out the end of the last song
playlist = playlist.fade_out(30)

# hmm I wonder how long it is... ( len(audio_segment) returns milliseconds )
playlist_length = len(playlist) / (1000*60)

# lets save it!
with open("%s_minute_playlist.mp3" % playlist_length, 'wb') as out_f:
    playlist.export(out_f, format='mp3')

License (MIT License)

Copyright © 2011 James Robert, http://jiaaro.com

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Comments
  • Problems with AudioSegment.from_mp3

    Problems with AudioSegment.from_mp3

    Hello James, I have Python2.7 (Win) and got latest pydub. When trying to open mp3 file using: song = AudioSegment.from_mp3("b.mp3") (while double checking with os.listdir that I do have the file in current folder), I'm getting:

    song = AudioSegment.from_mp3("b.mp3")

    Traceback (most recent call last): File "<pyshell#5>", line 1, in song = AudioSegment.from_mp3("b.mp3") File "build\bdist.win32\egg\pydub\audio_segment.py", line 318, in from_mp3 return cls.from_file(file, 'mp3') File "build\bdist.win32\egg\pydub\audio_segment.py", line 302, in from_file retcode = subprocess.call(convertion_command, stderr=open(os.devnull)) File "C:\Python27\lib\subprocess.py", line 486, in call return Popen(_popenargs, *_kwargs).wait() File "C:\Python27\lib\subprocess.py", line 672, in init errread, errwrite) File "C:\Python27\lib\subprocess.py", line 882, in _execute_child startupinfo) WindowsError: [Error 2] The system cannot find the file specified

    What's the problem? Additionaly while I do first time: from pydub import AudioSegment I'm getting: Warning (from warnings module): File "C:\Python27\lib\site-packages\pydub-0.9.2-py2.7.egg\pydub\utils.py", line 122 RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work

    The repeated import goes without warning.

    So - what it can be that I can't open plain mp3 file ?

    my email is: [email protected]

    opened by sopekmir 50
  •  Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work   warn(

    Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)

    Steps to reproduce

    from pydub import AudioSegment

    Expected behavior

    load the library

    Actual behavior

    C:\Users\DELL\Anaconda3\lib\site-packages\pydub\utils.py:165: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)

    Your System configuration

    • Python version: 3.6
    • Pydub version: 0.23.0
    • ffmpeg or avlib?:
    • ffmpeg/avlib version:
    opened by saurabhbidwai 40
  • pydub operating direct to disk vs in memory?

    pydub operating direct to disk vs in memory?

    EDIT: All tickets related to large audio files and out-of-memory errors are being merged into #135 -- jiaaro

    Original issue after the break


    Hey,

    Hopefully this is an okay forum to ask about this, but I was wondering if you had any thoughts on adapting pydub to operate direct to disk rather than in memory. From what I can tell, an operation that is overlaying, say, multiple hour-long audio files is going to have quite a massive memory footprint.

    If I wanted to adapt pydub to operate directly on-disk, do you have any thoughts on how involved that would be? Is that something that could play nice with audioop?

    Just looking for some thoughts here...

    Julian

    opened by lepinsk 19
  • CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

    CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

    I am testing some code on a VM, and all of sudden pydub starts giving error for loading wav files. Why??

    Python 3.5.5 |Anaconda custom (64-bit)| (default, May 13 2018, 21:12:35) Type 'copyright', 'credits' or 'license' for more information IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.

    In [1]: from pydub import AudioSegment

    In [2]: file_path = "/data/SPEECH_SEGMENTS/T/Ses05M_script03_2_201.85128559266252-202.72063311702868.wav"

    In [3]: audio = AudioSegment.from_wav(file_path)

    CouldntDecodeError Traceback (most recent call last) in () ----> 1 audio = AudioSegment.from_wav(file_path)

    /anaconda/envs/py35/lib/python3.5/site-packages/pydub/audio_segment.py in from_wav(cls, file, parameters) 726 @classmethod 727 def from_wav(cls, file, parameters=None): --> 728 return cls.from_file(file, 'wav', parameters) 729 730 @classmethod

    /anaconda/envs/py35/lib/python3.5/site-packages/pydub/audio_segment.py in from_file(cls, file, format, codec, parameters, **kwargs) 702 raise CouldntDecodeError( 703 "Decoding failed. ffmpeg returned error code: {0}\n\nOutput from ffmpeg/avlib:\n\n{1}".format( --> 704 p.returncode, p_err)) 705 706 p_out = bytearray(p_out)

    CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

    Output from ffmpeg/avlib:

    b'ffmpeg version 2.8.15-0ubuntu0.16.04.1 Copyright (c) 2000-2018 the FFmpeg developers\n built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.10) 20160609\n configuration: --prefix=/usr --extra-version=0ubuntu0.16.04.1 --build-suffix=-ffmpeg --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --cc=cc --cxx=g++ --enable-gpl --enable-shared --disable-stripping --disable-decoder=libopenjpeg --disable-decoder=libschroedinger --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp --enable-libschroedinger --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid --enable-libzvbi --enable-openal --enable-opengl --enable-x11grab --enable-libdc1394 --enable-libiec61883 --enable-libzmq --enable-frei0r --enable-libx264 --enable-libopencv\n WARNING: library configuration mismatch\n avcodec configuration: --prefix=/usr --extra-version=0ubuntu0.16.04.1 --build-suffix=-ffmpeg --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --cc=cc --cxx=g++ --enable-gpl --enable-shared --disable-stripping --disable-decoder=libopenjpeg --disable-decoder=libschroedinger --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp --enable-libschroedinger --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid --enable-libzvbi --enable-openal --enable-opengl --enable-x11grab --enable-libdc1394 --enable-libiec61883 --enable-libzmq --enable-frei0r --enable-libx264 --enable-libopencv --enable-version3 --disable-doc --disable-programs --disable-avdevice --disable-avfilter --disable-avformat --disable-avresample --disable-postproc --disable-swscale --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libvo_aacenc --enable-libvo_amrwbenc\n libavutil 54. 31.100 / 54. 31.100\n libavcodec 56. 60.100 / 56. 60.100\n libavformat 56. 40.101 / 56. 40.101\n libavdevice 56. 4.100 / 56. 4.100\n libavfilter 5. 40.101 / 5. 40.101\n libavresample 2. 1. 0 / 2. 1. 0\n libswscale 3. 1.101 / 3. 1.101\n libswresample 1. 2.101 / 1. 2.101\n libpostproc 53. 3.100 / 53. 3.100\n[wav @ 0x2174360] invalid start code [0][0][0][0] in RIFF header\n/data/SPEECH_SEGMENTS/T/Ses05M_script03_2_201.85128559266252-202.72063311702868.wav: Invalid data found when processing input\n'

    Your System configuration

    • Python version: 3.5.5
    • Pydub version: latest version
    opened by bwang482 18
  • Mp3 Export output

    Mp3 Export output

    The file produced when I use:

    sound.export("mashup.mp3", format="mp3)"

    is unplayable. Here is the output from $ avplay

    avplay mashup.mp3 
    avplay version 0.8.3-4:0.8.3-0ubuntu0.12.04.1, Copyright (c) 2003-2012 the Libav developers
      built on Jun 12 2012 16:37:58 with gcc 4.6.3
    [mp3 @ 0xb2300480] Format detected only with low score of 25, misdetection possible!
    [mp3 @ 0xb2300480] Could not find codec parameters (Audio: mp3, 0 channels, s16)
    [mp3 @ 0xb2300480] Estimating duration from bitrate, this may be inaccurate
    mashup.mp3: could not find codec parameters
    

    Exporting to wav however does work.

    opened by asmedrano 18
  • Adding audio files (different media info)

    Adding audio files (different media info)

    Amazing library Pydub!

    I'm getting some audio cut out at the end of this attempt to add those 3 files in one: There's obvious differences between them. What would be your recommended approach to achieve this? The audio cuts at the end by aprox 1 sec. everytime. Tried different files and the same happens. Audio can come in any kind of variation of this examples and any order. Example:

    ====== NEW AUDIO SOURCE =======
    DEBUG:pydub.converter:subprocess.call(['avconv', '-y', '-f', 'mp3', '-i', '/var/folders/1_/551bnby96jb5yy_w__0b9_1r0000gn/T/tmp_Aald3', '-vn', '-f', 'wav', '/var/folders/1_/551bnby96jb5yy_w__0b9_1r0000gn/T/tmpUwSy3h'])
    ||||||||
    Sample Width: 2
    Frame Rate: 22050
    Frame Width: 4
    Channels: 2
    Duration in Seconds: 5.48571428571
    ||||||||
    
    ====== NEW AUDIO SOURCE =======
    DEBUG:pydub.converter:subprocess.call(['avconv', '-y', '-f', 'mp3', '-i', '/var/folders/1_/551bnby96jb5yy_w__0b9_1r0000gn/T/tmpRSXPFC', '-vn', '-f', 'wav', '/var/folders/1_/551bnby96jb5yy_w__0b9_1r0000gn/T/tmpyuxKub'])
    ||||||||
    Sample Width: 2
    Frame Rate: 22050
    Frame Width: 4
    Channels: 2
    Duration in Seconds: 30.8506122449
    ||||||||
    
    ====== NEW AUDIO SOURCE =======
    DEBUG:pydub.converter:subprocess.call(['avconv', '-y', '-f', 'mp3', '-i', '/var/folders/1_/551bnby96jb5yy_w__0b9_1r0000gn/T/tmpUUIaqx', '-vn', '-f', 'wav', '/var/folders/1_/551bnby96jb5yy_w__0b9_1r0000gn/T/tmp0QC4l8'])
    ||||||||
    Sample Width: 2
    Frame Rate: 24000
    Frame Width: 2
    Channels: 1
    Duration in Seconds: 5.448
    ||||||||
    
    ====== FINAL MIXED AUDIO PARTS =======
    ||||||||
    Sample Width: 2
    Frame Rate: 24000
    Frame Width: 4
    Channels: 2
    Duration in Seconds: 41.7842916667
    ||||||||
    DEBUG:pydub.converter:subprocess.call(['avconv', '-y', '-f', 'wav', '-i', '/var/folders/1_/551bnby96jb5yy_w__0b9_1r0000gn/T/tmpN4XkDz', '-f', 'mp3', '/var/folders/1_/551bnby96jb5yy_w__0b9_1r0000gn/T/tmpfmKnzd'])
    
    
    opened by lithiumlab 17
  • local variable 'start_ii' referenced before assignment

    local variable 'start_ii' referenced before assignment

    Steps to reproduce

    Expected behavior

    I am running the code given in https://www.geeksforgeeks.org/python-speech-recognition-on-large-audio-files/ to transcribe large wav files

    Actual behavior

    But i am getting this error

    UnboundLocalError Traceback (most recent call last) in 100 101 --> 102 silence_based_conversion("creedoscar.wav") 103

    in silence_based_conversion(path) 29 # consider it silent if quieter than -16 dBFS 30 # adjust this per requirement ---> 31 silence_thresh = -16 32 ) 33

    ~/.local/lib/python3.6/site-packages/pydub/silence.py in split_on_silence(audio_segment, min_silence_len, silence_thresh, keep_silence, seek_step) 132 start_min = end_max 133 --> 134 chunks.append(audio_segment[max(start_min, start_ii - keep_silence): 135 min(len(audio_segment), end_ii + keep_silence)]) 136

    UnboundLocalError: local variable 'start_ii' referenced before assignment

    Your System configuration

    • Python version: Python 3.6.9
    • Pydub version: 0.24.0
    • ffmpeg or avlib?: ffmpeg
    • ffmpeg/avlib version: ffmpeg version 3.4.6-0ubuntu0.18.04.1

    Is there an audio file you can include to help us reproduce?

    You can include the audio file in this issue - just put it in a zip file and drag/drop the zip file into the github issue.

    opened by dumbcoder2399 16
  • Added ability to invert the phase of only one channel

    Added ability to invert the phase of only one channel

    What it says on the tin. Since reversing the phase of only one channel requires use of audioop, I figured it would be better to extend the invert_phase function within pydub rather than add that functionality in an external program.

    This shouldn't break any existing programs using Pydub since it will behave in exactly the same way as it did before if no arguments are passed.

    opened by cruxicheiros 16
  • Support files with frame rates over 48KHz and don't use temporary filenames

    Support files with frame rates over 48KHz and don't use temporary filenames

    This allows pydub to open files having more than 48KHz and/or 32-bit data if scipy is available.

    If scipy is not available, it falls back to using the standard wave module as before.

    Also, add a new from_file function that does all the reading on memory with pipes, not using any temporary file, which is faster and doesn't wear down disks for heavy usages.

    The new from_file function reads the input file and passes it to ffmpeg using a pipe and then reads ffmpeg output using another pipe directly to memory.

    Since wav files have the file length in the header and ffmpeg can't write it since it's working on a stream, we modify the resulting raw data from ffmpeg before reading it using the standard method.

    Also, rename from_file to from_file_using_temporary_files just in case there's any case in which the new from_file function doesn't work (I couldn't find any, but just in case, I guess it would be nice to keep it maybe as deprecated).

    Fixes #134 Fixes #237 Might also fix #209

    opened by antlarr 15
  • Permission denied

    Permission denied

    i want use playback to play a music

    but it alway tell me that i don't have permission,by the way, i can use cmd.exe to play a music.

    code like that: In [9]: sou1=AudioSegment.from_mp3(r'd:\02.mp3')

    In [10]: play(sou1)

    IOError Traceback (most recent call last) in () ----> 1 play(sou1)

    c:\python27\lib\site-packages\pydub\playback.pyc in play(audio_segment) 44 _play_with_pyaudio(audio_segment) 45 except ImportError: ---> 46 _play_with_ffplay(audio_segment)

    c:\python27\lib\site-packages\pydub\playback.pyc in _play_with_ffplay(seg) 16 def _play_with_ffplay(seg): 17 with NamedTemporaryFile("w+b", suffix=".wav") as f: ---> 18 seg.export(f.name, "wav") 19 subprocess.call([PLAYER, "-nodisp", "-autoexit", "-hide_banner", f.name]) 20

    c:\python27\lib\site-packages\pydub\audio_segment.pyc in export(self, out_f, for mat, codec, bitrate, parameters, tags, id3v2_version, cover) 579 id3v2_allowed_versions = ['3', '4'] 580 --> 581 out_f = _fd_or_path_or_tempfile(out_f, 'wb+') 582 out_f.seek(0) 583

    c:\python27\lib\site-packages\pydub\utils.pyc in _fd_or_path_or_tempfile(fd, mod e, tempfile) 57 58 if isinstance(fd, basestring): ---> 59 fd = open(fd, mode=mode) 60 61 return fd

    IOError: [Errno 13] Permission denied: 'c:\users\admini~1\appdata\local\tem p\tmpuzjjby.wav'

    System configuration

    • Python version: 2.7
    • Pydub version: 0.20.0
    • ffmpeg or avlib?: ffmpeg
    • ffmpeg/avlib version: 3.3.3
    opened by anmaz 14
  • Couldn't install ffmpeg or libavcodec-extra-53. Only libav-tools installation works.

    Couldn't install ffmpeg or libavcodec-extra-53. Only libav-tools installation works.

    I have Ubuntu 14.04 in spanish, so my error is printed in spanish, I'll try to traduce the error which appears in the terminal when i run: sudo apt-get install ffmpeg libavcodec-extra-53


    El paquete ffmpeg no está disponible, pero algún otro paquete hace referencia a él. Esto puede significar que el paquete falta, está obsoleto o solo se encuentra disponible desde alguna otro origen

    El paquete libavcodec-extra-53 no está disponible, pero algún otro paquete hace referencia a él. Esto puede significar que el paquete falta, está obsoleto o solo se encuentra disponible desde alguna otro origen Sin embargo, los siguientes paquetes lo reemplazan: libav-tools:i386 libav-tools


    Translation:

    ffmpeg is not available, but some other pack makes a reference to it. This could mean that ffmpeg pack is missing, is obsolete or it's only available from another origin.

    (There says the same for libavcodec-extra53)

    Nevertheless, the following packs could replace them: libav-tools:i386 libav-tools


    So i installed libav-tools as the terminal suggested but when i run this in a python file:

    from pydub import AudioSegment song = AudioSegment.from_file("piano.wav")

    MP3

    song.export("piano.mp3", format="mp3")

    Ogg Vorbis

    song.export("piano.ogg", format="ogg")

    WMA (windows media)

    song.export("piano.wma", format="wma")

    AAC (advanced audio coding)

    song.export("piano.m4a", format="aac")


    It only exports real -listenable- audio files for mp3 and ogg formats, but for wma and aac it only exports 0 bytes files that i can't listen.

    I think it occurs because i don't have ffmpeg or libavcodec-extra-53 installed, only libav-tools.

    Could you help with this? I have been looking how to fix this error for a while but i couldn't.

    PD: Sorry for my awful english.

    opened by sergiofv 13
  • Fix a bug in `parameters` variable

    Fix a bug in `parameters` variable

    I was trying to read a very large MP4 file and convert to WAV so I provided some parameters to from_file function in the argument parameters. My parameters were:

    parameters = ['-ac', '1', '-ar', '16000']
    

    which means that the resultant audio file should have 16K sample rate and the number of channels should be 1 (mono) but that did not work! After investigation, I found that the parameters are added it the end of the command and the command will be something like:

    ffmpeg -y -i my_video.mp4 -acodec pcm_s16le -vn -f wav - -ac 1 -ar 16000
    

    I changed it to add parameters in the middle where the command should look like this:

    ffmpeg -y -i my_video.mp4 -ac 1 -ar 16000 -acodec pcm_s16le -vn -f wav -
    

    and it worked!!

    opened by farisalasmary 0
  • AudioSegment from torch tensor

    AudioSegment from torch tensor

    Steps to reproduce

    Expected behavior

    Tell us what should happen

    Actual behavior

    I want to convert my torch tensor to AudioSegment but i dont see you have that function already

    Your System configuration

    • Python version: 3.8
    • Pydub version:
    • ffmpeg or avlib?:
    • ffmpeg/avlib version:

    Is there an audio file you can include to help us reproduce?

    You can include the audio file in this issue - just put it in a zip file and drag/drop the zip file into the github issue.

    opened by GoldDRoge 0
  • AudioSegment from torch tensor

    AudioSegment from torch tensor

    Steps to reproduce

    Expected behavior

    Tell us what should happen

    Actual behavior

    I want to convert my torch tensor to AudioSegment but i dont see you have that function already

    Your System configuration

    • Python version: 3.8
    • Pydub version:
    • ffmpeg or avlib?:
    • ffmpeg/avlib version:

    Is there an audio file you can include to help us reproduce?

    You can include the audio file in this issue - just put it in a zip file and drag/drop the zip file into the github issue.

    opened by GoldDRoge 0
  • Tests fail without libopenh264 installed (which happens on Linux distros)

    Tests fail without libopenh264 installed (which happens on Linux distros)

    Steps to reproduce

    Run the test suite.

    Expected behavior

    Testsuite should pass.

    Actual behavior

    It doesn’t:

    [   37s] ======================================================================
    [   37s] ERROR: test_export_mp3_with_tags (test.test.AudioSegmentTests)
    [   37s] ----------------------------------------------------------------------
    [   37s] Traceback (most recent call last):
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/test/test.py", line 879, in test_export_mp3_with_tags
    [   37s]     AudioSegment.from_file(self.mp4_file_path).export(tmp_mp3_file, format="mp3", tags=tags)
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/audio_segment.py", line 728, in from_file
    [   37s]     info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/utils.py", line 279, in mediainfo_json
    [   37s]     info = json.loads(output)
    [   37s]   File "/usr/lib64/python3.8/json/__init__.py", line 357, in loads
    [   37s]     return _default_decoder.decode(s)
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 337, in decode
    [   37s]     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 353, in raw_decode
    [   37s]     obj, end = self.scan_once(s, idx)
    [   37s] json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)
    [   37s] 
    [   37s] ======================================================================
    [   37s] ERROR: test_export_mp4_as_mp3 (test.test.AudioSegmentTests)
    [   37s] ----------------------------------------------------------------------
    [   37s] Traceback (most recent call last):
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/test/test.py", line 814, in test_export_mp4_as_mp3
    [   37s]     AudioSegment.from_file(self.mp4_file_path).export(tmp_mp3_file,
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/audio_segment.py", line 728, in from_file
    [   37s]     info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/utils.py", line 279, in mediainfo_json
    [   37s]     info = json.loads(output)
    [   37s]   File "/usr/lib64/python3.8/json/__init__.py", line 357, in loads
    [   37s]     return _default_decoder.decode(s)
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 337, in decode
    [   37s]     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 353, in raw_decode
    [   37s]     obj, end = self.scan_once(s, idx)
    [   37s] json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)
    [   37s] 
    [   37s] ======================================================================
    [   37s] ERROR: test_export_mp4_as_mp3_with_tags (test.test.AudioSegmentTests)
    [   37s] ----------------------------------------------------------------------
    [   37s] Traceback (most recent call last):
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/test/test.py", line 837, in test_export_mp4_as_mp3_with_tags
    [   37s]     AudioSegment.from_file(self.mp4_file_path).export(tmp_mp3_file,
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/audio_segment.py", line 728, in from_file
    [   37s]     info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/utils.py", line 279, in mediainfo_json
    [   37s]     info = json.loads(output)
    [   37s]   File "/usr/lib64/python3.8/json/__init__.py", line 357, in loads
    [   37s]     return _default_decoder.decode(s)
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 337, in decode
    [   37s]     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 353, in raw_decode
    [   37s]     obj, end = self.scan_once(s, idx)
    [   37s] json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)
    [   37s] 
    [   37s] ======================================================================
    [   37s] ERROR: test_export_mp4_as_mp3_with_tags_raises_exception_when_id3version_is_wrong (test.test.AudioSegmentTests)
    [   37s] ----------------------------------------------------------------------
    [   37s] Traceback (most recent call last):
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/test/test.py", line 861, in test_export_mp4_as_mp3_with_tags_raises_exception_when_id3version_is_wrong
    [   37s]     AudioSegment.from_file(self.mp4_file_path).export,
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/audio_segment.py", line 728, in from_file
    [   37s]     info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/utils.py", line 279, in mediainfo_json
    [   37s]     info = json.loads(output)
    [   37s]   File "/usr/lib64/python3.8/json/__init__.py", line 357, in loads
    [   37s]     return _default_decoder.decode(s)
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 337, in decode
    [   37s]     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 353, in raw_decode
    [   37s]     obj, end = self.scan_once(s, idx)
    [   37s] json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)
    [   37s] 
    [   37s] ======================================================================
    [   37s] ERROR: test_export_mp4_as_mp3_with_tags_raises_exception_when_tags_are_not_a_dictionary (test.test.AudioSegmentTests)
    [   37s] ----------------------------------------------------------------------
    [   37s] Traceback (most recent call last):
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/test/test.py", line 849, in test_export_mp4_as_mp3_with_tags_raises_exception_when_tags_are_not_a_dictionary
    [   37s]     AudioSegment.from_file(self.mp4_file_path).export, tmp_mp3_file,
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/audio_segment.py", line 728, in from_file
    [   37s]     info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/utils.py", line 279, in mediainfo_json
    [   37s]     info = json.loads(output)
    [   37s]   File "/usr/lib64/python3.8/json/__init__.py", line 357, in loads
    [   37s]     return _default_decoder.decode(s)
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 337, in decode
    [   37s]     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 353, in raw_decode
    [   37s]     obj, end = self.scan_once(s, idx)
    [   37s] json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)
    [   37s] 
    [   37s] ======================================================================
    [   37s] ERROR: test_export_mp4_as_ogg (test.test.AudioSegmentTests)
    [   37s] ----------------------------------------------------------------------
    [   37s] Traceback (most recent call last):
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/test/test.py", line 805, in test_export_mp4_as_ogg
    [   37s]     AudioSegment.from_file(self.mp4_file_path).export(tmp_ogg_file,
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/audio_segment.py", line 728, in from_file
    [   37s]     info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/utils.py", line 279, in mediainfo_json
    [   37s]     info = json.loads(output)
    [   37s]   File "/usr/lib64/python3.8/json/__init__.py", line 357, in loads
    [   37s]     return _default_decoder.decode(s)
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 337, in decode
    [   37s]     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 353, in raw_decode
    [   37s]     obj, end = self.scan_once(s, idx)
    [   37s] json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)
    [   37s] 
    [   37s] ======================================================================
    [   37s] ERROR: test_export_mp4_as_wav (test.test.AudioSegmentTests)
    [   37s] ----------------------------------------------------------------------
    [   37s] Traceback (most recent call last):
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/test/test.py", line 823, in test_export_mp4_as_wav
    [   37s]     AudioSegment.from_file(self.mp4_file_path).export(tmp_wav_file,
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/audio_segment.py", line 728, in from_file
    [   37s]     info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/utils.py", line 279, in mediainfo_json
    [   37s]     info = json.loads(output)
    [   37s]   File "/usr/lib64/python3.8/json/__init__.py", line 357, in loads
    [   37s]     return _default_decoder.decode(s)
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 337, in decode
    [   37s]     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 353, in raw_decode
    [   37s]     obj, end = self.scan_once(s, idx)
    [   37s] json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)
    [   37s] 
    [   37s] ======================================================================
    [   37s] ERROR: test_exporting_to_ogg_uses_default_codec_when_codec_param_is_none (test.test.AudioSegmentTests)
    [   37s] ----------------------------------------------------------------------
    [   37s] Traceback (most recent call last):
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/test/test.py", line 998, in test_exporting_to_ogg_uses_default_codec_when_codec_param_is_none
    [   37s]     AudioSegment.from_file(self.mp4_file_path).export(tmp_ogg_file, format="ogg")
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/audio_segment.py", line 728, in from_file
    [   37s]     info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
    [   37s]   File "/home/abuild/rpmbuild/BUILD/pydub-0.25.1/pydub/utils.py", line 279, in mediainfo_json
    [   37s]     info = json.loads(output)
    [   37s]   File "/usr/lib64/python3.8/json/__init__.py", line 357, in loads
    [   37s]     return _default_decoder.decode(s)
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 337, in decode
    [   37s]     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    [   37s]   File "/usr/lib64/python3.8/json/decoder.py", line 353, in raw_decode
    [   37s]     obj, end = self.scan_once(s, idx)
    [   37s] json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 2)
    [   37s] 
    [   37s] ----------------------------------------------------------------------
    [   37s] Ran 113 tests in 28.835s
    [   37s] 
    [   37s] FAILED (errors=8)
    
    

    Your System configuration

    • Python version: Python 3.8.15, 3.9.15, 3.10.8
    • Pydub version: 0.25.1
    • ffmpeg or avlib?: ffmpeg-5-5.1.2 from openSUSE/Factory (i.e., there is no libopenh264-7 available)
    • ffmpeg/avlib version: 5.1.2

    Is there an audio file you can include to help us reproduce?

    It’s all about test/data/creative_common.mp4. Most Linux distributions cannot legally carry libopenh264 library, so ffmpeg is hacked so that it mostly works even without (unless it is explicitly dlopened). Unfortunately, utils.mediainfo_json function has absolutely no checking for the errorcodes returned from running ffprobe, which means that failure is ignored and resulting invalid string is stuffed to json.loads. Resulting error message is completely confusing.

    This code (line 273 in pydub/utils.py):

    command = [prober, '-of', 'json'] + command_args
    res = Popen(command, stdin=stdin_parameter, stdout=PIPE, stderr=PIPE)
    output, stderr = res.communicate(input=stdin_data)
    output = output.decode("utf-8", 'ignore')
    stderr = stderr.decode("utf-8", 'ignore')
    

    badly needs some error handling.

    I have included just this workaround, but it is a pretty bad hack. Some testing for actual libopen264 availability should be included.

    ---
     test/test.py |   16 ++++++++++++++++
     1 file changed, 16 insertions(+)
    
    --- a/test/test.py
    +++ b/test/test.py
    @@ -796,6 +796,8 @@ class AudioSegmentTests(unittest.TestCas
                 AudioSegment.from_file(self.mp3_file_path).export(tmp_webm_file,
                                                                   format="webm")
     
    +    @unittest.skipIf('NO_OPENH264' in os.environ,
    +                         "libopenh264 not available")
         @unittest.skipUnless('aac' in get_supported_decoders(),
                              "Unsupported codecs")
         def test_export_mp4_as_ogg(self):
    @@ -803,6 +805,8 @@ class AudioSegmentTests(unittest.TestCas
                 AudioSegment.from_file(self.mp4_file_path).export(tmp_ogg_file,
                                                                   format="ogg")
     
    +    @unittest.skipIf('NO_OPENH264' in os.environ,
    +                         "libopenh264 not available")
         @unittest.skipUnless('aac' in get_supported_decoders(),
                              "Unsupported codecs")
         def test_export_mp4_as_mp3(self):
    @@ -810,6 +814,8 @@ class AudioSegmentTests(unittest.TestCas
                 AudioSegment.from_file(self.mp4_file_path).export(tmp_mp3_file,
                                                                   format="mp3")
     
    +    @unittest.skipIf('NO_OPENH264' in os.environ,
    +                         "libopenh264 not available")
         @unittest.skipUnless('aac' in get_supported_decoders(),
                              "Unsupported codecs")
         def test_export_mp4_as_wav(self):
    @@ -817,6 +823,8 @@ class AudioSegmentTests(unittest.TestCas
                 AudioSegment.from_file(self.mp4_file_path).export(tmp_wav_file,
                                                                   format="mp3")
     
    +    @unittest.skipIf('NO_OPENH264' in os.environ,
    +                         "libopenh264 not available")
         @unittest.skipUnless('aac' in get_supported_decoders(),
                              "Unsupported codecs")
         def test_export_mp4_as_mp3_with_tags(self):
    @@ -830,6 +838,8 @@ class AudioSegmentTests(unittest.TestCas
                                                                   format="mp3",
                                                                   tags=tags_dict)
     
    +    @unittest.skipIf('NO_OPENH264' in os.environ,
    +                         "libopenh264 not available")
         @unittest.skipUnless('aac' in get_supported_decoders(),
                              "Unsupported codecs")
         def test_export_mp4_as_mp3_with_tags_raises_exception_when_tags_are_not_a_dictionary(self):
    @@ -840,6 +850,8 @@ class AudioSegmentTests(unittest.TestCas
                     format="mp3", tags=json)
                 self.assertRaises(InvalidTag, func)
     
    +    @unittest.skipIf('NO_OPENH264' in os.environ,
    +                         "libopenh264 not available")
         @unittest.skipUnless('aac' in get_supported_decoders(),
                              "Unsupported codecs")
         def test_export_mp4_as_mp3_with_tags_raises_exception_when_id3version_is_wrong(self):
    @@ -854,6 +866,8 @@ class AudioSegmentTests(unittest.TestCas
                 )
                 self.assertRaises(InvalidID3TagVersion, func)
     
    +    @unittest.skipIf('NO_OPENH264' in os.environ,
    +                         "libopenh264 not available")
         @unittest.skipUnless('aac' in get_supported_decoders(),
                              "Unsupported codecs")
         def test_export_mp3_with_tags(self):
    @@ -973,6 +987,8 @@ class AudioSegmentTests(unittest.TestCas
             # average volume should be reduced
             self.assertTrue(compressed.rms < self.seg1.rms)
     
    +    @unittest.skipIf('NO_OPENH264' in os.environ,
    +                         "libopenh264 not available")
         @unittest.skipUnless('aac' in get_supported_decoders(),
                              "Unsupported codecs")
         def test_exporting_to_ogg_uses_default_codec_when_codec_param_is_none(self):
    
    opened by mcepl 0
  • Android OGG audio files fail to decode

    Android OGG audio files fail to decode

    Steps to reproduce

    We have an API for Speech-To-Text, we use websocket connection to create a file from the audio incoming from the mobile's microphone (iOS and Android).

    from pydub import AudioSegment
    import io
    
    if audio_format == AudioFormat.OGG:
        extension = "ogg"
        container = io.BytesIO(file_buffer)
        AudioSegment.from_file(container).export(name, format="ogg")
        container.seek(0)
    

    We tried as well with codec="libvorbis" and codec="opus", the same issue.

    Expected behavior

    Should create the file properly. This code does not fail when the audio bytes are coming from iOS, only from Android.

    Actual behavior

    Throws an error trying to decode. Below the output:

    E1129 13:09:05.272731793   70743 fork_posix.cc:76]           Other threads are currently calling into gRPC, skipping fork() handlers
    ERROR:root:Trying to create audio file with type: audio/ogg;codecs=opus - Decoding failed. ffmpeg returned error code: 1
    
    Output from ffmpeg/avlib:
    
    ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
      built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
      configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
      libavutil      56. 70.100 / 56. 70.100
      libavcodec     58.134.100 / 58.134.100
      libavformat    58. 76.100 / 58. 76.100
      libavdevice    58. 13.100 / 58. 13.100
      libavfilter     7.110.100 /  7.110.100
      libswscale      5.  9.100 /  5.  9.100
      libswresample   3.  9.100 /  3.  9.100
      libpostproc    55.  9.100 / 55.  9.100
    [ogg @ 0x55751faa5280] CRC mismatch!
        Last message repeated 2 times
    [ogg @ 0x55751faa5280] Header processing failed: Invalid data found when processing input
    [cache @ 0x55751faa5f00] Statistics, cache hits:0 cache misses:1
    cache:pipe:0: Invalid data found when processing input
    

    Your System configuration

    • Python version: 3.8
    • Pydub version: 0.25.1
    • ffmpeg or avlib?: ffmpeg
    • ffmpeg/avlib version: 4.4.2-0ubuntu0.22.04.1

    Is there an audio file you can include to help us reproduce?

    You can include the audio file in this issue - just put it in a zip file and drag/drop the zip file into the github issue.

    opened by npattarone 0
Releases(v0.25.1)
  • v0.25.1(Mar 10, 2021)

  • v0.25.0(Mar 6, 2021)

    • Don't show a runtime warning about the optional ffplay dependency being missing until someone trys to use it
    • Documentation improvements
    • Python 3.9 support
    • Improved efficiency of loading wave files with pydub.AudioSegment.from_file()
    • Ensure pydub.AudioSegment().export() always retuns files with a seek position at the beginning of the file
    • Added more EQ effects to pydub.scipy_effects (requires scipy to be installed)
    • Fix a packaging bug where the LICENSE file was not included in the source distribution
    • Add a way to instantiate a pydub.AudioSegment() with a portion of an audio file via pydub.AudioSegment().from_file()
    Source code(tar.gz)
    Source code(zip)
  • v0.24.1(Jun 3, 2020)

    • Fix bug where ffmpeg errors in Python 3 are illegible
    • Fix bug where split_on_silence fails when there are one or fewer nonsilent segments
    • Fix bug in fallback audioop implementation
    Source code(tar.gz)
    Source code(zip)
  • v0.24.0(May 12, 2020)

    • Fix inconsistent handling of 8-bit audio
    • Fix bug where certain files will fail to parse
    • Fix bug where pyaudio stream is not closed on error
    • Allow codecs and parameters in wav and raw export
    • Fix bug in pydub.AudioSegment.from_file where supplied codec is ignored
    • Allow pydub.silence.split_on_silence to take a boolean for keep_silence
    • Fix bug where pydub.silence.split_on_silence sometimes adds non-silence from adjacent segments
    • Fix bug where pydub.AudioSegment.extract_wav_headers fails on empty wav files
    • Add new function pydub.silence.detect_leading_silence
    • Support conversion between an arbitrary number of channels and mono in pydub.AudioSegment.set_channels
    • Fix several issues related to reading from filelike objects
    Source code(tar.gz)
    Source code(zip)
  • v0.23.1(Jan 22, 2019)

    • Fix bug in passing ffmpeg/avconv parameters for pydub.AudioSegment.from_mp3(), pydub.AudioSegment.from_flv(), pydub.AudioSegment.from_ogg(), and pydub.AudioSegment.from_wav()
    • Fix logic bug in pydub.effects.strip_silence()
    Source code(tar.gz)
    Source code(zip)
  • v0.23.0(Sep 17, 2018)

    • Add support for playback via simpleaudio
    • Allow users to override the type in pydub.AudioSegment().get_array_of_samples() (PR #313)
    • Fix a bug where the wrong codec was used for 8-bit audio (PR #309 - issue #308)
    Source code(tar.gz)
    Source code(zip)
  • v0.22.1(Jun 15, 2018)

  • v0.22.0(May 24, 2018)

    • Adds support for audio with frame rates (sample rates) of 48k and higher (requires scipy) (PR #262, fixes #134, #237, #209)
    • Adds support for PEP 519 File Path protocol (PR #252)
    • Fixes a few places where handles to temporary files are kept open (PR #280)
    • Add the license file to the python package to aid other packaging projects (PR #279, fixes #274)
    • Big fix for pydub.silence.detect_silence() (PR #263)
    Source code(tar.gz)
    Source code(zip)
  • v0.21.0(Feb 22, 2018)

    • NOTE: Semi-counterintuitive change: using the a stride when slicing AudioSegment instances (for example, sound[::5000]) will return chunks of 5000ms (not 1ms chunks every 5000ms) (#222)
    • Debug output from ffmpeg/avlib is no longer printed to the console unless you set up logging (see README for how to set up logging for your converter) (#223)
    • All pydub exceptions are now subclasses of pydub.exceptions.PydubException
    • The utilities in pydub.silence now accept a seek_stepargument which can optionally be passed to improve the performance of silence detection (#211)
    • Fix to pydub.silence utilities which allow you to detect perfect silence (#233)
    • Fix a bug where threaded code screws up your terminal session due to ffmpeg inheriting the stdin from the parent process. (#231)
    • Fix a bug where a crashing programs using pydub would leave behind their temporary files (#206)
    Source code(tar.gz)
    Source code(zip)
  • v0.20.0(Aug 5, 2017)

    • Add new parameter gain_during_overlay to pydub.AudioSegment.overlay which allows users to adjust the volume of the target AudioSegment during the portion of the segment which is overlaid with the additional AudioSegment.
    • pydub.playback.play() No longer displays the (very verbose) playback "banner" when using ffplay
    • Fix a confusing error message when using invalid crossfade durations (issue #193)
    Source code(tar.gz)
    Source code(zip)
  • v0.19.0(May 9, 2017)

    • Allow codec and ffmpeg/avconv parameters to be set in the pydub.AudioSegment.from_file() for more control while decoding audio files
    • Allow AudioSegment objects with more than two channels to be split using pydub.AudioSegment().split_to_mono()
    • Add support for inverting the phase of only one channel in a multi-channel pydub.AudioSegment object
    • Fix a bug with the latest avprobe that broke pydub.utils.mediainfo()
    • Add tests for webm encoding/decoding
    Source code(tar.gz)
    Source code(zip)
  • v0.18.0(Feb 10, 2017)

    • Add a new constructor: pydub.AudioSegment.from_mono_audiosegments() which allows users to create a multi-channel audiosegment out of multiple mono ones.
    • Refactor pydub.AudioSegment._sync() to support an arbitrary number of audiosegment arguments.
    Source code(tar.gz)
    Source code(zip)
  • v0.17.0(Feb 4, 2017)

    • Add the ability to add a cover image to MP3 exports via the cover keyword argument to pydub.AudioSegment().export()
    • Add pydub.AudioSegment().get_dc_offset() and pydub.AudioSegment().remove_dc_offset() which allow detection and removal of DC offset in audio files.
    • Minor fixes for windows users
    Source code(tar.gz)
    Source code(zip)
  • v0.16.7(Jan 6, 2017)

  • v0.16.6(Oct 12, 2016)

  • v0.16.3(Jan 11, 2016)

  • v0.16.2(Jan 11, 2016)

  • v0.16.1(Jan 11, 2016)

  • v0.16.0(Nov 5, 2015)

    Added an official way to access raw audio data (audio_segment.raw_data) and a helper method for getting an array the samples (audio_segment.get_array_of_samples())

    Source code(tar.gz)
    Source code(zip)
  • v0.15.0(Aug 18, 2015)

  • v0.14.2(Jul 14, 2015)

  • v0.14.0(Jun 24, 2015)

  • v0.12.0(Jun 23, 2015)

  • v0.11.0(Apr 17, 2015)

  • v0.10.0(Feb 23, 2015)

Synthesia but open source, made in python and free

PyPiano Synthesia but open source, made in python and free Requirements are in requirements.txt If you struggle with installation of pyaudio, run : pi

DaCapo 11 Nov 06, 2022
User-friendly Voice Cloning Application

Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio featur

Sven Eschlbeck 19 Dec 30, 2022
Praat in Python, the Pythonic way

Parselmouth - Praat in Python, the Pythonic way Parselmouth is a Python library for the Praat software. Though other attempts have been made at portin

Yannick Jadoul 786 Jan 09, 2023
This is a realtime voice translator program which gets input from user at any language and converts it to the desired language that the user asks

This is a realtime voice translator program which gets input from user at any language and converts it to the desired language that the user asks ...

Mohan Ram S 1 Dec 30, 2021
Vixtify - Python Controlled Music Player

Strumm Sound Playlist : Click me to listen Welcome to GitHub Pages You can use the editor on GitHub to maintain and preview the content for your websi

Vicky Kumar 2 Feb 03, 2022
Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names

Batch Sorting Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files accord

David Mainoo 1 Oct 29, 2021
A voice control utility for Spotify

Spotify Voice Control A voice control utility for Spotify · Report Bug · Request

Shoubhit Dash 27 Jan 01, 2023
Voice to Text using Raspberry Pi

This module will help to convert your voice (speech) into text using Speech Recognition Library. You can control the devices or you can perform the desired tasks by the word recognition

Raspberry_Pi Pakistan 2 Dec 15, 2021
A voice assistant which can be used to interact with your computer and controls your pc operations

Introduction 👨‍💻 It is a voice assistant which can be used to interact with your computer and also you have been seeing it in Iron man movies, but t

Sujith 84 Dec 22, 2022
Audio augmentations library for PyTorch for audio in the time-domain

Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.

Janne 166 Jan 08, 2023
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch

Auditory Slow-Fast This repository implements the model proposed in the paper: Evangelos Kazakos, Arsha Nagrani, Andrew Zisserman, Dima Damen, Slow-Fa

Evangelos Kazakos 57 Dec 07, 2022
LibXtract is a simple, portable, lightweight library of audio feature extraction functions.

LibXtract LibXtract is a simple, portable, lightweight library of audio feature extraction functions. The purpose of the library is to provide a relat

Jamie Bullock 215 Nov 16, 2022
Powerful, simple, audio tag editor for GNU/Linux

puddletag puddletag is an audio tag editor (primarily created) for GNU/Linux similar to the Windows program, Mp3tag. Unlike most taggers for GNU/Linux

341 Dec 26, 2022
Analysis of voices based on the Mel-frequency band

Speaker_partition_module Analysis of voices based on the Mel-frequency band. Goal: Identification of voices speaking (diarization) and calculation of

1 Feb 06, 2022
Library for Python 3 to communicate with the Google Chromecast.

pychromecast Library for Python 3.6+ to communicate with the Google Chromecast. It currently supports: Auto discovering connected Chromecasts on the n

Home Assistant Libraries 2.4k Jan 02, 2023
Python library for audio and music analysis

librosa A python package for music and audio analysis. Documentation See https://librosa.org/doc/ for a complete reference manual and introductory tut

librosa 5.6k Jan 06, 2023
A python wrapper for REAPER

pyreaper A python wrapper for REAPER (Robust Epoch And Pitch EstimatoR) Installation pip install pyreaper Demonstration notebnook http://nbviewer.jupy

Ryuichi Yamamoto 56 Dec 27, 2022
In this project we can see how we can generate automatic music using character RNN.

Automatic Music Genaration Table of Contents Project Description Approach towards the problem Limitations Libraries Used Summary Applications Referenc

Pronay Ghosh 2 May 27, 2022
MelGAN test on audio decoding

Official repository for the paper MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis The original work URL: https://github.com

Jurio 1 Apr 29, 2022
A simple voice detection system which can be applied practically for designing a device with capability to detect a baby’s cry and automatically turning on music

Auto-Baby-Cry-Detection-with-Music-Player A simple voice detection system which can be applied practically for designing a device with capability to d

2 Dec 15, 2021