EDIT: All tickets related to large audio files and out-of-memory erro

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

pydub operating direct to disk vs in memory? about pydub HOT 19 CLOSED

jiaaro commented on July 17, 2024

pydub operating direct to disk vs in memory?

from pydub.

Comments (19)

marcwebbie commented on July 17, 2024

Hello Julian

I think it is ok to operate it from disk. You just interface it using stringio/byteio. It'd be quite slower some operations, though.

from pydub.

lepinsk commented on July 17, 2024

Hey Marc,

Thanks for the reply – that's my expectation as well. I appreciate that disk IO is going to be a whole lot slower, but in situations where the operations are unable to complete in RAM I'd like to have that in place as a failover.

Julian

from pydub.

jiaaro commented on July 17, 2024

The AudioSegment class loads the full audio data into RAM, but it's possible to do most operations in chunks.

Probably the best option would be to create a new class LargeAudioSegment or similar which handles the chunking and defers actual processing to the main AudioSegment class.

from pydub.

tthienpont commented on July 17, 2024

Hi Julian,
I'm running into the same memory issue. As I am a true Python newbie and probably won't succeed in adapting this myself I was wondering whether you made any progress in using chunks to prevent out-of-memory issues?

My script which opens mp3 of 3 to 4 hrs frequently runs out of memory. I see it happening when looking with "top".

Any help is appreciated.

Cheers,
Tim

from pydub.

marcwebbie commented on July 17, 2024

Hello @tthienpont

Can you show me a code snippet to see what you're doing with pydub? You could change the io.string inteface to a file instead, with a couple of lines you could easily monkey-patch pydub.

from pydub.

lepinsk commented on July 17, 2024

Hey @marcwebbie ,

I can't speak for @tthienpont, but in my case I'm using pydub to overlay a series of audio files, each of which can potentially be up to multiple hours long. Not sure how informative this is, but a snippet here:

for layer in layers:
    print "overlaying '%s' at t=%sms" % (layer["name"], layer["offset"])
    output = output.overlay(AudioSegment.from_file(layer["audio-path"]), position=int(layer["offset"]))
    output_mixing_progress(increment=True)

from pydub.

jiaaro commented on July 17, 2024

Unfortunately there is no quick solution to this, but it's worth pointing out that most audioop operations can be performed in chunks, so while pydub operates in RAM at present it's not an insurmountable task.

If it is critical for you to get this working, I would advise you to take a look at the following pydub methods for reference while implementing a custom, lower-level solution:

AudioSegment._sync() method: Ensures that the audio files being worked on have the same number of channels, bits per sample, sample rate, etc
AudioSegment.overlay() method: makes the audioop calls necessary to do the overlay operation. note: AudioSegment()._data (self._data in the method) is the actual audio data in the form of a bytestring

One last bit of info that may be helpful: There is a python (re-)implementation of audioop included in pydub in order to support pypy (in fact the forthcoming pypy support for audioop comes from our implementation 😄). This might help you to understand what the audioop functions are actually doing, though I admit it is a bit opaque even in python.

from pydub.

tthienpont commented on July 17, 2024

Hi Guys,
Sorry for the late answer.
The snippet (watch in awe...):

main_song = AudioSegment.from_mp3(selected_mix)

Not much to see I'm afraid. If I open a file of let's say 3 hours worth of music, I see a drastic drop in available memory. When I hit 0, game over. Isn't that why swapping was invented?

Cheers,
Tim

from pydub.

jiaaro commented on July 17, 2024

@tthienpont pydub began it's life as a light wrapper around python's built-in audioop module, which takes audio data as arguments in the form of byte strings.

As a result it is unfortunately not as simple as just making the audio data iterable rather than storing it all in RAM at the same time. When you pass it to the audioop functions (which are implemented in C) the data will be converted into an array of bytes regardless of what we do on the python side.

The solution is to break the audio data into chunks and call the audioop functions on each chunk (or other strategies depending on the operation, RMS for instance would have to be done differently).

Anyway, I really wish it was as easy as using a better data structure or telling the interpreter to use swap space (and I'm still holding out hope that there's a way that I'm overlooking =D) but as far as I know we'll need to implement more granular strategies on a per-method basis

from pydub.

marcwebbie commented on July 17, 2024

From what I understood what they're actually looking for is a way to store those string in disk and load them to memory only when using the audioop wrapper side of pydub.

I suppose that if they tweak pydub code to dump wave data to disk instead loading to StringIO/ByteIO and load them in RAM only when needed would do it.

Of course, what could possibly be simple at first, might become a pain to implement.

from pydub.

jiaaro commented on July 17, 2024

@marcwebbie the problem is that once you call audioop the whole thing is going to be loaded into ram anyway since the C code works on the audio data as an array (and as a result, a single contiguous block of memory)

from pydub.

lepinsk commented on July 17, 2024

Hey James,

I know this is quite an old issue, but I wanted to know whether you have any idea about how much memory pydub is able to address. I'm throwing some significant stuff into ram, and it seems that (in some cases) pydub seems to fail at around 4gb. (With the following message: MemoryError: not enough memory for output buffer)

I'm seeing this both on my local machine (with 8gb of memory) as well as in production, on a server with 14gb of ram. Both machines are running a 64bit version of the python interpreter, which I'd expect to support addressing past 4gb.

from pydub.

jiaaro commented on July 17, 2024

@lepinsk I'm not 100% sure on this, but I see in the C source of audioop.tostereo an error that looks like what you're reporting.

I suspect that you have a large mono audio segment and pydub is trying to combine it with a stereo audio segment.

When pydub is combining (in any way) 2 audio segments, it first calls AudioSegment._sync to make sure the sample rate, number of channels (mono/stereo), bit depth, etc all match. And then if they don't match, it will choose the higher fidelity (i.e., higher RAM use) option for the output (so mono will be converted to stereo, doubling the RAM) in order to avoid losing fidelity.

Hope this helps

edit: Worth mentioning it could be happening if combining mixed sample rates when AudioSegment._sync calls audioop.ratecv. If you combine a segment with 11025Hz sample rate and one with 44100, the one with 11025 will up upsampled and use 4x as much RAM.

possible workarounds

you can manually convert a stereo audio segment to mono

new_sound = sound.set_channels(1)

and you can manually downsample a sound with too high a sample rate:

new_sound = sound.set_frame_rate(11025)

from pydub.

lepinsk commented on July 17, 2024

Thanks for the reply!

That's definitely where it's happening – I also arrived at the same conclusion and have eliminated some situations where I was inadvertently combining stereo and mono files, and I'm now converting my stereo files down to mono before the operation, which has helped with the issue.

That said, it does seem like something along the chain is running into a memory limitation that is well below the system memory (which I'm theorizing is perhaps a 32-bit 4gb limit?). Prior to rewriting my code to avoid upmixing to stereo, I ran the same set of files through my software two systems: one with 8gb of system memory and one with 14, and had it crash at the exact same spot in both.

I'm putting together a few tests to see if I can see this happening explicitly at the 4gb memory line. (Though where we can go from there, I'm not sure...)

from pydub.

lepinsk commented on July 17, 2024

Update: it seems like I can cram as much as I want into memory using pydub without running into the issue here (eventually my system just starts to swap to disk and slows down as you'd expect). This works fine with stereo or mono files.

The problem only seems to crop up with audioop.tostereo, which I suspect means there's an issue around that function in particular. It looks like it throws this error when it sees a value exceed PY_SSIZE_T_MAX/2. I'm wondering if PY_SSIZE_T_MAX isn't changing on 64-bit systems.

from pydub.

jiaaro commented on July 17, 2024

@lepinsk verrrry interesting - I wonder if you've found a bug in audioop?

from pydub.

lepinsk commented on July 17, 2024

It seems like it. For the time being it's sufficient for me to simply avoid upmixing to stereo.

Long term, I don't even know where I'd start if I wanted to fix this. :/

from pydub.

thesunlover commented on July 17, 2024

+1 for this suggestion.
here is the problem we tried to solve:
worldveil/dejavu#18

A good related comment is this one;
worldveil/dejavu#18 (comment)

Here is code we are using to chunk-size the audios

def read(filename, limit=None):
    """
    Reads any file supported by pydub (ffmpeg) and returns the data contained
    within.

    Can be optionally limited to a certain amount of seconds from the start
    of the file by specifying the `limit` parameter. This is the amount of
    seconds from the start of the file.

    returns: (channels, samplerate)
    """
    audiofile = AudioSegment.from_file(filename)

    if limit:
        audiofile = audiofile[:limit * 1000]

    data = np.fromstring(audiofile._data, np.int16)

    channels = []
    for chn in xrange(audiofile.channels):
        channels.append(data[chn::audiofile.channels])

    return channels, audiofile.frame_rate

here is the current version of the file containing the previous code:
https://github.com/worldveil/dejavu/blob/master/dejavu/decoder.py

Edit:
this is the version of previous file without the wavio-related-stuff:
https://github.com/worldveil/dejavu/blob/9eca3cc05a258cd5585ff9d291c20448b112790f/dejavu/decoder.py

from pydub.

jiaaro commented on July 17, 2024

Closing because I'm merging all related tickets into #135

from pydub.

pydub operating direct to disk vs in memory? about pydub HOT 19 CLOSED

Comments (19)

possible workarounds

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent