Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

python - How to read a MP3 audio file into a numpy array / save a numpy array to MP3?

Is there a way to read/write a MP3 audio file into/from a numpy array with a similar API to scipy.io.wavfile.read and scipy.io.wavfile.write:

sr, x = wavfile.read('test.wav')
wavfile.write('test2.wav', sr, x)

?

Note: pydub's AudioSegment object doesn't give direct access to a numpy array.

PS: I have already read Importing sound files into Python as NumPy arrays (alternatives to audiolab), tried all the answers, including those which requires to Popen ffmpeg and read the content from stdout pipe, etc. I have also read Trying to convert an mp3 file to a Numpy Array, and ffmpeg just hangs, etc., and tried the main answers, but there was no simple solution. After spending hours on this, I'm posting it here with "Answer your own question – share your knowledge, Q&A-style". I have also read How to create a numpy array from a pydub AudioSegment? but this does not easily cover the multi channel case, etc.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Calling ffmpeg and manually parsing its stdout as suggested in many posts about reading a MP3 is a tedious task (many corner cases because different number of channels are possible, etc.), so here is a working solution using pydub (you need to pip install pydub first).

This code allows to read a MP3 to a numpy array / write a numpy array to a MP3 file with a similar API than scipy.io.wavfile.read/write:

import pydub 
import numpy as np

def read(f, normalized=False):
    """MP3 to numpy array"""
    a = pydub.AudioSegment.from_mp3(f)
    y = np.array(a.get_array_of_samples())
    if a.channels == 2:
        y = y.reshape((-1, 2))
    if normalized:
        return a.frame_rate, np.float32(y) / 2**15
    else:
        return a.frame_rate, y

def write(f, sr, x, normalized=False):
    """numpy array to MP3"""
    channels = 2 if (x.ndim == 2 and x.shape[1] == 2) else 1
    if normalized:  # normalized array - each item should be a float in [-1, 1)
        y = np.int16(x * 2 ** 15)
    else:
        y = np.int16(x)
    song = pydub.AudioSegment(y.tobytes(), frame_rate=sr, sample_width=2, channels=channels)
    song.export(f, format="mp3", bitrate="320k")

Notes:

  • It only works for 16-bit files for now (even if 24-bit WAV files are pretty common, I've rarely seen 24-bit MP3 files... Does this exist?)
  • normalized=True allows to work with a float array (each item in [-1,1))

Usage example:

sr, x = read('test.mp3')
print(x)

#[[-225  707]
# [-234  782]
# [-205  755]
# ..., 
# [ 303   89]
# [ 337   69]
# [ 274   89]]

write('out2.mp3', sr, x)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...