Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
150 views
in Technique[技术] by (71.8m points)

c++ - Deinterleave audio data in varied bitrates

I'm trying to write one function that can deinterleave 8/16/24/32 bit audio data, given that the audio data naturally arrives in an 8 bit buffer.

I have this working for 8 bit, and it works for 16/24/32, but only for the first channel (channel 0). I have tried so many + and * and other operators that I'm just guessing at this point. I cannot find the magic formula. I am using C++ but would also accept a memcpy into the vector if that's easiest.

Check out the code. If you change the demux call to another bitrate you will see the problem. There is an easy math solution here I am sure, I just cannot get it.

#include <vector>
#include <map>
#include <iostream>
#include <iomanip>
#include <string>
#include <string.h>

const int bitrate = 8;
const int channel_count = 5;
const int audio_size = bitrate * channel_count * 4;
uint8_t audio_ptr[audio_size];
const int bytes_per_channel = audio_size / channel_count;

void Demux(int bitrate){
    int byterate = bitrate/8;
    std::map<int, std::vector<uint8_t> > channel_audio;
    for(int i = 0; i < channel_count; i++){
        std::vector<uint8_t> audio;
        audio.reserve(bytes_per_channel);
        for(int x = 0; x < bytes_per_channel; x += byterate){
            for(int z = 0; z < byterate; z++){
                // What is the magic formula!
                audio.push_back(audio_ptr[(x * channel_count) + i + z]);
            }
        }
        channel_audio.insert(std::make_pair(i, audio));
    }
    
    int remapsize = 0;
    std::cout << "
Remapped  Audio";
    std::map<int, std::vector<uint8_t> >::iterator it;
    for(it = channel_audio.begin(); it != channel_audio.end(); ++it){
        std::cout << "
Channel" << it->first << " ";
        std::vector<uint8_t> v = it->second;
        remapsize += v.size();
        for(size_t i = 0; i < v.size(); i++){
            std::cout << "0x" << std::hex << std::setfill('0') << std::setw(2) << +v[i] << " ";
            if(i && (i + 1) % 32 == 0){
                std::cout << std::endl;   
            }
        }
    }
    std::cout << "Total remapped audio size is " << std::dec << remapsize << std::endl;
}

int main()
{
    
    // External data
    std::cout << "Raw Audio
";
    for(int i = 0; i < audio_size; i++){
        audio_ptr[i] = i;   
        std::cout << "0x" << std::hex << std::setfill('0') << std::setw(2) << +audio_ptr[i] << " ";
        if(i && (i + 1) % 32 == 0){
            std::cout << std::endl;   
        }
    }
    std::cout << "Total raw audio size is " << std::dec << audio_size << std::endl;
    
    Demux(8);
    //Demux(16);
    //Demux(24);
    //Demux(32);
}
question from:https://stackoverflow.com/questions/65913008/deinterleave-audio-data-in-varied-bitrates

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You're actually pretty close. But the code is confusing: specifically the variable names and what actual values they represent. As a result, you appear to be just guessing the math. So let's go back to square one and determine what exactly it is we need to do, and the math will very easily fall out of it.

First, just imagine we have one sample covering each of the five channels. This is called an audio frame for that sample. The frame looks like this:

[channel0][channel1][channel2][channel3][channel4]

The width of a sample in one channel is called byterate in your code, but I don't like that name. I'm going to call it bytes_per_sample instead. You can easily see the width of the entire frame is this:

int bytes_per_frame = bytes_per_sample * channel_count;

It should be equally obvious that to find the starting offset for channel c within a single frame, you multiply as follows:

int sample_offset_in_frame = bytes_per_sample * c;

That's just about all you need! The last bit is your z loop which covers each byte in a single sample for one channel. I don't know what z is supposed to represent, apart from being a random single-letter identifier you chose, but hey let's just keep it.

Putting all this together, you get the absolute offset of sample s in channel c and then you copy individual bytes out of it:

int sample_offset = bytes_per_frame * s + bytes_per_sample * c;
for (int z = 0; z < bytes_per_sample; ++z) {
    audio.push_back(audio_ptr[sample_offset + z]);
}

This does actually assume you're looping over the number of samples, not the number of bytes in your channel. So let's show all the loops for completion sake:

const int bytes_per_sample = bitrate / 8;
const int bytes_per_frame = bytes_per_sample * channel_count;
const int num_samples = audio_size / bytes_per_frame;

for (int c = 0; c < channel_count; ++c)
{
    int sample_offset = bytes_per_sample * c;

    for (int s = 0; s < num_samples; ++s)
    {
        for (int z = 0; z < bytes_per_sample; ++z)
        {
            audio.push_back(audio_ptr[sample_offset + z]);
        }

        // Skip to next frame
        sample_offset += bytes_per_frame;
    }
}

You'll see here that I split the math up so that it's doing less multiplications in the loops. This is mostly for readability, but might also help a compiler understand what's happening when it tries to optimize. Concerns over optimization are secondary (and in your case, there are much more expensive worries going on with those vectors and the map)..

The most important thing is you have readable code with reasonable variable names that makes logical sense.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...