Hi,
bewing wrote:You could also try (as Brendan seems to be suggesting) to combine both of these aspects into just one big driver -- but I'd say that keeping them separate adds a bit of flexibility. You will need to be creating "stacks" of drivers anyway, for other devices (USB, TCP/IP).
Keeping the mixing and the low level device driver/s separate would be more flexible, if only to support unusual situations (e.g. one stereo sound card for front_left and front_right speakers, and another stereo sound card for back_left and back_right speakers).
However, applications can't do more than the interfaces they use allow, so interfaces should be designed to handle the most complex situations you can imagine. This basically means that the audio interface should be designed for high quality 3D games - if it can handle everything a 3D game wants, then it can handle anything any application will throw at it.
So, imagine you're writing a 3D game, and the player is facing north standing next to a railway track. The railway track runs from north-east to south-west. From overhead, it looks like this (where FL, FR, BL and BR are speakers):
Your game knows that in 2 seconds time a train will pass by.
Your game knows that it'll need to play "train_engine1.wav" (in a loop) in 2 seconds time, and that the sound needs to fade in on the front_right speaker, then shift from the front_right speaker to the front_left speaker to the back_left speaker, then fade out on the back_left speaker. Your game also knows that it'll need to play "train_wheel_click_clack.wav" many times, from various points along the train as it passes by (with correct sound positioning used each time the sound is played).
In addition, the sound files use 22 KHz sampling with 8-bit data, but the sound card/s want 44 KHz with 16-bit data; and it needs to give the best possible results regardless of how many speakers there are, where the speakers are in relation to the player/user and the characteristics of each speaker (e.g.
frequency response).
Also, you don't want to load the sound data from disk every time the sounds are played - the data should be cached somewhere, and (for performance reasons) you don't want to convert the sound data into a different format each time it's used (it's better to cache pre-converted data so that it's already in the correct format when it's needed).
Now, how much of this should the game itself have to do, how much should the game offload onto the mixer, and how much should the mixer offload onto the driver/s?
Myself, I'd probably want a mixer interface (that the game uses) that has something like the following:
- soundRef = loadSoundFile(fileName);
soundRef = loadSoundData(data, format);
soundInstance = playSoundOnce(soundRef, starting_time, 3D_starting_position, 3D_displacement_vector, volume);
soundInstance = playSoundLoop(soundRef, starting_time, 3D_starting_position, 3D_displacement_vector, volume);
stopSound(soundInstance, ending_time);
moveSound(soundInstance, change_time, 3D_displacement_vector);
changeVolume(soundInstance, change_time, new_volume);
That way the game doesn't need to touch the sound data itself, and can tell the mixer where sounds start from and the direction and speed the sounds are moving. The "3D_displacement_vector" means that the game doesn't need to constantly update the sound's position, and the game only needs to tell the mixer if the sound changes direction.
For the sound drivers, the interface would end up something like:
- channels= getChannels();
channelDetails = getChannelDetails(channel);
setOutputData(channel, change_time, soundData);
In this case, there's one channel per speaker, and the sound driver maintains a buffer containing "N seconds" of data (to be played in future) for each channel. The "setOutput()" is used by the mixer to replace/overwrite the sound data starting at a specific time. The "channelDetails" would be a structure of information about the channel (speaker's position relative to the user, dynamic range, frequency response, etc).
Lastly, I guess you could slap a "cat sound.dat > /dev/sound" interface on top of this for legacy purposes...
Cheers,
Brendan