Simple audio output

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
onlyonemac
Member
Member
Posts: 1146
Joined: Sat Mar 01, 2014 2:59 pm

Re: Simple audio output

Post by onlyonemac »

The diagram sounds interesting - so essentially you're comparing the frequencies present in different vowels? It would also be interesting to compare the waveforms - something makes me think that these harmonic patterns have something to do with the times when I was playing around with music synthesisers and suddenly it sounded like someone going "ooh" or "aah" (depending on the setting). If we could figure out a consistent pattern, I guess it would thus be quite easy to harmonically synthesise the English vowels, and maybe even to apply a similar proc9ess to the consonants (although I imagine that synthesising the consonants would be somewhat harder)? It's also interesting that espeak mixes recorded sounds with synthesised samples - something which I was not aware of - however I wonder why they thought it too hard to synthesise the sampled sounds and if they're thinking of replacing those to get a fully synthesised voice (which would be both cool and useful in the endless customisation of the voice that it would permit - I can just imagine setting things like exactly how much separation I want between phonemes, how long I want particular phonemes to sound for, and many other things that would be great for fine-tuning the speech synth for high-speed listening by a blind person).

Heck, now we need to start a "SpeechSynthDev" forum!
When you start writing an OS you do the minimum possible to get the x86 processor in a usable state, then you try to get as far away from it as possible.

Syntax checkup:
Wrong: OS's, IRQ's, zero'ing
Right: OSes, IRQs, zeroing
User avatar
DavidCooper
Member
Member
Posts: 1150
Joined: Wed Oct 27, 2010 4:53 pm
Location: Scotland

Re: Simple audio output

Post by DavidCooper »

Consonants are generally made from white noise with a range of frequencies of hiss which sound different due to that range. What makes it white noise rather than tones is that there is random variation in the cycle length to the point that there is no constant frequency in there. Some of the consonants involve a blockage of the flow, so that's a momentary lack of sound, and when the sound starts up again you get a very short blast of white noise. With the sound "t", for example, there's a silence followed by the same white noise as the sound "s", but it's so short that you don't normally recognise it as "s". With "p", the silence ends with white noise similar to that of the sound "f" (though it's actually a bilabial "ph" kind of "f" that uses both lips instead of involving the teeth). With "k", the white noise after the silence is a short burst of the "ch" in the Scottish word "loch" (which many people will know from German [acht], Spanish [junto], Arabic [khubz]).

I'd have thought it would be easy enough to generate all these hisses without needing to store recorded samples - they could then be generated when the program first runs, but it wouldn't save a vast amount of storage space so there's little need to bother unless you want the whole OS to fit on a floppy disk (which I ideally would want to do).
Help the people of Laos by liking - https://www.facebook.com/TheSBInitiative/?ref=py_c

MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
Post Reply