[arch-projects] projects

Wed Mar 30 16:30:15 EST 2005

> Would be very neat indeed. Voice is a tricky thing though. Not sure about
> sphinx..One of my ex-professors was a digital signal guy at Westinghouse
> for many many years. I could ask him about voice recognition stuff. He was
> helping a group of undergrads with a project at one time..something about
> people singing into their computer, and having it show the notes and pitch
> they were singing show up on the screen in real-time. They did it all in
> java, using some opensource FFT libs. pretty cool stuff actually.

I did a project with a prof. back in the day using encoding of
messages into recorded voice... it worked almost in real time, but the
dsp board we used was too crappy for the amount of calcs and
transforms required - so it was delayed about 2 seconds...
get a recorded audio track... talk into microphone while playing audio
track... voice is encoded  into the stream, picked up by a computer
and a program ran through it to remove the audio (we had to "parse"
that one first)... it was pretty cool...

anyway, if you want to do voice recognition yourself, you need to look
at fuzzy logic (yes, that's real) and things like that... basically an
"M" sound has a different shape from other sounds, but voices have
different tones and pitches and inflections and accents... you need to
match the waveforms in the same way your mind knows a car when it sees
one, even if it has never seen one before (4 wheels... vaguely *this*
shape... steering wheel... etc)