Speech processing

1940

Pioneering works in field of speech recognition using analysis of its spectrum were reported in 1940s. Linear predictive coding (LPC), a speech processing algorithm, was first proposed by Fumitada Itakura of Nagoya University and Shuzo Saito of Nippon Telegraph and Telephone (NTT) in 1966.

1952

In 1952, three researchers at Bell Labs, Stephen.

1966

1970

Schroeder at Bell Labs during the 1970s.

1978

LPC was the basis for voice-over-IP (VoIP) technology, as well as speech synthesizer chips, such as the Texas Instruments LPC Speech Chips used in the Speak & Spell toys from 1978. One of the first commercially available speech recognition products was Dragon Dictate, released in 1990.

1990

1992

In 1992, technology developed by Lawrence Rabiner and others at Bell Labs was used by AT&T in their Voice Recognition Call Processing service to route calls without a human operator.

2000

By this point, the vocabulary of these systems was larger than the average human vocabulary. By the early 2000s, the dominant speech processing strategy started to shift away from Hidden Markov Models towards more modern neural networks and deep learning. == Techniques == === Dynamic time warping === Dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed.

All text is taken from Wikipedia. Text is available under the Creative Commons Attribution-ShareAlike License .

Page generated on 2021-08-05