New techniques for sinusoidal coding of speech at 2400 bps

Sassan Ahmadi, Andreas Spanias

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

The sinusoidal transform coding (STC) is a frequency-domain speech compression technique, in which finite duration segments of speech signal are represented by linear combination of sinusoids with time-varying amplitudes, phases, and frequencies. The STC is known to reconstructed speech of high quality at data rats below 10 kbps. It can be shown that if the measured sine wave frequencies are replaced by a harmonic set, then reconstructed speech of good quality can still be obtained. The methods that are discussed in this paper have been exploited in the development of the STC coders at data rates from 9.6 to 2.4 kbps and resulted in reconstructed speech of high quality and intelligibility. Accurate pitch detection algorithm, perception-based split vector quantization, improved overlap/add and frame interpolation algorithms, minimum variance phase estimation, and finally computational efficiency are the basic features that discriminate our implementations from other implementations of sinusoidal coders. This paper focuses on the development of a fully quantized sinusoidal coder at 2.4 kbps.

Original languageEnglish (US)
Title of host publicationConference Record of the Asilomar Conference on Signals, Systems and Computers
PublisherIEEE
Pages770-774
Number of pages5
Volume1
StatePublished - 1997
EventProceedings of the 1996 30th Asilomar Conference on Signals, Systems & Computers. Part 2 (of 2) - Pacific Grove, CA, USA
Duration: Nov 3 1996Nov 6 1996

Other

OtherProceedings of the 1996 30th Asilomar Conference on Signals, Systems & Computers. Part 2 (of 2)
CityPacific Grove, CA, USA
Period11/3/9611/6/96

ASJC Scopus subject areas

  • Hardware and Architecture
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'New techniques for sinusoidal coding of speech at 2400 bps'. Together they form a unique fingerprint.

Cite this