Guest Blogger

Importance of clear audio for VoIP

May 30, 2006 04:30 PM

Topics: Developer Zone | Developers | Skype杂志 | Technology | ebay | skype | skypejournal | voip

by Vikas RangarajanVikas Rangarajan, senior software engineer, UmeVoice.

Rise in mobile VoIP use

As VoIP applications such as Skype become more ubiquitous, there is an ever increasing need to ensure effective use of VoIP communication in a wide range of environments. Skype is available for a variety of mobile platforms. A new breed of wireless-wifi converged devices is also in the news. VoIP service providers have begun to target mobile platforms. All indicators point to a rise in mobile use of VoIP technology. It is becoming common for the user to be in a noisy environment like an airport or a conference or a café while using his/her laptop or PDA for VoIP communication. One critical factor in making these applications truly valuable will be clear audio transmission irrespective of the surrounding environment. Adopting and recommending effective noise cancelling technology can help realize the full potential of internet voice technologies, making it possible for people to have more freedom to communicate globally.

How important is "Clear Audio" for VoIP?

The most popular use of VoIP technology is in speech communication applications like Skype, Google Talk and other standard SIP phones.

For the purposes of speech communication, clear audio can be defined as an audio stream that has a low noise component (ideally no noise), and a high speech component (the audio content that we are trying to transmit). However, this specification alone is not sufficient to describe clear audio.

We need the speech signal to be transmitted to the listener at the other end of the communication system with the least possible modification along the way (ideally with no modification at all: the listener should hear the audio exactly as the speaker spoke it).

These requirements become even more critical when the "listener" is not a human, but a far less capable recognizer of human speech, such as an ASR (automatic speech recognition) computer program. ASR programs can be found in today's dictation software, telephone IVR (Interactive voice response) systems and command and control applications

VoIP systems employ several technologies to optimize the efficiency of audio transmission. These technologies make a trade off between the quality of audio transmitted and the cost of transmitting the audio. In order to minimize the degradation of audio quality, it is important to ensure that clear audio is sent into the system regardless of the environment in which the end users are located.

How to provide clear audio for VoIP in noisy environments

In order to satisfy both the requirements for clear audio (high speech to noise ratio, and high speech fidelity), we need to filter the noise out from the signal leaving the speech. This can be accomplished in two broad ways:

  1. Remove (as much of) the noise (as possible) from the signal after it has been mixed in
  2. Prevent (as much of) the noise (as possible) somehow from entering the signal
Technique 1 is commonly employed by DSP-based noise cancelling systems that use frequency based algorithms to remove noise. This is an inherently difficult problem, since speech and noise invariably overlap at several frequencies. While this does often accomplish the first requirement of high speech to noise ratio, it invariably fails the second one to varying degrees. Since speech and noise overlap at many frequencies, removing the "noisy" frequencies results in the removal of (often critical) speech frequencies as well, leading to distorted speech. This is especially prominent at high noise levels. Some DSP based systems use adaptive techniques to minimize the identification of speech as noise, such as the Jawbone.

Technique 2 is also a tough problem, since both speech and noise travel through the same medium. A patented technology developed at UmeVoice exploits the noise cancelling properties of a standard dual-port noise cancelling microphone, making use of distinguishing characteristics of noise versus speech, to prevent the noise from entering the signal, thus accomplishing both requirements for clear audio. This makes such a solution ideal not only for VoIP communication but also for high quality speech applications like speech recognition. UmeVoice makes headsets theBoom, theBoom O and theBoom Quiet that offer the ability to effectively communicate even in the noisiest of environments.

Freedom to communicate from anywhere?

In an increasingly global society, technology is making it possible to work productively and stay connected while being mobile. Voice is one of the most natural human modes of communication. Technologies that facilitate clear audio capture and transmission will be crucial in ensuring that people can have true freedom to communicate clearly and effectively.




Trackback Pings

TrackBack URL for this entry:
http://www.skypejournal.com/cgi-bin/mt/mt-tb.cgi/2274

Comments

Post a comment




Remember Me?

(you may use HTML tags for style)





Other Recent Posts

Skype 3.0 Folder Pollution in Life | Products | Skype杂志 | complaints | design | ebay | skype | skypejournal | voip | wishlist on 11/22/06

Skype 3.0 Beta for Windows; bugfix build 137 in General Notices | News | Products | Skype News | Skype杂志 | ebay | skype | skypejournal | voip on 11/22/06

Skype PR Wake Up Call III: The Commentary in Business | Every Post | Ideas & Views | Marketing | Skype News | Skype杂志 | Strategy | ebay | observations | skype | skypejournal | voip on 11/22/06

Wednesday morning scan in Business | Life | Marketing | News | Products | Skype Partner Watch | Skype杂志 | Strategy | Technology | Tips & Tricks | Yahoo | counterpoints | design | ebay | freedom | observations | regulation | skype | skypejournal | voip on 11/22/06

Yes, TalkPlus reverse engineered Skype. in Developers | North America | Skype Partner Watch | Skype杂志 | Strategy | Technology | ebay | skype | skypejournal | voip on 11/21/06

Email to a friend