Technology

Skype Journal: Wishlist: Inline access to data streams through the Skype API

October 27, 2005 03:15 PM

I want to call again for the designers of the Skype API to provide a mecanism for developers to read and write to the audio and, someday soon, the video data streams. This lets programmers create real-time apps that augment conversation. For example:

  • Apply noise reduction algorithms to quiet background sounds
  • Mix in audio to create a background ambience
  • Look up product profiles from recognized barcodes
  • Recognize spoken language and pipe a slightly lagged transcript to a chat session
  • Recognize spoken keywords, like company or contact names, and show data on those topics (stock prices, recent email, blog posts, etc.
  • Detect stress and other indications of falsehood, to better detect lies
  • Replace my background video with a more posh background
  • Supertitle agenda items in our video
  • Add closed captioning to video
  • Overlay time zones on each speaker's window
  • Change my voice to sound like another sex, age, regional accent
  • Identify non-verbal sounds (clock ticking, car passing, music in the background, laughter) for closed captioning
  • Hide my eyeglasses
  • Quiet my voice volume and play the voice of a simultaneous translator
  • Save recognized text with audio or video file
  • Make me better looking by at least two beers
  • Insert television advertising in the background
  • "Sharpen" my speech to improve intelligibility
I'm bringing this up now because TechCrunch just previewed Riya. Riya is a service that recognzes faces in your photo album and helps you tag your album automatically. So all the photos of Uncle Joseph are properly labelled without your reviewing each one. They do this in a batch process on a server, or will when they come out of testing. This is exactly the kind of functionality we need; just in real time and inline.
TrackBack (0)

Comments (0)