We describe the use of non-verbal features in voice for direct control of interactive applications. Traditional speech recognition interfaces are based on an indirect, conversational model. First the user gives a direction and then the system performs certain operation. Our goal is to achieve more direct, immediate interaction like using a button or joystick by using lower-level features of voice such as pitch and volume. We are developing several prototype interaction techniques based on this idea, such as "control by continuous voice", "rate-based parameter control by pitch," and "discrete parameter control by tonguing." We have implemented several prototype systems, and they suggest that voice-as-sound techniques can enhance traditional voice recognition approach.
Publications
Takeo Igarashi, John F. Hughes
"Voice as Sound: Using Non-verbal Voice Input for Interactive Control
"
14th Annual Symposium on
User Interface Software and Technology, ACM UIST'01,
Orlando, FL,
November 11-14, 2001, (to appear). PDF