Hume AI logo

Hume AI

🎙️Voice & Audio

The AI toolkit for emotionally intelligent voice and speech generation

About

Hume AI builds AI that understands and responds to human emotions. Their flagship products include Octave, a frontier text-to-speech model that lets you design voices with natural language descriptions or clone any voice from seconds of audio, and EVI (Empathic Voice Interface), a speech-to-speech conversational AI that perceives and responds to emotional cues in real time. Octave supports 100+ languages with native-level pronunciation, acting instructions for directing vocal performance, and 250ms speech LLM latency. Ranked #1 in naturalness and expressivity, with 600+ emotion and voice characteristic tags detected. Built on decades of scientific research in human expression, with a strong ethical framework prioritizing beneficence and emotional well-being.

Features

  • Text-to-speech with natural language voice design
  • Instant voice cloning from seconds of audio
  • 100+ language support with native pronunciation
  • Acting instructions for directing vocal delivery
  • Empathic Voice Interface (EVI) for conversational AI
  • 600+ emotion and voice characteristic detection tags
  • 250ms speech LLM latency
  • Voice conversion and cross-lingual synthesis
  • Developer API with TypeScript, Python, React, Swift SDKs
  • Expression measurement for video, audio, images, and text

Use Cases

  • Building emotionally intelligent voice agents and chatbots
  • Creating natural-sounding voiceovers and audio content
  • Cloning voices for consistent brand narration across languages
  • Measuring emotional response in user research and content testing
  • Adding expressive speech to gaming, entertainment, and creative projects

Pricing

Free tier, Starter $3/mo, Creator $14/mo, Pro $70/mo, Scale $200/mo, Business $500/mo, Enterprise custom

Visit Website

Added on March 4, 2026