Jin Xu

Roman Holowinsky, PhD

JUNE 07, 2023

DIRECTOR

DATE

TEAM

Building a voice assistant interface for audio-based LLMs

Jim Schwoebel, Jin Xu, Nathan Schley

Project title: Building a voice assistant interface for audio-based LLMs

Team members:
@Jin Xu, @Nathan Schley
Mentor:
Jim Schwoebel

Problem
The emergence of large language models, such as OpenAI's GPT-3, has revolutionized natural language processing tasks, enabling various applications in text generation and understanding. One area where these models have garnered significant attention is text-to-audio conversion, where they serve as interfaces to convert written text into high-quality synthesized speech. However, this novel technology also brings along a unique set of challenges including:

Text-to-audio interfaces often struggle to capture subtle vocal cues, intonations, and emotions present in the original text, resulting in monotonous or robotic-sounding output that lacks the desired level of authenticity.
Large language models can occasionally introduce errors or inaccuracies when transforming text into speech, leading to misp

THE ERDŐS INSTITUTE

Helping PhDs get and create jobs they love at every stage of their career.

Jin Xu

TEAM

Building a voice assistant interface for audio-based LLMs

Jim Schwoebel, Jin Xu, Nathan Schley