Aayush Sharma
ML Researcher at Sarvam AI · audio foundation models, voice-to-voice & music generation
Machine Learning Researcher
Sarvam AI · India
aayushsharmajohn [at] gmail.com
I’m Aayush Sharma, a machine learning researcher at Sarvam AI, where I work on audio foundation models, voice-to-voice (speech) systems, and music generation. I’m drawn to the problem of getting machines to listen and speak naturally — and to the generative and representation-learning ideas that make that possible.
My path into ML ran through a B.Tech in Biotechnology at IIT Guwahati (2019–2023), after which I built Indic speech and language systems in industry: Indic ASR and NER models, semantic voice-activity detection, and large language models. Along the way I picked up a soft spot for knowledge-enhanced ML — work that earned a Best Paper Award at DeeLIO @ ACL 2022. This site is my public research notebook: paper notes, technical deep-dives, and project write-ups, gathered here as I prepare for PhD applications.
Interests: audio & speech foundation models · voice-to-voice systems · music generation · NLP & knowledge-enhanced ML.
news
| Jun 01, 2025 |
Our study on the structural transitions of the dehydrin protein appeared in Biochemistry. |
|---|---|
| Sep 01, 2023 | Joined Sarvam AI as a Machine Learning Researcher, working on audio foundation models and voice systems. |
| May 27, 2022 |
Trans-KBLSTM received the Best Paper Award at the DeeLIO workshop, ACL 2022 (Dublin). |
latest posts
| Jul 25, 2025 | Stable Audio - Fast Timing-Conditioned Latent Audio Diffusion |
|---|---|
| Jul 25, 2025 | Text-to-Audio-Models |
| Jul 25, 2025 | Noise2Music: Text-conditioned Music Generation with Diffusion Models |
selected publications
-
ACL