My Portfolio Website

Shehzeen Samarah Hussain

Email: shehzeensh@gmail.com

Research Scientist in Conversational AI at NVIDIA

I am currently investigating robust LLMs for speech generation, developing efficient learning algorithms that scale up to diverse speakers and languages.

I received my Ph.D. in Computer Engineering from UC San Diego, at the Adaptive Computing and Embedded Systems Lab advised by Professor Farinaz Koushanfar. During the Ph.D, my research centered on machine learning, speech and natural language processing, generative AI, adversarial learning, systems security and vulnerability analysis of deep learning models. I have been awarded the William S.C. Chang Best Ph.D. Dissertation Award in Electrical and Computer Engineering (ECE) Department at UC San Diego for my doctoral dissertation on Robust and Efficient Deep Learning for Multimedia Generation and Recognition.

I completed my B.Sc. in Electrical Engineering from University of Massachusetts, Amherst and B.A in Physics from Mount Holyoke College in 2015 and 2014, respectively. Prior to pursuing my research interests, I worked as an Applications Engineer at Global Foundries New York. I’m a recipient of Charles Lee Powell Foundation Fellowship.

Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance

Shehzeen Hussain, Paarth Neekhara, Xuesong Yang, Edresson Casanova, Subhankar Ghosh, Mikyas T Desta, Roy Fejgin, Rafael Valle, Jason Li
Conference on Empirical Methods in Natural Language Processing EMNLP 2025, ICML 2025 Workshop on Machine Learning for Audio AI Heard That!
[ paper, ICML Workshop, audio examples ]
NanoCodec: Towards High-Quality Ultra Fast Speech LLM Inference

Edresson Casanova, Paarth Neekhara, Ryan Langman, Shehzeen Hussain, Subhankar Ghosh, Xuesong Yang, Ante Jukić, Jason Li, Boris Ginsburg
Interspeech 2025
[ paper]
HiFiTTS-2: A Large-Scale High Bandwidth Speech Dataset

Ryan Langman, Xuesong Yang, Paarth Neekhara, Shehzeen Hussain, Edresson Casanova, Evelina Bakhturina, Jason Li
Interspeech 2025
[ paper]
Low frame-rate speech codec: a codec designed for fast high-quality speech LLM training and inference

Edresson Casanova, Ryan Langman, Paarth Neekhara, Shehzeen Hussain, Jason Li, Subhankar Ghosh, Ante Jukić, Sang-gil Lee
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025
[ paper]
Improving Robustness of LLM-based Speech Synthesis Models by Learning Monotonic Alignment

Paarth Neekhara*, Shehzeen Hussain*, Subhankar Ghosh, Jason Li, Rafael Valle, Rohan Badlani, Boris Ginsburg
Interspeech 2024
[ * Equal Contribution ]
[ paper, blog, audio examples ]
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations

Paarth Neekhara*, Shehzeen Hussain*, Rafael Valle, Boris Ginsburg, Rishabh Ranjan, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley
International Conference on Machine Learning (ICML) 2024
[ * Equal Contribution ]
[ paper, blog ]
REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models

Ruisi Zhang, Shehzeen Hussain, Paarth Neekhara, Farinaz Koushanfar
USENIX Security Symposium (USENIX) 2024
[ paper ]
FaceSigns: Semi-Fragile Watermarks for Media Authentication

Paarth Neekhara*, Shehzeen Hussain*, Xinqiao Zhang, Ke Huang, Julian McAuley, Farinaz Koushanfar
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 2024
[ * Equal Contribution ]
[ paper ]
NetFlick: Adversarial Flickering Attacks on Deep Learning Based Video Compression

Jung-Woo Chang, Nojan Sheybani, Shehzeen Hussain, Mojan Javaheripi, Seira Hidano, Farinaz Koushanfar IEEE International Conference on Learning Representations (ICLR) Workshop on ML4IoT 2023
[ paper, poster ]
ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations

Shehzeen Hussain, Paarth Neekhara, Jocelyn Huang, Jason Li, Boris Ginsburg
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023
[ paper, audio examples ]
ReFace: Adversarial Transformation Networks for Real-time Attacks on Face Recognition Systems

Shehzeen Hussain, Todd Huster, Chris Mesterharm, Paarth Neekhara, Farinaz Koushanfar
IEEE International Conference on Dependable Systems and Networks (DSN) 2023
[ paper ]
FastStamp: Accelerating Neural Steganography and Digital Watermarking of Images on FPGAs

Shehzeen Hussain, Nojan Sheybani, Paarth Neekhara, Xinqiao Zhang, Javier Duarte, Farinaz Koushanfar
International Conference On Computer Aided Design (ICCAD) 2022
[ paper ]
Multi-task Voice Activated Framework using Self-supervised Learning

Shehzeen Hussain, Van Nguyen, Shuhua Zhang, Erik Visser
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
[ paper ]
Cross-modal Adversarial Reprogramming

Paarth Neekhara*, Shehzeen Hussain*, Jinglong Du, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley
IEEE Winter Conference on Applications in Computer Vision (WACV) 2022
[ * Equal Contribution ]
[ paper ]
Expressive Neural Voice Cloning

Paarth Neekhara*, Shehzeen Hussain*, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley
Asian Conference on Machine Learning (ACML) 2021
[ * Equal Contribution ]
[ paper, audio examples ]
Waveguard: Understanding and Mitigating Audio Adversarial Attacks

Shehzeen Hussain*, Paarth Neekhara*, Shlomo Dubnov, Julian McAuley, Farinaz Koushanfar
USENIX Security Symposium (USENIX) 2021
[ * Equal Contribution ]
[ paper,code, audio examples ]
Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples

Shehzeen Hussain*, Paarth Neekhara*, Malhar Jere, Farinaz Koushanfar, Julian McAuley
IEEE Winter Conference on Applications in Computer Vision (WACV) 2021
[ * Equal Contribution ]
[ paper, video examples ]
Universal Adversarial Perturbations for Speech Recognition Systems

Paarth Neekhara*, Shehzeen Hussain*, Prakhar Pandey, Shlomo Dubnov, Julian McAuley, Farinaz Koushanfar
Interspeech 2019
[ * Equal Contribution ]
[ paper, audio examples ]
Adversarial Reprogramming of Text Classification Neural Networks

Paarth Neekhara, Shehzeen Hussain, Shlomo Dubnov, Farinaz Koshanfar
Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing 2019 (EMNLP)
[ paper, code ]
FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA

Shehzeen Hussain, Mojan Javaheripi, Paarth Neekhara, Ryan Kastner, Farinaz Koushanfar
International Conference On Computer Aided Design 2019 (ICCAD)
[ paper]
Overlay optimization for 1x node technology and beyond via rule based sparse sampling

Nyan Lynn Aung, Woong Jae Chung, Lokesh Subramany, Shehzeen Hussain, Pavan Samudrala, Haiyong Gao, Xueli Hao, Yen-Jen Chen, Juan-Manuel Gomez
Metrology, Inspection, and Process Control for Microlithography 2016 SPIE
[ paper ]

NVIDIA Corporation

July 2023 - Present
Santa Clara, California

Senior Research Scientist - Conversational AI
Investigating Large Language Models for multi-lingual voice conversion and robust text-to-speech synthesis.
NVIDIA Corporation

June 2022 - September 2022
Santa Clara, California

Research Intern - Deep Learning
Designed deep neural networks for multi-lingual voice conversion and text-to-speech synthesis. Developed state-of-the-art zero-shot voice conversion framework with speaker-adaptive pitch and duration controllability [ paper ]
Qualcomm Technologies Research

June 2021 - September 2021
San Diego, California

Deep Learning R&D Intern
I was a part of the Qualcomm Audio R&D team working on self-supervised learning, using Transformer+CNN architectures for effective keyword spotting and speaker verification. Our work achieved new state-of-the-art. [ paper ]
Facebook AI Research

June 2020 - September 2020
Menlo Park, California

Research Intern
I was a member of the Text-to-Speech Synthesis team within Applied AI Speech. I developed deep neural network models for multi-speaker and multi-style controllable speech synthesis. I also designed end-to-end pipeline for joint training of speaker encoder model with text-to-speech synthesis (Tacotron2) model. Additionally, I developed a voice cloning toolkit for synthesizing speech of unseen speakers from a few reference audio samples.
Intel Corporation

July 2019 - October 2019
Santa Clara, California

Graduate Machine Learning Intern
I was a part of the Non-Volatile Memory Systems Group, working on reinforcement learning for memory and SSD applications.
Qualcomm Technologies Research

June 2018 - September 2018
San Diego, California

Deep Learning R&D Intern
I was a part of Qualcomm Research optimizing power and performance management on Qualcomm chipsets using reinforcement learning with deep neural networks. I was advised by Shankar Sadasivam, Manu Rastogi, Guillaume Sautière and Rajeev Jain.
GlobalFoundries

July 2015 - August 2017
Malta, New York

Process Engineer
Among other responsibilities, I was a member of the applications team and assisted in design of algorithms to model advanced wafer level corrections.
[ paper ]

Shehzeen Samarah Hussain

News!

August 2025: Our paper Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance got accepted at EMNLP 2025!

August 2025: Our papers HiFiTTS-2 and NanoCodec got accepted at Interspeech 2025!

July 2024: Our paper Improving Robustness of LLM-based Speech Synthesis Models by Learning Monotonic Alignment got accepted at Interspeech 2024!

June 2024: I received the William S.C. Chang Best Ph.D. Dissertation Award in Electrical and Computer Engineering (ECE) Department at UC San Diego!

May 2024: Our paper SelfVC got accepted at ICML 2024!

March 2024: Our paper REMARK-LLM got accepted at USENIX Security 2024!

January 2024: Our paper FaceSigns: Semi-Fragile Watermarks for Media Authentication got accepted at ACM Transactions on Multimedia Computing, Communications, and Applications!

July 2023: I joined Nvidia as a Research Scientist after successfully defending my Ph.D!

Publications

Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance

NanoCodec: Towards High-Quality Ultra Fast Speech LLM Inference

HiFiTTS-2: A Large-Scale High Bandwidth Speech Dataset

Low frame-rate speech codec: a codec designed for fast high-quality speech LLM training and inference

Improving Robustness of LLM-based Speech Synthesis Models by Learning Monotonic Alignment

SelfVC: Voice Conversion With Iterative Refinement using Self Transformations

REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models

FaceSigns: Semi-Fragile Watermarks for Media Authentication

NetFlick: Adversarial Flickering Attacks on Deep Learning Based Video Compression

ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations

ReFace: Adversarial Transformation Networks for Real-time Attacks on Face Recognition Systems

FastStamp: Accelerating Neural Steganography and Digital Watermarking of Images on FPGAs

Multi-task Voice Activated Framework using Self-supervised Learning

Expressive Neural Voice Cloning

Waveguard: Understanding and Mitigating Audio Adversarial Attacks

Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples

Universal Adversarial Perturbations for Speech Recognition Systems

Adversarial Reprogramming of Text Classification Neural Networks

FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA

Overlay optimization for 1x node technology and beyond via rule based sparse sampling

Relevant Experience

NVIDIA Corporation

NVIDIA Corporation

Qualcomm Technologies Research

Facebook AI Research

Intel Corporation

Qualcomm Technologies Research

GlobalFoundries

Shehzeen Samarah Hussain

News!

August 2025: Our paper Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance got accepted at EMNLP 2025!

August 2025: Our papers HiFiTTS-2 and NanoCodec got accepted at Interspeech 2025!

July 2024: Our paper Improving Robustness of LLM-based Speech Synthesis Models by Learning Monotonic Alignment got accepted at Interspeech 2024!

June 2024: I received the William S.C. Chang Best Ph.D. Dissertation Award in Electrical and Computer Engineering (ECE) Department at UC San Diego!

May 2024: Our paper SelfVC got accepted at ICML 2024!

March 2024: Our paper REMARK-LLM got accepted at USENIX Security 2024!

January 2024: Our paper FaceSigns: Semi-Fragile Watermarks for Media Authentication got accepted at ACM Transactions on Multimedia Computing, Communications, and Applications!

July 2023: I joined Nvidia as a Research Scientist after successfully defending my Ph.D!

Publications

Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance

NanoCodec: Towards High-Quality Ultra Fast Speech LLM Inference

HiFiTTS-2: A Large-Scale High Bandwidth Speech Dataset

Low frame-rate speech codec: a codec designed for fast high-quality speech LLM training and inference

Improving Robustness of LLM-based Speech Synthesis Models by Learning Monotonic Alignment

SelfVC: Voice Conversion With Iterative Refinement using Self Transformations

REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models

FaceSigns: Semi-Fragile Watermarks for Media Authentication

NetFlick: Adversarial Flickering Attacks on Deep Learning Based Video Compression

ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations

ReFace: Adversarial Transformation Networks for Real-time Attacks on Face Recognition Systems

FastStamp: Accelerating Neural Steganography and Digital Watermarking of Images on FPGAs

Multi-task Voice Activated Framework using Self-supervised Learning

Cross-modal Adversarial Reprogramming

Expressive Neural Voice Cloning

Waveguard: Understanding and Mitigating Audio Adversarial Attacks

Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples

Universal Adversarial Perturbations for Speech Recognition Systems

Adversarial Reprogramming of Text Classification Neural Networks

FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA

Overlay optimization for 1x node technology and beyond via rule based sparse sampling

Relevant Experience

NVIDIA Corporation

NVIDIA Corporation

Qualcomm Technologies Research

Facebook AI Research

Intel Corporation

Qualcomm Technologies Research

GlobalFoundries