Research Scientist in Conversational AI at NVIDIA

I am currently investigating robust LLMs for speech generation, developing efficient learning algorithms that scale up to diverse speakers and languages.

I received my Ph.D. in Computer Engineering from UC San Diego, at the Adaptive Computing and Embedded Systems Lab advised by Professor Farinaz Koushanfar. During the Ph.D, my research centered on machine learning, speech and natural language processing, generative AI, adversarial learning, systems security and vulnerability analysis of deep learning models. I have been awarded the William S.C. Chang Best Ph.D. Dissertation Award in Electrical and Computer Engineering (ECE) Department at UC San Diego for my doctoral dissertation on Robust and Efficient Deep Learning for Multimedia Generation and Recognition.

I completed my B.Sc. in Electrical Engineering from University of Massachusetts, Amherst and B.A in Physics from Mount Holyoke College in 2015 and 2014, respectively. Prior to pursuing my research interests, I worked as an Applications Engineer at Global Foundries New York. I’m a recipient of Charles Lee Powell Foundation Fellowship.

News!

  • July 2024: Our paper Improving Robustness of LLM-based Speech Synthesis Models by Learning Monotonic Alignment got accepted at Interspeech 2024!

  • June 2024: I received the William S.C. Chang Best Ph.D. Dissertation Award in Electrical and Computer Engineering (ECE) Department at UC San Diego!

  • May 2024: Our paper SelfVC got accepted at ICML 2024!

  • March 2024: Our paper REMARK-LLM got accepted at USENIX Security 2024!

  • January 2024: Our paper FaceSigns: Semi-Fragile Watermarks for Media Authentication got accepted at ACM Transactions on Multimedia Computing, Communications, and Applications!

  • July 2023: I joined Nvidia as a Research Scientist after successfully defending my Ph.D!

Publications

  • Improving Robustness of LLM-based Speech Synthesis Models by Learning Monotonic Alignment

    Paarth Neekhara*, Shehzeen Hussain*, Subhankar Ghosh, Jason Li, Rafael Valle, Rohan Badlani, Boris Ginsburg
    Interspeech 2024
    [ * Equal Contribution ]
    [ paper, blog, audio examples ]

  • SelfVC: Voice Conversion With Iterative Refinement using Self Transformations

    Paarth Neekhara*, Shehzeen Hussain*, Rafael Valle, Boris Ginsburg, Rishabh Ranjan, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley
    International Conference on Machine Learning (ICML) 2024
    [ * Equal Contribution ]
    [ paper, blog ]

  • REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models

    Ruisi Zhang, Shehzeen Hussain, Paarth Neekhara, Farinaz Koushanfar
    USENIX Security Symposium (USENIX) 2024
    [ paper ]

  • FaceSigns: Semi-Fragile Watermarks for Media Authentication

    Paarth Neekhara*, Shehzeen Hussain*, Xinqiao Zhang, Ke Huang, Julian McAuley, Farinaz Koushanfar
    ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 2024
    [ * Equal Contribution ]
    [ paper ]

  • NetFlick: Adversarial Flickering Attacks on Deep Learning Based Video Compression

    Jung-Woo Chang, Nojan Sheybani, Shehzeen Hussain, Mojan Javaheripi, Seira Hidano, Farinaz Koushanfar IEEE International Conference on Learning Representations (ICLR) Workshop on ML4IoT 2023
    [ paper, poster ]

  • ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations

    Shehzeen Hussain, Paarth Neekhara, Jocelyn Huang, Jason Li, Boris Ginsburg
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023
    [ paper, audio examples ]

  • ReFace: Adversarial Transformation Networks for Real-time Attacks on Face Recognition Systems

    Shehzeen Hussain, Todd Huster, Chris Mesterharm, Paarth Neekhara, Farinaz Koushanfar
    IEEE International Conference on Dependable Systems and Networks (DSN) 2023
    [ paper ]

  • FastStamp: Accelerating Neural Steganography and Digital Watermarking of Images on FPGAs

    Shehzeen Hussain, Nojan Sheybani, Paarth Neekhara, Xinqiao Zhang, Javier Duarte, Farinaz Koushanfar
    International Conference On Computer Aided Design (ICCAD) 2022
    [ paper ]

  • Multi-task Voice Activated Framework using Self-supervised Learning

    Shehzeen Hussain, Van Nguyen, Shuhua Zhang, Erik Visser
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
    [ paper ]

  • Cross-modal Adversarial Reprogramming

    Paarth Neekhara*, Shehzeen Hussain*, Jinglong Du, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley
    IEEE Winter Conference on Applications in Computer Vision (WACV) 2022
    [ * Equal Contribution ]
    [ paper ]

  • Expressive Neural Voice Cloning

    Paarth Neekhara*, Shehzeen Hussain*, Shlomo Dubnov, Farinaz Koushanfar, Julian McAuley
    Asian Conference on Machine Learning (ACML) 2021
    [ * Equal Contribution ]
    [ paper, audio examples ]

  • Waveguard: Understanding and Mitigating Audio Adversarial Attacks

    Shehzeen Hussain*, Paarth Neekhara*, Shlomo Dubnov, Julian McAuley, Farinaz Koushanfar
    USENIX Security Symposium (USENIX) 2021
    [ * Equal Contribution ]
    [ paper,code, audio examples ]

  • Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples

    Shehzeen Hussain*, Paarth Neekhara*, Malhar Jere, Farinaz Koushanfar, Julian McAuley
    IEEE Winter Conference on Applications in Computer Vision (WACV) 2021
    [ * Equal Contribution ]
    [ paper, video examples ]

  • Universal Adversarial Perturbations for Speech Recognition Systems

    Paarth Neekhara*, Shehzeen Hussain*, Prakhar Pandey, Shlomo Dubnov, Julian McAuley, Farinaz Koushanfar
    Interspeech 2019
    [ * Equal Contribution ]
    [ paper, audio examples ]

  • Adversarial Reprogramming of Text Classification Neural Networks

    Paarth Neekhara, Shehzeen Hussain, Shlomo Dubnov, Farinaz Koshanfar
    Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing 2019 (EMNLP)
    [ paper, code ]

  • FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA

    Shehzeen Hussain, Mojan Javaheripi, Paarth Neekhara, Ryan Kastner, Farinaz Koushanfar
    International Conference On Computer Aided Design 2019 (ICCAD)
    [ paper]

  • Overlay optimization for 1x node technology and beyond via rule based sparse sampling

    Nyan Lynn Aung, Woong Jae Chung, Lokesh Subramany, Shehzeen Hussain, Pavan Samudrala, Haiyong Gao, Xueli Hao, Yen-Jen Chen, Juan-Manuel Gomez
    Metrology, Inspection, and Process Control for Microlithography 2016 SPIE
    [ paper ]

Relevant Experience

  • NVIDIA Corporation

    July 2023 - Present
    Santa Clara, California

    Research Scientist - Conversational AI
    Investigating Large Language Models for multi-lingual voice conversion and robust text-to-speech synthesis.

  • NVIDIA Corporation

    June 2022 - September 2022
    Santa Clara, California

    Research Intern - Deep Learning
    Designed deep neural networks for multi-lingual voice conversion and text-to-speech synthesis. Developed state-of-the-art zero-shot voice conversion framework with speaker-adaptive pitch and duration controllability [ paper ]

  • Qualcomm Technologies Research

    June 2021 - September 2021
    San Diego, California

    Deep Learning R&D Intern
    I was a part of the Qualcomm Audio R&D team working on self-supervised learning, using Transformer+CNN architectures for effective keyword spotting and speaker verification. Our work achieved new state-of-the-art. [ paper ]

  • Facebook AI Research

    June 2020 - September 2020
    Menlo Park, California

    Research Intern
    I was a member of the Text-to-Speech Synthesis team within Applied AI Speech. I developed deep neural network models for multi-speaker and multi-style controllable speech synthesis. I also designed end-to-end pipeline for joint training of speaker encoder model with text-to-speech synthesis (Tacotron2) model. Additionally, I developed a voice cloning toolkit for synthesizing speech of unseen speakers from a few reference audio samples.

  • Intel Corporation

    July 2019 - October 2019
    Santa Clara, California

    Graduate Machine Learning Intern
    I was a part of the Non-Volatile Memory Systems Group, working on reinforcement learning for memory and SSD applications.

  • Qualcomm Technologies Research

    June 2018 - September 2018
    San Diego, California

    Deep Learning R&D Intern
    I was a part of Qualcomm Research optimizing power and performance management on Qualcomm chipsets using reinforcement learning with deep neural networks. I was advised by Shankar Sadasivam, Manu Rastogi, Guillaume Sautière and Rajeev Jain.

  • GlobalFoundries

    July 2015 - August 2017
    Malta, New York

    Process Engineer
    Among other responsibilities, I was a member of the applications team and assisted in design of algorithms to model advanced wafer level corrections.
    [ paper ]