Hello there! Welcome to my personal portfolio!
I work as a research scientist at Spotify Research. My primary role there is to come up with convenient techniques to make computers understand the conversations that are happening within the podcasts!
Podcasts are actually billions of hours of speech audio where people are talking in more than 100 different languages. My job is to teach computers to make sense of all those conversations. I can speak only two languages (English and Bengali) and I can focus to listen to only about an hour-long audio per day. Do you think my job is something feasible to do? Send me what you think about it using the Contact Me page.
I did my Ph.D. at the University of Rochester. My academic advisor was Professor M. Ehsan Hoque. Beside my extraordinary advisor, I was also got humbled by getting the opportunity to work with several other amazing researchers and reputed professors, including Professor Henry Kautz, Professor Daniel Gildea, Professor Ji Liu, and Professor Gonzalo Mateos. I am originally from Bangladesh.
Unsupervised Speaker Diarization that is Agnostic to Language, Overlap-Aware, and Tuning Free, Interspeech’22, Incheon, Korea
FairyTED: A Fair Rating Predictor for TED Talk Data, AAAI’20, NY, USA
UR-FUNNY: A Multimodal Language Dataset for Understanding Humor, EMNLP’19, Hong Kong, China
A Causality-Guided Prediction of the TED Talk Ratings from the Speech-Transcripts using Neural Networks, arXiv preprint arXiv:1905.08392
SyntaViz: Visualizing Voice Queries through a Syntax-Driven Hierarchical Ontology, EMNLP’18, Brussels, Belgium, 2018
Awe the Audience: How the Narrative Trajectories Affect Audience Perception in Public Speaking, CHI’18, Montreal, Canada, 2018
AutoManner: An Automated Interface for Making Public Speakers Aware of Their Mannerisms, IUI’16, Sonoma, CA, USA, 2016
Unsupervised Extraction of Human-Interpretable Nonverbal Behavioral Cues in a Public Speaking Scenario, ACMMM’15, Brisbane, Australia, 2015
Automated Prediction of Job Interview Performance: The Role of What You Say and How You Say It, FG’15, Ljubljana, Slovenia, 2015.
News and Presentations
|June 15th, 2022||My first paper after joining Spotify has been accepted for presentation at Interspeech 2022. It is a paper on speaker diarization, which is the problem of identifying “who spoke where” in long audio. In this paper, we propose a speaker diarization technique that is agnostic to language, overlap-aware, and tuning free.||paper|
|Aug 3rd, 2020||I joined Spotify as a research scientist|
|Nov 10th, 2019||Our FairyTED paper got accepted in AAAI 2020!!! After reading Judea Pearl’s “The Book of Why”, I realized that neural networks alone can’t do a bias-free and fair prediction. It is important to model the data generating process (possibly using causal models). I tried to convince Rupam, Ankani, and Soumen regarding the importance of the problem and, together, we pulled the paper off. I’m really very proud of this contribution.||paper|
|Aug 13th, 2019||Our UR-Funny paper got accepted in EMNLP 2019||paper|
|Oct 27th, 2018||I presented in the lab on Judea Pearl’s new book “The Book of Why”. My advisor, Ehsan Hoque tweeted that and Judea Pearl himself commented on that tweet!||Tweet|
|Oct 22nd, 2018||I’ve successfully defended my PhD thesis||Pictures|
|Aug 6th, 2018||One of my work during the internship in Comcast Lab is accepted in EMNLP 2018||Paper|
|April 25, 2018||I presented our TED Talk work in CHI 2018||Slides|
|Feb 16, 2018||I presented Chapter 5 from the Deep Learning book in Lab||Slides|
|Jan 2018||Our paper got accepted in CHI 2018|