Nicholas J. Bryan

Head of Music AI
Adobe Research
San Francisco, CA

email: njb at ieee dot org

Nicholas J. Bryan
Research

Generative AI for music and audio, focused on human–AI co-creation: controllable, fast models that let people control and edit media, rather than relying on one-shot text prompts.

About

I am Head of the Music AI and Principal Scientist at Adobe Research. I received my PhD and MA from CCRMA, Stanford University and MS in Electrical Eng., also from Stanford. Before that, I received my Bachelor of Music and BS in Electrical Eng. with summa cum laude, general honors, and departmental honors at the U. of Miami-FL. I have received 2 best paper awards, 1 AES Graduate Design Gold award, 1 best reviewer award, and 1 best paper finalist acknowledgement. I was general co-chair of WASPAA 2023, a 2x elected member of the IEEE AASP TC, an IEEE Senior Member, and am an Adobe distinguished inventor. I've also been a musician since childhood, performed at Carnegie Hall, and have been on the front-page of the New York Times.


News

[05/2026] Invited Talk at Conversational AI Reading Group at Mila The Grand Design Challenge of Music GenAI
[05/2026] "V2M-Zero: Zero-Pair Time-Aligned Video-to-Music Generation" arXiv
[05/2026] "Rethinking Music Captioning with Music Metadata LLMs" ICASSP 2026 arXiv
[05/2026] "A Generative-First Neural Audio Autoencoder" ICASSP 2026 arXiv
[05/2026] "Stemphonic: All-at-once Flexible Multi-stem Music Generation" ICASSP 2026 arXiv
[04/2026] Invited Talk Design@Large UCSD w/Orly Lobel and Shahrokh Yadegari, EventBrite Link!
[02/2026] "TAC: Timestamped Audio Captioning" arXiv arXiv
[10/2025] Adobe Firefly Generate Soundtrack Released! web
[10/2025] TMLR journal paper! DRAGON: Distributional Rewards Optimize Diffusion Generative Models arXiv, web, video
[04/2025] "Beyond Text to Music" Keynote Talk at ICASSP 2025 GenDA Workshop. slides


Adobe Internships

For current students, I offer research internships at Adobe Research in San Francisco, CA. Internships are typically during the summer months and for graduate students studying audio/music signal processing, machine learning, or related field. If you are interested, please send me an email with your CV and research interests in October-December before the given summer. My Adobe webpage is here.


Publications


"V2M-Zero: Zero-Pair Time-Aligned Video-to-Music Generation"
Y-B. Lin J. Casebeer L. Mai A. Mahapatra G. Bertasius N. J. Bryan
arXiv, March, 2026.
(arXiv, web)
Rethinking Music Captioning with Music Metadata LLMs
"Rethinking Music Captioning with Music Metadata LLMs"
I. Bukey Z. Wang C. Donahue N. J. Bryan
IEEE ICASSP, May, 2026.
(arXiv)

"A Generative-First Neural Audio Autoencoder"
J. Casebeer G. Zhu Z. Wang N. J. Bryan
IEEE ICASSP, May, 2026.
(arXiv, video)

"Stemphonic: All-at-once Flexible Multi-stem Music Generation"
S-L Wu G. Zhu J-P. Caceres C-Z. Huang N. J. Bryan
IEEE ICASSP, May, 2026.
(arXiv, web, video)
TAC: Timestamped Audio Captioning
"TAC: Timestamped Audio Captioning"
S. Kumar P. Seetharaman K. Chen O. Nieto J. Su Z. Wang R. Kumar D. Manocha N. J. Bryan Z. Jin J. Salamon
arXiv, Feb, 2026.
(arXiv)

"DRAGON: Distributional Rewards Optimize Diffusion Generative Models"
Y. Bai J. Casebeer S. Sojoudi N. J. Bryan
TMLR, April, 2025.
(arXiv, web, video)

"Presto! Distilling Steps and Layers for Accelerating Music Generation"
Z. Novack G. Zhu J. Casebeer J. McAuley, T. Berg-Kirkpatrick, N. J. Bryan
ICLR (Spotlight, top 5%), April, 2025.
(arXiv, web, video)

"MusicHiFi: Fast High-Fidelity Stereo Vocoding"
G. Zhu J.-P. Caceres, Z. Duan, N. J. Bryan
IEEE SPL, March, 2024.
(arXiv, web, video)

"DITTO: Diffusion Inference-time T-Optimization for Music Generation"
Z. Novack J. McAuley, T. Berg-Kirkpatrick, N. J. Bryan
ICML (oral, top 1.5%), January, 2024.
(arXiv, web, video)
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation "DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation"
Z. Novack J. McAuley, T. Berg-Kirkpatrick, N. J. Bryan
arXiv, May, 2024.
International Society for Music Information Retrieval Conference (ISMIR), November, 2024.
(arXiv, web)

"Music ControlNet: Multiple Time-varying Controls for Music Generation"
S-L. Wu C. Donahue, S. Watanabe, N. J. Bryan
IEEE Transactions on Audio, Speech, and Language Processsing (TASLP), November, 2023.
(arXiv, web, video)
"Meta-AF: Meta-learning for Adaptive Filters."
J. Casebeer N. J. Bryan, P. Smaragdis
IEEE Transactions on Audio, Speech, and Language Processsing (TASLP), January, 2022.
Presented at IEEE International Conf. on Acoustics, Speech, and Signal Processing (ICASSP), June, 2023.
(TASLP, arXiv, web, code, video)
"Style Transfer of Audio Effects with Differentiable Signal Processing."
C. J. Steinmetz, N. J. Bryan, J. D. Reiss,
Journal of the Audio Engineering Society (JAES), September, 2022.
Presented at 154th AES Europe Convention, May, 2023.
(arXiv, code, demo)
Meta-learning for Adaptive Filters with Higher-order Frequency Dependencies. "Meta-learning for Adaptive Filters with Higher-order Frequency Dependencies."
J. Wu, J. Casebeer N. J. Bryan, P. Smaragdis,
IEEE Workshop on Acoustic Signal Enhancement (IWAENC), September, 2022.
(arXiv, code, demo)
Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization. "Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization."
H. Yang, S. Firodiya N. J. Bryan, M. Kim,
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022.
(arXiv, paper code)
Emotion Embedding Spaces for Matching Music to Stories. "Emotion Embedding Spaces for Matching Music to Stories."
M. Won, J. Salamon N. J. Bryan, G. J. Mysore, X. Serra
International Society for Music Information Retrieval Conference (ISMIR), 2021.
(code, paper)
ISMIR Best Student Paper Award Winner
Deep Embeddings and Section Fusion Improve Music Segmentation. "Deep Embeddings and Section Fusion Improve Music Segmentation."
J. Salamon O. Nieto, N. J. Bryan,
International Society for Music Information Retrieval Conference (ISMIR), 2021.
(code, paper)
Who Calls the Shots? Rethinking Few-Shot Learning for Audio. "Who Calls the Shots? Rethinking Few-Shot Learning for Audio."
Y. Wang, N. J. Bryan, J. Salamon M. Cartwright, J. P. Bello
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021.
(arXiv, data, code, paper, full talk)
IEEE WASPAA Special Best Paper Award Winner
Auto-DSP: Learning to Optimize Acoustic Echo Cancellers. "Auto-DSP: Learning to Optimize Acoustic Echo Cancellers."
J. Casebeer, N. J. Bryan P. Smaragdis,
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021.
(arXiv, web, code, paper, short talk, full talk)
"Differentiable Signal Processing with Black-Box Audio Effects."
M. A. Martínez Ramírez, O. Wang, P. Smaragdis, N. J. Bryan
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021.
(arXiv, web, code, ieee explore, paper long-talk)
Few-shot Continual Learning for Audio Classification. "Few-shot Continual Learning for Audio Classification."
Y. Wang, N. J. Bryan, M. Cartwright, J.-P. Bello, J. Salamon
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021.
(paper, ieee explore)
Context-aware Prosody Correction for Text-based Speech Editing. "Context-aware Prosody Correction for Text-based Speech Editing."
M. Morrison, L. Rencker, Z. Jin, N. J. Bryan, J.P. Caceres, B. Pardo
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021.
(arXiv, web, paper)
A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences. "A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences."
P. Manocha, A. Finkelstein, R. Zhang, N. J. Bryan, G. J. Mysore, Z. Jin
Interspeech, 2020.
(arXiv, web, code)
Interspeech Best Student Paper Finalist
Controllable Neural Prosody Synthesis. "Controllable Neural Prosody Synthesis."
M. Morrison, Z. Jin, J. Salamon, N. J. Bryan, G. J. Mysore
Interspeech, 2020.
(paper | arXiv, web)
Metric Learning vs Classification for Disentangled Music Representation Learning. "Metric Learning vs Classification for Disentangled Music Representation Learning."
J. Lee, N. J. Bryan, J. Salamon, Z. Jin J. Nam
International Society for Music Information Retrieval (ISMIR), 2020.
(paper | arXiv | web)
Few-shot Drum Transcription in Polyphonic Music. "Few-shot Drum Transcription in Polyphonic Music."
Y. Wang, J. Salamon, M. Cartwright, N. J. Bryan, J.-P. Bello
International Society for Music Information Retrieval (ISMIR), 2020.
(paper | arXiv)
Disentangled Multidimensional Metric Learning For Music Similarity. "Disentangled Multidimensional Metric Learning For Music Similarity."
J. Lee, N. J. Bryan, J. Salamon, Z. Jin J. Nam
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020.
(project page | paper | ieee link | arXiv | talk | data set)
One-Shot Parametric Audio Production Style Transfer With Application to Frequency Equalization. "One-Shot Parametric Audio Production Style Transfer With Application to Frequency Equalization."
S. I. Mimilakis, N. J. Bryan, P. Smaragdis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020.
(project page | paper | ieee link | talk)
Few-Shot Sound Event Detection. "Few-Shot Sound Event Detection."
Y. Wang, J. Salamon, N. J. Bryan, J.-P. Bello
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020.
(paper | ieee link | talk)
"Impulse Response Data Augmentation and Deep Neural Networks For Blind Room Acoustic Parameter Estimation."
N. J. Bryan
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020.
(paper | ieee link | talk)
Scene-Aware Audio Rendering via Deep Acoustic Analysis. "Scene-Aware Audio Rendering via Deep Acoustic Analysis."
Z. Tang N. J. Bryan D. Li T. Langlois D. Manocha
IEEE VR Journal Track (TVCG), 2020.
(paper | ieee | arXiv | project page | video )
ISSE: An Interactive Source Separation Editor. "ISSE: An Interactive Source Separation Editor."
N. J. Bryan, G. J. Mysore, G. Wang
Human Factors in Computing Systems (CHI), 2014.
(project page | paper | video)
"Interactive Sound Source Separation."
N. J. Bryan.
Stanford University, Stanford, CA, USA. March, 2014.
(PhD Thesis)

* ISSE: An Interactive Source Separation Editor (talk 1) (talk 2)
* Software (link)
* C++ code (link)
* Matlab code (link)
* Demos (link)
* SiSEC results (link)
* Publications (link)
AES Graduate Student Design Gold Award
Source Separation of Polyphonic Music With Interactive User-Feedback on a Piano Roll Display. "Source Separation of Polyphonic Music With Interactive User-Feedback on a Piano Roll Display."
N. J. Bryan, G. J. Mysore, G. Wang
International Society for Music Information Retrieval Conference (ISMIR), 2013.
(web | paper)
An Efficient Posterior Regularized Latent Variable Model for Interactive Sound Source Separation. "An Efficient Posterior Regularized Latent Variable Model for Interactive Sound Source Separation."
N. J. Bryan, G. J. Mysore
International Conference on Machine Learning (ICML), 2013.
(web | paper | sisec | Adobe MAX | poster | slides)
Interactive Refinement of Supervised and Semi-Supervised Sound Source Separation Estimates. "Interactive Refinement of Supervised and Semi-Supervised Sound Source Separation Estimates."
N. J. Bryan, G. J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013.
(web | pre-print | poster)
Interactive User-Feedback for Sound Source Separation. "Interactive User-Feedback for Sound Source Separation."
N. J. Bryan, G. J. Mysore
International Conf. on Intelligent User-Interfaces, Workshop on Interative Machine Learning, 2013.
(web | abstract)
User-Guided Variable-Rate Time-Stretching Via Stiffness Control. "User-Guided Variable-Rate Time-Stretching Via Stiffness Control."
N. J. Bryan, J. Herrera, G. Wang
International Conference on Digital Audio Effects (DAFX), 2012.
(web | paper | slides (long) | slides (short) )
Clustering and Synchronizing Multi-Camera Video via Landmark Cross-Correlation "Clustering and Synchronizing Multi-Camera Video via Landmark Cross-Correlation"
N. J. Bryan, P. Smaragdis, G. J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012.
(paper | poster | pre-print | slides, Adobe MAX)
Musical Influence Network Analysis and Rank of Sampled-Based Music. "Musical Influence Network Analysis and Rank of Sampled-Based Music."
N. J. Bryan, G. Wang
International Society for Music Information Retrieval Conference (ISMIR), 2011.
(paper | whosampled | slides)
Two Turntables and a Mobile Phone. "Two Turntables and a Mobile Phone."
N. J. Bryan, G. Wang
International Conference on New Interfaces for Musical Expression (NIME), 2011.
(paper | slides | web)
Instinct-Based Mating in Genetic Algorithms Applied to the Tuning of 1-NN Classifiers. "Instinct-Based Mating in Genetic Algorithms Applied to the Tuning of 1-NN Classifiers."
T. Quirino, M. Kubat, N. J. Bryan
IEEE Transactions on Knowledge and Data Engineering (TKDE), December, 2010.
(paper)
Methods For Extending Room Impulse Responses Beyond Their Noise Floor. "Methods For Extending Room Impulse Responses Beyond Their Noise Floor."
N. J. Bryan, J. S. Abel
Audio Engineering Society Convention (AES), 2010.
(paper | slides | web)
Impulse Response Measurements in the Presence of Clock Drift. "Impulse Response Measurements in the Presence of Clock Drift."
N. J. Bryan, M. A. Kolar, J. S. Abel
Audio Engineering Society Convention (AES), 2010.
(paper | slides | web)
Estimating Room Impulse Responses from Recorded Balloon Pops. "Estimating Room Impulse Responses from Recorded Balloon Pops."
J. S. Abel N. J. Bryan, P. Huang, M. A. Kolar, B. Pentcheva,
Audio Engineering Society Convention (AES), 2010.
(paper | slides | web)
Approximating Measured Reverberation Using A Hybrid Fixed/Switched Convolution Structure. "Approximating Measured Reverberation Using A Hybrid Fixed/Switched Convolution Structure."
K. Lee, N. J. Bryan, J. S. Abel
International Conference on Digital Audio Effects (DAFX), 2010.
(paper | web)
MoMu: A Mobile Music Toolkit. "MoMu: A Mobile Music Toolkit."
N. J. Bryan, J. Herrera, J. Oh, G. Wang
International Conference on New Interfaces for Musical Expression (NIME), 2010.
(paper | web | speaker hands | slides)
Evolving The Mobile Phone Orchestra. "Evolving The Mobile Phone Orchestra."
J. Oh, J. Herrera, N. J. Bryan, L. Dahl, G. Wang
International Conference on New Interfaces for Musical Expression (NIME), 2010.
(paper | speaker hands)
A Configurable Microphone Array with Acoustically Transparent Omnidirectional Elements. "A Configurable Microphone Array with Acoustically Transparent Omnidirectional Elements."
J. S. Abel N. J. Bryan, T. Skare, M. A. Kolar, P. Huang, D. Mostowfi, J. O. Smith III
Audio Engineering Society Convention (AES), 2009.
(paper | poster | web)
Stanford Laptop Orchestra (SLOrk). "Stanford Laptop Orchestra (SLOrk)."
G. Wang N. J. Bryan, J. Oh, R. Hamilton,
International Computer Music Conference (ICML), 2009.
(pdf: paper | web | speakers )