[edit]
Volume 312: Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI), 26 January 2026, Singapore EXPO, Singapore, Singapore
[edit]
Editors: Tatsuya Komatsu, Keisuke Imoto, Xiaoxue Gao, Nobutaka Ono, Nancy F. Chen
Lina-Speech: Gated Linear Attention and Initial-State Tuning for Multi-Sample Prompting Text-To-Speech Synthesis
; Proceedings of the AAAI 2026 Workshop on Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI), PMLR 312:1-20
[abs][Download PDF]
AudioBERTScore: Objective Evaluation of Environmental Sound Synthesis Based on Similarity of Audio Embedding Sequences
; Proceedings of the AAAI 2026 Workshop on Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI), PMLR 312:21-37
[abs][Download PDF]
Semi-supervised Acoustic Scene Classification under Spatial-Temporal Variability with a CRNN-based Model
; Proceedings of the AAAI 2026 Workshop on Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI), PMLR 312:38-47
[abs][Download PDF]
Online Independent Low-Rank Matrix Analysis as a Lightweight and Trainable Model for Real-Time Multichannel Music Source Separation
; Proceedings of the AAAI 2026 Workshop on Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI), PMLR 312:48-60
[abs][Download PDF]
Train multi-modal LLM to understand diverse speech paralinguistics by distilling from teacher with meta-information prompt
; Proceedings of the AAAI 2026 Workshop on Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI), PMLR 312:61-77
[abs][Download PDF]
Latent-RQ: Enhancing Speech Pre-training with Latent Representations and Random Quantization
; Proceedings of the AAAI 2026 Workshop on Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI), PMLR 312:78-93
[abs][Download PDF]
Can You Hear Naples? Building and Benchmarking a Neapolitan Speech Corpus
; Proceedings of the AAAI 2026 Workshop on Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI), PMLR 312:94-112
[abs][Download PDF]
AudioRAG: A Challenging Benchmark for Audio Reasoning and Information Retrieval
; Proceedings of the AAAI 2026 Workshop on Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI), PMLR 312:113-125
[abs][Download PDF]
subscribe via RSS