[edit]
Volume 262: NeurIPS Efficient Natural Language and Speech Processing Workshop, 14 December 2024, Vancouver, British Columbia, Canada
[edit]
Editors: Mehdi Rezagholizadeh, Peyman Passban, Soheila Samiee, Vahid Partovi Nia, Yu Cheng, Yue Deng, Qun Liu, Boxing Chen
Training
Scaling Smart: Accelerating Large Language Model Pre-Training with Small Model Initialization
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:1-13
[abs][Download PDF]
Computational Bottlenecks of Training Small-scale Large Language Models
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:14-21
[abs][Download PDF]
QuAILoRA: Quantization-Aware Initialization for LoRA
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:22-33
[abs][Download PDF]
SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:34-46
[abs][Download PDF]
RGP: Achieving Memory-Efficient Model Fine-tuning Via Randomized Gradient Projection
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:47-54
[abs][Download PDF]
Efficient Alignment of Large Language Models via Data Sampling
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:55-72
[abs][Download PDF]
KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:73-80
[abs][Download PDF]
Model Design \& Architecture
Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:81-101
[abs][Download PDF]
VL-Mamba: Exploring State Space Models for Multimodal Learning
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:102-113
[abs][Download PDF]
MisD-MoE: A Multimodal Misinformation Detection Framework with Adaptive Feature Selection
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:114-122
[abs][Download PDF]
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:123-135
[abs][Download PDF]
Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:136-144
[abs][Download PDF]
Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:145-164
[abs][Download PDF]
Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:165-181
[abs][Download PDF]
StructMoE: Structured Mixture of Experts Using Low Rank Experts
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:182-193
[abs][Download PDF]
Sparse Upcycling: Inference Inefficient Finetuning
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:194-205
[abs][Download PDF]
Model Efficiency \& Compression
Post-Training Statistical Calibration for Higher Activation Sparsity
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:206-221
[abs][Download PDF]
Accelerating the Low-Rank Decomposed Models
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:222-231
[abs][Download PDF]
The EarlyBird Gets the WORM: Heuristically Accelerating EarlyBird Convergence
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:232-240
[abs][Download PDF]
Post Training Quantization of Large Language Models with Microscaling Formats
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:241-258
[abs][Download PDF]
EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:259-269
[abs][Download PDF]
Scaling laws for post-training quantized large language models
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:270-285
[abs][Download PDF]
Partially Shared Query-Key for Lightweight Language Models
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:286-291
[abs][Download PDF]
Inference
Snakes and Ladders: Accelerating SSM Inference with Speculative Decoding
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:292-304
[abs][Download PDF]
GEAR: An Efficient Error Reduction Framework for KV Cache Compression in LLM Inference
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:305-321
[abs][Download PDF]
The N-Grammys: Accelerating Autoregressive Inference with Learning-Free Batched Speculation
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:322-335
[abs][Download PDF]
Distributed Speculative Inference of Large Language Models is Provably Faster
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:336-354
[abs][Download PDF]
AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:355-369
[abs][Download PDF]
Inference-Friendly Models With MixAttention
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:370-381
[abs][Download PDF]
Improving Multi-candidate Speculative Decoding
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:382-394
[abs][Download PDF]
Speculative Streaming: Fast LLM Inference without Auxiliary Models
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:395-413
[abs][Download PDF]
Hysteresis Activation Function for Efficient Inference
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:414-422
[abs][Download PDF]
Efficiently Dispatching Flash Attention For Partially Filled Attention Masks
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:423-442
[abs][Download PDF]
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:443-455
[abs][Download PDF]
Dynamic Speculation Lookahead Accelerates Speculative Decoding of Large Language Models
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:456-467
[abs][Download PDF]
CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:468-484
[abs][Download PDF]
Residual vector quantization for KV cache compression in large language model
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:485-490
[abs][Download PDF]
Benchmark \& Evaluation
Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:491-511
[abs][Download PDF]
ChemTEB: Chemical Text Embedding Benchmark, an Overview of Embedding Models Performance & Efficiency on a Specific Domain
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:512-531
[abs][Download PDF]
On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:532-539
[abs][Download PDF]
Applications
Text Summarization With Graph Attention Networks
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:540-553
[abs][Download PDF]
Less is Enough: Adapting Pre-trained Vision Transformers for Audio-Visual Speaker Verification
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:554-563
[abs][Download PDF]
Enhanced label noise robustness through early adaptive filtering for the self-supervised speaker verification task
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:564-575
[abs][Download PDF]
Mai Ho‘omāuna i ka ‘Ai: Language Models Improve Automatic Speech Recognition in Hawaiian
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:576-583
[abs][Download PDF]
Lightweight Neural Networks for Speech Emotion Recognition using Layer-wise Adaptive Quantization
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:584-595
[abs][Download PDF]
OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters
; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:596-610
[abs][Download PDF]
subscribe via RSS