<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Proceedings of Machine Learning Research</title>
    <description>Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL)
  Held in UiT The Arctic University, Troms\o, Norway on 06-08 January 2026

Published as Volume 307 by the Proceedings of Machine Learning Research on 19 January 2026.

Volume Edited by:
  Hyeongji Kim
  Adín Ramírez Rivera
  Benjamin Ricaud

Series Editors:
  Neil D. Lawrence
</description>
    <link>https://proceedings.mlr.press/v307/</link>
    <atom:link href="https://proceedings.mlr.press/v307/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Sat, 14 Mar 2026 22:51:34 +0000</pubDate>
    <lastBuildDate>Sat, 14 Mar 2026 22:51:34 +0000</lastBuildDate>
    <generator>Jekyll v3.10.0</generator>
    
      <item>
        <title>SimGroupAttn: Similarity-Guided Group Attention for Vision Transformer to Incorporate Population Information in Plant Disease Detection</title>
        <description>In this paper, we address the problem of Vision Transformer (ViT) models being limited to intra-image attention, which prevents them from leveraging cross-sample information. This is highly relevant in agricultural data such as plant disease detection, an important challenge in agriculture where early and reliable diagnosis helps protect yields and food security. Yet existing methods often fail to capture subtle or overlapping symptoms that only become evident when considered in a population context. Our approach $\textit{SimGroupAttn}$ extends masked image modeling by enabling image patches to attend not only within their own image but also to similar regions across other images in the same batch. Guided by cosine similarity score which is trained jointly with model weights,$\textit{SimGroupAttn}$ incorporates population-level context into the learned representations, making them more robust and discriminative. Extensive experiments on PlantPathology dataset demonstrate that our approach outperforms Simple Masked Image Modeling (SimMIM) and Masked Autoencoders (MAE) in linear probing and classification task. It improves top-1 accuracy by up to 6.5% in linear probing for complex classes and 3.5% in classification compared with the best baseline model performance under the same settings.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/wu26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/wu26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Predictive and Explanatory Uncertainties in Graph Neural Networks: A Case Study in Molecular Property Prediction</title>
        <description>Accurate molecular property prediction is a key challenge in fields such as drug discovery and materials science, where deep learning models offer promising solutions. However, the widespread use of these models is hindered by their lack of transparency and the difficulty in assessing the reliability of their predictions. In this study, we address these issues by integrating uncertainty quantification and explainable AI techniques to enhance the trustworthiness of graph neural networks for molecular property prediction. We focus on predicting two distinct properties: aqueous solubility and mutagenicity. By deriving explanations in the form of substructure attribution scores, we obtain interpretable explanations that signify which chemically meaningful substructures influence the model’s predictions. We incorporate uncertainty quantification to evaluate the confidence of both the predictions and their explanations. Our results demonstrate that predictive uncertainty scores correlate with the accuracy of the predictions for both tasks. Uncertainties in the explanations also correlate with prediction correctness, and there is a weak to moderate correlation between the uncertainties in the predictions and those in the explanations. These findings highlight the potential of combining uncertainty quantification and explainability to improve the trustworthiness of molecular property prediction models.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/wodrich26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/wodrich26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Using Ensemble Diffusion to Estimate Uncertainty for End-to-End Autonomous Driving</title>
        <description>End-to-end planning systems for autonomous driving are rapidly improving, especially in closed-loop simulation environments like CARLA. Many such driving systems either do not consider uncertainty as part of the plan itself or obtain it by using specialized representations that do not generalize. In this paper, we propose EnDfuser, an end-to-end driving system that uses a diffusion model as the trajectory planner.  EnDfuser effectively leverages complex perception information like fused camera and LiDAR features, through combining attention pooling and trajectory planning into a single diffusion transformer module.  Instead of committing to a single plan,  EnDfuser produces a distribution of candidate trajectories (128 for our case) from a single perception frame through ensemble diffusion.  By observing the full set of candidate trajectories, EnDfuser provides interpretability for uncertain, multimodal future trajectory spaces. Using this information we design a simplistic safety-rule that improves the system’s driving score by 1.7% on the LAV benchmark. Our findings suggest that ensemble diffusion, used as a drop-in replacement for traditional point-estimate trajectory planning modules, can contribute to an uncertainty-aware decision making process in End-to-End driving policies by modeling the uncertainty of the posterior trajectory distribution.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/wintel26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/wintel26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Explaining Latent Representations of Neural Networks with Archetypal Analysis</title>
        <description>We apply Archetypal Analysis to the latent spaces of trained neural networks, offering interpretable explanations of feature representations of neural networks without relying on user-defined corpora. Through layer-wise analyses of convolutional networks and vision transformers across multiple classification tasks, we demonstrate that archetypes are robust, dataset-independent, and provide intuitive insights into how models encode and transform information from layer to layer. Our approach enables global insights by characterizing the unique structure of the latent representation space of each layer, while also offering localized explanations of individual decisions as convex combinations of extreme points (i.e., archetypes).</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/wedenborg26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/wedenborg26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Comparing Foundation Models for Medical Images: A Study on Limited Data and Generalization</title>
        <description>In this study we have investigated how vision foundation models, pretrained on different domains, compete with a specialized model for classification as a function of the size of the labeled training set of medical images. Furthermore, we have looked into the different models’ ability to generalize to difficult cases. Our experiments are conducted for cardiac ultrasound images and the downstream task of view recognition. Still, this classification task is meant to serve as a demonstrative example, where we think that the findings should be transferable to other classification tasks and other domains.  Through these experiments we found that the foundation models were able to beat the performance of our task-specific supervised model when labelled training data were limited. This was true even for models trained on natural images and when using the simple linear probing method to create a classifier. We observed that more domain-specific foundation models achieved an even higher performance with limited data. On the other hand, the more general models showed a greater ability to generalize and perform well on difficult, out-of-distribution cases. Still, for typical in-domain cases with sufficient labeled data, a task-specific ResNet model was competitive with the foundation models, while also being both smaller and faster.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/utseth26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/utseth26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in Fisheries</title>
        <description>Accurate fisheries data are crucial for effective and sustainable marine resource management. With the recent adoption of Electronic Monitoring (EM) systems, more video data is now being collected than can be feasibly reviewed manually. This paper addresses this challenge by developing an optimized deep learning pipeline for automated fish re-identification (Re-ID) using the novel AutoFish dataset, which simulates EM systems with conveyor belts with six similarly looking fish species. We demonstrate that key Re-ID metrics (R1 and mAP@k) are substantially improved by using hard triplet mining in conjunction with a custom image transformation pipeline that includes dataset-specific normalization. By employing these strategies, we demonstrate that the Vision Transformer-based Swin-T architecture consistently outperforms the Convolutional Neural Network-based ResNet-50, achieving peak performance of 41.65% mAP@k and 90.43% Rank-1 accuracy. An in-depth analysis reveals that the primary challenge is distinguishing visually similar individuals of the same species (Intra-species errors), where viewpoint inconsistency proves significantly more detrimental than partial occlusion. The source code and documentation are available at: \url{https://github.com/msamdk/Fish_Re_Identification.git}</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/thilakarathna26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/thilakarathna26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Preserving Ordinality in Diabetic Retinopathy Grading through a Distribution-Based Loss Function</title>
        <description>Diabetic Retinopathy (DR) is a neurovascular complication of diabetes and the leading cause of blindness in adults in developed countries. Because DR progresses through ordered severity levels, its grading is naturally an ordinal classification problem. Yet, most deep learning methods treat it as a categorical task, disregarding the inherent class order and worsening performance under class imbalance. In this work, we introduce a novel ordinal loss function that emphasizes the predictive tendencies of the whole model output rather than the class output probabilities individually. This design promotes unimodal predictions aligned with the underlying severity scale and is particularly robust to class imbalance. To place our method in context, we also evaluate a range of existing ordinal approaches on five publicly available DR datasets. with cross-entropy serving as a nominal baseline.  Extensive experiments demonstrate that our proposed loss function consistently preserves the ordinal structure of DR grades, even under severe imbalance, outperforming both nominal and alternative ordinal formulations. Code will be released upon acceptance.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/stelter26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/stelter26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Investigating the relationship between diversity and generalization in deep neural networks</title>
        <description>In ensembles, improved generalization is frequently attributed to \emph{diversity} among members of the ensemble. By viewing a single neural network as an \emph{implicit ensemble}, we perform an exploratory investigation that applies well-known ensemble diversity measures to a neural network in order to study the relationship between diversity and generalization in the over-parameterized regime. Our results show that i) deeper layers of the network generally have higher levels of diversity—particularly for MLPs—and ii) layer-wise accuracy positively correlates with diversity. Additionally, we study the effects of well-known regularizers such as Dropout, DropConnect and batch size, on diversity and generalization. We generally find that increasing the strength of the regularizer increases the diversity in the neural network and this increase in diversity is positively correlated with model accuracy. We show that these results hold for several benchmark datasets (such as Fashion-MNIST and CIFAR-10) and architectures (MLPs and CNNs). Our findings suggest new avenues of research into the generalization ability of deep neural networks.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/spoel26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/spoel26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Reducing Manual Workload in SAR-Based Oil Spill Detection Through Uncertainty-Aware Deep Learning</title>
        <description>Constant monitoring of the oceans is required to detect oil spills and reduce environmental damage associated with spills. Synthetic Aperture Radar (SAR) imaging is a critical tool for oil spill detection, but is complex and requires time-consuming manual labor for analysis. Deep learning has shown encouraging performance in automatic classification of oil spills on these images, but the performance is still not sufficient for a deep learning classifier to act autonomously, making manual assessment essential. However, if only a reduced subset of uncertain samples had to be analyzed by human experts while the remaining samples could be automatically classified, it could greatly reduce the manual workload. In this study, we investigate if uncertainty estimates can identify which samples should be prioritized for manual inspection. Specifically, we propose a pipeline of defining a user-specified error tolerance and identifying an uncertainty threshold that filters out samples for automatic/manual thresholding. We evaluate the proposed pipeline on challenging real-world data. The results show that our proposed uncertainty-based ranking technique can reduce the manual workload by 41%, paving the way for new and more efficient ways to detect marine oil spills.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/solskinnsbakk26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/solskinnsbakk26a.html</guid>
        
        
      </item>
    
      <item>
        <title>EEG Guided Token Selection in VQ for Visual Brain Decoding</title>
        <description>Reconstructing visual stimuli from non-invasive Electroencephalography (EEG) is an interesting but challenging task in brain decoding that involves translating noisy neural signals into images via fine-grained generative control. In this work, we introduce a novel and efficient framework that guides a visual token generator by conditioning the generation process on a high-level semantic understanding of the EEG signal. Our method leverages a pre-trained LaBraM-based architecture to derive a robust class prediction from the neural data.  In comparison to recent works that involve diffusion models, which require high computational resources and long inference times, our approach utilizes a lightweight and efficient token generator by building upon the bidirectional, parallel decoding capabilities of MaskGIT. This choice of components avoids the high computational requirements typical of large-scale diffusion processes. This focus on efficiency makes our approach not only easier to train but also more viable for potential real-time BCI applications where real-time feedback is crucial.  The core of our method is a straightforward yet powerful two-stage process. First, the  EEG classifier distills the complex input signal into a class label. In the second stage, this label serves as a direct condition for the pre-trained token generator. The generator, guided by this class information, then produces a sequence of discrete latent codes that are semantically consistent with the original stimulus. This neurally-guided token sequence is finally rendered into a high-fidelity image by a pretrained decoder, completing an efficient pathway from brain activity to visual representation</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/rathore26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/rathore26a.html</guid>
        
        
      </item>
    
      <item>
        <title>RAG in the Aerospace Domain: A Comprehensive Retrieval, Generation, and User Evaluation for NASA Documentation</title>
        <description>Large Language Models (LLMs) have demonstrated remarkable capabilities in Natural Language Understanding  and text generation, but their application is often limited by hallucinations, outdated knowledge, and lack of evidence. Retrieval-Augmented Generation (RAG) addresses these fundamental LLM limitations by integrating external knowledge sources, thereby improving the factual accuracy and traceability while maintaining the text generative capabilities. This work presents the design and implementation of a web-based RAG system for the aerospace domain, leveraging more than 10,000 NASA technical documents and lessons-learned mission reports. The system integrates open-source LLaMA and closed-source OpenAI models and performs an extensive comparative analysis of their performance within the RAG framework. Evaluation through both automated metrics and user studies demonstrates the effectiveness of the RAG approach for both technical and non-technical users. The findings provide insights and establish a foundation for future advancements in AI-driven knowledge management for specialized fields</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/petniunas26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/petniunas26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Design and Evaluation of a Geometric Algebra-Based Graph Neural Network for Molecular Property Prediction</title>
        <description>Geometric Algebra (GA) provides a unified framework for representing scalars, vectors, and higher-dimensional geometric elements, along with the geometric product, an operation that mixes information across these components in an equivariant manner. While GA has recently attracted attention in deep learning, its potential for molecular property prediction remains underexplored. We introduce GA-GNN, a novel equivariant graph neural network that extends message passing architectures from separate scalar and vector features to multivector representations, and employs sequences of geometric product layers as the core update mechanism. Evaluated on the QM9 benchmark, GA-GNN achieves competitive performance with the recent state-of-the-art while demonstrating that GA-based representations can simplify architecture design. These results highlight the potential of GA for building expressive equivariant message passing networks for molecular property prediction.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/petersen26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/petersen26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Incorporating the Cycle Inductive Bias in Masked Autoencoders</title>
        <description>Many time series exhibit cyclic structure   for example, in physiological signals such as ECG or EEG   yet most representation learning methods treat them as generic sequences. We propose a masked autoencoder (MAE) framework that explicitly leverages cycles as an inductive bias for more efficient and effective time-series modelling. Our method decomposes sequences into cycles and trains the model to reconstruct masked segments at both the cycle and sequence level. This cycle-based decomposition shortens the effective sequence length processed by the encoder by up to a factor of ten in our experiments, yielding substantial computational savings without loss in reconstruction quality. At the same time, the approach exposes the encoder to a greater diversity of temporal patterns, as each cycle forms an additional training instance, which enhances the ability to capture subtle intra-cycle variations. Empirically, our framework outperforms three competitive baselines across four cyclic datasets, while also reducing training time on larger datasets.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/ottersen26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/ottersen26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Liver, vessel, and tumor segmentation from partially labeled CT and multi-label masked learning</title>
        <description>Accurate delineation of liver parenchyma, intrahepatic vessels, and tumors (LVT) may aid earlier tumor detection, consistent response assessment, and surgical planning for patients with liver cancer. Deep learning (DL) may enable such automated delineation, but available CT datasets are inconsistent and partially labeled, making them unsuited for end-to-end training. We investigate a single-head, 3D segmentation framework that learns from partially labeled data by: (i) loss masking per class or voxel to ignore missing annotations, (ii) using multi-hot targets and the anatomical hierarchy inherent to liver, vessels, and tumors, to handle overlapping structures without class competition. In controlled ablations that simulate partial-label training, this multi-label masked strategy reliably outperforms masked multi-class baselines, avoids precision collapse, and improves tumor overlap and lesion detection sensitivity. Scaling training to multiple partially labeled datasets, the model surpasses full-resolution nnU-Net on an external clinical cohort, with higher tumor and vessel segmentation performance. We conduct a retrospective feasibility analysis on clinical data to illustrate the clinical potential of the LVT application. We find that LVT models may facilitate earlier detection of metastasis, longitudinal size tracking aligned with radiologist measurements, 3D tumor–vessel visualization for surgical planning, and stable inter-phase liver volumetry ($\approx$ 5% deviation). These results show that multi-label masked learning enables robust, clinically relevant LVT segmentation from partially labeled datasets.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/ostmo26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/ostmo26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Hybrid Concept-based Models: Using Concepts to Improve Neural Networks’ Accuracy</title>
        <description>Most datasets used for supervised machine learning consist of a single label per data point. However, in cases where more information than just the class label is available, would it be possible to train models more efficiently? We introduce two novel model architectures, which we call \emph{hybrid concept-based models}, that train using both class labels and additional information in the dataset referred to as \emph{concepts}. In order to thoroughly assess their performance, we introduce \emph{ConceptShapes}, an open and flexible class of datasets with concept labels. We show that the hybrid concept-based models can outperform standard computer vision models and previously proposed concept-based models with respect to accuracy. We also introduce an algorithm for performing \emph{adversarial concept attacks}, where an image is perturbed in a way that does not change a concept-based model’s concept predictions, but changes the class prediction. The existence of such adversarial examples raises questions about the interpretable qualities promised by concept-based models.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/opsahl26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/opsahl26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Kolmogorov–Arnold Networks for Cross-Domain Time-Series Modeling in Health and Activity Monitoring</title>
        <description>Time-series data from wearable sensors and clinical assessments provide complementary perspectives on human health, yet they often remain siloed across domains. This work presents a framework for harmonizing heterogeneous time-series sources at both minute and daily resolutions, extracting interpretable temporal features through techniques such as frequency-domain analysis and automated feature engineering. On top of this feature space, we benchmark conventional machine learning methods, Random Forest, Logistic Regression, Gradient Boosting, and a Transformer baseline against a proposed Kolmogorov Arnold Networks (KANs) model, which adaptively learn functional transformations tailored to complex temporal patterns. We evaluate models on tasks including activity index prediction and disorder-related classification, with a focus on transfer learning across lifestyle and clinical domains. Results indicate that KANs achieve competitive performance and offer greater interpretability of temporal dynamics than black-box architectures. The proposed framework demonstrates how modern time-series models can enable cross-domain learning and improve the understanding of physiological and behavioral health patterns.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/mohammed26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/mohammed26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Improving Vision Model Robustness against Misclassification and Uncertainty Attacks via Underconfidence Adversarial Training</title>
        <description>Adversarial robustness research has focused on defending against misclassification attacks. However, such adversarially trained models remain vulnerable to underconfidence adversarial attacks, which reduce the model s confidence without changing the predicted class. Decreased confidence can result in unnecessary interventions, delayed diagnoses, and a weakening of trust in automated systems. In this work, we introduce two novel underconfidence attacks: one that induces ambiguity between a class pair, and ConfSmooth which spreads uncertainty across all classes. For defense, we propose Underconfidence Adversarial Training (UAT) that embeds our underconfidence attacks in an adversarial training framework. We extensively benchmark our underconfidence attacks and defense strategies across six model architectures (both CNN and ViT-based), and seven datasets (MNIST, CIFAR, ImageNet, MSTAR and medical imaging).  In 14 of the 15 data-architecture combinations, our attack outperforms the state-of-the-art, often substantially. Our UAT defense maintains the highest robustness against all underconfidence attacks on CIFAR-10, and achieves comparable to or better robustness than adversarial training against misclassification attacks while taking half of the gradient steps. By broadening the scope of adversarial robustness to include uncertainty-aware threats and defenses, UAT enables more robust computer vision systems.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/marti-nez-marti-nez26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/marti-nez-marti-nez26a.html</guid>
        
        
      </item>
    
      <item>
        <title>AI-Enabled Vessels Segmentation Model for Real-Time Laparoscopic Ultrasound Imaging</title>
        <description>Laparoscopic ultrasound (LUS) is essential for assessing the liver during laparoscopic liver resections. However, the interpretation of LUS images presents significant challenges due to the steep learning curve and image noise. In this study, we propose an enhanced U-Net-based neural network with a ResNet18 backbone specifically designed for real-time liver vessel segmentation of 2D LUS images. Our approach incorporates five preprocessing steps aimed at maximizing the training information extracted from the ultrasound sonogram region. The modified U-Net model achieved a Dice coefficient of 0.879, demonstrating real-time performance at 40 frames per second and enabling the development of advanced ultrasound-based surgical navigation solutions.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/kupcikevicius26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/kupcikevicius26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Analyzing Fairness of Neural Network Prediction via Counterfactual Dataset Generation</title>
        <description>Interpreting the inference-time behavior of deep neural networks remains a challenging problem.  Existing approaches to counterfactual explanation typically ask: What is the closest alternative $\textit{input}$ that would alter the model’s prediction in a desired way? In contrast, we explore $\textbf{counterfactual datasets}$.  Rather than perturbing the input, our method efficiently finds the closest alternative $\textit{training dataset}$, one that differs from the original dataset by changing a few labels.  Training a new model on this altered dataset can then lead to a different prediction of a given test instance. This perspective provides a new way to assess fairness by directly analyzing the influence of label bias on training and inference.  Our approach can be characterized as probing whether a given prediction depends on biased labels. Since exhaustively enumerating all possible alternate datasets is infeasible, we develop analysis techniques that trace how bias in the training data may propagate through the learning algorithm to the trained network. Our method heuristically ranks and modifies the labels of a bounded number of training examples to construct a counterfactual dataset, retrains the model, and checks whether its prediction on a chosen test case changes. We evaluate our approach on feedforward neural networks across over 1100 test cases from 7 widely-used fairness datasets.  Results show that it modifies only a small subset of training labels, highlighting its ability to pinpoint the critical training examples that drive prediction changes. Finally, we demonstrate how counterfactual training datasets reveal connections between training examples and test cases, offering an interpretable way to probe dataset bias.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/kim26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/kim26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Counterfactual generation for Out-of-Distribution data</title>
        <description>Deploying machine learning models in safety-critical applications necessitates both reliable out-of-distribution (OOD) detection and interpretable model behavior. While substantial progress has been made in OOD detection and explainable AI (XAI), the question of why a model classifies a data point as OOD remains underexplored. Counterfactual explanations are a widely used XAI approach, yet they often fail in OOD contexts, as the generated examples may themselves be OOD. To address this limitation, we introduce the concept of OOD counterfactuals perturbed inputs that transition between distinct OOD categories to provide insight into the model s OOD classification decisions. We propose a novel method for generating OOD counterfactuals and evaluate it on synthetic, tabular, and image datasets. Empirical results demonstrate that our approach offers both quantitatively and qualitatively improved explanations compared to existing baselines.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/keshtmand26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/keshtmand26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Assessing the Fragility of SHAP-Based Model Explanations Using Counterfactuals</title>
        <description>Post-hoc explanations such as SHAP are increasingly used to justify machine learning predictions. Yet, these explanations can be fragile: small, realistic input perturbations can cause large shifts in the importance of attributed features. We present a multi-seed, distance-controlled *stability assessment* for SHAP-based model explanations. For each data instance, we use DiCE to generate plausible counterfactuals, pool across random seeds, deduplicate, and retain the $K$ nearest counterfactuals. Using a shared independent masker and the model s logit (raw margin), we measure per-feature attribution shifts and summarise instance-level instability. On four tabular fairness benchmark datasets, we apply our protocol to a logistic regression, a multilayer perceptron, and decision trees, including boosted and bagged versions. We report within-model group-wise explanation stability and examine which features most often drive the observed shifts. To contextualise our findings, we additionally report coverage, effective-$K$, distance-to-boundary, and outlier diagnostics. The protocol is model-agnostic yet practical for deep networks (batched inference, shared background), turning explanation variability into an actionable fairness assessment without altering trained models.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/kasbohrer26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/kasbohrer26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Extremal Contours: Gradient-driven contours for compact visual attribution</title>
        <description>Faithful yet compact explanations for vision models remain a challenge, as commonly used dense perturbation masks are often fragmented and overfitted, needing careful post-processing. Here, we present a training-free explanation method that replaces dense masks with smooth tunable contours. A star-convex region is parameterized by a truncated Fourier series and optimized under an extremal preserve/delete objective using the classifier gradients. The approach guarantees a single, simply connected mask, cuts the number of free parameters by orders of magnitude, and yields stable boundary updates without cleanup. Restricting solutions to low-dimensional, smooth contours makes the method robust to adversarial masking artifacts. On ImageNet classifiers, it matches the extremal fidelity of dense masks while producing compact, interpretable regions with improved run-to-run consistency. Explicit area control also enables importance contour maps, yielding a transparent fidelity area profiles. Finally, we extend the approach to multi-contour and show how it can localize multiple objects within the same framework. Across benchmarks, the method achieves higher relevance mass and lower complexity than gradient and perturbation based baselines, with especially strong gains on self-supervised DINO models where it improves relevance mass by over 15% and maintains positive faithfulness correlations.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/karimzadeh26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/karimzadeh26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Structured Covariance Modeling Using Learned Mixture-of-Bases for Uncertainty in 3D Segmentation</title>
        <description>Accurate segmentation is essential in error-critical domains such as medical imaging, where outputs support clinical decisions. Probabilistic models like the Stochastic Segmentation Network (SSN) enable uncertainty quantification, but existing methods typically use low-rank plus diagonal covariance structures that struggle to capture both global and local spatial correlations, limiting performance gains over deterministic models. We revisit low-rank formulations and introduce two approaches - Single-Basis and Mixture-of-Bases decompositions - that project predicted noise onto learned covariance bases, either globally or within partitioned volume blocks. This yields richer, more flexible uncertainty modeling with minimal parameter overhead. On the most challenging organs in the 3D TotalSegmentator CT dataset, our methods significantly improve Dice scores over deterministic and baseline stochastic models while preserving strong calibration, with the Mixture-of-Bases performing best. These findings show that basis-driven covariance modeling can enhance segmentation accuracy and uncertainty estimation in 3D medical imaging.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/kampen26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/kampen26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Staying on the Manifold: Geometry-Aware Noise Injection</title>
        <description>It has been shown that perturbing the input during training implicitly regularises the gradient of the learnt function, leading to smoother models and enhancing generalisation. However, previous research mostly considered the addition of ambient noise in the input space, without considering the underlying structure of the data. In this work, we propose several strategies of adding geometry-aware input noise that accounts for the lower dimensional manifold the input space inhabits. We start by projecting ambient Gaussian noise onto the tangent space of the manifold. In a second step, the noise sample is mapped on the manifold via the associated geodesic curve. We also consider Brownian motion noise, which moves in random steps along the manifold. We show that geometry-aware noise leads to improved generalisation and robustness to hyperparameter selection on highly curved manifolds, while performing at least as well as training without noise on simpler manifolds. Our proposed framework extends to data manifolds approximated by generative models and we observe similar trends on the MNIST digits dataset.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/jacobsen26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/jacobsen26a.html</guid>
        
        
      </item>
    
      <item>
        <title>MTVNet: Multi-Contextual Transformers for Volumes – Network for Super-Resolution with Long-Range Interactions</title>
        <description>Recent advances in transformer-based models have led to significant improvements in 2D image super-resolution. However, leveraging these advances for volumetric super-resolution remains challenging due to the high memory demands of self-attention mechanisms in 3D volumes, which severely limit the receptive field. As a result, long-range interactions, one of the key strengths of transformers, are underutilized in 3D super-resolution. To investigate this, we propose MTVNet, a volumetric transformer model that leverages information from expanded contextual regions at multiple resolution scales. Here, coarse resolution information from boarder context regions is carried on to inform the super-resolution prediction of a smaller area. Using transformer layers at each resolution, our coarse-to-fine modeling limits the number of tokens at each scale and enables attention over larger regions than previously possible. We compare our method, MTVNet, against state-of-the-art models on five 3D datasets. Our results show that expanding the receptive field of transformer-based methods yields significant performance gains on high-resolution 3D data. While CNNs outperform transformers on low-resolution data, transformer-based methods excel on high-resolution volumes with exploitable long-range dependencies, with our MTVNet achieving state-of-the-art performance. Our code is available at https://github.com/AugustHoeg/MTVNet.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/hoeg26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/hoeg26a.html</guid>
        
        
      </item>
    
      <item>
        <title>On the Generalisation of Koopman Representations for Chaotic System Control</title>
        <description>This paper investigates the generalisability of Koopman-based representations for chaotic dynamical systems, focusing on their transferability across prediction and control tasks. Using the Lorenz system as a testbed, we propose a three-stage methodology: learning Koopman embeddings through autoencoding, pre-training a transformer on next-state prediction, and fine-tuning for safety-critical control. Our results show that Koopman embeddings outperform both standard and physics-informed PCA baselines, achieving accurate and data-efficient performance. Notably, fixing the pre-trained transformer weights during fine-tuning leads to no performance degradation, indicating that the learned representations capture reusable dynamical structure rather than task-specific patterns. These findings support the use of Koopman embeddings as a foundation for multi-task learning in physics-informed machine learning.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/hjikakou26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/hjikakou26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Predicting Calving Events in Antarctica using Machine Learning</title>
        <description>Monitoring the calving dynamics of the Antarctic ice shelves is central to understanding a major driver for the changes to ocean levels on our planet. Several physical models have been proposed as calving laws, with varying predictive power. We propose an approach using Machine Learning (ML) to identify key variables and parameters that may be used in future models of the ice shelf calving dynamics. As part of an ongoing project, we have trained a U-Net on samples from a set of Gaussian Random Field-represented Essential Climate Variables (ECV). Ablation studies establish a few of the selected variables as having high correlation with calving events, with an F1 score above 0.9. Our first study site was the Larsen C Ice Shelf, on the northwest part of the Weddell Sea, where in 2017 there was a massive calving event. We have found strong correlations between the calving and the ice velocity leading up to this event, which may be further improved when accounting for basal melt rates in the area.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/hay26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/hay26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Wildfire Spread Scenarios: Increasing Sample Diversity of Segmentation Diffusion Models with Training-Free Methods</title>
        <description>Predicting future states in uncertain environments, such as wildfire spread, medical diagnosis, or autonomous driving, requires models that can consider multiple plausible outcomes. While diffusion models can effectively learn such multi-modal distributions, naively sampling from these models is computationally inefficient, potentially requiring hundreds of samples to find low-probability modes that may still be operationally relevant. In this work, we address the challenge of sample-efficient ambiguous segmentation by evaluating several training-free sampling methods that encourage diverse predictions. We adapt two techniques, particle guidance and SPELL, originally designed for the generation of diverse natural images, to discrete segmentation tasks, and additionally propose a simple clustering-based technique. We validate these approaches on the LIDC medical dataset, a modified version of the Cityscapes dataset, and MMFire, a new simulation-based wildfire spread dataset introduced in this paper. Compared to naive sampling, these approaches increase the HM IoU* metric by up to 7.5% on MMFire and 16.4% on Cityscapes, demonstrating that training-free methods can be used to efficiently increase the sample diversity of segmentation diffusion models with little cost to image quality and runtime.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/gerard26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/gerard26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Unreliable Monte Carlo Dropout Uncertainty Estimation</title>
        <description>Reliable uncertainty estimation is crucial for machine learning models, especially in safety-critical domains. While exact Bayesian inference offers a principled approach, it is often computationally infeasible for deep neural networks. Monte Carlo dropout (MCD) was proposed as an efficient approximation to Bayesian inference in deep learning by applying dropout at inference time. Hence, the method generates multiple sub-models yielding a distribution of predictions to estimate uncertainty. We investigate its ability to capture true uncertainty and compare to Gaussian Processes (GP) and Bayesian Neural Networks (BNN). We find that MCD struggles to accurately reflect the underlying true uncertainty, particularly failing to capture increased uncertainty in extrapolation and interpolation regions observed in Bayesian models. The findings suggest that uncertainty estimates from MCD, as implemented and evaluated in these experiments, may not be as reliable as those from traditional Bayesian approaches for capturing epistemic and aleatoric uncertainty.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/djupskas26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/djupskas26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Learning Normal Patterns in Musical Loops</title>
        <description>We propose an unsupervised framework for analyzing audio patterns in musical loops using deep feature extraction and anomaly detection. Unlike prior methods limited by fixed input lengths, handcrafted features, or domain constraints, our approach combines a pre-trained Hierarchical Token-semantic Audio Transformer (HTS-AT) and Feature Fusion Mechanism (FFM) to generate representations from variable-length audio. These embeddings are analyzed by Deep Support Vector Data Description (Deep SVDD), which models normative patterns in a compact latent space. Experiments on bass and guitar datasets show our Deep SVDD models—especially with residual autoencoders—outperform baselines like Isolation Forest and PCA, achieving better anomaly separation. Our work provides a flexible, unsupervised method for effective pattern discovery in diverse audio samples.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/dadman26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/dadman26a.html</guid>
        
        
      </item>
    
      <item>
        <title>CID: Measuring Feature Importance Through Counterfactual Distributions</title>
        <description>Assessing the importance of individual features in Machine Learning is critical to understand the model’s decision-making process. While numerous methods exist, the lack of a definitive ground truth for comparison highlights the need for alternative, well-founded measures. This paper introduces a novel post-hoc local feature importance method called Counterfactual Importance Distribution (CID). We generate two sets of positive and negative counterfactuals, model their distributions using Kernel Density Estimation, and rank features based on a distributional dissimilarity measure. This measure, grounded in a rigorous mathematical framework, satisfies key properties required to function as a valid metric. We showcase the effectiveness of our method by comparing with well-established local feature importance explainers. Our method not only offers complementary perspectives to existing approaches, but also improves performance on faithfulness metrics (both for comprehensiveness and sufficiency), resulting in more faithful explanations of the system. These results highlight its potential as a valuable tool for model analysis.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/conti26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/conti26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Reflective Agents for Knowledge Graph Traversal</title>
        <description>Current research on Retrieval Augmented Generation (RAG) for Knowledge Graphs often relies on graph pruning to manage the scale of the data. This approach is not feasible for dense, highly structured environments like rigid ontologies, where every node has significant interconnected value. The sheer size of these graphs inhibits the effectiveness of standard semantic retrieval methods. To overcome this limitation, we introduce a novel approach using an autonomous agent that dynamically traverses the graph to retrieve information. A key contribution of our work is the integration of a feedback mechanism that informs the agent about its general performance and specific tool utilization, thereby enhancing its traversal efficiency. We validate our method through a systematic study on ontologies of varying sizes, employing a user simulator to generate realistic tasks for knowledge graph construction and querying. Our findings demonstrate the current problems with information retrieval in large, non prunable knowledge structures.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/chudoba26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/chudoba26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Towards Agnostic and Holistic Universal Image Segmentation with Bit Diffusion</title>
        <description>This paper introduces a diffusion-based framework for universal image segmentation, making agnostic segmentation possible without depending on mask-based frameworks and instead predicting the full segmentation in a holistic manner. We present several key adaptations to diffusion models, which are important in this discrete setting. Notably, we show that a location-aware palette with our 2D gray code ordering improves performance. Adding a final tanh activation function is crucial for discrete data. On optimizing diffusion parameters, the sigmoid loss weighting consistently outperforms alternatives, regardless of the prediction type used, and we settle on x-prediction. While our current model does not yet surpass leading mask-based architectures, it narrows the performance gap and introduces unique capabilities, such as principled ambiguity modeling, that these models lack. All models were trained from scratch, and we believe that combining our proposed improvements with large-scale pretraining or promptable conditioning could lead to competitive models.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/christensen26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/christensen26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Spatio-Temporal Landmark Detection via Selective Fine-Tuning of Echocardiography Foundation Models</title>
        <description>Foundation models (FMs) have shown remarkable capabilities across computer vision tasks, yet their effectiveness for complex medical downstream tasks remains underexplored. This work investigates whether state-of-the-art video-based FMs for echocardiography can perform precise spatio-temporal landmark detection without extensive fine-tuning. We evaluate two recent powerful FMs, namely EchoPrime, and PanEcho, pre-trained on few millions of echocardiographic video-text pairs, for left-ventricular contour detection on EchoNet-Dynamic. We compare encoder regimes (frozen, partially frozen, fully trainable) and decoder heads (MLP vs. GCN), and benchmark against strong non-FM backbones (ResNet-18 2D/3D, ViT-Base, MViTv2-Small). Frozen encoders perform poorly and variably ($\approx$78.00 Dice, ED), whereas selectively unfreezing two blocks with GCN+augmentation yields a large jump ($91.71\pm3.49$ Dice, ED), recovering most of the improvement. Fully trainable EchoPrime (GCN+augmentation) achieves $93.13\pm3.11/90.95\pm3.71$ Dice (ED/ES), which is SOTA for regression-based models on EchoNet. Deploying separate, fully fine-tuned models for each task quickly becomes impractical in resource-constrained settings. Our results suggest that partially fine-tuning the FM is a resource-efficient strategy that recovers most of the performance benefits of end-to-end training, while avoiding the overhead of maintaining a separate model for each task. The code is available at \href{https://github.com/preetrajb/EchoVLMLandmarks}{https://github.com/preetrajb/EchoVLMLandmarks}.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/bhoodoo26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/bhoodoo26a.html</guid>
        
        
      </item>
    
      <item>
        <title>Self-Supervised and Unsupervised Multispectral Anomaly Detection for Unknown Substance and Surface Defect Identification</title>
        <description>Autonomous systems and environmental monitoring require reliable detection of unknown hazardous materials to prevent traffic accidents and ecological damage resulting from chemical spills, fuel leaks, and agricultural runoff. Traditional detection methods, such as gas chromatography, pose contamination risks and cause delays, while laser-based techniques rely on prior localization of potential hotspots. This paper addresses the automatic detection of unknown materials (e.g., fertilizer, sand, soil) and surface anomalies (e.g., cracks, holes) without requiring labeled anomaly examples during training. We employ unsupervised and self-supervised deep learning methods to learn normal patterns and identify deviations. Our approach evaluates four models: convolutional and vision transformer-based autoencoders, and two self-supervised methods, SimCLR and Barlow Twins. Experiments conducted on multispectral road images from the German Aerospace Center and the MVTec hazelnut dataset demonstrate that the ViT-based autoencoder outperforms its convolutional counterpart, while Barlow Twins achieves superior anomaly localization compared to SimCLR. These results highlight the potential of efficient deep learning models for enhancing road safety and environmental protection through early detection of potentially hazardous substances before they cause harm.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/beyaz26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/beyaz26a.html</guid>
        
        
      </item>
    
      <item>
        <title>How PARTs assemble into wholes: Learning the relative composition of images</title>
        <description>The composition of objects and their parts, along with object-object positional relationships, provides a rich source of information for representation learning. Hence, spatial-aware pretext tasks have been actively explored in self-supervised learning. Existing works commonly start from a grid structure, where the goal of the pretext task involves predicting the absolute position index of patches within a fixed grid. However, grid-based approaches fall short of capturing the fluid and continuous nature of real-world object compositions. We introduce PART, a self-supervised learning approach that leverages continuous relative transformations between off-grid patches to overcome these limitations. By modeling how parts relate to each other in a continuous space, PART learns the relative composition of images an off-grid structural relative positioning that is less tied to absolute appearance and can remain coherent under variations such as partial visibility or stylistic changes. In tasks requiring precise spatial understanding such as object detection and time series prediction, PART outperforms grid-based methods like MAE and DropPos, while maintaining competitive performance on global classification tasks. By breaking free from grid constraints, PART opens up a new trajectory for universal self-supervised pretraining across diverse datatypes from images to EEG signals with potential in medical imaging, video, and audio.</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/ayoughi26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/ayoughi26a.html</guid>
        
        
      </item>
    
      <item>
        <title>HetGSMOTE: Oversampling for Heterogeneous Graphs</title>
        <description>Graph Neural Networks (GNNs) have proven effective for learning from graph structured data, with heterogeneous graphs (HetGs) gaining particular prominence for their ability to model diverse real world systems through multiple node and edge types. However, class imbalance where certain node classes are significantly underrepresented presents a critical challenge for node classification tasks on HetGs, as traditional learning approaches fail to adequately handle minority classes. This work introduces HetGSMOTE, a novel oversampling framework that extends SMOTE-based techniques to heterogeneous graph settings by systematically incorporating node-type, edge-type, and metapath information into the synthetic sample generation process. HetGSMOTE operates by constructing a content-aggregated and neighbor-type-aggregated embedding space through a base model, then generating synthetic minority nodes while training specialized edge generators for each node type to preserve essential relational structures. Through comprehensive experiments across multiple benchmark datasets and base models, we demonstrate that HetGSMOTE consistently outperforms existing baseline methods, achieving substantial improvements in classification performance under various imbalance scenarios, particularly in extreme imbalance cases while maintaining broad compatibility across different heterogeneous graph neural network architectures. We release our code and data preparations at [github.com/smlab-niser/hetgsmote](https://github.com/smlab-niser/hetgsmote).</description>
        <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://proceedings.mlr.press/v307/ansad26a.html</link>
        <guid isPermaLink="true">https://proceedings.mlr.press/v307/ansad26a.html</guid>
        
        
      </item>
    
  </channel>
</rss>
