SCENE-Net V2: Interpretable Multiclass 3D Scene Understanding with Geometric Priors

Diogo Lavado; Cláudia Soares; Alessandra Micheletti

SCENE-Net V2: Interpretable Multiclass 3D Scene Understanding with Geometric Priors

Diogo Lavado, Cláudia Soares, Alessandra Micheletti

Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM), PMLR 251:222-232, 2024.

Abstract

In this paper, we present SCENE-Net V2, a new resource-efficient, gray-box model for multiclass 3D scene understanding. SCENE-Net V2 leverages Group Equivariant Non-Expansive Operators (GENEOs) to incorporate fundamental geometric priors as inductive biases, offering a more transparent alternative to the prevalent black-box models in the domain. This model addresses the limitations of its white-box predecessor, SCENE-Net, by expanding its applicability from pole-like structures to a wider range of datasets with detailed 3D elements. Our model achieves the sweet-spot between application and transparency: SCENE-Net V2 is a general method for object identification with interpretability guarantees. Our experimental results demonstrate that SCENE-Net V2 achieves competitive performance with a significantly lower parameter count. Furthermore, we propose the use of GENEO-based architectures as a feature extraction tool for black-box models, enabling an increase in performance by adding a minimal number of meaningful parameters. Our code is available in: https://github.com/dlavado/SCENE-Net-V2.

Cite this Paper

BibTeX

@InProceedings{pmlr-v251-lavado24a,
  title = 	 {SCENE-Net V2: Interpretable Multiclass 3D Scene Understanding with Geometric Priors},
  author =       {Lavado, Diogo and Soares, Cl\'audia and Micheletti, Alessandra},
  booktitle = 	 {Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM)},
  pages = 	 {222--232},
  year = 	 {2024},
  editor = 	 {Vadgama, Sharvaree and Bekkers, Erik and Pouplin, Alison and Kaba, Sekou-Oumar and Walters, Robin and Lawrence, Hannah and Emerson, Tegan and Kvinge, Henry and Tomczak, Jakub and Jegelka, Stephanie},
  volume = 	 {251},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v251/main/assets/lavado24a/lavado24a.pdf},
  url = 	 {https://proceedings.mlr.press/v251/lavado24a.html},
  abstract = 	 {In this paper, we present SCENE-Net V2, a new resource-efficient, gray-box model for multiclass 3D scene understanding. SCENE-Net V2 leverages Group Equivariant Non-Expansive Operators (GENEOs) to incorporate fundamental geometric priors as inductive biases, offering a more transparent alternative to the prevalent black-box models in the domain. This model addresses the limitations of its white-box predecessor, SCENE-Net, by expanding its applicability from pole-like structures to a wider range of datasets with detailed 3D elements. Our model achieves the sweet-spot between application and transparency: SCENE-Net V2 is a general method for object identification with interpretability guarantees. Our experimental results demonstrate that SCENE-Net V2 achieves competitive performance with a significantly lower parameter count. Furthermore, we propose the use of GENEO-based architectures as a feature extraction tool for black-box models, enabling an increase in performance by adding a minimal number of meaningful parameters. Our code is available in: https://github.com/dlavado/SCENE-Net-V2.}
}

Endnote

%0 Conference Paper
%T SCENE-Net V2: Interpretable Multiclass 3D Scene Understanding with Geometric Priors
%A Diogo Lavado
%A Cláudia Soares
%A Alessandra Micheletti
%B Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM)
%C Proceedings of Machine Learning Research
%D 2024
%E Sharvaree Vadgama
%E Erik Bekkers
%E Alison Pouplin
%E Sekou-Oumar Kaba
%E Robin Walters
%E Hannah Lawrence
%E Tegan Emerson
%E Henry Kvinge
%E Jakub Tomczak
%E Stephanie Jegelka	
%F pmlr-v251-lavado24a
%I PMLR
%P 222--232
%U https://proceedings.mlr.press/v251/lavado24a.html
%V 251
%X In this paper, we present SCENE-Net V2, a new resource-efficient, gray-box model for multiclass 3D scene understanding. SCENE-Net V2 leverages Group Equivariant Non-Expansive Operators (GENEOs) to incorporate fundamental geometric priors as inductive biases, offering a more transparent alternative to the prevalent black-box models in the domain. This model addresses the limitations of its white-box predecessor, SCENE-Net, by expanding its applicability from pole-like structures to a wider range of datasets with detailed 3D elements. Our model achieves the sweet-spot between application and transparency: SCENE-Net V2 is a general method for object identification with interpretability guarantees. Our experimental results demonstrate that SCENE-Net V2 achieves competitive performance with a significantly lower parameter count. Furthermore, we propose the use of GENEO-based architectures as a feature extraction tool for black-box models, enabling an increase in performance by adding a minimal number of meaningful parameters. Our code is available in: https://github.com/dlavado/SCENE-Net-V2.

APA

Lavado, D., Soares, C. & Micheletti, A.. (2024). SCENE-Net V2: Interpretable Multiclass 3D Scene Understanding with Geometric Priors. Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM), in Proceedings of Machine Learning Research 251:222-232 Available from https://proceedings.mlr.press/v251/lavado24a.html.

Related Material

Download PDF