SCENE-Net V2: Interpretable Multiclass 3D Scene Understanding with Geometric Priors

Diogo Lavado, Cláudia Soares, Alessandra Micheletti
Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM), PMLR 251:222-232, 2024.

Abstract

In this paper, we present SCENE-Net V2, a new resource-efficient, gray-box model for multiclass 3D scene understanding. SCENE-Net V2 leverages Group Equivariant Non-Expansive Operators (GENEOs) to incorporate fundamental geometric priors as inductive biases, offering a more transparent alternative to the prevalent black-box models in the domain. This model addresses the limitations of its white-box predecessor, SCENE-Net, by expanding its applicability from pole-like structures to a wider range of datasets with detailed 3D elements. Our model achieves the sweet-spot between application and transparency: SCENE-Net V2 is a general method for object identification with interpretability guarantees. Our experimental results demonstrate that SCENE-Net V2 achieves competitive performance with a significantly lower parameter count. Furthermore, we propose the use of GENEO-based architectures as a feature extraction tool for black-box models, enabling an increase in performance by adding a minimal number of meaningful parameters. Our code is available in: https://github.com/dlavado/SCENE-Net-V2.

Cite this Paper


BibTeX
@InProceedings{pmlr-v251-lavado24a, title = {SCENE-Net V2: Interpretable Multiclass 3D Scene Understanding with Geometric Priors}, author = {Lavado, Diogo and Soares, Cl\'audia and Micheletti, Alessandra}, booktitle = {Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM)}, pages = {222--232}, year = {2024}, editor = {Vadgama, Sharvaree and Bekkers, Erik and Pouplin, Alison and Kaba, Sekou-Oumar and Walters, Robin and Lawrence, Hannah and Emerson, Tegan and Kvinge, Henry and Tomczak, Jakub and Jegelka, Stephanie}, volume = {251}, series = {Proceedings of Machine Learning Research}, month = {29 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v251/main/assets/lavado24a/lavado24a.pdf}, url = {https://proceedings.mlr.press/v251/lavado24a.html}, abstract = {In this paper, we present SCENE-Net V2, a new resource-efficient, gray-box model for multiclass 3D scene understanding. SCENE-Net V2 leverages Group Equivariant Non-Expansive Operators (GENEOs) to incorporate fundamental geometric priors as inductive biases, offering a more transparent alternative to the prevalent black-box models in the domain. This model addresses the limitations of its white-box predecessor, SCENE-Net, by expanding its applicability from pole-like structures to a wider range of datasets with detailed 3D elements. Our model achieves the sweet-spot between application and transparency: SCENE-Net V2 is a general method for object identification with interpretability guarantees. Our experimental results demonstrate that SCENE-Net V2 achieves competitive performance with a significantly lower parameter count. Furthermore, we propose the use of GENEO-based architectures as a feature extraction tool for black-box models, enabling an increase in performance by adding a minimal number of meaningful parameters. Our code is available in: https://github.com/dlavado/SCENE-Net-V2.} }
Endnote
%0 Conference Paper %T SCENE-Net V2: Interpretable Multiclass 3D Scene Understanding with Geometric Priors %A Diogo Lavado %A Cláudia Soares %A Alessandra Micheletti %B Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM) %C Proceedings of Machine Learning Research %D 2024 %E Sharvaree Vadgama %E Erik Bekkers %E Alison Pouplin %E Sekou-Oumar Kaba %E Robin Walters %E Hannah Lawrence %E Tegan Emerson %E Henry Kvinge %E Jakub Tomczak %E Stephanie Jegelka %F pmlr-v251-lavado24a %I PMLR %P 222--232 %U https://proceedings.mlr.press/v251/lavado24a.html %V 251 %X In this paper, we present SCENE-Net V2, a new resource-efficient, gray-box model for multiclass 3D scene understanding. SCENE-Net V2 leverages Group Equivariant Non-Expansive Operators (GENEOs) to incorporate fundamental geometric priors as inductive biases, offering a more transparent alternative to the prevalent black-box models in the domain. This model addresses the limitations of its white-box predecessor, SCENE-Net, by expanding its applicability from pole-like structures to a wider range of datasets with detailed 3D elements. Our model achieves the sweet-spot between application and transparency: SCENE-Net V2 is a general method for object identification with interpretability guarantees. Our experimental results demonstrate that SCENE-Net V2 achieves competitive performance with a significantly lower parameter count. Furthermore, we propose the use of GENEO-based architectures as a feature extraction tool for black-box models, enabling an increase in performance by adding a minimal number of meaningful parameters. Our code is available in: https://github.com/dlavado/SCENE-Net-V2.
APA
Lavado, D., Soares, C. & Micheletti, A.. (2024). SCENE-Net V2: Interpretable Multiclass 3D Scene Understanding with Geometric Priors. Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM), in Proceedings of Machine Learning Research 251:222-232 Available from https://proceedings.mlr.press/v251/lavado24a.html.

Related Material