Defining Neural Network Architecture through Polytope Structures of Datasets

Sangmin Lee, Abbas Mammadov, Jong Chul Ye
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:26789-26836, 2024.

Abstract

Current theoretical and empirical research in neural networks suggests that complex datasets require large network architectures for thorough classification, yet the precise nature of this relationship remains unclear. This paper tackles this issue by defining upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question. We also delve into the application of these principles to simplicial complexes and specific manifold shapes, explaining how the requirement for network width varies in accordance with the geometric complexity of the dataset. Moreover, we develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks. Through our algorithm, it is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-lee24q, title = {Defining Neural Network Architecture through Polytope Structures of Datasets}, author = {Lee, Sangmin and Mammadov, Abbas and Ye, Jong Chul}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {26789--26836}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/lee24q/lee24q.pdf}, url = {https://proceedings.mlr.press/v235/lee24q.html}, abstract = {Current theoretical and empirical research in neural networks suggests that complex datasets require large network architectures for thorough classification, yet the precise nature of this relationship remains unclear. This paper tackles this issue by defining upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question. We also delve into the application of these principles to simplicial complexes and specific manifold shapes, explaining how the requirement for network width varies in accordance with the geometric complexity of the dataset. Moreover, we develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks. Through our algorithm, it is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.} }
Endnote
%0 Conference Paper %T Defining Neural Network Architecture through Polytope Structures of Datasets %A Sangmin Lee %A Abbas Mammadov %A Jong Chul Ye %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-lee24q %I PMLR %P 26789--26836 %U https://proceedings.mlr.press/v235/lee24q.html %V 235 %X Current theoretical and empirical research in neural networks suggests that complex datasets require large network architectures for thorough classification, yet the precise nature of this relationship remains unclear. This paper tackles this issue by defining upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question. We also delve into the application of these principles to simplicial complexes and specific manifold shapes, explaining how the requirement for network width varies in accordance with the geometric complexity of the dataset. Moreover, we develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks. Through our algorithm, it is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
APA
Lee, S., Mammadov, A. & Ye, J.C.. (2024). Defining Neural Network Architecture through Polytope Structures of Datasets. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:26789-26836 Available from https://proceedings.mlr.press/v235/lee24q.html.

Related Material