To go deep or wide in learning?

Gaurav Pandey; Ambedkar Dukkipati

To go deep or wide in learning?

Gaurav Pandey, Ambedkar Dukkipati

Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:724-732, 2014.

Abstract

To achieve acceptable performance for AI tasks, one can either use sophisticated feature extraction methods as the first layer in a two-layered supervised learning model, or learn the features directly using a deep (multi-layered) model. While the first approach is very problem-specific, the second approach has computational overheads in learning multiple layers and fine-tuning of the model. In this paper, we propose an approach called wide learning based on arc-cosine kernels, that learns a single layer of infinite width. We propose exact and inexact learning strategies for wide learning and show that wide learning with single layer outperforms single layer as well as deep architectures of finite width for some benchmark datasets.

Cite this Paper

BibTeX


@InProceedings{pmlr-v33-pandey14,
  title = 	 {{To go deep or wide in learning?}},
  author = 	 {Pandey, Gaurav and Dukkipati, Ambedkar},
  booktitle = 	 {Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {724--732},
  year = 	 {2014},
  editor = 	 {Kaski, Samuel and Corander, Jukka},
  volume = 	 {33},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Reykjavik, Iceland},
  month = 	 {22--25 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v33/pandey14.pdf},
  url = 	 {https://proceedings.mlr.press/v33/pandey14.html},
  abstract = 	 {To achieve acceptable performance for AI tasks, one can either use sophisticated feature extraction methods as the first layer in a two-layered supervised learning model, or learn the features directly using a deep (multi-layered) model. While the first approach is very problem-specific, the second approach has computational overheads in learning multiple layers and fine-tuning of the model. In this paper, we propose an approach called wide learning based on arc-cosine kernels, that learns a single layer of infinite width. We propose exact and inexact learning strategies for wide learning and show that wide learning with single layer outperforms single layer as well as deep architectures of finite width for some benchmark datasets.}
}

Endnote

%0 Conference Paper
%T To go deep or wide in learning?
%A Gaurav Pandey
%A Ambedkar Dukkipati
%B Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2014
%E Samuel Kaski
%E Jukka Corander	
%F pmlr-v33-pandey14
%I PMLR
%P 724--732
%U https://proceedings.mlr.press/v33/pandey14.html
%V 33
%X To achieve acceptable performance for AI tasks, one can either use sophisticated feature extraction methods as the first layer in a two-layered supervised learning model, or learn the features directly using a deep (multi-layered) model. While the first approach is very problem-specific, the second approach has computational overheads in learning multiple layers and fine-tuning of the model. In this paper, we propose an approach called wide learning based on arc-cosine kernels, that learns a single layer of infinite width. We propose exact and inexact learning strategies for wide learning and show that wide learning with single layer outperforms single layer as well as deep architectures of finite width for some benchmark datasets.

RIS


TY  - CPAPER
TI  - To go deep or wide in learning?
AU  - Gaurav Pandey
AU  - Ambedkar Dukkipati
BT  - Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics
DA  - 2014/04/02
ED  - Samuel Kaski
ED  - Jukka Corander	
ID  - pmlr-v33-pandey14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 33
SP  - 724
EP  - 732
L1  - http://proceedings.mlr.press/v33/pandey14.pdf
UR  - https://proceedings.mlr.press/v33/pandey14.html
AB  - To achieve acceptable performance for AI tasks, one can either use sophisticated feature extraction methods as the first layer in a two-layered supervised learning model, or learn the features directly using a deep (multi-layered) model. While the first approach is very problem-specific, the second approach has computational overheads in learning multiple layers and fine-tuning of the model. In this paper, we propose an approach called wide learning based on arc-cosine kernels, that learns a single layer of infinite width. We propose exact and inexact learning strategies for wide learning and show that wide learning with single layer outperforms single layer as well as deep architectures of finite width for some benchmark datasets.
ER  -

APA


Pandey, G. & Dukkipati, A.. (2014). To go deep or wide in learning?. Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 33:724-732 Available from https://proceedings.mlr.press/v33/pandey14.html.

Related Material

Download PDF