- title: 'Preface'
  abstract: 'Preface to the Proceedings of the First Workshop on Applications of Pattern Analysis September 1-3, 2010, Cumberland Lodge, Windsor, UK'
  volume: 11
  URL: https://proceedings.mlr.press/v11/diethe10a.html
  PDF: http://proceedings.mlr.press/v11/diethe10a/diethe10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-diethe10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Nello
    family: Cristianini
  - given: Tom
    family: Diethe
  - given: John
    family: Shawe-Taylor
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 1-3
  id: diethe10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 1
  lastpage: 3
  published: 2010-09-30 00:00:00 +0000
- title: 'O-IPCAC and its Application to EEG Classification'
  abstract: 'In this paper we describe an online/incremental linear binary classifier based on an interesting approach to estimate the Fisher subspace. The proposed method allows to deal with datasets having high cardinality, being dynamically supplied, and it efficiently copes with high dimensional data without employing any dimensionality reduction technique. Moreover, this approach obtains promising classification performance even when the cardinality of the training set is comparable to the data dimensionality. We demonstrate the efficacy of our algorithm by testing it on EEG data. This classification problem is particularly hard since the data are high dimensional, the cardinality of the data is lower than the space dimensionality, and the classes are strongly unbalanced. The promising results obtained in the MLSP competition, without employing any feature extraction/selection step, have demonstrated that our method is effective; this is further proved both by our tests and by the comparison with other well-known classifiers.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/rozza10a.html
  PDF: http://proceedings.mlr.press/v11/rozza10a/rozza10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-rozza10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Alessandro
    family: Rozza
  - given: Gabriele
    family: Lombardi
  - given: Marco
    family: Rosa
  - given: Elena
    family: Casiraghi
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 4-11
  id: rozza10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 4
  lastpage: 11
  published: 2010-09-30 00:00:00 +0000
- title: '$\mu$TOSS - Multiple hypothesis testing in an open software system'
  abstract: '$\mu$TOSS is an R package providing an open source, easy-to-extend platform for multiple hypothesis testing (MHT), one of the most active research fields in statistics over the last 10-15 years. Its first motivation is to establish a common platform and standardization for MHT procedures at large. The $\mu$TOSS software has been designed and written in the framework of a “Harvest Programme” call of the PASCAL2 European research network. Basically, it consists of the two R packages mutoss and mutossGUI. For researchers, it features a convenient unification of interfaces for MHT procedures (including standardized functions to access existing specific MHT R packages such as multtest and multcomp, as well as recent MHT procedures that are not available elsewhere) and helper functions facilitating the setup of benchmark simulations for comparison of competing methods. For end users, a graphical user interface and an online user''s guide for finding appropriate methods for a given specification of the multiple testing problem is included. Ongoing maintenance and subsequent extensions will aim at establishing $\mu$TOSS as a state of the art in statistical computing for MHT.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/blanchard10a.html
  PDF: http://proceedings.mlr.press/v11/blanchard10a/blanchard10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-blanchard10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Gilles
    family: Blanchard
  - given: Thorsten
    family: Dickhaus
  - given: Niklas
    family: Hack
  - given: Frank
    family: Konietschke
  - given: Kornelius
    family: Rohmeyer
  - given: Jonathan
    family: Rosenblatt
  - given: Marsel
    family: Scheer
  - given: Wiebke
    family: Werft
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 12-19
  id: blanchard10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 12
  lastpage: 19
  published: 2010-09-30 00:00:00 +0000
- title: 'SubSift: a novel application of the vector space model to support the academic research process'
  abstract: 'SubSift matches submitted conference or journal papers to potential peer reviewers based on the similarity between the paper''s abstract and the reviewer''s publications as found in online bibliographic databases such as Google Scholar. Using concepts from information retrieval including a bag-of-words representation and cosine similarity, the SubSift tools were originally created to streamline the peer review process for the ACM SIGKDD''09 data mining conference. This paper describes how these tools were subsequently developed and deployed in the form of web services designed to support not only peer review but also personalised data discovery and mashups. SubSift has already been used by several major data mining conferences and interesting applications in other fields are now emerging.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/price10a.html
  PDF: http://proceedings.mlr.press/v11/price10a/price10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-price10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Simon
    family: Price
  - given: Peter A.
    family: Flach
  - given: Sebastian
    family: Spiegler
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 20-27
  id: price10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 20
  lastpage: 27
  published: 2010-09-30 00:00:00 +0000
- title: 'Cross-associating unlabelled timbre distributions to create expressive musical mappings'
  abstract: 'In timbre remapping applications such as concatenative synthesis, an audio signal is used as a template, and a mapping process derives control data for some audio synthesis algorithm such that it produces a new audio signal approximating the perceived trajectory of the original sound. Timbre is a multidimensional attribute with interactions between dimensions, and the control and synthesised signals typically represent sounds with different timbral ranges, so it is non-trivial to design a search process which makes best use of the timbral variety available in the synthesiser. We first discuss our preliminary work applying standard machine-learning techniques for this purpose (PCA, self-organising maps), and the reasons they were not satisfactory. We then describe a novel regression-tree technique which learns associations between unlabelled multidimensional timbre distributions.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/stowell10a.html
  PDF: http://proceedings.mlr.press/v11/stowell10a/stowell10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-stowell10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Dan
    family: Stowell
  - given: Mark D.
    family: Plumbley
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 28-35
  id: stowell10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 28
  lastpage: 35
  published: 2010-09-30 00:00:00 +0000
- title: 'Automating News Content Analysis: An Application to Gender Bias and Readability'
  abstract: 'In this article we present an application of text-analysis technologies to support social science research, in particular the analysis of patterns in news content. We describe a system that gathers and annotates large volumes of textual data in order to extract patterns and trends. We have examined 3.5 million news articles and show that their topic is related to the gender bias and readability of their content. This study is intended to illustrate how pattern analysis technology can be deployed to automate tasks commonly performed by humans in the social sciences, in order to enable large scale studies that would otherwise be impossible.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/ali10a.html
  PDF: http://proceedings.mlr.press/v11/ali10a/ali10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-ali10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Omar
    family: Ali
  - given: Ilias
    family: Flaounas
  - given: Tijl De
    family: Bie
  - given: Nick
    family: Mosdell
  - given: Justin
    family: Lewis
  - given: Nello
    family: Cristianini
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 36-43
  id: ali10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 36
  lastpage: 43
  published: 2010-09-30 00:00:00 +0000
- title: 'MOA: Massive Online Analysis, a Framework for Stream Classification and Clustering'
  abstract: 'Massive Online Analysis (MOA) is a software environment for implementing algorithms and running experiments for online learning from evolving data streams. MOA is designed to deal with the challenging problem of scaling up the implementation of state of the art algorithms to real world dataset sizes. It contains collection of offline and online for both classification and clustering as well as tools for evaluation. In particular, for classification it implements boosting, bagging, and Hoeffding Trees, all with and without Naive Bayes classifiers at the leaves. For clustering, it implements StreamKM++, CluStream, ClusTree, Den-Stream, D-Stream and CobWeb. Researchers benefit from MOA by getting insights into workings and problems of different approaches, practitioners can easily apply and compare several algorithms to real world data set and settings. MOA supports bi-directional interaction with WEKA, the Waikato Environment for Knowledge Analysis, and is released under the GNU GPL license.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/bifet10a.html
  PDF: http://proceedings.mlr.press/v11/bifet10a/bifet10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-bifet10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Albert
    family: Bifet
  - given: Geoff
    family: Holmes
  - given: Bernhard
    family: Pfahringer
  - given: Philipp
    family: Kranen
  - given: Hardy
    family: Kremer
  - given: Timm
    family: Jansen
  - given: Thomas
    family: Seidl
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 44-50
  id: bifet10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 44
  lastpage: 50
  published: 2010-09-30 00:00:00 +0000
- title: 'Pinview: Implicit Feedback in Content-Based Image Retrieval'
  abstract: 'This paper describes Pinview, a content-based image retrieval system that exploits implicit relevance feedback during a search session. Pinview contains several novel methods that infer the intent of the user. From relevance feedback, such as eye movements or clicks, and visual features of images Pinview learns a similarity metric between images which depends on the current interests of the user. It then retrieves images with a specialized reinforcement learning algorithm that balances the tradeoff between exploring new images and exploiting the already inferred interests of the user. In practise, we have integrated Pinview to the content-based image retrieval system PicSOM, in order to apply it to real-world image databases. Preliminary experiments show that eye movements provide a rich input modality from which it is possible to learn the interests of the user.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/auer10a.html
  PDF: http://proceedings.mlr.press/v11/auer10a/auer10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-auer10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Peter
    family: Auer
  - given: Zakria
    family: Hussain
  - given: Samuel
    family: Kaski
  - given: Arto
    family: Klami
  - given: Jussi
    family: Kujala
  - given: Jorma
    family: Laaksonen
  - given: Alex P.
    family: Leung
  - given: Kitsuchart
    family: Pasupa
  - given: John
    family: Shawe-Taylor
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 51-57
  id: auer10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 51
  lastpage: 57
  published: 2010-09-30 00:00:00 +0000
- title: 'Handwritten Text Recognition for Ancient Documents'
  abstract: 'Huge amounts of legacy documents are being published by on-line digital libraries world wide. However, for these raw digital images to be really useful, they need to be transcribed into a textual electronic format that would allow unrestricted indexing, browsing and querying. In some cases, adequate transcriptions of the handwritten text images are already available. In this work three systems are presented to deal with this sort of documents. The first two address two different approaches for semi-automatic transcription of document images. The third system implements an alignment method to find mappings between word images of a handwritten document and their respective words in its given transcription.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/juan10a.html
  PDF: http://proceedings.mlr.press/v11/juan10a/juan10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-juan10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Alfons
    family: Juan
  - given: Verónica
    family: Romero
  - given: Joan Andreu
    family: Sánchez
  - given: Nicolás
    family: Serrano
  - given: Alejandro H.
    family: Toselli
  - given: Enrique
    family: Vidal
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 58-65
  id: juan10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 58
  lastpage: 65
  published: 2010-09-30 00:00:00 +0000
- title: 'Assessment of Cow’s Body Condition Score Through Statistical Shape Analysis and Regression Machines'
  abstract: 'This study explores the feasibility of estimating the Body Condition Score (BCS) of cows from digital images by employing statistical shape analysis and regression machines. The shapes of body cows are described through a number of variations from a unique average shape. Specifically, Kernel Principal Component Analysis is used to determine the components describing the many ways in which the body shape of different cows tend to deform from the average shape. This description is used for automatic estimation of BCS through regression approach. The proposed method has been tested on a new benchmark dataset available through the Internet. Experimental results confirm the effectiveness of the proposed technique that outperforms the state-of-the-art approaches proposed in the context of dairy cattle research.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/battiato10a.html
  PDF: http://proceedings.mlr.press/v11/battiato10a/battiato10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-battiato10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Sebastiano
    family: Battiato
  - given: Giovanni Maria
    family: Farinella
  - given: Giuseppe Claudio
    family: Guarnera
  - given: Giovanni
    family: Puglisi
  - given: Giuseppe
    family: Azzaro
  - given: Margherita
    family: Caccamo
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 66-73
  id: battiato10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 66
  lastpage: 73
  published: 2010-09-30 00:00:00 +0000
- title: 'Closure-Based Confidence Boost in Association Rules'
  abstract: 'We focus on association rule mining. It is well-known that naive miners end up often providing far too large amounts of mined associations to result actually useful in practice. Many proposals exist for selecting appropriate association rules, trying to measure their interest in various ways; most of these approaches are statistical in nature, or share their main traits with statistical notions. Alternatively, some existing notions of redundancy among association rules allow for a logical-style characterization and lead to irredundant bases (axiomatizations) of absolutely minimum size. Here we follow up on a study of closure-based redundancy, which, in practice, leads to smaller bases than simpler alternative forms of redundancy, with the proviso that, in principle, they need to be complemented with an implicational basis. One can push the intuition of redundancy further and gain a perspective of the interest of association rules in terms of their “novelty” with respect to other rules. An irredundant rule is so because its confidence is higher than what the rest of the rules would suggest; then, one can ask: how much higher? Among several variants, a recently proposed parameter, the confidence boost, succeeds in measuring a notion of novelty along these lines so that it fits better the needs of practical applications. However, that notion is based on plain redundancy, of relatively limited practical usefulness. Here we extend the confidence boost to closure-based redundancy, paying a small theoretical price to obtain several advantages in practical applications. We describe a rule-mining system implementing this contribution.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/balcazar10a.html
  PDF: http://proceedings.mlr.press/v11/balcazar10a/balcazar10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-balcazar10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: José L.
    family: Balcázar
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 74-80
  id: balcazar10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 74
  lastpage: 80
  published: 2010-09-30 00:00:00 +0000
- title: 'HMMPayl: an application of HMM to the analysis of the HTTP Payload'
  abstract: 'Zero-days attacks are one of the most dangerous threats against computer networks. These, by definition, are attacks never seen before. Thus, defense tools based on a database of rules (usually referred as “signatures”) that describe known attacks cannot do anything against them. Recently, defense tools based on machine learning algorithms have gained an increasing popularity as they offer the possibility to fight off also zero-days attacks. In this paper we propose HMMPayl, an anomaly based Intrusion Detection System for the protection of a web server and of the applications the server hosts. HMMPayl analyzes the network traffic toward the web server and it is based on Hidden Markov Models. With this paper we provide for several contributions. First, the algorithm implemented by HMMPayl allows to carefully model the payload increasing the classification accuracy with respect to previously proposed solutions. Second, we show that an approach based on multiple classifiers leads to an increased classification accuracy with respect to the case where a single classifier is used. Third, exploiting the redundancy within the information extracted from the payload we propose a solution to reduce the computational cost of the algorithm.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/ariu10a.html
  PDF: http://proceedings.mlr.press/v11/ariu10a/ariu10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-ariu10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Davide
    family: Ariu
  - given: Giorgio
    family: Giacinto
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 81-87
  id: ariu10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 81
  lastpage: 87
  published: 2010-09-30 00:00:00 +0000
- title: 'Maximum Margin Learning with Incomplete Data: Learning Networks instead of Tables'
  abstract: 'In this paper we address the problem of predicting when the available data is incomplete. We show that changing the generally accepted table-wise view of the sample items into a graph representable one allows us to solve these kind of problems in a very concise way by using the well known convex, one-class classification based, optimisation framework. The use of the one-class formulation in the learning phase and in the prediction as well makes the entire procedure highly consistent. The graph representation can express the complex interdependencies among the data sources. The underlying optimisation problem can be transformed into a on-line algorithm, e.g. a perceptron type one, and in this way it can deal with data sets of million items. This framework covers and encompasses supervised, semi-supervised and some unsupervised learning problems. Furthermore, the data sources can be chosen as not only simple binary variables or vectors but text documents, images or even graphs with complex internal structures.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/szedmak10a.html
  PDF: http://proceedings.mlr.press/v11/szedmak10a/szedmak10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-szedmak10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Sandor
    family: Szedmak
  - given: Yizhao
    family: Ni
  - given: Steve R.
    family: Gunn
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 96-102
  id: szedmak10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 96
  lastpage: 102
  published: 2010-09-30 00:00:00 +0000
- title: 'Interactive Pattern Recognition and Human Language Technology for Digital Audiovisual Content Processing'
  abstract: 'This paper describes ongoing research work by the Pattern Recognition and Human Language Technology (PRHLT) group (UPV PASCAL2 node) in two important technology transfer projects: i3media and erudito.com. On the one hand, i3media (2007-2010) is a 35M€ “tractor” technology project within the Spanish Programa CENIT-Ingenio 2010, run through a consortium of 12 main enterprises of the media sector, which also involve 19 research groups, including PRHLT. i3media focuses on the creation and automated management of intelligent audiovisual content, so as to facilitate both, content personalisation and interaction with users (i3media.barcelonamedia.org). Our participation in i3media is centred on interactive machine translation, to transfer and adapt our experience on this technology to i3media-specific needs. On the other hand, erudito.com (2010-2012) is a 1.4M€ experimental design project, supported by the Spanish Ministry of Industry, Tourism and Trade under the Avanza I+D program, aimed at developing a tool to encapsulate, distribute and intelligently use digital content such as that showed on thematic TV channels. In this project, PRHLT contributes to the development of interactive closed captioning (speech transcription) and machine translation tools.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/lagarda10a.html
  PDF: http://proceedings.mlr.press/v11/lagarda10a/lagarda10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-lagarda10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Antonio
    family: Lagarda
  - given: Jorge
    family: Civera
  - given: Alfons
    family: Juan
  - given: Francisco
    family: Casacuberta
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 103-110
  id: lagarda10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 103
  lastpage: 110
  published: 2010-09-30 00:00:00 +0000
- title: 'Facial Expression Detection using Filtered Local Binary Pattern Features with ECOC Classifiers and Platt Scaling'
  abstract: 'We outline a design for a FACS-based facial expression recognition system and describe in more detail the implementation of two of its main components. Firstly we look at how features that are useful from a pattern analysis point of view can be extracted from a raw input image. We show that good results can be obtained by using the method of local binary patterns (LPB) to generate a large number of candidate features and then selecting from them using fast correlation-based filtering (FCBF). Secondly we show how Platt scaling can be used to improve the performance of an error-correcting output code (ECOC) classifier.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/smith10a.html
  PDF: http://proceedings.mlr.press/v11/smith10a/smith10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-smith10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Raymond S.
    family: Smith
  - given: Terry
    family: Windeatt
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 111-118
  id: smith10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 111
  lastpage: 118
  published: 2010-09-30 00:00:00 +0000
- title: 'Gap Between Theory and Practice: Noise Sensitive Word Alignment in Machine Translation'
  abstract: 'Word alignment is to estimate a lexical translation probability \emphp(\emphe|\emphf), or to estimate the correspondence \emphg(\emphe,\emphf) where a function \emphg outputs either 0 or 1, between a source word \emphf and a target word \emphe for given bilingual sentences. In practice, this formulation does not consider the existence of ’noise’ (or outlier) which may cause problems depending on the corpus. \emphN-to-\emphm mapping objects, such as paraphrases, non-literal translations, and multi-word expressions, may appear as both noise and also as valid training data. From this perspective, this paper tries to answer the following two questions: 1) how to detect stable patterns where noise seems legitimate, and 2) how to reduce such noise, where applicable, by supplying extra information as prior knowledge to a word aligner.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/okita10a.html
  PDF: http://proceedings.mlr.press/v11/okita10a/okita10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-okita10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Tsuyoshi
    family: Okita
  - given: Yvette
    family: Graham
  - given: Andy
    family: Way
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 119-126
  id: okita10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 119
  lastpage: 126
  published: 2010-09-30 00:00:00 +0000
- title: 'Modeling Knowledge Worker Activity'
  abstract: 'This paper describes an approach to constructing a probabilistic process model representing knowledge worker activity out of a log of primitive events, such as e-mails, web page visits and document accesses. Firstly, we present the process of enriching the primitive events into abstract actions, executed in different contexts. We explain the process of obtaining both context and action for each event by clustering the events via two different views. Secondly, we present an application of probabilistic deterministic finite automata to model the transitions between consecutive actions within the same context and demonstrate the approach on real-world knowledge worker data for the purpose of understanding knowledge processes and demonstrating the feasibility of the proposed approach, where a process model is constructed out of low-level events.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/stajner10a.html
  PDF: http://proceedings.mlr.press/v11/stajner10a/stajner10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-stajner10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Tadej
    family: Štajner
  - given: Dunja
    family: Mladeniƈ
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 127-133
  id: stajner10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 127
  lastpage: 133
  published: 2010-09-30 00:00:00 +0000
- title: 'Visualization of Online Discussion Forums'
  abstract: 'This paper describes a set of visualization tools which aid the understanding of discussion topics and trends in online discussion forums. The tools integrate into the forum’s web page, allowing for easy exploration of its contents. Three visualizations are presented: a visual browsing suggestions mechanism, a semantic “atlas” providing a thematic overview of larger forum segments, and a timeline displaying temporal evolution of forum topics. The underlying algorithms have very few language-dependent components. The software is operational and can be tested live on Slovene, Slovak and Hungarian pilot sites, containing up to 5 million forum posts.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/trampus10a.html
  PDF: http://proceedings.mlr.press/v11/trampus10a/trampus10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-trampus10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Mitja
    family: Trampuš
  - given: Marko
    family: Grobelnik
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 134-141
  id: trampus10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 134
  lastpage: 141
  published: 2010-09-30 00:00:00 +0000
- title: 'A Novel Hybrid Feature Selection Method Based on IFSFFS and SVM for the Diagnosis of Erythemato-Squamous Diseases'
  abstract: 'This paper developed a diagnosis model based on Support Vector Machines (SVM) with a novel hybrid feature selection method to diagnose erythemato-squamous diseases. Our hybrid feature selection method, named IFSFFS (Improved F-score and Sequential Forward Floating Search), combines the advantages of filters and wrappers to select the optimal feature subset from the original feature set. In our IFSFFS, we firstly generalized the original F-score to the improved F-score measuring the discrimination of more than two sets of real numbers. Then we proposed to combine Sequential Forward Floating Search (SFFS) and our improved F-score to accomplish the optimal feature subset selection. Where, our improved F-score is an evaluation criterion for filters, while SFFS and SVM compose an evaluation system of wrappers. The best parameters of kernel function of SVM are found out by grid search technique with ten-fold cross validation. Experiments have been conducted on five random training-test partitions of the erythemato-squamous diseases dataset from UCI machine learning database. The experimental results show that our SVM-based model with IFSFFS achieved the optimal classification accuracy with no more than 14 features as well.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/xie10a.html
  PDF: http://proceedings.mlr.press/v11/xie10a/xie10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-xie10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Juanying
    family: Xie
  - given: Weixin
    family: Xie
  - given: Chunxia
    family: Wang
  - given: Xinbo
    family: Gao
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 142-151
  id: xie10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 142
  lastpage: 151
  published: 2010-09-30 00:00:00 +0000
- title: 'Learning to Rank for Personalized News Article Retrieval'
  abstract: 'This paper aims to tackle the very interesting and important problem of user personalized ranking of search results. The focus is on news retrieval and the data from which the ranking model is learned was provided by a large online newspaper. The personalized news search ranking model which we have developed takes into account not only document content and metadata, but also data specific to the user such as age, gender, job, income, city, country etc. All the user specific data is provided by the user himself when registering to the news site.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/dali10a.html
  PDF: http://proceedings.mlr.press/v11/dali10a/dali10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-dali10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Lorand
    family: Dali
  - given: Blaž
    family: Fortuna
  - given: Jan
    family: Rupnik
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 152-159
  id: dali10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 152
  lastpage: 159
  published: 2010-09-30 00:00:00 +0000
- title: 'Detection of Server-side Web Attacks'
  abstract: 'Web servers and server-side applications constitute the key components of modern Internet services. We present a pattern recognition system to the detection of intrusion attempts that target such components. Our system is anomaly-based, i.e., we model the normal (legitimate) traffic and intrusion attempts are identified as anomalous traffic. In order to address the presence of attacks (noise) inside the training set we employ an ad-hoc outlier detection technique. This approach does not require supervision and allows us to accurately detect both known and unknown attacks against web services.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/corona10a.html
  PDF: http://proceedings.mlr.press/v11/corona10a/corona10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-corona10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Igino
    family: Corona
  - given: Giorgio
    family: Giacinto
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 160-166
  id: corona10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 160
  lastpage: 166
  published: 2010-09-30 00:00:00 +0000
- title: 'Multiple Kernel Learning on the Limit Order Book'
  abstract: 'Simple features constructed from order book data for the EURUSD currency pair were used to construct a set of kernels. These kernels were used both individually and simultaneously through the Multiple Kernel Learning (MKL) methods of SimpleMKL and the more novel LPBoostMKL to train multiclass Support Vector Machines to predict the direction of future price movements. The kernel methods outperformed a trend following benchmark both in their predictive ability and when used in a simple trading rule. Furthermore, the kernel weightings selected by the MKL techniques highlight which features of the EURUSD order book are the most informative for predictive tasks.'
  volume: 11
  URL: https://proceedings.mlr.press/v11/fletcher10a.html
  PDF: http://proceedings.mlr.press/v11/fletcher10a/fletcher10a.pdf
  edit: https://github.com/mlresearch//v11/edit/gh-pages/_posts/2010-09-30-fletcher10a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the First Workshop on Applications of Pattern Analysis'
  publisher: 'PMLR'
  author: 
  - given: Tristan
    family: Fletcher
  - given: Zakria
    family: Hussain
  - given: John
    family: Shawe-Taylor
  editor: 
  - given: Tom
    family: Diethe
  - given: Nello
    family: Cristianini
  - given: John
    family: Shawe-Taylor
  address: Cumberland Lodge, Windsor, UK
  page: 167-174
  id: fletcher10a
  issued:
    date-parts: 
      - 2010
      - 9
      - 30
  firstpage: 167
  lastpage: 174
  published: 2010-09-30 00:00:00 +0000