ESANN2009

17th European Symposium on Artificial Neural Networks
Bruges, Belgium, April 22-23-24

[Electronic proceedings home page] [Electronic proceedings author index]

ESANN2009
Content of the proceedings

WARNING: you need Adobe Acrobat reader 7.0 or more to view the PDF files below



Semi-supervised learning


ES2009-3

Machine Learning with Labeled and Unlabeled Data

Tijl De Bie, Thiago Turchetti Maia, Antônio Braga

Abstract
The field of semi-supervised learning has been expanding rapidly in the past few years, with a sheer increase in the number of related publications. In this paper we present the SSL problem in contrast with supervised and unsupervised learning. In addition, we propose a taxonomy with which we categorize many existing approaches described in the literature based on their underlying framework, data representation, and algorithmic class.

Manuscript from author [PDF]

ES2009-9

A Variational Approach to Semi-Supervised Clustering

Peng Li, Yiming Ying, Colin Campbell

Abstract
We present a Bayesian variational inference scheme for semi-supervised clustering in which data is supplemented with side information in the form of common labels. There is no mutual exclusion of classes assumption and samples are represented as a combinatorial mixture over multiple clusters. We illustrate performance on six datasets and find a positive comparison against constrained K-means clustering.

Manuscript from author [PDF]

ES2009-75

A self-training method for learning to rank with unlabeled data

Tuong Vinh Truong, Massih-Reza Amini, Patrick Gallinari

Abstract
This paper presents a new algorithm for bipartite ranking functions trained with partially labeled data. The algorithm is an extension of the self-training paradigm developed under the classification framework. We further propose an efficient and scalable optimization method for training linear models though the approach is general in the sense that it can be applied to any classes of scoring functions. Empirical results on several common image and text corpora over the Area Under the ROC Curve (AUC) and the Average Precision measure show that the use of unlabeled data in the training process leads to improve the performance of baseline supervised ranking functions.

Manuscript from author [PDF]

ES2009-80

Transductively Learning from Positive Examples Only

Kristiaan Pelckmans, Johan Suykens

Abstract
This paper considers the task of learning a binary labeling of the vertices of a graph, given only a small set of positive examples and knowledge of the desired amount of positives. A learning machine is described maximizing the precision of the prediction, a combinatorial optimization problem which can be rephrased as a S-T mincut problem. For validation, we consider the movie recommendation dataset of MOVIELENS. For each user we have given a collection of (ratings of) movies which are liked well, and the task is to recommend a disjoint set of movies which are most probably of interest to the user.

Manuscript from author [PDF]

ES2009-104

Supervised classification of categorical data with uncertain labels for DNA barcoding

Charles Bouveyron, Stephane Girard, Madalina Olteanu

Abstract
In the supervised classification framework, the human supervision is required for labeling a set of learning data which are then used for building the classifier. However, in many applications, the human supervision is either imprecise, difficult or expensive and this gives rise to non robust classifiers. An interesting application where this situation occurs is DNA barcoding which aims to develop a standard tool to identify species with no or limited recourse to taxonomic expertise. In some cases, the morphological features describing the reference sample may be misleading and the taxonomists attribute labels incorrectly. This work presents a robust supervised classification method for categorical data based on a multivariate multinomial mixture model. The proposed method is applied to DNA barcoding and compared to classical methods on a real dataset.

Manuscript from author [PDF]

ES2009-81

A semi-supervised approach to question classification

David Tomás, Claudio Giuliano

Abstract
This paper presents a machine learning approach to question classification. We have defined a kernel function based on latent semantic information acquired from unlabeled data. This kernel allows including external semantic knowledge into the supervised learning process. We have combined this knowledge with a bag-of-words approach by means of composite kernels to obtain state-of-the-art results. As the semantic information is acquired from unlabeled text, our system can be easily adapted to different languages and domains.

Manuscript from author [PDF]

ES2009-56

Improving BAS committee performance with a semi-supervised approach

Ruy Luiz Milidiú, Julio Cesar Duarte

Abstract
Semi-supervised Learning is a machine learning approach that, by making use of both labeled and unlabeled data for training, can significantly improve learning accuracy. Boosting is a machine learning technique that combines several weak classifiers to improve the overall accuracy. At each iteration, the algorithm changes the weights of the examples and builds an additional classifier. A well known algorithm based on boosting is AdaBoost, which uses an initial uniform distribution. Boosting At Start (BAS) is a boosting framework that generalizes AdaBoost by allowing any initial weight distribution and a cost function. Here, we present a scheme that allows the use of unlabeled data in the BAS framework. We examine the performance of the proposed scheme in some datasets commonly used in semi-supervised approaches. Our empirical findings indicate that BAS can improve the accuracy of the generated classifiers by taking advantage of unlabeled data.

Manuscript from author [PDF]

ES2009-136

Semi-supervised bipartite ranking with the normalized Rayleigh coefficient

Liva Ralaivola

Abstract
We propose a new algorithm for semi-supervised learning in the bipartite ranking framework. It is based on the maximization of a so-called normalized Rayleigh coefficient, which differs from the usual Rayleigh coefficient of Fisher's linear discriminant in that the actual covariance matrices are used instead of the scatter matrices. We show that if the class conditional distributions are Gaussian, then the ranking function produced by our algorithm is the optimal linear ranking function. A kernelized version of the proposed algorithm and a semi-supervised formulation are provided. Preliminary numerical results are promising.

Manuscript from author [PDF]

ES2009-61

Partially-supervised learning in Independent Factor Analysis

Etienne Côme, Latifa Oukhellou, Patrice Aknin, Thierry Denoeux

Abstract
Independent Factor Analysis (IFA) is used to recover latent components (or sources) from their linear observed mixtures within an unsupervised learning framework. Both the mixing process and the source densities are learned from the observed data. The sources are assumed to be mutually independent and distributed according to a mixture of Gaussians. This paper investigates the possibility of incorporating partial knowledge on the cluster belonging of some samples to estimate the IFA model. Semi-supervised and partially supervised learning cases can thus be handled. Experimental results demonstrate the ability of this approach to enhance estimation accuracy and remove indeterminacy commonly encountered in unsupervised IFA such as the permutation of the sources.

Manuscript from author [PDF]

[Back to Top]


Dimensionality reduction


ES2009-81

A semi-supervised approach to question classification

David Tomás, Claudio Giuliano

Abstract
This paper presents a machine learning approach to question classification. We have defined a kernel function based on latent semantic information acquired from unlabeled data. This kernel allows including external semantic knowledge into the supervised learning process. We have combined this knowledge with a bag-of-words approach by means of composite kernels to obtain state-of-the-art results. As the semantic information is acquired from unlabeled text, our system can be easily adapted to different languages and domains.

Manuscript from author [PDF]

ES2009-56

Improving BAS committee performance with a semi-supervised approach

Ruy Luiz Milidiú, Julio Cesar Duarte

Abstract
Semi-supervised Learning is a machine learning approach that, by making use of both labeled and unlabeled data for training, can significantly improve learning accuracy. Boosting is a machine learning technique that combines several weak classifiers to improve the overall accuracy. At each iteration, the algorithm changes the weights of the examples and builds an additional classifier. A well known algorithm based on boosting is AdaBoost, which uses an initial uniform distribution. Boosting At Start (BAS) is a boosting framework that generalizes AdaBoost by allowing any initial weight distribution and a cost function. Here, we present a scheme that allows the use of unlabeled data in the BAS framework. We examine the performance of the proposed scheme in some datasets commonly used in semi-supervised approaches. Our empirical findings indicate that BAS can improve the accuracy of the generated classifiers by taking advantage of unlabeled data.

Manuscript from author [PDF]

ES2009-136

Semi-supervised bipartite ranking with the normalized Rayleigh coefficient

Liva Ralaivola

Abstract
We propose a new algorithm for semi-supervised learning in the bipartite ranking framework. It is based on the maximization of a so-called normalized Rayleigh coefficient, which differs from the usual Rayleigh coefficient of Fisher's linear discriminant in that the actual covariance matrices are used instead of the scatter matrices. We show that if the class conditional distributions are Gaussian, then the ranking function produced by our algorithm is the optimal linear ranking function. A kernelized version of the proposed algorithm and a semi-supervised formulation are provided. Preliminary numerical results are promising.

Manuscript from author [PDF]

ES2009-61

Partially-supervised learning in Independent Factor Analysis

Etienne Côme, Latifa Oukhellou, Patrice Aknin, Thierry Denoeux

Abstract
Independent Factor Analysis (IFA) is used to recover latent components (or sources) from their linear observed mixtures within an unsupervised learning framework. Both the mixing process and the source densities are learned from the observed data. The sources are assumed to be mutually independent and distributed according to a mixture of Gaussians. This paper investigates the possibility of incorporating partial knowledge on the cluster belonging of some samples to estimate the IFA model. Semi-supervised and partially supervised learning cases can thus be handled. Experimental results demonstrate the ability of this approach to enhance estimation accuracy and remove indeterminacy commonly encountered in unsupervised IFA such as the permutation of the sources.

Manuscript from author [PDF]

ES2009-133

The Exploration Machine - a novel method for structure-preserving dimensionality reduction

Axel Wismueller

Abstract
We present a novel method for structure-preserving dimensionality reduction. The Exploration Machine (Exploratory Observation Machine, XOM) computes graphical representations of high-dimensional observations by a strategy of self-organized model adaptation. Although simple and computationally efficient, XOM enjoys a surprising flexibility to simultaneously contribute to several different domains of advanced machine learning, scientific data analysis, and visualization, such as structure-preserving dimensionality reduction and data clustering.

Manuscript from author [PDF]

ES2009-65

Nonlinear Discriminative Data Visualization

Kerstin Bunte, Barbara Hammer, Petra Schneider, Michael Biehl

Abstract
Due to the tremendous increase of electronic information with respect to the size of the data sets as well as its dimensionality, visualization of high dimensional data constitutes one of the key problems of data mining. Since embedding in lower dimensions necessarily includes a loss of information, methods to explicitly control the information kept by a specific visualization technique are highly desirable. The incorporation of supervised class information constitutes an important specific case. In this contribution we propose an extension of prototype-based local matrix learning by a charting technique which results in an efficient nonlinear discriminative visualization of a given labelled data manifold.

Manuscript from author [PDF]

ES2009-114

Does dimensionality reduction improve the quality of motion interpolation?

Sebastian Bitzer, Stefan Klanke, Sethu Vijayakumar

Abstract
In recent years nonlinear dimensionality reduction has frequently been suggested for the modelling of high-dimensional motion data. While it is intuitively plausible to use dimensionality reduction to recover low dimensional manifolds which compactly represent a given set of movements, there is a lack of critical investigation into the quality of resulting representations, in particular with respect to generalisability. Furthermore it is unclear how consistently particular methods can achieve good results. Here we use a set of robotic motion data of which we know ground truth to evaluate a range of nonlinear dimensionality reduction methods with respect to the quality of motion interpolation. We show that results are sensitive to parameter settings and data set used and that no dimensionality reduction method significantly outperforms naive interpolation.

Manuscript from author [PDF]

ES2009-113

Transformations for variational factor analysis to speed up learning

Jaakko Luttinen, Alexander Ilin, Tapani Raiko

Abstract
We present a way to speed-up learning of variational Bayesian factor analysis models by doing simple transformations during the learning process. These transformations are motivated by some representational ambiguities in the model and are given a theoretical justification from the Bayesian framework. We derive the formulae for variational Bayesian PCA and show experimentally that the transformations may improve the rate of convergence by orders of magnitude. The result can be applied to several factor analysis models that use EM or gradient-based algorithms for learning.

Manuscript from author [PDF]

ES2009-117

X-SOM and L-SOM: a nested approach for missing value imputation

Paul Merlin, Antti Sorjamaa, Bertrand Maillet, Amaury Lendasse

Abstract
In this paper, a new method for the determination of missing values in temporal databases is presented. This one is based on a robust version of a nonlinear classification algorithm: the Self-Organizing Maps and consists to the combination of two classifications in order to take advantage of spatial as well as temporal dependency of the dataset. This nested approach leads to a significant improvement of the estimation of missing values. An application of the determination of missing values for hedge fund return database is proposed.

Manuscript from author [PDF]

[Back to Top]


Signal and image processing


ES2009-70

Sparse differential connectivity graph of scalp EEG for epileptic patients

Ladan Amini, Sophie Achard, Christian Jutten, Hamid Soltanian-Zadeh, Gholam Ali Hossein-Zadeh, Olivier David, Laurent Vercueil

Abstract
The aim of the work is to integrate the information modulation of the inter-relations between EEG scalp measurements of two brain states in a connectivity graph. We present a sparse differential connectivity graph (SDCG) to distinguish the effectively modulated connections between epileptiform and non-epileptiform states of the brain from all the common connections created by noise, artifact, unwanted background activities and their related volume conduction effect. The proposed method is applied on real epileptic EEG data. Clustering the extracted features from SDCG may present valuable information about the epileptiform focus and their relations.

Manuscript from author [PDF]

ES2009-83

Patch-based bilateral filter and local m-smoother for image denoising

Arnaud de Decker, John Lee, Michel Verleysen

Abstract
In the field of image analysis, denoising is an important preprocessing task. The design of an efficient, robust, and computationally effective edge-preserving denoising algorithm is a widely studied, and yet unsolved problem. One of the most efficient edge-preserving denoising algorithms is the bilateral filter, which is an intuitive generalization of the local M-smoother. In this paper, we propose to modify both the bilateral filter and the local M-smoother to use patches of the image instead of single pixel in the denoising process. With this modification, the filtering effet becomes more sensitive to the different areas of the image and the filtering results improve. The denoising quality of these patch-based filters is evaluated on test images and compared to classical bilateral filtering and local M-smoother.

Manuscript from author [PDF]

ES2009-94

Adaptive anisotropic denoising: a bootstrapped procedure

John Lee, Arnaud de Decker, Michel Verleysen

Abstract
Signal denoising proves to be important in many domains such as pattern recognition and image analysis. This paper investigates several refinements of adaptive local filters that rely on local mode finding. These spatial filters are anisotropic and offer the advantage of attenuating noise without smoothing salient signal features such as discontinuities or other sharp transitions. In particular, a bootstrapped procedure is developed and leads to an improvement of the denoising quality without increasing the computational complexity. Experiments with an artificial benchmark allow the quantification of the performance gain.

Manuscript from author [PDF]

[Back to Top]


Learning (with) preferences


ES2009-4

Supervised learning as preference optimization

Fabio Aiolli, Alessandro Sperduti

Abstract
Learning with preferences is receiving more and more attention in the last few years. The goal in this setting is to learn based on qualitative or quantitative declared preferences between objects of a domain. In this paper we give a survey of a recent framework for supervised learning based on preference optimization. In fact, many of the broad set of supervised tasks can all be seen as particular instances of this preference based framework. They include simple binary classification, (single or multi) label multiclass classification, ranking problems, and (ordinal) regression, just to name a few. We show that the proposed general preference learning model (GPLM), which is based on a large-margin principled approach, gives a flexible way to codify cost functions for all the above problems as sets of linear preferences. Examples of how the proposed framework has been effectively used to address a variety of real-world applications are reported clearly showing the flexibility and effectiveness of the approach.

Manuscript from author [PDF]

ES2009-112

Efficient voting prediction for pairwise multilabel classification

Eneldo Loza Mencía, Sang-Hyeun Park, Johannes Fürnkranz

Abstract
The pairwise approach to multilabel classification reduces the problem to learning and aggregating preference predictions among the possible labels. A key problem is the need to query a quadratic number of preferences for making a prediction. To solve this problem, we extend the recently proposed QWeighted algorithm for efficient pairwise multiclass voting to the multilabel setting, and evaluate the adapted algorithm on several real-world datasets. We achieve an average-case reduction of classifier evaluations from n^2 to n + dn log n, where n is the total number of labels and d is the average number of labels, which is typically quite small in real-world datasets.

Manuscript from author [PDF]

ES2009-122

Multi-task Preference learning with Gaussian Processes

Adriana Birlutiu, Perry Groot, Tom Heskes

Abstract
We present an EM-algorithm for the problem of learning user preferences with Gaussian Processes in the context of multi-task learning. We validate our approach on an audiological data set and show that predictive results for sound quality perception of normal hearing and hearing-impaired subjects, in the context of pairwise comparison experiments, can significantly be improved using the hierarchial model.

Manuscript from author [PDF]

[Back to Top]


Learning I


ES2009-62

Adaptive Metrics for Content Based Image Retrieval in Dermatology

Kerstin Bunte, Michael Biehl, Nicolai Petkov, Marcel F. Jonkman

Abstract
We apply distance based classifiers in the context of a content based image retrieval task in dermatology. In the present project, only RGB color information is used. We employ two different methods in order to obtain a discriminative distance measure for classification and retrieval: Generalized Matrix LVQ and Large Margin Nearest Neighbor approach. Both methods provide a linear transformation of the original features to lower dimensions. We demonstrate that both methods lead to very similar discriminative transformations and improve the classification and retrieval performances significantly.

Manuscript from author [PDF]

ES2009-67

Bayesian periodogram smoothing for speech enhancement

Xueru Zhang, Alexander Ypma, Bert de Vries

Abstract
Periodogram smoothing of the received noisy signal is a challenging problem in speech enhancement. We present a Bayesian approach, where the instantaneous periodogram is smoothed through an adaptive smoothing parameter. By updating sufficient statistics using new samples of the noisy signal, the smoothing parameter is adjusted on-line. The performance of the novel smoothing algorithm is studied in a speech enhancement context. It is demonstrated that with respect to Mean Square Error, the proposed Bayesian smoothing algorithm performs better than the other non-Bayesian smoothing algorithms in higher signal-to-noise ratio environments.

Manuscript from author [PDF]

ES2009-78

Improving the transition modelling in hidden Markov models for ECG segmentation

Benoît Frénay, Gaël de Lannoy, Michel Verleysen

Abstract
The segmentation of ECG signal is a useful tool for the diagnosis of cardiac diseases. However, the state-of-the-art methods use hidden Markov models which do not adequately model the transitions between successive waves. This paper proposes two methods which attempt to overcome this limitation: a HMM state scission scheme which prevents ingoing and outgoing transitions in the middle of the waves and a bayesian network where the transitions are emission-dependent. Experiments show that both methods improve the results on pathological ECG signals.

Manuscript from author [PDF]

ES2009-74

A robust biologically plausible implementation of ICA-like learning

Felipe Gerhard, Cristina Savin, Jochen Triesch

Abstract
We present a model that can perform ICA-like computation by simple, local, biologically plausible rules. By combining synaptic learning with homeostatic regulation of neurons properties and adaptive lateral inhibition, the neural network can robustly learn Gabor-like receptive fields from natural images. With spatially localized inhibitory connections, a topographic map can be achieved. Additionally, the network can solve the Foldiak bars problem, a classical nonlinear ICA task.

Manuscript from author [PDF]

ES2009-137

Spline-based neuro-fuzzy Kolmogorov’s network for time series prediction

Vitaliy Kolodyazhniy

Abstract
A spline-based modification of the previously developed Neuro-Fuzzy Kolmogorov's Network (NFKN) is proposed. In order to improve the approximation accuracy, cubic B-splines are substituted for triangular membership functions. The network is trained with a hybrid learning rule combining least squares estimation for the output layer and gradient descent for the hidden layer. The initialization of the NFKN is deterministic and is based on the PCA procedure. The advantages of the modified NFKN are confirmed by long-range iterated predictions of two chaotic time series: an artificial data generated by the Mackey-Glass equation and a real data of laser intensity oscillations.

Manuscript from author [PDF]

ES2009-110

Gene expression data analysis using spatiotemporal blind source separation

Matthieu Sainlez, Pierre-Antoine Absil, Andrew Teschendorff

Abstract
We propose a ``time-biased'' and a ``space-biased'' method for spatiotemporal independent component analysis (ICA). The methods rely on computing an orthogonal approximate joint diagonalizer of a collection of covariance-like matrices. In the time-biased version, the time signatures of the ICA modes are imposed to be white, whereas the space-biased version imposes the same condition on the space signatures. We apply the two methods to the analysis of gene expression data, where the genes play the role of the space and the cell samples stand for the time. This study is a step towards addressing a question first raised by Liebermeister, on whether ICA methods for gene expression analysis should impose independence across genes or across cell samples. Our preliminary experiment indicates that both approaches have value, and that exploring the continuum between these two extremes can provide useful information about the interactions between genes and their impact on the phenotype.

Manuscript from author [PDF]

ES2009-127

A wavelet-heterogeneous index of market shocks for assessing the magnitude of financial crises

Christophe Boucher, Patrick Kouontchou, Bertrand Maillet, Raymond Hélène

Abstract
An accurate quantitative definition of financial crisis requires a universal and robust scale for measuring market shocks. Following Zumbach et al. (2000) and Maillet et Michel (2003), we propose a new quantitative measure of financial disturbances, which captures the heterogeneity of investor horizons – from day traders to pension funds. The indicator resides on a multi-resolution analysis of market volatility, each scale corresponding to various investment horizons and different data frequencies. This new risk measure, called “Wavelet-heterogeneous Index of Market Shocks” (WhIMS), is based on the combination of two methods: the Wavelet Packets Sub-band Decomposition and the constrained Independent Component Analysis (See Kopriva and Sersic, 2007 and Lu and Rajapakse, 2005). We apply this measure on the French stock markets (high frequency CAC40) to date and gauge the severity of financial crises.

Manuscript from author [PDF]

ES2009-128

A robust hybrid DHMM-MLP modelling of financial crises measured by the WhIMS

Christophe Boucher, Bertrand Maillet, Paul Merlin

Abstract
This paper develops a hybrid model combining a Hidden Markov Chain (HMC) and Multilayer Perceptrons (MLP) on the Waveletheterogeneous Index of Market Shocks (WhIMS) to identify dynamically regimes in financial turbulences. The WhIMS is an aggregate measure of volatility computed at different frequencies. We estimate the model based on a French market stock index (CAC40 Index) and compare the prediction performance of the HMC-MLP model to classical linear and non-linear models. A state separation of financial disturbances based on the WhIMS and conditional probabilities of the HMC-MLP model is then performed using a Robust SOM.

Manuscript from author [PDF]

ES2009-105

A faster model selection criterion for OP-ELM and OP-KNN: Hannan-Quinn criterion

Yoan Miché, Amaury Lendasse

Abstract
The OP-ELM and OP-KNN algorithms make use of the same methodology structure, based on a random initialization of a Feedforward Neural Network followed by a ranking of the neurons; final step uses this ranking of neurons to determine the best combination of them to retain. This is usually achieved by Leave One Out (LOO) cross-validation. It is proposed in this article to use the Hannan-Quinn Information Criterion as a model selection criterion, instead of the classical LOO. This criterion proved to be efficient and just as good as (or slightly better than) the previously used LOO one for both OP-ELM and OP-KNN, while decreasing computational times by factors of four to five.

Manuscript from author [PDF]

ES2009-123

Rosen's projection method for SVM training

Jorge López, José Dorronsoro

Abstract
In this work we will give explicit formulae for the application of Rosen's gradient projection method to SVM training that leads to a very simple implementation. We shall experimentally show that the method provides good descent directions that result in less training iterations, particularly when large precision is wanted. However, a naive kernelization may end up in a procedure requiring more KOs than SMO and further work is needed to arrive at an efficient implementation.

Manuscript from author [PDF]

ES2009-124

On the huge benefit of quasi-random mutations for multimodal optimization with application to grid-based tuning of neurocontrollers

Guillaume Chaslot, Jean-Baptiste Hoock, Fabien Teytaud, Olivier Teytaud

Abstract
In this paper, we study the optimization of a neural network used for controlling a Monte-Carlo Tree Search (MCTS/UCT) algorithm. The main results are: (i) the specification of a new multimodal benchmark function; this function has been defined in particular in agree ment with \cite{multimodalppsn} which has pointed out that most multimodal functions are not satisfactory for some real-wor ld multimodal scenarios (section \ref{sota}); (ii) experimentation of Evolution Strategies on this new multimodal benchmark function, showing the great efficienc y of quasi-random mutations in this framework (section \ref{artif}); (iii) the proof-of-concept of the application of ES for grid-based tuning Neural Networks for controlling MCTS/UCT (see section \ref{rw}). %As this work combines several notions, including some not very well known yet algorithms (e.g. MCTS), the first section co ntains a brief introduction to all these important notions (the MCTS part can be seen as a particular benchmark for readers uninterested in UCT/MCTS approaches). %However, readers who are not familiar with MCTS and not interested in this family o f algorithms can only see this as a particularly difficult benchmark and read the rest of the paper only.

Manuscript from author [PDF]

ES2009-32

Support vectors machines regression for estimation of mars surface physical properties

Caroline Bernard Michel, Sylvain Douté, Mathieu Fauvel, Laurent Gardes, Stephane Girard

Abstract
In this paper, the estimation of physical properties from hyperspectral data with support vector machine is addressed. Several kernel functions were used, from classical to advanced ones. The results are compared with Gaussian Regularized Sliced Inversion Regression and Partial Least Squares, both in terms of accuracy and complexity. Experiments on simulated data show that SVM produce highly accurate results, for some kernels, but with an increased of the processing time. Inversion of real images shows that SVM are robust and generalize well. In addition, the analysis of the support vectors allows to detect saturation of the physical model used to generate the simulated data.

Manuscript from author [PDF]

ES2009-98

Self-organising map for large scale processes monitoring

Edouard Leclercq, Fabrice Druaux, Dimitri Lefebvre

Abstract
A feed-forward neural network is proposed for monitoring operating modes of large scale processes. A Gaussian hidden layer associated with a Kohonen output layer map the principal features of measurements of state variables. Subsets of selective neurons are generated into the hidden layer by means of self adapting of centers and dispersions parameters of the Gaussian functions. The output layer operates like a data fusion operator by means of adapting the hidden-to-output matrix of weights through a winner takes all strategy. The algorithm is tested with the Tennessee Eastman Challenge Process. The results prove that the proposed neural network clearly maps the different operating modes.

Manuscript from author [PDF]

ES2009-100

The Use of ANN for Turbo Engine Applications

René Meier, Lars Frank Große, Franz Joos

Abstract
To reduce environmental pollution and increase efficiency of commercially available turbo engines it is essential to optimize. The suggestion made in this paper, is the use of evolution strategies and artificial neuronal networks (ANN) for turbo engine applications. Optimisations of the impeller and the combustion process are only two applications in the wide range of improvements.

Manuscript from author [PDF]

[Back to Top]


Efficient learning in recurrent networks


ES2009-7

Recent advances in efficient learning of recurrent networks

Barbara Hammer, Benjamin Schrauwen, Jochen J. Steil

Abstract
Recurrent neural networks (RNNs) carry the promise of implementing efficient and biologically plausible signal processing. They both are optimally suited for a wide area of applications when dealing with spatiotemporal data or causalities and provide explanation of cognitive phenomena of the human brain. Recently, a few new fundamental paradigms connected to RNNs have been developed which allow insights into their potential for information processing. They also pave the way towards new efficient training algorithms which overcome the well-known problem of long-term dependencies. This tutorial gives an overview of this recent developments in efficient, biologically plausible recurrent information processing.

Manuscript from author [PDF]

ES2009-132

Studies on reservoir initialization and dynamics shaping in echo state networks

Joschka Boedecker, Oliver Obst, N. Michael Mayer, Minoru Asada

Abstract
The fixed random connectivity of networks in reservoir computing leads to significant variation in performance. Only few problem specific optimization procedures are known to date. We study a general initialization method using permutation matrices and derive a new unsupervised learning rule based on intrinsic plasticity (IP) for echo state networks. Using three different benchmarks, we show that networks with permutation matrices for the reservoir connectivity have much longer memory than the other methods, but are also able to perform highly non-linear mappings. We also show that IP based on sigmoid transfer functions is limited concerning the output distributions that can be achieved.

Manuscript from author [PDF]

ES2009-63

Non-markovian process modelling with Echo State Networks

Xavier Dutoit, Benjamin Schrauwen, Hendrik Van Brussel

Abstract
Reservoir Computing (RC) is a relatively recent technique for training Recurrent Neural Networks. It has shown interesting performances in a wide range of tasks despite the simple training rules. We use it here in a logistic regression framework. Considering non-markovian time series with a hidden variable, we show that RC can be used to estimate the transition probabilities at each time step and also to estimate the hidden variable. We also show that it outperforms classic logistic regression on this task. Finally, it can be used to extract invariants from a stochastic series.

Manuscript from author [PDF]

ES2009-17

Stimulus processing and unsupervised learning in autonomously active recurrent networks

Claudius Gros, Gregor Kaczor

Abstract
Strongly recurrent neural nets may show a continuously ongoing self-sustained activity, as it is the case for the brain. A new paradigm for learning is needed for neural nets being such autonomously active, since standard Hebbian-style online learning would result in uncontrolled reinforcement of accidental activity patterns. Here we propose that autonomously active neural networks processing a time series of stimuli adapt whenever a stimulus successfully influences the ongoing internal dynamics. In this case the incoming stimulus corresponds to a novel signal. We then show, that the network performance results in an unsupervised non-linear independent component analysis of the input data stream. We propose this paradigm to be of relevance for stimulus processing in both natural and artificial neural nets.

Manuscript from author [PDF]

ES2009-135

Reservoir computing for static pattern recognition

Mark Embrechts, Luis Alexandre, Jonathan Linton

Abstract
This paper introduces reservoir computing for static pattern recognition. Reservoir computing networks are neural networks with a sparsely connected recurrent hidden layer (or reservoir) of neurons. The weights from the inputs to the reservoir and the reservoir weights are randomly selected. The weights of the second layer are determined with a linear partial least squares solver. The outputs of the reservoir layer can be considered as an unsupervised data transformation and this stage has a brain-like plausibility. This paper shows that by letting the dynamics of the reservoir evolve to a stable solution, and then applying a sigmoid transfer function, reservoir computing can be applied as a robust and highly accurate pattern classifier. Reservoir computing is applied to 16 difficult multi-class classification benchmark cases, and compared with the best results of state-of the art neural network classification methods with entropic error criteria.

Manuscript from author [PDF]

ES2009-88

Generalisation of action sequences in RNNPB networks with mirror properties

Raymond Cuijpers, Floran Stuijt, Ida Sprinkhuizen-Kuyper

Abstract
The human mirror neuron system (MNS) is supposed to be involved in recognition of observed action sequences. However, it remains unclear how such a system could learn to recognise a large variety of action sequences. Here we investigated a neural network with mirror properties, the Recurrent Neural Network with Parametric Bias (RNNPB). We show that the network is capable of recognising noisy action sequences and that it is capable of generalising from a few learnt examples. Such a mechanism may explain how the human brain is capable of dealing with an infinite variety of action sequences.

Manuscript from author [PDF]

ES2009-54

Attractor-based computation with reservoirs for online learning of inverse kinematics

R. Felix Reinhart, Jochen J. Steil

Abstract
We implement completely data driven and efficient online learning from temporally correlated data in a reservoir network setup. We show that attractor states rather than transients are used for computation when learning inverse kinematics for the redundant robot arm PA-10. Our findings shade also light on the role of output feedback.

Manuscript from author [PDF]

[Back to Top]


Classification and fuzzy logic


ES2009-59

Supervised variable clustering for classification of NIR spectra

Catherine Krier, Damien Francois, Fabrice Rossi, Michel Verleysen

Abstract
Spectrometric data involve very high-dimensional observations representing sampled spectra. The correlation of the resulting spectral variables and their high number are two sources of difficulties in modeling. This paper proposes a supervised feature clustering algorithm that provides dimension reduction for this type of data in a classification context. The new features designed by this method are means of the original spectral variables computed on specific ranges of wavelengths and are therefore easy to interpret. Experiments on real world data show that the reduction in redundancy and in number of features leads to better performances obtained using a very low number of spectral ranges.

Manuscript from author [PDF]

ES2009-58

Fuzzy Fleiss-kappa for Comparison of Fuzzy Classifiers

Dietlind Zühlke, Tina Geweniger, Ulrich Heimann, Thomas Villmann

Abstract
In this paper we show a straight forward extension of the fuzzy Cohen's-ê to Fleiss'-ê for the determination of classification agreements of fuzzy classifiers. In addition we investigate the influence of different interpretations of fuzzy intersection in terms of t-norms. These considerations are done for exemplary artificial data as well as for classification in image recognition for counting pollen grains.

Manuscript from author [PDF]

ES2009-23

Lukasiewicz fuzzy logic networks and their ultra low power hardware implementation

Rafal Dlugosz, Witold Pedrycz

Abstract
In this paper, we propose a new category of current-mode Lukasiewicz OR and AND logic neurons and logic networks and show their ultra low power realization. The introduced circuits can operate with very low input signals that set up the operating point of transistors in the subthreshold region. In this region, the mismatch between transistors has much stronger impact on the current mirror gain than in the strong inversion region. The proposed solution minimizes this problem by reducing the number of current mirrors between the input and output of the neuron to only one.

Manuscript from author [PDF]

ES2009-97

Simultaneous Clustering and Segmentation for Functional Data

Bernard Hugueney, Georges Hébrail, Yves Lechevallier, Fabrice Rossi

Abstract
We propose in this paper an exploratory analysis algorithm for functional data. The method partitions a set of functions into K clusters and represents each cluster by a piecewise constant prototype. The total number of segments in the prototypes, P, is chosen by the user and optimally distributed into the clusters via two dynamic programming algorithms.

Manuscript from author [PDF]

[Back to Top]


Neurosciences


ES2009-50

Cerebellum and spatial cognition: A connectionist approach

Jean-Baptiste Passot, Laure Rondi-Reig, Angelo Arleo

Abstract
A large body of experimental and theoretical work has investigated the role of the cerebellum in adaptive motor control, movement coordination, and Pavlovian conditioning. Recent experimental findings have also begun to unravel the implication of the cerebellum in high-level functions such as spatial cognition. We focus on behavioural genetic data suggesting that cerebellar long-term plasticity may mediate the procedural component of spatial learning. We present a spiking neural network model of the cerebellar microcomplex that reproduces these experimental findings. The model brings forth a testable prediction about the interaction between the neural substrates subserving procedural and declarative spatial learning.

Manuscript from author [PDF]

ES2009-125

A neural model for binocular vergence control without explicit calculation of disparity

Agostino Gibaldi, Manuela Chessa, Andrea Canessa, Silvio P. Sabatini, Fabio Solari

Abstract
A computational model for the control of horizontal vergence, based on a population of disparity tuned complex cells, is presented. The model directly extracts the disparity-vergence response by combining the outputs of the disparity detectors without explicit calculation of the disparity map. The resulting vergence control yields to stable fixation and has small response time to a wide range of disparities. Experimental simulations with synthetic stimuli in depth validate the approach.

Manuscript from author [PDF]

[Back to Top]


Weightless neural systems


ES2009-6

A brief introduction to Weightless Neural Systems

Igor Aleksander, Massimo De Gregorio, Felipe França, Priscila Lima, Helen Morton

Abstract
Mimicking biological neurons by focusing on the excitatory/inhibitory decoding performed by the dendritic trees is a different and attractive alternative to the integrate-and-fire McCullogh-Pitts neuron stylisation. In such alternative analogy, neurons can be seen as a set of RAM nodes addressed by Boolean inputs and producing Boolean outputs. The shortening of the semantic gap between the synaptic-centric model introduced by the McCullogh-Pitts neuron and the dominating, binary digital, computational environment, is among the interesting benefits of the weightless neural approach. This paper presents an overview of the most representative paradigms of weightless neural systems and corresponding applications, at abstraction levels ranging from pattern recognition to artificial consciousness.

Manuscript from author [PDF]

ES2009-28

Phenomenal weightless machines

Igor Aleksander, Helen Morton

Abstract
This paper describes how early designs of dynamic weightless neural systems were developed to enable some of the states of a state structure to have a phenomenal character. Such states reflect the features of a sensory reality and allow the storage of aspects of sensory experience and access to it. The ‘machine consciousness’ paradigm is summarised in this paper. The paper concludes with a description of the current state-of-the-art of a phenomenal approach to a model of consciousness which is based on the first of a set of introspective axioms.

Manuscript from author [PDF]

ES2009-116

Extracting fuzzy rules from “mental” images generated by a modified WISARD perceptron

Bruno Grieco, Priscila Lima, Massimo De Gregorio, Felipe França

Abstract
The pioneering WISARD weightless neural classifier is based on the collective response of RAM-based neurons. The ability of producing prototypes, analog to “mental images”, from learned categories, was firstly introduced in the DRASIW model. By counting the frequency of writing accesses at each RAM neuron content accessed during the training phase, it is possible to associate the mostly accessed contents to the corresponding input field addresses that defined them. This work is about extracting information from such frequency/location counting in the form of fuzzy rules, as an alternative way to describe the same mental images produced by DRASIW as logical prototypes.

Manuscript from author [PDF]

ES2009-10

FPGA-based enhanced probabilistic convergent weightless Network for human Iris recognition

Pierre Lorrentz, Gareth Howells, Klaus McDonald-Maier

Abstract
This paper investigates how human identification and identity verification can be performed by the application of an FPGA based weightless neural network, entitled the Enhanced Probabilistic Convergent Neural Network (EPCN), to the iris biometric modality. The human iris are processed for feature vectors which will be employed for formation of connectivity, during learning and subsequent recognition. The pre-processing of the iris, prior to EPCN training, is very minimal. Structural modifications were also made to the Random Access Memory (RAM) based neural network which enhances its robustness when applied in real-time. Structural modifications were also made to the Random Access Memory (RAM) based neural network which enhances its robustness when applied in real-time.

Manuscript from author [PDF]

ES2009-138

Novel Modular Weightless Neural Architectures for Biometrics-based Recognition

Konstantinos Sirlantzis, Gareth Howells, Bogdan Gherman

Abstract
We introduce a novel weightless artificial neural architecture based on multiple classifier systems. In this, different modules of a network specialise in recognising specific classes of a multiclass recognition task. Each of these modules comprises individual RAM addresses which store frequency-based probabilistic estimates of how likely it is to observe this pattern as a feature of the training examples available from a particular class. The class-wise likelihood of observing a combination of addresses for each class is calculated as a sum-based scheme (one of the most commonly used multi-classifier fusion methods). The classification decision is finally obtained by choosing the class with the highest pseudo-posterior probability for an address combination. Tests of our system on a face recognition problem using Minchinton cell encoding for mapping regions of interest (ROIs) to the network’s input layer showed very encouraging results.

Manuscript from author [PDF]

ES2009-101

Quantum RAM Based Neural Netoworks

Wilson de Oliveira

Abstract
A mathematical quantisation of a Random Access Memory (RAM) is proposed starting from its matrix representation. This quantum RAM (q-RAM) is employed as the neural unit of q-RAM-based Neural Networks, q-RbNN, which can be seen as the quantisation of the corresponding RAM-based ones. The models proposed here are direct realisable in quantum circuits, have a natural adaptation of the classical learning algorithms and physical feasibility of quantum learning in contrast to what has been proposed in the literature.

Manuscript from author [PDF]

[Back to Top]


Learning II


ES2009-24

Comparison between linear discrimination analysis and support vector machine for detection of pesticide on spinach leaf by hyperspectral imaging with excitation-emission matrix

Mizuki Tsuta, Gamal El Masry, Takehiro Sugiyama, Kaori Fujita, Junichi Sugiyama

Abstract
The performances of support vector machine (SVM) and linear discrimination analysis (LDA) for detection of pesticide on spinach leaf were investigated. Fluorescence images of spinach leaves without any treatment, treated with pure water and methamidophos solution were taken under 561 difference wavelength conditions to acquire hyperspectral excitation-emission matrix (EEM) data. Then LDA and SVM were applied to EEMs of pixels randomly sampled from the data for the classification of treatment. Misclassification rate for LDA and SVM were 18.8% and 9.9%, respectively. It was also found that methamidophos treated leaves could be distinguished visibly from others after SVM was applied to each pixel of hyperspectral EEM data.

Manuscript from author [PDF]

ES2009-40

SVM-based learning method for improving colour adjustment in automotive basecoat manufacturing

Francisco J. Ruiz, Nuria Agell, Cecilio Angulo

Abstract
A new iterative method based on Support Vector Machines to perform automated colour adjustment processing in the automotive industry is proposed in this paper. The iterative methodology relies on a SVM trained with patterns provided by expert colourists and an actions' generator module. The SVM algorithm enables selecting the most adequate action in each step of an iterated feed-forward loop until the final state satisfies colourimetric bounding conditions. Both encouraging results obtained and the significant reduction of non-conformance costs, justify further industrial efforts to develop an automated software tool in this and similar industrial processes.

Manuscript from author [PDF]

ES2009-60

Application of SVM for cell recognition in BCC skin pathology

Tomasz Markiewicz, Stanislaw Osowski, Cezary Jochymski, Joanna Narbutt, Wojciech Kozlowski

Abstract
The paper presents the application of Support Vector Machine (SVM) for the recognition of immunopositive and immunonegative cells at basal cell carcinoma. The developed algorithm applies two kinds of SVM: the Gaussian kernel SVM for direct cell recognition and linear kernel SVM as a preprocessing stage for sequential thresholding of the image. The developed computer program was tested on the examples of 528 images of carcinoma and the obtained results are in good agreement with the human expert score.

Manuscript from author [PDF]

ES2009-87

A neural network model of landmark recognition in the fiddler crab, Uca lactea

Hyunggi Cho, DaeEun Kim

Abstract
The fiddler crabs, Uca lactea, which live on intertidal mudflats, exhibit a remarkable ability to return to its burrow. It has been reported that the species usually use path integration, an ideothetic mechanism for short-range homing. During the mating season, however, the accumulation error of the process increases due to vigorous courtship movement. To compensate for this, most courting males construct the vertical mud structures, called semidomes, at the entrance of their burrows and use them as landmarks. Here, we suggest a possible neural medel that demonstrates how visual landmark navigation could be implemented in the fiddler crab's central nervous system. The model consisting of two levels of a population of neurons, is based on the snapshot hypothesis and a simplified version of Franz's algorithm is used for the computation of home vector.

Manuscript from author [PDF]

ES2009-82

Classification of high-dimensional data for cervical cancer detection

Charles Bouveyron, Camille Brunet, Vincent Vigneron

Abstract
In this paper, the performance of different generative methods for the classification of cervical nuclei are compared in order to detect cancer of cervix. These methods include classical approaches, such as Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA) or Mixture Discriminant Analysis (MDA) and a high dimensional approach (HDDA) recently developed. The classification of cervical nuclei presents two main statistical issues, scarce population and high dimensional data, which impact on the ability to successfully discriminate the different classes. This paper presents an approach to face these problems of unbalanced data and high dimensions.

Manuscript from author [PDF]

ES2009-36

Sparse support vector machines by kernel discriminant analysis

Kazuki Iwamura, Shigeo Abe

Abstract
We discuss sparse support vector machines (SVMs) by selecting the linearly independent data in the empirical feature space. First we select training data that maximally separate two classes in the empirical feature space. As a selection criterion we use linear discriminant analysis in the empirical feature space and select training data by forward selection. Then the SVM is trained in the empirical feature space spanned by the selected training data. We evaluate our method by computer experiments and show that our method can realize sparse SVMs with comparable generalization performance with that of regular SVMs.

Manuscript from author [PDF]

ES2009-111

Embedding Proximal Support Vectors into Randomized Trees

Cedric Simon, Christophe De Vleeschouwer, Jérôme Meessen

Abstract
By embedding multiple proximal SVM classifiers into a binary tree architecture, it is possible to turn an arbitrary multi-classes problem into a hierarchy of binary classifications. The critical issue then consists in determining in each node of the tree how to aggregate the multiple classes into a pair of say overlay classes to discriminate. As a fundamental contribution, our paper proposes to deploy an ensemble of randomized trees, instead of a single optimized decision tree, to bypass the question of overlay classes definition. Empirical results on various datasets demonstrate a significant gain in accuracy both compared to 'one versus one' SVM solutions and to conventional ensemble of decision trees classifiers.

Manuscript from author [PDF]

ES2009-99

Echo State networks and Neural network Ensembles to predict Sunspots activity

Friedhelm Schwenker, Amr Labib

Abstract
Echo state networks (ESN) and ensembles of neural networks are developed for the prediction of the monthly sunspots series. An echo state network and a multilayer perceptron approach were used with the neural network ensembles. Through numerical evaluation on this data it is shown that ESN outperform feedforward MLP. Furthermore, it is shown that median fusion lead to robust predictors, and even can improve the prediction accuracy of the best individual predictors.

Manuscript from author [PDF]

ES2009-96

Monotonic Recurrent Bounded Derivative Neural Network

Alexey Minin, Bernhard Lang

Abstract
Neural networks applied in control loops and safety-critical domains have to meet hard requirements. First of all, a small approximation error is required, then, the smoothness and the monotonicity of selected input-output relations have to be taken into account and finally time dependencies in time series should be induced into the model. Otherwise the stability of the control laws can be lost. New Monotonic Recurrent Bounded Derivative Network (RBDN) on the basis of the Bounded Derivative Network (BDN) will be considered. Authors compared two networks, investigated the influence of the back connection in recurrent network, stability and monotonicity of the new network.

Manuscript from author [PDF]

ES2009-73

Modeling pigeon behavior using a Conditional Restricted Boltzmann Machine

Matthew Zeiler, Graham Taylor, Nikolaus Troje, Geoffrey Hinton

Abstract
In an effort to better understand the complex courtship behaviour of pigeons, we have built a model learned from motion capture data. We employ a Conditional Restricted Boltzmann Machine with binary latent features and real-valued visible units. The units are conditioned on information from previous time steps to capture dynamics. We validate a trained model by quantifying the characteristic "head-bobbing" present in pigeons. We also show how to predict missing data by marginalizing out the hidden variables and minimizing free energy.

Manuscript from author [PDF]

ES2009-22

Connection strategy and performance in sparsely connected 2D associative memory models with non-random images

Lee Calcraft, Rod Adams, Neil Davey

Abstract
A sparsely connected associative memory model is tested with different pattern sets, and it is found that pattern recall is highly dependent on the type of patterns used. Performance is also found to depend critically on the connection strategy used to build the networks. Comparisons of topology reveal that connectivity matrices based on Gaussian distributions perform well for all pattern types tested, and that for best pattern recall at low wiring costs, the optimal value of Gaussian  used in creating the connection matrix is dependent on properties of the pattern set.

Manuscript from author [PDF]

ES2009-44

Zero phase-lag synchronization through short-term modulations

Thomas Burwick

Abstract
Considering coupled phase model oscillator systems with non-identical time delays, we study the possibility of close-to-zero phase-lag synchronization (ZPS) without frequency depression (FD). FD refers to nearly vanishing frequencies of the synchzronized oscillators (in comparison to the intrinsic frequencies); its absence is crucial for interpretations related to brain dynamics. Discussing an extension of the Kuramoto model, it is demonstrated that ZPS without FD may arise by allowing for dynamical parameters. Two models are presented: one is based on short-term modulation of the delays, while the other assumes static delays but short-term modulation of coupling strengths. We also speculate on possible relevance of such mechanisms with respect to assembly formation by relating the frequency of the synchronized oscillation to recently proposed pattern frequency bands.

Manuscript from author [PDF]

ES2009-48

Learning reconstruction and prediction of natural stimuli by a population of spiking neurons

Michael Gutmann, Aapo Hyvärinen

Abstract
We propose a model for learning representations of time dependent data with a population of spiking neurons. Encoding is based on a standard spiking neuron model, and the spike timings of the neurons represent the stimulus. Learning is based on the sole principle of maximization of representation accuracy: the stimulus can be decoded from the spike timings with minimum error. Since the encoding is causal, we propose two different representation strategies: The spike timings represent the stimulus either in a predictive manner or by reconstructing past input. We apply the model to speech data and discuss differences between the emergent representations.

Manuscript from author [PDF]

[Back to Top]


Brain Computer Interfaces: from theory to practice


ES2009-5

Brain-Computer Interfaces: from theory to practice

Dieter Devlaminck, Bart Wyns, Luc Boullart, Patrick Santens, Georges Otte

Abstract
Brain-Computer Interfaces (BCI) are a new kind of human-machine interfaces emerging on the horizon. They form a communication pathway between the brain and a machine. This can be achieved by measuring brain signals and translate them directly into control commands. Such a system allows people with severe motor disabilities to manipulate their environment in an alternative way. However there’s still a lot of work to be done to make it usable in daily life. In this contribution we give a tutorial overview of existing methods and possible applications.

Manuscript from author [PDF]

ES2009-64

Oscillation in a network model of neocortex

Wim van Drongelen, Hyong Lee, Amber Martell, Jennifer Dwyer, Rick Stevens, Mark Hereld

Abstract
A basic understanding of the relationships between the activity of individual neurons and macroscopic electrical activity of local field potentials or electroencephalogram (EEG) may provide guidance for experimental design in neuroscience, improve the development of therapeutic approaches in neurology, and offer opportunities for computer aided design of brain-computer interfaces. Here, we study the relationship between subthreshold resonant properties of cortical neurons and the onset and offset of network oscillations in a computational model of neocortex. This model includes two types of pyramidal cells and four types of inhibitory interneurons and is capable of generating network oscillations and bursting activity. Our findings suggest that neuronal resonance is associated with subthreshold oscillation of neurons. This subthreshold behavior affects spike timing and therefore plays a significant role in the generation of the network’s extracellular currents as reflected in the EEG. In addition, we find that electrical stimulation to stop bursting in a network is most effective around the resonant frequency of the neurons.

Manuscript from author [PDF]

ES2009-51

Sensors selection for P300 speller brain computer interface

Bertrand Rivet, Antoine Souloumiac, Guillaume Gibert, Virginie Attina, Olivier Bertrand

Abstract
Brain-computer interfaces (BCI) are communication system that use brain activities to control a device. The BCI studied is based on the P300 speller [1]. A new algorithm to select relevant sensors is proposed: it is based on a previous proposed algorithm [2] used to enhance P300 potentials by spatial filters. Data recorded on three subjects were used to evaluate the proposed selection method: it is shown to be efficient and to compare favourably with a reference method [3].

Manuscript from author [PDF]

ES2009-139

Multiclass brain computer interface based on visual attention

Rolando Grave de Peralta Menendez, Jorge Dias, José Augusto Soares Prado, Hadi Aliakbarpour, Sara Gonzalez Andino

Abstract
Recent public demonstrations showed that a system based on imagination does not always work [1]. On the other side predicting limb movement based on scalp activity has proved to be hazardous [2] and thus other alternatives are needed. This paper describes the asynchronous Geneva-BCI based on EEG and visual attention to external stimulus able to send commands every 0.5 (or 0.25) seconds with very high (98.88%) correct classification rates and optimal (178 bits/min) theoretical bit rate. This high performance allows for the distant real time control of robots using four commands.

Manuscript from author [PDF]

ES2009-102

Brain Computer Interface for Virtual Reality Control

Christoph Guger, Clemens Holzner, Christoph Groenegress, Günter Edlinger, Mel Slater

Abstract
An electroencephalogram (EEG) based brain-computer interface (BCI) was connected with a Virtual Reality system in order to control a smart home application. Therefore special control masks were developed which allowed using the P300 component of the EEG as input signal for the BCI system. Control commands for switching TV channels, for opening and closing doors and windows, for navigation and conversation were realized. Experiments with 12 subjects were made to investigate the speed and accuracy that can be achieved if several hundred of commands are used to control the smart home environment. The study clearly shows that such a BCI system can be used for smart home control. The Virtual Reality approach is a very cost effective way for testing the smart home environment together with the BCI system.

Manuscript from author [PDF]

ES2009-20

The Possibility of Single-trial Classification of Viewed Characters using EEG Waveforms

Minoru Nakayama, Hiroshi Abe

Abstract
Electroencephalograms (EEGs) contain responses to visual stimulus, however, signal noise often prevents these from being easily obtained. To classify EEG waveforms, a signal processing procedure using the relationship between EEG and ERP, which is the summation of EEG waveforms, was developed. The processing technique involves the prediction of signals using Support Vector Regression. The procedure was developed and applied to a Kanji recognition task used to classify viewing characters, symbols or Kanji. The accuracy of classification between using EEG waveforms with ERP references and without ERP references was compared. The accuracy with references was significantly more than by chance and was higher than EEG waveforms without references.

Manuscript from author [PDF]

ES2009-52

Exploring the impact of alternative feature representations on BCI classification

Ali Bahramisharif, Marcel van Gerven, Tom Heskes

Abstract
Classification performance in BCIs depends heavily on the features that are used as input to the employed classifier. If the BCI signal is extended in time, we may either use a representation of the signal at multiple time segments with a high risk of overfitting or averaged over time with a high risk of underfitting as input to the classifier. In this paper we present an empirical study which allows us to determine the right balance between these two representations. Using two BCI data sets, we show that our method can significantly improve classification performance.

Manuscript from author [PDF]

ES2009-77

Uncued brain-computer interfaces: a variational hidden markov model of mental state dynamics

Cédric Gouy-Pailler, Jérémie Mattout, Marco Congedo, Christian Jutten

Abstract
This paper describes a method to improve uncued Brain-Computer Interfaces based on motor imagery. Our algorithm aims at filtering the continuous classifier output by incorporating prior knowledge about the mental state dynamics. On dataset IVb of BCI competition III, we compare the performances of four different methods by combining smoothed probabilities filtered by our algorithm/direct classifier output and static/dynamic classifier. We demonstrate that the combination of our algorithm with a dynamic classifier yields the best results.

Manuscript from author [PDF]

ES2009-93

Decoding finger flexion using amplitude modulation from band-specific ECoG

Nanying Liang, Laurent Bougrain

Abstract
EEG-BCIs have been well studied in the past decades and implemented into several famous applications, like P300 speller and wheelchair controller. However, these interfaces are indirect due to low spatial resolution of EEG. Recently, direct ECoG-BCIs attract intensive attention because ECoG provides a higher spatial resolution and signal quality. This makes possible localization of the source of neural signals with respect to certain brain functions. In this article, we present a realization of ECoG-BCIs for finger flexion prediction provided by BCI competition IV. Methods for finger flexion prediction including feature extraction and selection are provided in this article. Results show that the predicted finger movement is highly correlated with the true movement when we use band-specific amplitude modulation.

Manuscript from author [PDF]

ES2009-103

Neural network pruning for feature selection - Application to a P300 Brain-Computer Interface

Hubert Cecotti, Axel Graeser

Abstract
A Brain-Computer Interface (BCI) is an interface that enables the direct communication between human and machines by analyzing brain measurements. A P300 speller is based on the oddball paradigm, which generates event-related potential (ERP), like the P300 wave, on targets selected by the user. The detection of these P300 waves allows selecting visually characters on the screen. We present a new model for the detection of P300 waves. The techniques is based on a neural network that uses convolution layers for creating channels. One challenge for improving pragmatically BCIs is to reduce the number of electrodes and to select the best electrodes in relation to the subject particularities. We propose a feature selection strategy based on salient connexions in the first hidden layer of a neural network trained with all the electrodes as input. A new classifier is created in relation to the remaining topology and the desired number of electrodes for the system. The recognition rate of the P300 speller over 2 subjects is 87\% by considering only 8 electrodes.

Manuscript from author [PDF]

ES2009-115

Augmenting Information from Brain-Computer Interfaces through Bayesian Plan Recognition

Eric Demeester, Alexander Huntemann, Jose del R. Millan, Hendrik Van Brussel

Abstract
For severely disabled people, Brain-Computer Interfaces (BCIs) may provide the means to regain mobility and manipulation capabilities. However, information obtained from current BCIs is uncertain and of limited bandwidth and resolution. This paper presents a Bayesian framework that estimates from uncertain BCI signals a richer representation of the task a robotic mobility or manipulation device should execute, such that these devices can be operated more safely, accurately and efficiently. The framework has been evaluated on a simulated robotic wheelchair.

Manuscript from author [PDF]

[Back to Top]


Generative and bayesian models


ES2009-57

Heterogeneous mixture-of-experts for fusion of locally valid knowledge-based submodels

Jörg Beyer, Kai Heesche, Werner Hauptmann, Clemens Otte

Abstract
Real-world applications often require the joint use of data-driven and knowledge-based models. While data-driven models are learned from available process data, knowledge-based models are able to provide additional information not contained in the data. In this contribution, we propose a method to divide the input space on the basis of the validity ranges of the knowledge-based models. By doing so they are only active in those domains they are designed for. The data-driven models complete the coverage of the input space. We demonstrate the benefits of our approach on a real-world application for the energy management of a hybrid electric vehicle.

Manuscript from author [PDF]

ES2009-107

Dirichlet process-based component detection in state-space models

Botond Bocsi, Lehel Csato

Abstract
An extension of the switching-state models that allows arbitrary number of components is presented. We introduce a Dirichlet process prior over the mixture components of the linear models. This prior allows the inference on the number of linear models to be put into the mixture. We develop a distance measure in the space of linear Kalman filters with the use of the Kullback-Leibler divergence over the conditional probabilities induced by the individual Kalman filters. The introduced distance measure allows to remove components that are no longer relevant, making the algorithm more effective. We test the proposed algorithm on both artificial and real-world data.

Manuscript from author [PDF]

ES2009-29

A variational radial basis function approximation for diffusion processes

Michail Vrettas, Dan Cornford, Yuan Shen

Abstract
In this paper we present a radial basis function based extension to a recently proposed variational algorithm for approximate inference for diffusion processes. Inference, for state and in particular (hyper-)parameters, in diffusion processes is a challenging and crucial task. We show that the new radial basis function approximation based algorithm converges to the original algorithm and has beneficial characteristics when estimating (hyper-)parameters. We validate our new approach on a non-linear double well potential dynamical system.

Manuscript from author [PDF]

ES2009-106

A regression model with a hidden logistic process for signal parametrization

Faicel Chamroukhi, Allou Samé, Gérard Govaert, Patrice Aknin

Abstract
A new approach for signal parametrization, which consists of a specific regression model incorporating a discrete hidden logistic process, is proposed. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. An experimental study using simulated and real data reveals good performances of the proposed approach.

Manuscript from author [PDF]

[Back to Top]


Neural maps and learning vector quantization - theory and applications


ES2009-8

Neural Maps and Learning Vector Quantization - Theory and Applications

Frank-Michael Schleif, Thomas Villmann

Abstract
Neural maps and Learning Vector Quantizer are fundamental paradigms in neural vector quantization based on Hebbian learning. The beginning of this field dates back over twenty years with strong progress in theory and outstanding applications. Their success lies in its robustness and simplicity in application whereas the mathematics beyond is rather difficult. We provide an overview on recent achievements and current trends of ongoing research.

Manuscript from author [PDF]

ES2009-85

Hyperparameter Learning in Robust Soft LVQ

Petra Schneider, Michael Biehl, Barbara Hammer

Abstract
We present a technique to extend Robust Soft Learning Vector Quantization (RSLVQ). This algorithm is derived from an explicit cost function and follows the dynamics of a stochastic gradient ascent. The RSLVQ cost function involves a hyperparameter which is kept fixed during training. We propose to adapt the hyperparameter based on the gradient information. Experiments on artificial and real life data show that the hyperparameter crucially influences the performance of RSLVQ. However, it is not possible to estimate the best value from the data prior to learning. We show that the proposed variant of RSLVQ is very robust with respect to the choice of the hyperparameter.

Manuscript from author [PDF]

ES2009-66

Median Variant of Fuzzy c-Means

Tina Geweniger, Dietlind Zühlke, Barbara Hammer, Thomas Villmann

Abstract
In this paper we introduce Median Fuzzy C-Means (M-FCM). This algorithm extends the Median C-Means (MCM) algorithm by allowing fuzzy values for the cluster assignments. To evaluate the performance of M-FCM we compare the results with the clustering obtained by employing MCM and Median Neural Gas (MNG).

Manuscript from author [PDF]

ES2009-108

Topologically Ordered Graph Clustering via Deterministic Annealing

Fabrice Rossi, Nathalie Villa

Abstract
This paper proposes an organized generalization of Newman and Girvan's modularity measure for graph clustering. Optimized via a deterministic annealing scheme, this measure produces topologically ordered graph partitions that lead to faithful and readable graph representations.

Manuscript from author [PDF]

ES2009-39

Equilibrium properties of off-line LVQ

Aree Witoelar, Michael Biehl, Barbara Hammer

Abstract
The statistical physics analysis of offline learning is applied to cost function based learning vector quantization (LVQ) schemes. Typical learning behavior is obtained from a model with data drawn from high dimension Gaussian mixtures and a system of two or three prototypes. The analytic approach becomes exact in the limit of high training temperature. We study two cost function related LVQ algorithms and the influence of an appropriate weight decay. In our findings, learning from mistakes (LFM) achieves poor generalization ability, while a limiting case of generalized LVQ (GLVQ), termed LVQ+/-, displays much better performance with a properly chosen weight decay.

Manuscript from author [PDF]

ES2009-49

Kernelizing Vector Quantization Algorithms

Matthieu Geist, Olivier Pietquin, Gabriel Fricout

Abstract
The kernel trick is a well known approach allowing to implicitly cast a linear method into a nonlinear one by replacing any dot product by a kernel function. However few vector quantization algorithms have been kernelized. Indeed, they usually imply to compute linear transformations (e.g. moving prototypes), what is not easily kernelizable. This paper introduces the Kernel-based Vector Quantization (KVQ) method which allows working in an approximation of the feature space, and thus kernelizing any Vector Quantization (VQ) algorithm.

Manuscript from author [PDF]

ES2009-134

A computational framework for exploratory data analysis

Axel Wismueller

Abstract
We introduce the Exploration Machine (Exploratory Observation Machine, XOM) as a novel versatile method for the analysis of multidimensional data. XOM systematically inverts structural and functional components of so-called topology-preserving mappings. It provides a surprising flexibility to simultaneously contribute to complementary domains of unsupervised learning for exploratory pattern analysis, namely both structure-preserving dimensionality reduction and data clustering. We demonstrate XOM’s applicability to synthetic and real-world data.

Manuscript from author [PDF]

[Back to Top]


Learning III


ES2009-92

SOM based methods in early fault detection of nuclear industry

Miki Sirola, Jaakko Talonen, Golan Lampi

Abstract
Early fault detection in nuclear industry is studied. Tools have been developed for control room operators and experts in an industrial project. Self-Organizing Map (SOM) method has been used in combination with other methods. Decision support visualizations are introduced. The usability of methods have been tested and verified by constructing prototype systems. The use of SOM method in dynamic systems is discussed. Applications for industrial domain are presented. Data sets from a Finnish nuclear power plant have been analyzed. Promising results in failure management are achieved.

Manuscript from author [PDF]

ES2009-131

Projection of undirected and non-positional graphs using Self Organizing Maps

Hagenbuchner Markus, ShuJia Zhang, Ah Chung Tsoi, Alessandro Sperduti

Abstract
Kohonen's Self-Organizing Map is a popular method which allows the projection of high dimensional data onto a low dimensional display space. Models of Self-Organizing Maps for the treatment of graphs have also been defined and studied. This paper proposes an extension to the GraphSOM model which substantially improves the stability of the model, and, as a side effect, allows for an acceleration of training. The proposed extension is based on a soft encoding of the information needed to represent the vertices of an input graph. Experimental results versus the original GraphSOM model demonstrate the advantages of the proposed extension.

Manuscript from author [PDF]

ES2009-30

Hardware Implementation Issues of the Neighborhood Mechanism in Kohonen Self Organized Feature Maps

Marta Kolasa, Rafal Dlugosz

Abstract
In this paper, we discuss an important problem of the selection of the neighborhood radius in the learning schemes of the Winner Takes Most Kohonen neural network. The optimization of this parameter is essential in case of hardware realization of the network given that the lower values of the radius can result in significant reduction of both the power dissipation and the chip area, even by 40-60% that is important in application of such networks in low power devices. The simulation studies reveal that using large initial values of the neighborhood radius usually is not the most optimal. For a wide range of the training parameters some optimal values, usually small, of the neighborhood radius may be indicated that allow for the minimization of the quantization error.

Manuscript from author [PDF]

ES2009-31

Reconciling neural fields to self-organization

Lucian Alecu, Hervé Frezza-Buet

Abstract
Despite being successfully used in the design of various biologically-inspired applications, the paradigm of dynamic neural fields (DNF) does not seem to have been exploited at its full potential yet. Partly because of the difficulties concerning a comprehensive theoretical study of them, essential aspects as learning mechanisms have rarely been addressed in the literature. In the current paper, we first show that classical DNF equations fail to offer reliable support for self-organization, unveiling some behavioural issues that prevent the fields to achieve this goal. Then, as an alternative to these, we propose a new DNF equation capable of deploying indeed a self-organizing mechanism based on neural fields.

Manuscript from author [PDF]

ES2009-126

Applying Mutual Information for Prototype or Instance Selection in Regression Problems

Alberto Guillen, Luis Javier Herrera, Gines Rubio, Héctor Pomares, Amaury Lendasse, Ignacio Rojas

Abstract
The problem of selecting the patterns to be learned by any model is usually not considered by the time of designing the concrete model but as a preprocessing step. Information theory provides a robust theoretical framework for performing input variable selection thanks to the concept of mutual information. This paper presents a new application of the concept of mutual information not to select the variables but to decide which prototypes should belong to the training data set in regression problems. The proposed methodology consists in deciding if a prototype should belong or not to the training set using as criteria the estimation of the mutual information between the variables. The novelty of the approach is to focus in prototype selection for regression problems instead of classification as the majority of the literature deals only with the last one. Other element that distinguish this work from others is that it is not proposed as an outlier identificator but as algorithm that determines the best subset of input vectors by the time of building a model to approximate it. As the experiment section shows, this new method is able to identify a high percentage of the real data set when it is applied to a highly distorted data sets.

Manuscript from author [PDF]

ES2009-43

Forward feature selection using Residual Mutual Information

Erik Schaffernicht, Christoph Möller, Klaus Debes, Horst-Michael Gross

Abstract
In this paper, we propose a hybrid filter/wrapper approach for fast feature selection using the Residual Mutual Information (RMI) between a classifier output and the remaining features as selection criterion. This approach can handle redundancies in the data as well as the bias of the employed learning machine while keeping the number of required evaluation cycles low. In classification experiments, we compare the Residual Mutual Information algorithm with other basic approaches for feature subset selection that use similar selection criteria. The efficiency and effectiveness of our method are demonstrated by the obtained results on UCI datasets.

Manuscript from author [PDF]

ES2009-68

Gaussian Mixture Models for multiclass problems with performance constraints

Nisrine Jrad, Edith Grall-Maes, Pierre Beauseroy

Abstract
This paper proposes a method using labelled data to learn a decision rule for multiclass problems with class-selective rejection and performance constraints. The method is based on class-conditional density estimations obtained by using the Gaussian Mixture Models (GMM). The rule is thus determined by plugging these estimations in the statistical hypothesis framework and solving an optimization problem. Two simulations are then carried out to corroborate the efficiency of the proposed method. Experimental results show that it compares well with a non-parametric solution using Parzen estimator.

Manuscript from author [PDF]

ES2009-25

On the routing complexity of neural network models - Rent's Rule revisited

Johannes Partzsch, Rene Schüffny

Abstract
In most models of spiking neural networks, routing complexity and scalability have not been taken into account. In this paper, we analyse recent neural network models on their routing complexity, using a method from circuit design known as Rent's Rule. We find a high complexity in most of the models for a wide range of connectivity levels. As a consequence, these models do not scale well in a two- or three-dimensional substrate, such as neuromorphic hardware or the brain.

Manuscript from author [PDF]

[Back to Top]