ESANN2006

14th European Symposium on Artificial Neural Networks
Bruges, Belgium, April 26-27-28

[Electronic proceedings home page] [Electronic proceedings author index]

ESANN2006
Content of the proceedings

WARNING: you need Adobe Acrobat reader 7.0 or more to view the PDF files below



Self-organization, vector quantization and clustering


ES2006-142

Unsupervised clustering of continuous trajectories of kinematic trees with SOM-SD

Jochen Steil, Risto Koiva, Alessandro Sperduti

Abstract
We explore the capability of SOM-SD to compress continuous time data recorded from a kinematic tree, which can represent a robot or an artifical stick-figure. We compare different encodings of this data as tree or sequence, which preserve the structural dependencies introduced by the physical constraints in the model to different degrees. Besides computing a standard quantization error, we propose a new measure to account for the amount of compression in the temporal domain based on correlation of the degree of locality of the tree and the number of winners in the map for this tree. The approach is demonstrated for a stick-figure moving in a physics based simulation world. It turns out that SOM-SD is able to achieve a very exact representation of the data together with a reasonable compression if tree encodings rather than sequence encodings are used.

Manuscript from author [PDF]

ES2006-83

Magnification control for batch neural gas

Barbara Hammer, Alexander Hasenfuss, Thomas Villmann

Abstract
It is well known, that online neural gas (NG) possesses a magnification exponent different from the information theoretically optimum one in adaptive map formation. The exponent can explicitely be controlled by a small change of the learning algorithm. Batch NG constitutes a fast alternative optimization scheme for NG vector quantizers which possesses the same magnification factor as standard online NG. In this paper, we propose a method to integrate magnification control by local learning into batch NG by linking magnification control to an underlying cost function. We validate the learning rule in an experimental setting.

Manuscript from author [PDF]

ES2006-90

Weighted differential topographic function: a refinement of topographic function

Lili Zhang, Erzsebet Merenyi

Abstract
Topology preservation of Self-Organizing Maps (SOMs) is an advantageous property for correct clustering. Among several existing measures of topology violation, this paper studies the Topographic Function (TF) [1]. We find that this measuring method, demonstrated for low-dimensional data in [1], has a reliable foundation in its distance metric for the interpretation of the neighborhood relationship in the input space, for high-dimensional data. Based on the TF, we present a Differential Topographic Function (DTF) to reveal the topology violation more clearly and informatively. In addition, a Weighted Differential Topographic Function (WDTF) has been developed. For real world data, the DTF and WDTF unravel more details than the original TF, and help us estimate the topology preservation quality more accurately.

Manuscript from author [PDF]

ES2006-121

Cluster detection algorithm in neural networks

David Meunier, Hélčne Paugam-Moisy

Abstract
Complex networks have received much attention in the last few years, and reveal global properties of interacting systems in domains like biology, social sciences and technology. One of the key feature of complex networks is their clusterized structure. Most methods applied to study complex networks are based on undirected graphs. However, when considering neural networks, the directionality of links is fundamental. In this article, a method of cluster detection is extended for directed graphs. We show how the extended method is more efficient to detect a clusterized structure in neural networks, without significant increase of the computational cost.

Manuscript from author [PDF]

ES2006-86

Enhanced maxcut clustering with multivalued neural networks and functional annealing

Enrique Mérida-Casermeiro, Domingo López-Rodríguez, Juan Miguel Ortiz-de-Lazcano-Lobato

Abstract
In this work a new algorithm to improve the performance of optimization methods, by means of avoiding certain local optima, is described. Its theoretical bases are presented in a rigorous, but intuitive, way. It has been applied concretely to the case of recurrent neural networks, in particular to MREM, a multivalued recurrent model, that has proved to obtain very good results when dealing with NP-complete combinatorial optimization problems. In order to show its efficiency, the well-known MaxCut problem for graphs has been selected as benchmark. Our proposal outperforms other specialized and powerful techniques, as shown by simulations.

Manuscript from author [PDF]

[Back to Top]


Man-Machine-Interfaces - Processing of nervous signals


ES2006-5

Artificial neural networks and machine learning for man-machine-interfaces - processing of nervous signals

Martin Bogdan, Michael Bensch

Abstract
Recently, Man-Machine-Interfaces contacting the nervous system in order to extract information resp. to introduce information gain more and more in importance. In order to establish systems like neural prostheses or Brain-Computer-Interfaces, powerful (real time) algorithms for processing nerve signals or their field potentials are required. Another important point is the introduction of information into nervous systems by means like functional neuroelectrical stimulation (FNS). This paper gives a short introduction and reviews different approaches towards the development of Man-Machine-Interfaces using artificial neural networks respectively machine learning algorithms for signal processing.

Manuscript from author [PDF]

ES2006-45

Linking non-binned spike train kernels to several existing spike train metrics

Benjamin Schrauwen, Jan Van Campenhout

Abstract
This work presents two kernels which can be applied to sets of spike times. This allows the use of state-of-the-art classification techniques to spike trains. The presented kernels are closely related to several recent and often used spike train metrics. One of the main advantages is that it does not require the spike trains to be binned. A high temporal resolution is thus preserved which is needed when temporal coding is used. As a test of the classification possibilities a jittered spike train template classification problem is solved.

Manuscript from author [PDF]

ES2006-111

Spatial filters for the classification of event-related potentials

Ulrich Hoffmann, Jean-Marc Vesin, Touradj Ebrahimi

Abstract
Spatial filtering is a widely used dimension reduction method in electroencephalogram based brain-computer interface systems. In this paper a new algorithm is proposed, which learns spatial filters from a training dataset. In contrast to existing approaches the proposed method yields spatial filters that are explicitly designed for the classification of event-related potentials, such as the P300 or movement-related potentials. The algorithm is tested, in combination with support vector machines, on several benchmark datasets from past BCI competitions and achieves state of the art results.

Manuscript from author [PDF]

ES2006-51

On-line adaptation of neuro-prostheses with neuronal evaluation signals

Klaus R. Pawelzik, Udo A. Ernst, David Rotermund

Abstract
Experiments have demonstrated that prosthetic devices can in principle be controlled by brain signals. However, in stable long-term applications neuroprostheses may suffer substantially from non-stationarities of the recorded signals. Such changes currently require supervised re-learning procedures which must be conducted under laboratory conditions, hampering the envisioned everyday use of such devices. As an alternative we here propose an on-line adaptation scheme that exploits a secondary signal source from brain regions reflecting the user's affective evaluation of the neuro-prosthetic's performance. Using realistic assumptions about recordable signals and their noise levels, our simulations show that prosthetic devices can be adapted successfully during normal, everyday usage.

Manuscript from author [PDF]

ES2006-44

Using distributed genetic programming to evolve classifiers for a brain computer interface

Eva Alfaro-Cid, Anna I. Esparcia-Alcázar, Ken Sharman

Abstract
The objective of this paper is to illustrate the application of genetic programming to evolve classifiers for multi-channel time series data. The paper shows how high performance distributed genetic programming has been implemented for evolving classifiers. The particular application discussed herein is the classification of human electroencephalographic signals for a brain-computer interface. The resulting classifying structures provide classification rates comparable to those obtained using traditional, human-designed, classification methods.

Manuscript from author [PDF]

[Back to Top]


Vision and applications


ES2006-39

A Cyclostationary Neural Network model for the prediction of the NO2 concentration

Monica Bianchini, Ernesto Di Iorio, Marco Maggini, Chiara Mocenni, Augusto Pucci

Abstract
Air pollution control is a major environmental concern. The quality of air is an important factor for everyday life in cities, since it affects the health of the community and directly influences the sustainability of our lifestyles and production methods. In this paper we propose a cyclostationary neural network (CNN) model for the prediction of the NO2 concentration. The cyclostationary nature of the problem guides the construction of the CNN architecture, which is composed by a number of MLP blocks equal to the cyclostationary period in the analyzed phenomenon, and is independent of exogenous inputs. Some preliminary experimentation shows that the CNN model significantly outperforms standard statistical tools usually employed for this task.

Manuscript from author [PDF]

ES2006-89

Learning Visual Invariance

Alessio Plebe

Abstract
Invariance is a necessary feature of a visual system able to recognize real objects in all their possible appearance. It is also the processing step most problematic to understand in biological systems, and most difficult to simulate in computational models. This work investigates the possibility to achieve viewpoint invariance without adopting any explicit theorical solution to the problem, but simply by exposing a hierarchical architecture of self-organizing artificial cortical maps to series of images under various viewpoints.

Manuscript from author [PDF]

[Back to Top]


Online Learning in Cognitive Robotics


ES2006-4

Recent trends in online learning for cognitive robots

Jochen Steil, Heiko Wersing

Abstract
We present a review of recent trends in cognitive robotics that deal with online learning approaches to the acquisition of knowledge, control strategies and behaviors of a cognitive robot or agent. Along this line we focus on the topics of object recognition in cognitive vision, trajectory learning and adaptive control of multi-DOF robots, task learning from demonstration, and general developmental approaches in robotics. We argue for the relevance of online learning as a key ability for future intelligent robotic systems to allow flexible and adaptive behavior within a changing and unpredictable environment.

Manuscript from author [PDF]

ES2006-73

Extended model of conditioned learning within latent inhibition

Nicolas Gomond, Jean-Marc Salotti

Abstract
Due to the various and dynamic nature of stimuli, decisions of intelligent agents must rely on the coordination of complex cognitive systems. This paper precisely focusses on a general learning architecture for autonomous agents. It is based on a neural network model that enables the specific behaviours of classical conditioning and a biologically inspired attentional phenomenon called latent inhibition. We propose a neural network implementation of an extended model of classical conditioning and present some results.

Manuscript from author [PDF]

ES2006-118

construction of a memory management system in an on-line learning mechanism

Francisco Bellas, Jose Antonio Becerra, Richard Duro

Abstract
This paper is the first of a two paper series that deals with an important problem in on-line learning mechanisms for autonomous agents that must perform non trivial tasks and operate over extended periods of time. The problem has to do with memory, and, in particular, with what is to be stored in what representation and the need for providing a memory management system to control the interplay between different types of memory. To study the problem, a two level memory structure consisting of a short term and a long term memory is introduced in an evolutionary based cognitive mechanism called the Multilevel Darwinist Brain. A management system for their operation and interaction is proposed that benefits from the evolutionary nature of the mechanism. Some results obtained during operation with real robots are presented in the second paper of the series

Manuscript from author [PDF]

ES2006-92

Adaptive scene-dependent filters in online learning environments

Michael Götting, Jochen Steil, Heiko Wersing, Edgar Körner, Helge Ritter

Abstract
In this paper we propose the Adaptive Scene Dependent Filters (ASDF) to enhance the online learning capabilities of an object recognition system in real-world scenes. The ASDF method proposed extends the idea of unsupervised segmentation to a flexible, highly dynamic image segmentation architecture. We combine unsupervised segmentation to define coherent groups of pixels with a recombination step using top-down information to determine which segments belong together to the object. We show the successful application of this approach to online learning in cluttered environments.

Manuscript from author [PDF]

ES2006-19

A multiagent architecture for concurrent reinforcement learning

Victor Uc Cetina

Abstract
In this paper we propose a multiagent architecture for implementing concurrent reinforcement learning, an approach where several agents, sharing the same environment, perceptions and actions, work towards one only objective: learning a single value function. We present encouraging experimental results derived from the initial phase of our research on the combination of concurrent reinforcement learning and learning from demonstration.

Manuscript from author [PDF]

ES2006-120

Some experimental results with a two level memory management system in the multilevel darwinist brain

Francisco Bellas, Jose Antonio Becerra, Richard Duro

Abstract
This paper provides a description and discussion of several experiments carried out with simulated and real agents that operated with the Multilevel Darwinist Brain cognitive mechanism including a two level memory management system. The agents interacted with real environments, including teachers, and the results show the interplay between the parameters that regulate replacement strategies in both, short term and long term memories. This type of structures allow the agents to learn autonomously, paying attention to the relevant information and to transform data into knowledge, creating subjective internal representations that can be easily reused or modified to adapt them to new situations.

Manuscript from author [PDF]

[Back to Top]


Learning I


ES2006-21

Robust Local Cluster Neural Networks

Ralf Eickhoff, Joaquin Sitte, Ulrich Rückert

Abstract
Artificial neural networks are intended to be used in future nanoelectronics since their biological examples seem to be robust to noise. In this paper, we analyze the robustness of Local Cluster Neural Networks and determine upper bounds on the mean square error for noise contaminated weights and inputs.

Manuscript from author [PDF]

ES2006-24

Topological Correlation

Kevin Doherty, Rod Adams, Neil Davey

Abstract
Quantifying the success of the topographic preservation achieved with a neural map is difficult. In this paper we present Topological Correlation, Tc, a method that assesses the degree of topographic preservation achieved based on the linear correlation between the topological distances in the neural map, and the topological distances in the induced Delaunay triangulation of the network nodes. In contrast to previous indices, Tc has been explicitly devised to assess the topographic success of neural maps composed of many sub-graph structures. The Tc index is bounded, and unequivocally identifies a perfect mapping, but more importantly, it provides the ability to quantitatively compare less than successful mappings. The Tc index has also been successfully used to determine the maximum size of network.

Manuscript from author [PDF]

ES2006-53

An algorithm for fast and reliable ESOM learning

Mario Nöcker, Fabian Mörchen, Alfred Ultsch

Abstract
The training of Emergent Self-organizing Maps (ESOM) with large datasets can be a computationally demanding task. Batch learning may be used to speed up training. It is demonstrated here, however, that the representation of clusters in the data space on maps trained with batch learning is poor compared to sequential training. This effect occurs even for very clear cluster structures. The k-batch learning algorithm is preferrable, because it creates the same quality of representation as sequential learning but maintains important properties of batch learning that can be exploited for speedup.

Manuscript from author [PDF]

ES2006-16

EM-algorithm for training of state-space models with application to time series prediction

Elia Liitiäinen, Nima Reyhani, Amaury Lendasse

Abstract
In this paper, an improvement to the E-step of the EM-algorithm for nonlinear state-space models is presented. We also propose strategies for model structure selection when the EM-algorithm and state-space models are used for time series prediction. Experiments on the Poland electricity load benchmark show that the method gives good short-term predictions and can also be used for long-term prediction.

Manuscript from author [PDF]

ES2006-77

Time series prediction using DirRec strategy

Antti Sorjamaa, Amaury Lendasse

Abstract
This paper demonstrates how the selection of Prediction Strategy is important in the Long-Term Prediction of Time Series. Two strategies are already used in the prediction purposes called Recursive and Direct. This paper presents a third one, DirRec, which combines the advantages of the two already used ones. A simple k-NN approximation method is used and all three strategies are applied to two benchmarks: Santa Fe and Poland Electricity Load time series.

Manuscript from author [PDF]

ES2006-30

Consistent estimation of the architecture of multilayer perceptrons

joseph Rynkiewicz

Abstract
We consider regression models involving multilayer perceptrons (MLP) with one hidden layer and a Gaussian noise. The estimation of the parameters of the MLP can be done by maximizing the likelihood of the model. In this framework, it is difficult to determine the true number of hidden units using an information criterion, like the BIC, because the information matrix of Fisher is not invertible if the number of hidden units is overestimated. Indeed, the classical theoretical justification of information criteria relies entirely on the invertibility of this matrix. However, using recent methodology introduced to deal with models with a loss of identifiability, we prove that suitable information criterion leads to consistent estimation of the true number of hidden units.

Manuscript from author [PDF]

ES2006-57

Optimal design of hierarchical wavelet networks for time-series forecasting

Chen Yuehui, Yang Bo, Abraham Ajith

Abstract
The purpose of this study is to identify the Hierarchical Wavelet Neural Networks (HWNN) and select important input features for each sub-wavelet neural network automatically. Based on the pre-defined instruction/operator sets, a HWNN is created and evolved using tree-structure based Extended Compact Genetic Programming (ECGP), and the parameters are optimized by Differential Evolution (DE) algorithm. This framework also allows input variables selection. Empirical results on benchmark time-series approximation problems indicate that the proposed method is effective and fficient.

Manuscript from author [PDF]

ES2006-85

Recognition of handwritten digits using sparse codes generated by local feature extraction methods

Rebecca Steinert, Martin Rehn, Anders Lansner

Abstract
We investigate when sparse coding of sensory inputs can improve performance in a classification task. For this purpose, we use a standard data set, the MNIST database of handwritten digits. We systematically study combinations of sparse coding methods and neural classifiers in a two-layer network. We find that processing the image data into a sparse code can indeed improve the classification performance, compared to directly classifying the images. Further, increasing the level of sparseness leads to even better performance, up to a point where the reduction of redundancy in the codes is offset by loss of information.

Manuscript from author [PDF]

ES2006-82

iterative context compilation for visual object recognition

Jens Teichert, Rainer Malaka

Abstract
This contribution describes an almost parameterless iterative context compilation method, which produces feature layers, that are especially suited for mixed bottom-up top-down association architectures. The context model is simple and enables fast calculation. Resulting structures are invariant to position, scale and rotation of input patterns.

Manuscript from author [PDF]

ES2006-134

FPGA implementation of an integrate-and-fire LEGION model for image segmentation

Bernard Girau, Cesar Torres-Huitzil

Abstract
Despite several previous studies, little progress has been made in building successful neural systems for image segmentation in digital hardware. Spiking neural networks offer an opportunity to develop models of visual perception without any complex structure based on multiple neural maps. Such models use elementary asynchronous computations that have motivated several implementations on analog devices, whereas digital implementations appear as quite unable to handle large spiking neural networks, for lack of density. In this work, we consider a model of integrate-and-fire neurons organized according to the standard LEGION architecture to segment grey-level images. Taking advantage of the local and distributed structure of the model, a massively distributed implementation on FPGA using pipelined serial computations is developed. Results show that digital and flexible solutions may efficiently handle large networks of spiking neurons.

Manuscript from author [PDF]

ES2006-137

Visual object classification by sparse convolutional neural networks

Alexander Gepperth

Abstract
A convolutional network (CNN) architecture termed sparse convolutional neural network is proposed and tested on a real-world classification task (car classification). In addition to the usual error function based on the mean squared error (MSE), a penalty term for correlation between hidden layer neurons is introduced with the aim of enforcing a sparse coding of the objects' visual appearance. It is demonstrated that classification accuracies can be improved by this method compared to purely MSE-trained convolutional networks.

Manuscript from author [PDF]

ES2006-133

Modelling switching dynamics using prediction experts operating on distinct wavelet scales

Alexandre Aussem, Pierre Chainais

Abstract
We present a framework for modelling the switching dynamics of a time series with correlation structures spanning distinct time scales, based on a neural-based multi-expert prediction model. First, an orthogonal wavelet transform is used to decompose the time series into varying levels of temporal resolution so that the underlying temporal structures of the original time series become more tractable. The transitions between the resolution scales are assumed to be governed by a hidden Markov model (HMM). The best state sequence is obtained by the Viterbi algorithm assuming some prior knowledge on the state transition probabilities and energy-dependent observation probabilities. The model achieves a hard segmentation of the time series into distinct dynamical modes and the simultaneous specialization of the prediction experts on the segments. The predictive ability of this strategy is assessed on a synthetic time series.

Manuscript from author [PDF]

ES2006-107

Learning for stochastic dynamic programming

Sylvain Gelly, Jérémie Mary, Olivier Teytaud

Abstract
We present experimental results about learning function values (i.e. Bellman values) in stochastic dynamic programming (SDP). All results come from openDP (opendp.sourceforge.net), a freely available source code, and therefore can be reproduced. The goal is an independent comparison of learning methods in the framework of SDP.

Manuscript from author [PDF]

ES2006-54

Adaptive Sensor Modelling and Classification using a Continuous Restricted Boltzmann Machine (CRBM)

Tong Boon Tang, Alan Murray

Abstract
This paper presents a neural approach to sensor modelling and classification as the basis of local data fusion in a wireless sensor network. Data distributions are non-Gaussian. Data clusters are sufficiently complex that the classification problem is markedly non-linear. We prove that a Continuous Restricted Boltzmann Machine can model complex data distributions and can autocalibrate against real sensor drift. To highlight the adaptation, two trained but subsequently non-adaptive neural classifiers (SLP and MLP) were employed as benchmarks.

Manuscript from author [PDF]

ES2006-48

Non-linear gating network for the large scale classification model CombNET-II

Mauricio Kugler, Toshiyuki Miyatani, Susumu Kuroyanagi, Anto Satriyo Nugroho, Akira Iwata

Abstract
The linear gating classifier (stem network) of the large scale model CombNET-II has been always the limiting factor which restricts the number of the expert classifiers (branch networks). The linear boundaries between its clusters cause a rapid decrease in the performance with increasing number of clusters and, consequently, impair the overall performance. This work proposes the use of a non-linear classifier to learn the complex boundaries between the clusters, which increases the gating performance while keeping the balanced split of samples produced by the original sequential clustering algorithm. The experiments have shown that, for some problems, the proposed model outperforms the monolithic classifier.

Manuscript from author [PDF]

ES2006-94

Saliency extraction with a distributed spiking neural network

Sylvain Chevallier, Philippe Tarroux, Hélčne Paugam-Moisy

Abstract
We present a distributed spiking neuron network (SNN) for handling low-level visual perception in order to extract salient locations in robot camera images. We describe a new method which reduce the computional load of the whole system, stemming from our choices of architecture. We also describe a modeling of post-synaptic potential, which allows to quickly compute the contribution of a sum of incoming spikes to a neuron's membrane potential. The interests of this saliency extraction method, which differs from classical image processing, are also exposed.

Manuscript from author [PDF]

ES2006-100

Connection strategies in neocortical networks

Andreas Herzog, Karsten Kube, Bernd Michaelis, Anna D. de Lima, Thomas Voigt

Abstract
This study considers the impact of different connection strategies in developing neocortical networks. An adequate connectivity is a requisite for synaptogenesis and the development of synchronous oscillatory network activity during maturation of cortical networks. In a defined time window early in development neurites have to grow out and connect to other neurons. Based on morphological observations we postulate that the underlying mechanism differs from common strategies of unspecific global or small world strategies. We show here that a displaced local connection mode is a very effective approach to connect neurons with minimal costs.

Manuscript from author [PDF]

[Back to Top]


Feature extraction and variable projection


ES2006-130

Random Forests Feature Selection with K-PLS: Detecting Ischemia from Magnetocardiograms

Long Han, Mark J. Embrechts, Boleslaw Szymanski, Karsten Sternickel, Alexander Ross

Abstract
Random Forests were introduced by Breiman for feature(variable) selection and improved predictions for decision tree models. The resulting model is often superior to AdaBoost and bagging approaches. In this paper the random forest approach is extended for variable selection with other learning models, in this case Partial Least Squares (PLS) and Kernel Partial Least Squares (K-PLS) to estimate the importance of variables. This variable selection method is demonstrated on two benchmark datasets (Boston Housing and South African heart disease data). Finally, this methodology is applied to magnetocardiogram data for the detection of ischemic heart disease.

Manuscript from author [PDF]

ES2006-101

Determination of the Mahalanobis matrix using nonparametric noise estimations

Amaury Lendasse, Francesco Corona, Jin Hao, Nima Reyhani, Michel Verleysen

Abstract
In this paper, the problem of an optimal transformation of the input space for function approximation problems is addressed. The transformation is defined determining the Mahalanobis matrix that minimizes the variance of noise. To compute the variance of the noise, a nonparametric estimator called the Delta Test paradigm is used. The proposed approach is illlustrated on two different benchmarks.

Manuscript from author [PDF]

ES2006-105

Bootstrap feature selection in support vector machines for ventricular fibrillation detection

Felipe Alonso Atienza, José Luis Rojo-Álvarez, Gustavo Camps-Valls, Alfredo Rosado Muńoz, Arcadio García Alberola

Abstract
Support Vector Machines (SVM) for classification are being paid special attention in a number of practical applications. When using nonlinear Mercer kernels, the mapping of the input space to a highdimensional feature space makes the input feature selection a difficult task to be addressed. In this paper, we propose the use of nonparametric bootstrap resampling technique to provide with a statistical, distribution independent, criterion for input space feature selection. The confidence interval of the difference of error probability between the complete input space and a reduced-in-one-variable input space, is estimated via bootstrap resampling. Hence, a backward variable elimination procedure can be stated, by removing one variable at each step according to its associated confidence interval. A practical example application to early stage detection of cardiac ventricular fibrillation (VF) is presented. Basing on a previous nonlinear analysis based on temporal and spectral VF parameters, we use the SVM with Gaussian kernel and bootstrap resampling to provide with the minimum input space feature set that still holds the classiffcation performance of the complete data. The use of bootstrap resampling is a powerful input feature selection procedure for SVM classifiers.

Manuscript from author [PDF]

ES2006-50

The permutation test for feature selection by mutual information

Damien Francois, Vincent Wertz, Michel Verleysen

Abstract
The estimation of mutual information for feature selection is often subject to inaccuracies due to noise, small sample size, bad choice of parameter for the estimator, etc. The choice of a threshold above which a feature will be considered useful is thus dicult to make. Therefore, the use of the permutation test to assess the reliability of the estimation is proposed. The permutation test allows performing a non-parametric ypothesis test to select the relevant features and to build a Feature Relevance Diagram that visually synthesizes the result of the test.

Manuscript from author [PDF]

ES2006-74

Stochastic Processes for Canonical Correlation Analysis

Colin Fyfe, Gayle Leen

Abstract
We consider two stochastic process methods for performing canonical correlation analysis (CCA). The first uses a Gaussian Process formulation of regression in which we use the current projection of one data set as the target for the other and then repeat in the opposite direction. The second uses a Dirichlet process of Gaussian models where the Gaussian models are determined by Probabilistic CCA \cite{jordan:bach}. The latter method is more computationally intensive but has the advantages of non-parametric approaches.

Manuscript from author [PDF]

[Back to Top]


Visualization methods for data mining


ES2006-3

Visual Data Mining and Machine Learning

Fabrice Rossi

Abstract
Information visualization and visual data mining leverage the human visual system to provide insight and understanding of unorganized data. In order to scale to massive sets of high dimensional data, simplification methods are needed, so as to select important dimensions and objects. Some machine learning algorithms try to solve those problems. We give in this paper an overview of information visualization and survey the links between this field and machine learning.

Manuscript from author [PDF]

ES2006-97

Sanger-driven MDSLocalize - a comparative study for genomic data

Marc Strickert, Nese Sreenivasulu, Udo Seiffert

Abstract
Multidimensional scaling (MDS) methods are designed to establish a one-to-one correspondence of input-output relationships. While the input may be given as high-dimensional data items or as adjacency matrix characterizing data relations, the output space is usually chosen as low-dimensional Euclidean, ready for visualization. MDSLocalize, an existing method, is reformulated in terms of Sanger's rule that replaces the original foundations of computationally costly singular value decomposition. The derived method is compared to the recently proposed high-throughput multi-dimensional scaling (HiT-MDS) and to the well-established XGvis system. For comparison, real-value gene expression data and corresponding DNA sequences, given as proximity data, are considered.

Manuscript from author [PDF]

ES2006-34

Visualizing the trustworthiness of a projection

Michaël Aupetit

Abstract
The visualization of continuous multi-dimensional data based on their projection in a 2-dimensional space is a way to detect visually interesting patterns, as far as the projection provides a faithful image of the original data. We propose to visualize directly in the projection space, how much the neighborhood has been preserved or not during the projection. We color the Voronoi cells associated to the segments of the Delaunay graph of the projections, according to their stretching or compression. We experiment these techniques with the Principal Component Analysis and the Curvilinear Component Analysis applied to different databases.

Manuscript from author [PDF]

ES2006-138

Data topology visualization for the Self-Organizing Map

Kadim Tasdemir, Erzsebet Merenyi

Abstract
The Self-Organizing map (SOM), a powerful method for data mining and cluster extraction, is very useful for processing high-dimensional and complex data. Visualization methods present different aspects of the information learned by the SOM to get more insight about the data. In this work, we propose a new visualization scheme that represents data topology superimposed on the SOM grid, and we show how it helps in the discovery of data structure.

Manuscript from author [PDF]

ES2006-26

Visual nonlinear discriminant analysis for classifier design

Tomoharu Iwata, Kazumi Saito, Naonori Ueda

Abstract
We present a new method for analyzing classifiers by visualization, which we call visual nonlinear discriminant analysis. Classifiers that output posterior probabilities are visualized by embedding samples and classes so as to approximate posteriors using parametric embedding. The visualization provides a better intuitive understanding of such classifier characteristics as separability and generalization ability than conventional methods. We evaluate our method by visualizing classifiers for artificial and real data sets.

Manuscript from author [PDF]

ES2006-78

Outlier identification with the Harmonic Topographic Mapping

Marian Pena, Colin Fyfe

Abstract
We review two versions of a topology preserving algorithm one of which we had previously found to be more succesful in defining smooth manifolds and tight clusters. In the context of outlier detection, however, the other is shown to be more sucessful. However, we show that, by using local kernels for calculation of responsibilities, the first one can also be used in this manner.

Manuscript from author [PDF]

ES2006-155

A new hyperbolic visualization method for displaying the results of a neural gas model: application to Webometrics

Shadi Al Shehabi, Jean-Charles Lamirel

Abstract
The core model which is considered in this paper is the neural gas model. This paper proposes an original hyperbolic visualization approach which is suitable to be applied on the results of such a model. The main principle of this approach is to use a hierarchical algorithm in order to summarize the gas contents in the form on a hypertree in which information on data density issued from the original neurons (i.e.classes) description space is preserved. An application of this approach on a dataset of websites issued from European universities is presented in order to prove its accuracy.

Manuscript from author [PDF]

[Back to Top]


Semi-blind approaches for Source Separation and Independent Component Analysis (ICA)


ES2006-2

Semi-Blind Approaches for Source Separation and Independent component Analysis

Massoud Babaie-Zadeh, Christian Jutten

Abstract
This paper is a survey of semi-blind source separation approaches. Since Gaussian iid signals are not separable, simplest priors suggest to assume non Gaussian iid signals, or Gaussian non iid signals. Other priors can also been used, for instance discrete or bounded sources, positivity, etc. Although providing a generic framework for semi-blind source separation, Sparse Component Analysis and Bayesian ICA will just sketched in this paper, since two other survey papers develop in depth these approaches.

Manuscript from author [PDF]

ES2006-154

Bayesian source separation: beyond PCA and ICA

Ali Mohammad-Djafari

Abstract
Blind source separation (BSS) has become one of the major signal and image processing area in many applications. Principal component analysis (PCA) and Independent component analysis (ICA) have become two main classical approaches for this problem. However, these two approaches have their limits which are mainly, the assumptions that the data are temporally iid and that the model is exact (no noise). In this paper, we first show that the Bayesian inference framework gives the possibility to go beyond these limits while obtaining PCA and ICA algorithms as particular cases. Then, we propose different a priori models for sources which progressively account for different properties of the sources. Finally, we illustrate the application of these different models in spectrometry, in astrophysical imaging, in satellite imaging and in hyperspectral imaging.

Manuscript from author [PDF]

ES2006-157

A survey of Sparse Component Analysis for blind source separation: principles, perspectives, and new challenges

Rémi Gribonval, Sylvain Lesage

Abstract
In this survey, we highlight the appealing features and challenges of Sparse Component Analysis (SCA) for blind source separation (BSS). SCA is a simple yet powerful framework to separate several sources from few sensors, even when the independence assumption is dropped. So far, SCA has been most successfully applied when the sources can be represented sparsely in a given basis, but many other potential uses of SCA remain unexplored. Among other challenging perspectives, we discuss how SCA could be used to exploit both the spatial diversity corresponding to the mixing process and the morphological diversity between sources to unmix even underdetermined convolutive mixtures. This raises several challenges, including the design of both provably good and numerically efficient algorithms for large-scale sparse approximation with overcomplete signal dictionaries.

Manuscript from author [PDF]

ES2006-62

Source separation with priors on the power spectrum of the sources

Jorge Igual, Raul Llinares, Andres Camacho

Abstract
A general approach introducing priors on the correlation function or equivalently power spectrum of the sources in the Blind Source Separation problem is presented. This prior modifies or constrains the contrast function that measures the independence of the recovered signals depending on its characteristics. Considering the case where the priors correspond to the sources that we are interested in recovering, the deflation approach is stated. This formulation is especially useful for those large-dimension problems where the ancillary sources are not needed to be estimated. We show its application to the biomedical problem of extracting the atrial activity from atrial fibrillation episodes, where discriminant information about the frequency content of the atrial activity with respect to the other components is available in advance.

Manuscript from author [PDF]

ES2006-153

A time-scale correlation-based blind separation method applicable to correlated sources

Yannick Deville, Dass Bissessur, Matthieu Puigt, Shahram Hosseini, Hervé Carfantan

Abstract
We first propose a correlation-based blind source separation (BSS) method based on time-scale (TS) representations of the observed signals. This approach consists in identifying the columns of the (permuted scaled) mixing matrix in TS zones where this method detects that a single source is active. It thus sets very limited constraints on the sparsity of the sources in the TS domain. Both the detection and identification stages of this approach use local correlation parameters of the TS transforms of the observed signals. This BSS method, called TISCORR (for TIme-Scale CORRelation-based BSS), is an extension of our previous two temporal and time-frequency versions of this class of methods. Our second contribution in this paper consists in proving that all three approaches apply if the (transformed) source signals are linearly independent, thus allowing them to be correlated. This extends our previous demonstration, which only guaranteed our previous two approaches to be applicable to uncorrelated sources. Experimental tests show that our TISCORR method achieves good separation for linear instantaneous mixtures of real, correlated or uncorrelated, speech signals (output SIRs are above 40 dB).

Manuscript from author [PDF]

ES2006-72

Independent dynamics subspace analysis

Alexander Ilin

Abstract
The paper presents an algorithm for identifying the independent subspace analysis model based on source dynamics. We propose to separate subspaces by decoupling their dynamic models. Each subspace is extracted by minimizing the prediction error given by a first-order nonlinear autoregressive model. The learning rules are derived from a cost function and implemented in the framework of denoising source separation.

Manuscript from author [PDF]

ES2006-103

Non-orthogonal Support Width ICA

John A. Lee, Frédéric Vrins, Michel Verleysen

Abstract
Independent Component Analysis (ICA) is a powerful tool with applications in many areas of blind signal processing; however, its key assumption, i.e. the statistical independence of the source signals, can be somewhat restricting in some particular cases. For example, when considering several images, it is tempting to look on them as independent sources (the picture subjects are different), although they may actually be highly correlated (subjects are similar). Pictures of several landscapes (or faces) fall in this category. How to separate mixtures of such pictures? This paper proposes an ICA algorithm that can tackle this apparently paradoxical problem. Experiments with mixtures of real images demonstrate the soundness of the approach.

Manuscript from author [PDF]

ES2006-156

Hierarchical markovian models for joint classification, segmentation and data reduction of hyperspectral images

Nadia Bali, Ali Mohammad-Djafari, Adel Mohammadpour

Abstract
Spectral classification, segmentation and data reduction are the three main problems in hyperspectral image analysis. In this paper we propose a Bayesian estimation approach which tries to give a solution for these three problems jointly. The data reduction problem is modeled as a blind sources separation (BSS) where the data are the m hyperspectral images and the sources are the n < m images which must be mutually the most independent and piecewise homogeneous. To insure these properties, we propose a hierarchical model for the sources with a common hidden classification variable which is modelled via a Potts Markov field. The joint Bayesian estimation of this hidden variable as well as the sources and the mixing matrix of the BSS problem gives a solution for all the three problems of spectra classification, segmentation and data reduction problems of hyperspectral images. An appropriate Gibbs Sampling (GS) algorithm is proposed for the Bayesian computationand a few simulation results are given to illustrate the performances of the proposed method and some comparison with other classical methods of PCA and ICA used for BSS.

Manuscript from author [PDF]

ES2006-20

A simple idea to separate convolutive mixtures in an undetermined scenario

Maciej Pedzisz, Ali Mansour

Abstract
We consider a blind separation problem for undetermined mixtures of two BPSK signals in a multi-path fading channel. We use independence and frequency diversity of the two source signals to identify mixture parameters, estimate Pulse Shaping Filters (PSF) and channel responses, as well as to extract both binary sequences from only one observation. Presented method uses gradient descent algorithm to directly adopt the symbols, which are then used as feedback sequence for PSF roll-off factor identification as well as for channel equalization.

Manuscript from author [PDF]

ES2006-68

FastISA: A fast fixed-point algorithm for independent subspace analysis

Aapo Hyvärinen, Urs Köster

Abstract
Independent Subspace Analysis (ISA; Hyvarinen & Hoyer, 2000) is an extension of ICA. In ISA, the components are divided into subspaces, so that components in different subspaces are assumed independent, whereas components in the same subspace have dependencies. In this paper we describe a fast fixed-point algorithm for ISA estimation, analogous to FastICA. In particular we give a proof of the quadratic convergence of the algorithm, and present simulations to confirm the fast convergence.

Manuscript from author [PDF]

ES2006-115

Discriminacy of the minimum range approach to blind separation of bounded sources

Dinh-Tuan Pham, Frédéric Vrins

Abstract
The Blind Source Separation (BSS) problem is often solved by maximizing objective functions reflecting the statistical dependency between outputs. Since global maximization may be difficult without exhaustive search, criteria for which it can be proved that all the local maxima correspond to an acceptable solution of the BSS problem have been developed. These criteria are used in a deflation procedure. This paper shows that the ``spurious maximum free'' property still holds for the minimum range approach when the sources are extracted simultaneously.

Manuscript from author [PDF]

[Back to Top]


Learning II


ES2006-148

Entropy-based principle and generalized contingency tables

Vincent Vigneron

Abstract
It is well known that the entropy-based concept of mutual information provides a measure of dependence between two discrete random variables. There are several ways to normalize this measure in order to obtain a coefficient similar e.g. to Pearson’s coefficient of contingency. This paper presents a measure of independence between categorical variables and is applied for clustering of multidimensional contingency tables. We propose and study a class of measures of directed discrepancy. Two factors make our divergence function attractive: first, the coefficient we obtain a framework in which a Bregman divergence can be used for the objective function ; second, we allow specification of a larger class of constraints that preserves varous statistics.

Manuscript from author [PDF]

ES2006-46

On the selection of hidden neurons with heuristic search strategies for approximation

Ignacio Barrio, Enrique Romero, Lluís Belanche

Abstract
Feature Selection techniques usually follow some search strategy to select a suitable subset from a set of features. Most neural network growing algorithms perform a search with Forward Selection with the objective of finding a reasonably good subset of neurons. Using this link between both fields (feature selection and neuron selection), we propose and analyze different algorithms for the construction of neural networks based on heuristic search strategies coming from the feature selection field. The results of an experimental comparison to Forward Selection using both synthetic and real data show that a much better approximation can be achieved, though at the expense of a higher computational cost.

Manuscript from author [PDF]

ES2006-71

Lag selection for regression models using high-dimensional mutual information

Geoffroy Simon, Michel Verleysen

Abstract
Mutual information may be used to select the embedding lag of a time series. However, this lag selection is usually limited to the analysis of the mutual information between a pair of lagged values in the series. In this paper, generalized mutual information estimators are proposed to take into account more than two variables in the lag selection. Experimental results show that lag selection using mutual information should also take into account the output of the regression model.

Manuscript from author [PDF]

ES2006-99

Learning what is important: feature selection and rule extraction in a virtual course

Terence Etchells, Angela Nebot, Alfredo Vellido, Paulo Lisboa, Francisco Mugica

Abstract
Virtual campus environments are becoming a mainstream alternative to traditional distance higher education. The Internet medium they use allows the gathering of information on students’ usage behaviour. The knowledge extracted from this information can be fed back to the e-learning environment to ease advisors’ workload. In this context, two problems are addressed in the current study: finding which usage features are best at predicting online students’ marks, and explaining mark prediction in the form of parsimonious and interpretable rules. To that effect, two methods are used: Fuzzy Inductive Reasoning (FIR) for feature selection and Orthogonal Search-Based Rule Extraction (OSRE). Experiments carried out on the available data indicate that students’ marks can be accurately predicted and that a small subset of variables explains the accuracy of such prediction, which can be described through a set of simple and actionable rules.

Manuscript from author [PDF]

ES2006-43

Data mining techniques for feature selection in blood cell recognition

Tomasz Markiewicz, Stanislaw Osowski

Abstract
The paper presents and compares the data mining techniques for selection of the diagnostic features in the problem of blood cell recognition in leukaemia. Different techniques are compared: the linear SVM ranking, correlation analysis and statistical analysis of centers and variances of clusters corresponding to classes. The applied classifier network is Support Vector Machine of radial kernel. The results of recognition of 10 classes of cells are presented and discussed.

Manuscript from author [PDF]

ES2006-104

A Gaussian process latent variable model formulation of canonical correlation analysis

Gayle Leen, Colin Fyfe

Abstract
We investigate a nonparametric model with which to visualize the relationship between two datasets. We base our model on Gaussian Process Latent Variable Models (GPVLM)[1],[7], a probabilistically defined latent variable model which takes the alternative approach of marginalizing the parameters and optimizing the latent variables; we optimize a latent variable set for each dataset, which preserves the correlations between the datasets, resulting in a GPVLM formulation of canonical correlation analysis which can be nonlinearised by choice of covariance function

Manuscript from author [PDF]

ES2006-35

Designing neural network committees by combining boosting ensembles

Vanessa Gómez-Verdejo, Anibal R. Figueiras-Vidal

Abstract
To construct modified Real Adaboost ensembles by applying weighted emphasis on erroneous and critical (near the classification boundary) has been shown to lead to improved designs, both in performance and in ensemble sizes. In this paper, we propose to take advantage of the diversity among different weighted combination to build committees of modified Real Adaboost designs. Experiments show that the expectable improvements are obtained.

Manuscript from author [PDF]

ES2006-91

Using Regression Error Characteristic Curves for Model Selection in Ensembles of Neural Networks

Aloisio Carlos de Pina, Gerson Zaverucha

Abstract
Regression Error Characteristic (REC) analysis is a technique for evaluation and comparison of regression models that facilitates the visualization of the performance of many regression functions simultaneously in a single graph. The objective of this work is to present a new approach for model selection in ensembles of Neural Networks, in which we propose the use of REC curves in order to select a good threshold value, so that only residuals greater than that value are considered as errors. The algorithm was empirically evaluated and its results were analyzed also by means of REC curves.

Manuscript from author [PDF]

ES2006-113

Diversity creation in local search for the evolution of neural network ensembles

Pete Duell, Iris Fermin, Xin Yao

Abstract
The EENCL algorithm [1] automatically designs neural network ensembles for classification, combining global evolution with local search based on gradient descent. Two mechanisms encourage diversity: Negative Correlation Learning (NCL) and implicit fitness sharing. This paper analyses EENCL, finding that NCL is not an essential component of the algorithm, while implicit fitness sharing is. Furthermore, we find that a local search based on independent training is equally effective in both accuracy and diversity. We propose that NCL is unnecessary in EENCL for the tested datasets, and that complementary diversity in local search and global evolution may lead to better ensembles.

Manuscript from author [PDF]

ES2006-67

Immune Network based Ensembles

Nicolás García-Pedrajas, Colin Fyfe

Abstract
This paper presents a new method for constructing ensembles of classifiers based on Immune Network Theory, one of the most interesting paradigms within the field of Artificial Immune Systems. Ensembles of classifiers are a very interesting alternative to single classifiers when facing difficult problems. In general, ensembles are able to achieve better performance in terms of learning and generalization error. We construct an Immune Network that constitutes an ensemble of classifiers. Using a neural network as base classifier we have compared the performance of this ensemble with five standard methods of ensemble construction. This comparison is made using 35 real-world classification problems from the UCI Machine Learning Repository. The results show a general advantage of the proposed model over the standard methods.

Manuscript from author [PDF]

ES2006-128

Classification by means of Evolutionary Response Surfaces

Rafael del Castillo-Gomariz, Nicolás García-Pedrajas

Abstract
Response surfaces are a powerful tool for both classification and regression as they are able to model many different phenomena and construct complex boundaries between classes. Nevertheless, the absence of efficient methods for obtaining manageable response surfaces for real-world problems due to the large number of terms needed, greatly undermines their applicability. In this paper we propose the use of real-coded genetic algorithms for overcoming these limitations. We apply the evolved response surfaces to classification in two classes. The proposed algorithm selects a model of minimum dimensionality improving the robustness and generalisation abilities of the obtained classifier. The algorithm uses a dual codification (real and binary) and specific operators adapted from the standard operators for real-coded genetic algorithms. The fitness function considers the classification error and a regularisation term that takes into account the number of terms of the model. The results obtained in 10 real-world classification problems from the UCI Machine Learning Repository are comparable with well-known classification algorithms with a more interpretable polynomial function.

Manuscript from author [PDF]

ES2006-36

Hierarchical analysis of GSM network performance data

Mikko Multanen, Kimmo Raivio, Pasi Lehtimäki

Abstract
In this study, a method for hierarchical examination and visualization of GSM data using the Self-Organizing Map (SOM) is described. The data is examined in few phases. At first temporally averaged data is used and then, in each phase some of the data is discarded and the rest is examined in more detail. The SOM is used both in clustering and in visualization. The actual clustering is performed to the nodes of the SOM to lower the computational cost and to help to understand better the properties of the clusters.

Manuscript from author [PDF]

ES2006-125

Learning with monotonicity requirements for optimal routing with end-to-end quality of service constraints

Antoine Mahul, Alexandre Aussem

Abstract
In this paper, we adapt the classical learning algorithm for feed-forward neural networks when monotonicity is required in the input-output mapping. Monotonicity can be imposed by adding of suitable penalization terms to the error function. This yields a computationally efficient algorithm with little overhead compared to back-propagation. This algorithm is used to train neural networks for delay evaluation in an optimization scheme for optimal routing in a communication network.

Manuscript from author [PDF]

[Back to Top]


Biologically inspired models


ES2006-10

Evolving multi-segment 'super-lamprey' CPG's for increased swimming control

Leena Patel, Alan Murray, John Hallam

Abstract
'Super-lamprey' swimmers which operate over a greater control range are evolved. Propulsion in the lamprey, an eel-like fish, is governed by activity in its spinal neural network. This CPG is simulated, in accordance with Ekeberg's model, and then optimised alternatives are generated with genetic algorithms. Extending our prior lamprey work on single segment oscillators to multiple segments (including interaction with a mechanical model) demonstrates that Ekeberg's CPG is not a unique solution and that simpler versions with wider operative ranges can be generated. This work 'out-evolves' nature as an initial step in understanding how to control wave power devices, with similar motion to the lamprey.

Manuscript from author [PDF]

ES2006-140

Exploring the role of intrinsic plasticity for the learning of sensory representations

Nicholas Butko, Jochen Triesch

Abstract
Intrinsic plasticity (IP) refers to a neuron's ability to regulate its firing activity by adapting its intrinsic excitability. Previously, we showed that model neurons combining IP with Hebbian synaptic plasticity can adapt their weight vector to discover heavy-tailed directions in the input space. In this paper we consider networks of coupled model neurons and show how a population of such units can solve a standard non-linear ICA problem. We also present a simple model for the formation of maps of oriented receptive fields in primary visual cortex. Together, our results indicate that intrinsic plasticity may play an important role for learning efficient representations in populations of cortical neurons.

Manuscript from author [PDF]

[Back to Top]


Kernel methods


ES2006-75

LS-SVM functional network for time series prediction

Tuomas Kärnä, Fabrice Rossi, Amaury Lendasse

Abstract
Usually time series prediction is done with regularly sampled data. In practice, however, the data available may be irregularly sampled. In this case the conventional prediction methods cannot be used. One solution is to use Functional Data Analysis (FDA). In FDA an interpolating function is fitted to the data and the fitting coefficients are being analyzed instead of the original data points. In this paper, we propose a functional approach to time series prediction. Radial Basis Function Network (RBFN) is used for the interpolation. The interpolation parameters are optimized with a k-Nearest Neighbors (k-NN) model. Least Squares Support Vector Machine (LS-SVM) is used for the prediction.

Manuscript from author [PDF]

ES2006-61

Synthesis of maximum margin and multiview learning using unlabeled data

Sandor Szedmak, John Shawe-Taylor

Abstract
In this presentation we show the semi-supervised learning with two input sources can be transformed into a maximum margin problem to be similar to a binary SVM. Our formulation exploits the unlabeled data to reduce the complexity of the class of the learning functions. In order to measure how the complexity is decreased we use the Rademacher Complexity Theory. The corresponding optimization problem is convex and it is efficiently solvable for large-scale applications as well.

Manuscript from author [PDF]

ES2006-116

Efficient Forward Regression with Marginal Likelihood

Ping Sun, Xin Yao

Abstract
We propose an efficient forward regression algorithm based on greedy optimization of marginal likelihood. It can be understood as a forward selection procedure which adds a new basis vector at each step with the largest increment to the marginal likelihood. The computational cost of our algorithm is linear in the number $n$ of training examples and quadratic in the number $k$ of selected basis vectors, i.e. $\mathcal{O}(nk^2)$. Moreover, our approach is only required to store a small fraction of all columns of the full design matrix. We compare our algorithm with the well-known Relevance Vector Machines (RVM) which also optimizes marginal likelihood iteratively. The results show that our algorithm can achieve comparable prediction accuracy but with significantly better scaling performance in terms of both computational cost and memory requirements.

Manuscript from author [PDF]

[Back to Top]


Nonlinear dynamics


ES2006-6

Nonlinear dynamics in neural computation

Tjeerd olde Scheper, Nigel Crook

Abstract
This tutorial reports on the use of nonlinear dynamics in several different models of neural systems. We discuss a number of distinct approaches to neural information processing based on nonlinear dynamics. The models we consider combine controlled chaotic models with phenomenological models of spiking mechanisms as well as using weakly chaotic systems. The recent work of several major researchers in this field is briefly introduced.

Manuscript from author [PDF]

ES2006-136

Dynamical reservoir properties as network effects

Carlos Lourenço

Abstract
It has been proposed that chaos can serve as a reservoir providing an infinite number of dynamical states. These can be interpreted as different behaviors, search actions or computational states which are selectively adequate for different tasks. The high flexibility of chaotic regimes has been noted, as well as other advantages over regular regimes. However, the model neurons used to demonstrate these ideas could be criticized as lacking physical or biological realism. In the present paper we show that the same kind of rich behavior displayed by the toy models can be found with a more realistic neural model [6]. Furthermore, much of the complex behavior arises from network properties often overlooked in the literature.

Manuscript from author [PDF]

ES2006-29

Nonlinear transient computation and variable noise tolerance

Nigel Crook

Abstract
A novel nonlinear transient computation device is presented which is designed to perform computations on multiple spike-train input signals. The input signals perturb the internal dynamic state of the device in a way that is characteristic of the input signal presented in each case. These characteristics are reflected in the output spike train of the device. Experimental evidence is presented in this paper which shows that this output spike train is both a noise tolerant and a noise sensitive response to the input signal presented.

Manuscript from author [PDF]

ES2006-149

Cultures of dissociated neurons display a variety of avalanche behaviours

Roberta Alessio, Laura Cozzi, , Vittorio Sanguineti

Abstract
Avalanche dynamics has been described in organotypic cultures and acute slices from rat cortex. Its distinctive feature is a statistical distribution of avalanche size and duration following a power law with specific exponents, corresponding to near-critical state. We asked whether the same dynamics is present in dissociated cultures from rat embryos, which are characterized by complete lack of anatomic structure and high, random synaptic connectivity. We indeed observed such dynamics in some, but not all, experimental preparations. We conclude that the variability found in the dynamics of dissociated cultures also affects general features like the criticality of avalanche behavior.

Manuscript from author [PDF]

[Back to Top]


Neural Networks and Machine Learning in Bioinformatics - Theory and Applications


ES2006-7

Neural networks and machine learning in bioinformatics - theory and applications

Udo Seiffert, Barbara Hammer, Samuel Kaski, Thomas Villmann

Abstract
Bioinformatics is a promising and innovative research field. Despite of a high number of techniques specifically dedicated to bioinformatics problems as well as many successful applications, we are in the beginning of a process to massively integrate the aspects and experiences in the different core subjects such as biology, medicine, computer science, engineering, chemistry, physics, and mathematics. Within this rather wide area we focus on neural networks and machine learning related approaches in bioinformatics with particular emphasis on integrative research against the background of the above mentioned scope.

Manuscript from author [PDF]

ES2006-38

Using sampling methods to improve binding site predictions

Yi Sun, Mark Robinson, Rod Adams, Rene te Boeckhorst, Alistair G. Rust, Neil Davey

Abstract
Currently the best algorithms for transcription factor binding site prediction are severely limited in accuracy. In previous work we combine random selection under-sampling with the SMOTE over-sampling techniques, working with several classification algorithms from the machine learning field to integrate binding site predictions. In this paper, we improve the classification result with the aid of Tomek links, either as an under-sampling technique or to remove further noisy data after sampling.

Manuscript from author [PDF]

ES2006-28

Margin based Active Learning for LVQ Networks

Frank-Michael Schleif, Barbara Hammer, Thomas Villmann

Abstract
In this article, we extend a local prototype-based learning model by active learning, which gives the learner the capability to select training samples and thereby increase speed and accuracy of the model. Our algorithm is based on the idea of selecting a query on the borderline of the actual classification. This can be done by considering margins in an extension of learning vector quantization based on an appropriate cost function. The performance of the query algorithm is demonstrated on real life data.

Manuscript from author [PDF]

ES2006-129

Classification of Boar Sperm Head Images using Learning Vector Quantization

Michael Biehl, Piter Pasma, Marten Pijl, Lidia Sanchez, Nicolai Petkov

Abstract
We apply Learning Vector Quantization (LVQ) in the domain of medical image analysis for automated boar semen quality assessment. The classification of single boar spermatozoid heads into healthy (normal) and damaged (non-normal) ones is based on greyscale microscopic images. Sample data was classified by veterinary experts and is used for training a system with a number of prototypes for each class. We apply as training schemes Kohonen's LVQ1 and the recent variants Generalized LVQ (GLVQ) and Generalized Relevance LVQ (GRLVQ). We compare their performance and furthermore study the influence of the employed metric.

Manuscript from author [PDF]

ES2006-33

Selection of more than one gene at a time for cancer prediction from gene expression data

Oleg Okun, Nikolay Zagoruiko, Alexessander Alves, Olga Kutnenko, Irina Borisova

Abstract
A new gene selection method capable of selecting more than one gene at a time is introduced. This characteristic contrasts it with almost all known methods assuming that there are no interactions between genes. The only exception is the pairwise gene selection method recently proposed by B{\o} and Jonassen~\cite{bj02}. Motivated by this method, we compare it and ours. Classification into healthy tissue and cancerous tumor is studied, where gene selection finds gene subsets well suitable for discriminating between these classes. Experiments demonstrate superiority of our method in terms of leave-one-out cross-validation error.

Manuscript from author [PDF]

ES2006-69

Visualizing gene interaction graphs with local multidimensional scaling

Jarkko Venna, Samuel Kaski

Abstract
Several bioinformatics data sets are naturally represented as graphs, for instance gene regulation, metabolic pathways, and protein-protein interactions. The graphs are often large and complex, and their straightforward visualizations are incomprehensible. We have recently developed a new method called local multidimensional scaling for visualizing high-dimensional data sets. In this paper we adapt it to visualize graphs, and compare it with two commonly used graph visualization packages in visualizing yeast gene interaction graphs. The new method outperforms the alternatives in two crucial respects: It produces graph layouts that are both more trustworthy and have fever edge crossings.

Manuscript from author [PDF]

ES2006-141

Fuzzy image segmentation with Fuzzy Labelled Neural Gas

Cornelia Brüß, Felix Bollenbeck, Frank-Michael Schleif, Winfriede Weschke, Thomas Villmann, Udo Seiffert

Abstract
Processing biological data often requires handling of uncertain and sometimes inconsistent information. Particularly when coping with image segmentation tasks against biomedical background, a clear description of for example tissue borders is often hard to define. On the other hand, there are only a few promising segmentation algorithms being able to process fuzzy input data. This paper describes one novel alternative applying the recently introduced Fuzzy Labelled Neural Gas (FLNG) as subsequent classification step to a biologically relevant fuzzy labelling with underlying image feature extraction.

Manuscript from author [PDF]

ES2006-144

Elucidating the structure of genetic regulatory networks: a study of a second order dynamical model on artificial data

Minh Quach, Pierre Geurts, Florence d'Alché-Buc

Abstract
Learning regulatory networks from time-series of gene expression is a challenging task. We propose to use synthetic data to analyze the ability of a state-space model to retrieve the network structure while varying a number of relevant problem parameters. ROC curves together with new tools such as spectral clustering of local solutions found by EM are used to analyze these results and provide relevant insights.

Manuscript from author [PDF]

[Back to Top]


Learning III


ES2006-56

OnlineDoubleMaxMinOver: a simple approximate time and information efficient online Support Vector Classification method

Daniel Schneegaß, Thomas Martinetz, Michael Clausohm

Abstract
We present the OnlineDoubleMaxMinOver approach to obtain the Support Vectors in two class classification problems. With its linear time complexity and linear convergence the algorithm achieves a competitive speed. We approach the problem of the impossibility of perfect non trivial online Support Vector Learning by parameterising the exactness. Even in the case of linearly inseparable data within the feature space the method converges to a solution expressible by a finite amount of information while observing an arbitrarily large number of input vectors. The results of the online method are comparable to the batch ones, occasionally even better.

Manuscript from author [PDF]

ES2006-80

Variants of Unsupervised Kernel Regression: General cost functions

Stefan Klanke, Helge Ritter

Abstract
We present an extension to a recent method for learning of nonlinear manifolds, which allows to incorporate general cost functions. We focus on the epsilon-insensitive loss and visually demonstrate our method on both toy and real data.

Manuscript from author [PDF]

ES2006-102

Degeneracy in model selection for SVMs with radial Gaussian kernel

Tobias Glasmachers

Abstract
We consider the model selection problem for support vector machines applied to binary classification. As the data generating process is unknown, we have to rely on heuristics as model section criteria. In this study, we analyze the behavior of two criteria, radius margin quotient and kernel polarization, applied to SVMs with radial Gaussian kernel. We proof necessary and sufficient conditions for local optima at the boundary of the kernel parameter space in the limit of arbitrarily narrow kernels. The theorems show that multi-modality of the model selection objectives can arise due to insignificant properties of the training dataset.

Manuscript from author [PDF]

ES2006-117

Evolino for recurrent support vector machines

Juergen Schmidhuber, Matteo Gagliolo, Daan Wierstra, Faustino Gomez

Abstract
We introduce a new class of recurrent, truly sequential SVM-like devices with internal adaptive states, trained by a novel method called EVOlution of systems with KErnel-based outputs (Evoke), an instance of the recent Evolino class of methods. Evoke evolves recurrent networks to detect and represent temporal dependencies while using SVM to produce precise outputs. Evoke is the first SVM-based mechanism learning to classify a context-sensitive language. It also outperforms recent state-of-the-art gradient-based recurrent neural networks (RNNs) on various time series prediction tasks.

Manuscript from author [PDF]

ES2006-114

Hybrid generative/discriminative training of radial basis function networks

Artur Ferreira, Mario Figueiredo

Abstract
We propose a new training algorithm for radial basis function networks (RBFN), which incorporates both generative (mixture-based) and discriminative (logistic) criteria. Our algorithm incorporates steps from the classical expectation-maximization algorithm for mixtures of Gaussians with a logistic regression step to update (in a discriminative way) the output weights. We also describe an incremental version of the algorithm, which is robust regarding initial conditions. Comparison of our approach with existing training algorithms, on (both synthetic and real) binary classification problems, shows that it achieves better performance.

Manuscript from author [PDF]

ES2006-124

Rotation-based ensembles of RBF networks

Juan J. Rodriguez, Jesus Maudes, Carlos Alonso

Abstract
Ensemble methods allow to improve the accuracy of classification methods. This work considers the application of one of these methods, named Rotation-based, when the classifiers to combine are RBF Networks. This ensemble method, for each member of the ensemble, transforms the data set using a pseudo-random rotation of the axis. Then the classifier is constructed using this rotation data. The results of the ensembles obtained with this method are compared with the results using other ensemble methods (including Bagging and Boosting), over 34 data sets.

Manuscript from author [PDF]

ES2006-119

Learning and discrimination through STDP in a top-down modulated associative memory

Anthony Mouraud, Hélčne Paugam-Moisy

Abstract
This article underlines the learning and discrimination capabilities of a model of associative memory based on artificial networks of spiking neurons. Inspired from neuropsychology and neurobiology, the model implements top-down modulations, as in neocortical layer V pyramidal neurons, with a learning rule based on synaptic plasticity (STDP), for performing a multimodal association learning task. A temporal correlation method of analysis proves the ability of the model to associate specific activity patterns to different samples of stimulation. Even in the absence of initial learning and with continuously varying weights, the activity patterns become stable enough for discrimination.

Manuscript from author [PDF]

ES2006-60

Gaussian and exponential architectures in small-world associative memories

Lee Calcraft, Rod Adams, Neil Davey

Abstract
The performance of sparsely-connected associative memory models built from a set of perceptrons is investigated using different patterns of connectivity. Architectures based on Gaussian and exponential distributions are compared to networks created by progressively rewiring a locally-connected network. It is found that while all three architectures are capable of good pattern-completion performance, the Gaussian and exponential architectures require a significantly lower mean wiring length to achieve the same results. In the case of networks of low connectivity, relatively tight Gaussian and exponential distributions achieve the best overall performance.

Manuscript from author [PDF]

ES2006-76

Parallel hardware implementation of a broad class of spiking neurons using serial arithmetic

Benjamin Schrauwen, Jan Van Campenhout

Abstract
Current digital, directly mapped implementations of spiking neural networks use serial processing and parallel arithmetic. On a standard CPU, this might be the good choice, but when using a Field Programmable Gate Array (FPGA), other implementation architectures are possible. This work present a hardware implementation of a broad class of integrate and fire spiking neurons with synapse models using parallel processing and serial arithmetic. This results in very fast and compact implementations of spiking neurons on FPGA.

Manuscript from author [PDF]

ES2006-135

Generalization properties of spiking neurons trained with ReSuMe method

Filip Ponulak, Andrzej Kasi&#324;ski

Abstract
In this paper we demonstrate the generalization ability of the spiking neurons trained with ReSuMe method. We show in a set of experiments that the learning neuron can approximate the input-output transformations defined by another - reference neuron with a high precision and that the learning process converges very quickly. We discuss the relationship between the neuron I/O properties and the weight distribution of its input connections. Finally, we discuss the conditions under which the neuron can approximate some given I/O transformations.

Manuscript from author [PDF]

ES2006-95

A sequence-encoding neural network for face recognition

Marek Barwi&#324;ski, Rolf P. Würtz

Abstract
We propose a feature-based system for face recognition using contextual information to improve the recognition rate. A small (6 memory blocks, 3 cells each) recurrent neural network with internalmemory cell states (LSTM) is trained on single images of 49 different identities randomly picked from the FERET database and tested on images with different facial expressions using a predefined saccade path. We present the improvement of recognition rate and an outlook to the future development of the system including autonomous saccade generation, evidence accumulation and novelty detection.

Manuscript from author [PDF]

ES2006-147

Freeform surface induction from projected planar curves via neural networks

Usman Khan, Abdelaziz Terchi, Sungwoo Lim, David Wright, Sheng-Feng Qin

Abstract
We propose a novel intelligent approach into 2D to 3D of on-line sketching in conceptual design. A Multilayer Perceptron neural network is employed to construct 3D freeform surfaces from 2D freehand curves. Planar curves were used to represent the boundary strokes of a freeform surface patch and varied iteratively to produce a training set. Sampled curves were used to train and test the network. The results obtained demonstrate that the network successfully leaned the inverse-projection map and correctly inferred respective surfaces from curves previously unencountered.

Manuscript from author [PDF]

ES2006-139

The combination of STDP and intrinsic plasticity yields complex dynamics in recurrent spiking networks

Andreea Lazar, Gordon Pipa, Jochen Triesch

Abstract
We analyze the dynamics of deterministic recurrent spiking neural networks with spike-timing dependent plasticity (STDP) and intrinsic plasticity (IP) that changes the excitability of individual units. We find that STDP and IP can synergistically interact to produce complex network dynamics. These dynamics are quite different from the dynamics of networks that lack one or the other form of plasticity. Our results suggest that a synergistic combination of different forms of plasticity may contribute to cortical dynamics of high complexity, and they underscore the need to carefully study the interaction of different plasticity forms.

Manuscript from author [PDF]

ES2006-22

Reducing policy degradation in neuro-dynamic programming

Thomas Gabel, Martin Riedmiller

Abstract
We focus on neuro-dynamic programming methods to learn state-action value functions and outline some of the inherent problems to be faced, when performing reinforcement learning in combination with function approximation. In an attempt to overcome some of these problems, we develop a reinforcement learning method that monitors the learning process, enables the learner to reflect whether it is better to cease learning, and thus obtains more stable learning results.

Manuscript from author [PDF]

ES2006-122

Probabilistic classifiers and time-scale representations: application to the monitoring of a tramway guiding system

Zahra HAMOU MAMAR, Pierre Chainais, Alexandre Aussem

Abstract
We discuss a new diagnosis system combining wavelet analysis techniques and probabilistic classifiers for detecting tramway rollers defects. A continuous wavelet transform is applied on the vibration signals measured by specific accelerometers located on the rail. A temporal segmentation of the signals is carried out in order to identify the contribution of each pair of rollers to the overall vibration signal. The singular values decomposition (SVD) method is applied to segments of the time-scale representation to extract the most significative features. The resulting multi-class problem is then solved using pairwise classifiers trained on two-class sub-problems. The efficiency of this approach is successfully illustrated on several experiments on the tramway.

Manuscript from author [PDF]

ES2006-146

Pattern analysis in illicit heroin seizures: a novel application of machine learning algorithms

Frédéric Ratle, Anne-Laure Terrettaz, Mikhaďl Kanevski, Pierre Esseiva, Olivier Ribaux

Abstract
An application of machine learning algorithms to the clustering and classification of chemical data concerning heroin seizures is presented. The data concerns the chemical constituents of heroin as given by a gas chromatography analysis. Following a preprocessing step, where the six initial constituents are reduced to only two significant features, the data are clustered in order to find natural classes which we have supposed to correspond to the country of origin. A classification is then made using a multi-layer perceptron, a probabilistic neural network, a radial basis function network and the k-nearest neighbors method. Results are encouraging and add important information to previous work in the field.

Manuscript from author [PDF]

[Back to Top]