ESANN2019

27th European Symposium on Artificial Neural Networks
Bruges, Belgium, April 24-25-26

[Electronic proceedings home page] [Electronic proceedings author index]

ESANN2019
Content of the proceedings

WARNING: you need Adobe Acrobat reader 7.0 or more to view the PDF files below



Classification and Bayesian learning


ES2019-35

Conditional BRUNO: a neural process for exchangeable labelled data

Iryna Korshunova, Yarin Gal, Arthur Gretton, Joni Dambre

Abstract
We present a neural process that models exchangeable sequences of high dimensional complex observations conditionally on a set of labels or tags. Our model combines the expressiveness of deep neural networks with the data-efficiency of Gaussian processes, resulting in a probabilistic model for which the posterior distribution is easy to evaluate and sample from, and the computational complexity scales linearly with the number of observations. The advantages of the proposed architecture are demonstrated on a challenging few-shot view reconstruction task which requires generalisation from short sequences of viewpoints.

Manuscript from author [PDF]

ES2019-98

interpretable dynamics models for data-efficient reinforcement learning

Markus Kaiser, Clemens Otte, Thomas Runkler, Carl Henrik Ek

Abstract
In this paper, we present a Bayesian view on model-based reinforcement learning. We use expert knowledge to impose structure on the transition model and present an efficient learning scheme based on variational inference. This scheme is applied to a heteroskedastic and bimodal benchmark problem on which we compare our results to NFQ and show how our approach yields human-interpretable insight about the underlying dynamics while also increasing data-efficiency.

Manuscript from author [PDF]

ES2019-77

PAC-Bayes and Fairness: Risk and Fairness Bounds on Distribution Dependent Fair Priors

Luca Oneto, Michele Donini, Massimiliano Pontil

Abstract
We address the problem of algorithmic fairness: ensuring that sensitive information does not unfairly influence the outcome of a classifier. We face this issue in the PAC-Bayes framework and we present an approach which trades off and bounds the risk and the fairness of the Gibbs Classifier measured with respect to different state-of-the-art fairness measures. For this purpose, we further develop the idea that the PAC-Bayes prior can be defined based on the data-generating distribution without actually needing to know it. In particular, we define a prior and a posterior which gives more weight to functions which exhibit good generalization and fairness properties.

Manuscript from author [PDF]

ES2019-131

DropConnect for Evaluation of Classification Stability in Learning Vector Quantization

Jensun Ravichandran, Sascha Saralajew, Thomas Villmann

Abstract
In this paper we consider DropOut/DropConnect techniques known from deep neural networks to evaluate the stability of learning vector quantization classifiers (LVQ). For this purpose, we consider the LVQ as a multilayer network and transfer the respective concepts to LVQ. Particularly, we consider the output as a stochastic ensemble such that an information theoretic measure is obtained to judge the stability level.

Manuscript from author [PDF]

ES2019-189

Pixel-wise Conditioning of Generative Adversarial Networks

Cyprien Ruffino, Romain HERAULT, Eric Laloy, Gilles Gasso

Abstract
Generative Adversarial Networks (GANs) have proven successful for unsupervised image generation. Several works extended GANs to image inpainting by conditioning the generation with parts of the image one wants to reconstruct. However, these methods have limitations in settings where only a small subset of the image pixels is known beforehand. In this paper, we study the effectiveness of conditioning GANs by adding an explicit regularization term to enforce pixel-wise conditions when very few pixel values are provided. In addition, we also investigate the influence of this regularization term on the quality of the generated images and the satisfaction of the conditions. Conducted experiments on MNIST and FashionMNIST show evidence that this regularization term allows for controlling the trade-off between quality of the generated images and constraint satisfaction.

Manuscript from author [PDF]

ES2019-142

Committees as Artificial Organisms - Evolution and Adaptation

Roberto Alamino

Abstract
Generalised committee machines are here proposed to model an organism's DNA interaction with its environment, which are shown to induce a unique genotype-phenotype map. An application to organisms being subjected to a toxic environment is shown to allow antagonistic pleiotropy. The same scenario is studied in order to show the difference in adaptation when there is a fitness cost given by a lower reproduction rate.

Manuscript from author [PDF]

ES2019-68

Towards a device-free passive presence detection system with Bluetooth Low Energy beacons

Maximilian Münch, Karsten Huffstadt, Frank-Michael Schleif

Abstract
In an era of smart information systems and smart buildings, detecting, tracking and identifying the presence of attendants inside of enclosed rooms have evolved to a key challenge in the research area of smart building systems. Therefore, several types of sensing systems were proposed over the past decade to tackle these challenge. Depending on the component’s arrangement, a distinction is made between so-called device-based active and device-free passive sensing systems. Here we focus on the device-free passive concept and introduce a strategy of using Bluetooth Low Energy beacons for passive presence detection.

Manuscript from author [PDF]

ES2019-84

Defending against poisoning attacks in online learning settings

Greg Collinge, Emil C Lupu, Luis Muñoz-González

Abstract
Machine learning systems are vulnerable to data poisoning, a coordinated attack where a fraction of the training dataset is manipulated by an attacker to subvert learning. In this paper we first formulate an optimal attack strategy against online learning classifiers to assess worst-case scenarios. We also propose two defence mechanisms to mitigate the effect of online poisoning attacks by analysing the impact of the data points in the classifier and by means of an adaptive combination of machine learning classifiers with different learning rates. Our experimental evaluation supports the usefulness of our proposed defences to mitigate the effect of poisoning attacks in online learning settings.

Manuscript from author [PDF]

ES2019-90

Hybrid vibration signal monitoring approach for rolling element bearings

Jarno Kansanaho, Tommi Kärkkäinen

Abstract
New approach to identify different lifetime stages of rolling element bearings,to improve early bearing fault detection, is presented. We extract characteristic features from vibration signals generated by rolling element bearings. This data is first pre-labelled with an unsupervised clustering method. Then, supervised methods are used to improve the labelling. Moreover, we assess feature importance with each classifier. From the practical point of view, the classifiers are compared on how early emergence of a bearing fault is being suggested. The results show that all of the classifiers are usable for bearing fault detection and the importance of the features was consistent.

Manuscript from author [PDF]

ES2019-93

Modal sense classification with task-specific context embeddings

Bo Li, Mathieu Dehouck, Pascal Denis

Abstract
Sense disambiguation of modal constructions is a crucial part of natural language understanding. Framed as a supervised learning task, this problem heavily depends on an adequate feature representation of the modal verb context. Inspired by recent work on general word sense disambiguation, we propose a simple approach of modal sense classification in which standard shallow features are enhanced with task-specific context embedding features. Comprehensive experiments show that these enriched contextual representations fed into a simple SVM model lead to significant classification gains over shallow feature sets.

Manuscript from author [PDF]

ES2019-114

Adversarial robustness of linear models: regularization and dimensionality

Istvan Megyeri, Istvan Hegedus, Mark Jelasity

Abstract
Many machine learning models are sensitive to adversarial input, meaning that very small but carefully designed noise added to correctly classified examples may lead to misclassification. The reasons for this are still poorly understood, even in the simple case of linear models. Here, we study linear models and offer a number of novel insights. We focus on the effect of regularization and dimensionality. We show that in very high dimensions adversarial robustness is inherently very low due to some mathematical properties of high-dimensional spaces that have received little attention so far. We also demonstrate that---although regularization may help---adversarial robustness is harder to achieve than high accuracy during the learning process. This is typically overlooked when researchers set optimization meta-parameters.

Manuscript from author [PDF]

ES2019-167

A Simple and Effective Scheme for Data Pre-processing in Extreme Classification

Sujay Khandagale, Rohit Babbar

Abstract
Extreme multi-label classification (XMC) refers to supervised multi-label learning involving hundreds of thousand or even millions of labels. It has been shown to be an effective framework for addressing crucial tasks such as recommendation, ranking and web-advertising. In this paper, we propose a method for effective and well-motivated data pre-processing scheme in XMC. We show that our proposed algorithm, PrunEX, can remove upto 90% data in the input which is redundant from a classification view-point. Our scheme is universal in the sense it is applicable to all known public datasets in the domain of XMC.

Manuscript from author [PDF]

ES2019-154

MAP best performances prediction for endurance runners

Ángel Campo, Marc Francaux, Laurent Baijot, Michel Verleysen

Abstract
The preparation of long-distance runners requires to estimate their potential race performances beforehand. Athlete performances can be modeled based on their past records, but the task is made difficult because of the high variability in runner race performances. This paper presents a maximum a posteriori (MAP) estimation that addresses the issues related to this high variability. The inclusion of athlete priors and a specific residual model are inferred with the help of a large set of race results.

Manuscript from author [PDF]

ES2019-58

TrIK-SVM : an alternative decomposition for kernel methods in Kreı̆n spaces

Gaelle Loosli

Abstract
The proposed work aims at proposing a alternative kernel decomposition in the context of kernel machines with indefinite kernels. The original paper of KSVM (SVM in Kreı̆n spaces) uses the eigen-decomposition, our proposition avoids this decompostion. We explain how it can help in designing an algorithm that won’t require to compute the full kernel matrix. Finally we illustrate the good behavior of the proposed method compared to KSVM.

Manuscript from author [PDF]

[Back to Top]


Embeddings and Representation Learning for Structured Data


ES2019-4

Embeddings and Representation Learning for Structured Data

Benjamin Paaßen, Claudio Gallicchio, Alessio Micheli, Alessandro Sperduti

Abstract
Learning models of structured data, such as sequences, trees, and graphs, has become a rich and promising research objective in many fields of machine learning, such as (deep) neural networks, probabilistic models, kernels, metric learning, and dimensionality reduction. All these seemingly disparate approaches are connected by their construction of vectorial representations and embeddings of structured data, be it implicit or explicit, fixed or learned, deterministic or stochastic. Such embeddings can not only be utilized for classification or regression, but for generation of structured data, visualization, and interpretation.

Manuscript from author [PDF]

ES2019-107

Graph generation by sequential edge prediction

Davide Bacciu, Alessio Micheli, Podda Marco

Abstract
Graph generation with Machine Learning models is a challenging problem with applications in various research fields. Here, we propose a recurrent Deep Learning based model to generate graphs by learning to predict their ordered edge sequence. Despite its simplicity, our experiments on a wide range of datasets show that our approach is able to generate graphs originating from very different distributions, outperforming canonical graph generative models from graph theory, and reaching performances comparable to the current state of the art on graph generation.

Manuscript from author [PDF]

ES2019-137

On the definition of complex structured feature spaces

Nicolò Navarin, Dinh Van Tran, Alessandro Sperduti

Abstract
In this paper, we propose a graph kernel whose feature space is defined combining pairs of features of an existing base graph kernel. Furthermore, we propose a variation where the feature space is adaptive with respect to the learning task at hand, allowing to learn a representation suited to it. Experimental results on six real-world graph datasets from different domains show that the proposed kernels are able to get a consistent performance improvement over the considered base kernel, and over previously defined feature combination methods in literature.

Manuscript from author [PDF]

ES2019-60

Deep Weisfeiler-Lehman assignment kernels via multiple kernel learning

Nils Morten Kriege

Abstract
Kernels for structured data are commonly obtained by decomposing objects into their parts and adding up the similarities between all pairs of parts measured by a base kernel. Assignment kernels are based on an optimal bijection between the parts and have proven to be an effective alternative to the established convolution kernels. We explore how the base kernel can be learned as part of the classification problem. We build on the theory of valid assignment kernels derived from hierarchies defined on the parts. We show that the weights of this hierarchy can be optimized via multiple kernel learning. We apply this result to learn vertex similarities for the Weisfeiler-Lehman optimal assignment kernel for graph classification. We present first experimental results which demonstrate the feasibility and effectiveness of the approach.

Manuscript from author [PDF]

ES2019-67

Predicting vehicle behaviour using LSTMs and a vector power representation for spatial positions

Florian Mirus, Peter Blouw, Stewart Terrence, Jörg Conradt

Abstract
Predicting future vehicle behaviour is an essential task to enable safe and situation-aware automated driving. In this paper, we propose to encapsulate spatial information of multiple objects in a semantic vector-representation. Assuming that future vehicle motion is influenced not only by past positions but also by the behaviour of other traffic participants, we use this representation as input for a Long Short-Term Memory (LSTM) network for sequence to sequence prediction of vehicle positions. We train and evaluate our system on real-world driving data collected mainly on highways in southern Germany and compare it to other models for reference.

Manuscript from author [PDF]

ES2019-79

Efficient learning of email similarities for customer support

Jelle Bakker, Kerstin Bunte

Abstract
One way to increase customer satisfaction is efficient and con-sistent customer email support. In this contribution we investigate the use of dimensionality reduction, metric learning and classification methods to predict answer templates that can be used by an employee or retrieve his-toric conversations with potential suitable answers given an email query. The strategies are tested on email data and the publicly available Reuters data. We conclude that prototype-based metric learning is fast to train and the parameters provide a compressed representation of the database enabling efficient content based retrieval. Furthermore, learning customer email embeddings based on the similarity of employee answers is a promising direction for computer aided customer support.

Manuscript from author [PDF]

ES2019-140

Nonnegative matrix factorization with polynomial signals via hierarchical alternating least squares

Cécile Hautecoeur, François Glineur

Abstract
Nonnegative matrix factorization (NMF) is a widely used tool in data analysis due to its ability to extract significant features from data vectors. Among algorithms developed to solve NMF, hierarchical alternative least squares (HALS) is often used to obtain state-of-the-art results. We generalize HALS to tackle an NMF problem where both input data and features consist of nonnegative polynomial signals. Compared to standard HALS applied to a discretization of the problem, our algorithm is able to recover smoother features, with a computational time growing moderately with the number of observations compared to existing approaches.

Manuscript from author [PDF]

[Back to Top]


Deep learning and CNN


ES2019-30

Deep Embedded SOM: joint representation learning and self-organization

Florent Forest, Lebbah Mustapha, Azzag Hanane, Jérôme Lacaille

Abstract
In the wake of recent advances in joint clustering and deep learning, we introduce the Deep Embedded Self-Organizing Map, a model that jointly learns representations and the code vectors of a self-organizing map. Our model is composed of an autoencoder and a custom SOM layer that are optimized in a joint training procedure, motivated by the idea that the SOM prior could help learning SOM-friendly representations. We evaluate SOM-based models in terms of clustering quality and unsupervised clustering accuracy, and study the benefits of joint training.

Manuscript from author [PDF]

ES2019-48

Deep convolutional neural network for survival estimation of Amyotrophic Lateral Sclerosis patients

Enrico Grisan, Alessandro Zandonà, Barbara Di Camillo

Abstract
We propose a convolutional neural network (CNN) coupled with a fully connected top layer for survival estimation. We design an objective function to directly estimate the probability of survival at discrete time intervals, conditional to the patient not having incurred any adverse event at previous time points. We test our CNN and objective function on a large dataset of longitudinal data of patients with Amyotrophic Lateral Sclerosis (ALS). We compare our CNN and the objective function against other neural networks designed for survival analysis, and against the optimization of Cox-partial-likelihood or a simple logistic classifier. The use of our objective function outperforms both Cox-partial-likelihood and logistic classifier, independently of the network architecture, and our deep CNN provides the best results in terms of AU-ROC, accuracy and mean absolute error.

Manuscript from author [PDF]

ES2019-89

Detecting adversarial examples with inductive Venn-ABERS predictors

Jonathan Peck, Bart Goossens, Yvan Saeys

Abstract
Inductive Venn-ABERS predictors (IVAPs) are a type of probabilistic predictors with the theoretical guarantee that their predictions are perfectly calibrated. We propose to exploit this calibration property for the detection of adversarial examples in binary classification tasks. By rejecting predictions if the uncertainty of the IVAP is too high, we obtain an algorithm that is both accurate on the original test set and significantly more robust to adversarial examples. The method appears to be competitive to the state of the art in adversarial defense, both in terms of robustness as well as scalability.

Manuscript from author [PDF]

ES2019-113

Learning Rich Event Representations and Interactions for Temporal Relation Classification

Onkar Pandit, Pascal Denis, Liva Ralaivola

Abstract
Most existing systems for identifying temporal relations between events heavily rely on hand-crafted features derived from event words and explicit temporal markers. Besides, less attention has been given to automatically learning contextualized event representations or to finding complex interactions between events. This paper fills this gap in showing that a combination of rich event representations and interaction learning is essential to more accurate temporal relation classification. Specifically, we propose a neural architecture, in which i) Recurrent Neural Network (RNN) is used to extract contextual information for pairs of events, and ii) a deep Convolutional Neural Network (CNN) architecture is used to find out intricate interactions between events. We show that the proposed approach outperforms most existing systems on commonly used datasets, while using fully automatic feature extraction and simple local inference.

Manuscript from author [PDF]

ES2019-156

L1-norm double backpropagation adversarial defense

Ismaila Seck, Gaelle Loosli, Stéphane Canu

Abstract
Adversarial examples are a challenging open problem for deep neural networks. We propose in this paper to add a penalization term that forces the decision function to be flat in some regions of the input space, such that it becomes, at least locally, less sensitive to attacks. Our proposition is theoretically motivated and shows on a first set of carefully conducted experiments that it behaves as expected when used alone, and seems promising when coupled with adversarial training.

Manuscript from author [PDF]

ES2019-174

Application of deep neural networks for automatic planning in radiation oncology treatments

Ana Barragan Montero, Dan Nguyen, Weiguo Lu, Mu-Han Lin, Xavier Geets, edmond sterpin, Steve Jiang

Abstract
Treatment planning for radiotherapy patients is a time-consuming and manual process. In this work, we investigate the use of deep neural networks to learn from previous clinical cases and directly predict the optimal dose distribution for a new patient. The proposed model combines two architectures, UNet and DenseNet, and used mean squared error as loss function. Ten input channels were used to include dosimetric and anatomical information. A set of 100 patients was used for training/validation and 29 for testing. Dice similarity coefficients ≥ 0.9 for the isodose-lines in the predicted versus the clinical dose showed the excellent accuracy of the model.

Manuscript from author [PDF]

ES2019-25

Conditional WGAN for grasp generation

Florian Patzelt, Robert Haschke, Helge Ritter

Abstract
This work proposes a new approach to robotic grasping exploiting conditional Wasserstein generative adversarial networks (WGANs), which output promising grasp candidates from depth image inputs. In contrast to discriminative models, the WGAN approach enables deliberative navigation in the set of feasible grasps and thus allows a smooth integration with other motion planning tools. We find that the training autonomously partitioned the space of feasible grasps into several regions corresponding to different grasp types. Each region forms a smooth grasp manifold with latent parameters corresponding to important grasp parameters like approach direction. We evaluate the model in simulation on the multi-fingered Shadow Robot hand, comparing it a) to a classical grasp planner for primitive geometric object shapes and b) to a state-of-the-art discriminative network model. The proposed generative model matches the grasp success rate of its trainer models and exhibits better generalization.

Manuscript from author [PDF]

ES2019-19

Multilingual short text categorization using convolutional neural network

Liriam Enamoto, Li Weigang

Abstract
One of the most meaningful use of online social media is to communicate quickly during emergency. In case of global emergency, the threat might cross countries borders, affect different cultures and languages. This article aims to explore Convolutional Neural Network (CNN) for multilingual short text categorization in English, Japanese and Portuguese to identify useful information in social media. A CNN is constructed for this special purpose. The experiment results show that CNN model performs better than SVM even in small dataset. And more interestingly, the cross languages test suggests that English, Japanese and Portuguese text can use the same model with few hyperparameters changes.

Manuscript from author [PDF]

ES2019-26

Fast and reliable architecture selection for convolutional neural networks

Lukas Hahn, Lutz Roese-Koerner, Klaus Friedrichs, Anton Kummert

Abstract
The performance of a Convolutional Neural Network (CNN) depends on its hyperparameters, like the number of layers, kernel sizes, or the learning rate for example. Especially in smaller networks and applications with limited computational resources, optimisation is key.\\ We present a fast and efficient approach for CNN architecture selection. Taking into account time consumption, precision and robustness, we develop a heuristic to quickly and reliably assess a network's performance. In combination with Bayesian optimisation, to effectively cover the vast parameter space, our contribution offers a plain and powerful architecture search for this machine learning technique.

Manuscript from author [PDF]

ES2019-32

On the Speedup of Deep Reinforcement Learning Deep Q-Networks (RL-DQNs)

Anas Albaghajati, Lahouari Ghouti

Abstract
Deep reinforcement learning (DRL) merges reinforcement (RL) and deep learning (DL). In DRL-based agents rely on high-dimensional imagery inputs to make accurate decisions. Such excessively high-dimensional inputs and sophisticated algorithms require very powerful computing resources and longer training times. To alleviate the need for powerful resources and reduce the training times, this paper proposes novel solutions to mitigate the curse-of-dimensionality without compromising the DRL agent performance. Using these solutions, the deep Q-network model (DQN) and its improved versions require less training times while achieving better performance.

Manuscript from author [PDF]

ES2019-37

Deep Autoencoder Feature Extraction for Fault Detection of Elevator Systems

Krishna Mohan Mishra, Tomi Krogerus , Kalevi Huhtala

Abstract
In this research, we propose a generic deep autoencoder model for automated feature extraction from the elevator sensor data. Extracted deep features are classified with random forest algorithm for fault detection. Sensor data are labelled as healthy or faulty based on the maintenance actions recorded. In our research, we have included all fault types present for each elevator. The remaining healthy data is used for validation of the model to prove its efficacy in terms of avoiding false positives. We have achieved nearly 100% accuracy in fault detection along with avoiding false positives based on new extracted deep features, which outperform the results using existing features.

Manuscript from author [PDF]

ES2019-121

Detecting Ghostwriters in High Schools

Magnus Stavngaard, August Sørensen, Stephan Lorenzen, Niklas Hjuler, Stephen Alstrup

Abstract
Students hiring ghostwriters to write their assignments is an increasing problem in educational institutions all over the world, with companies selling these services as a product. In this work, we develop automatic techniques with special focus on detecting such ghostwriting in high school assignments. This is done by training deep neural networks on an unprecedented large amount of data supplied by the Danish company MaCom, which covers 90% of Danish high schools. We achieve an accuracy of 0.875 and a AUC score of 0.947 on an evenly split data set.

Manuscript from author [PDF]

ES2019-125

Design of Power-Efficient FPGA Convolutional Cores with Approximate Log Multiplier

Leonardo Tavares Oliveira, Min Soo Kim, Alberto Antonio Del Barrio García, Nader Bagherzadeh, Ricardo Menotti

Abstract
This paper presents the design of a convolutional core that utilizes an approximate log multiplier to significantly reduce the power consumption of FPGA acceleration of convolutional neural networks. The core also exploits FPGA reconfigurability as well as the parallelism and input sharing opportunities in convolutional layers to minimize the costs. The simulation results show reductions up to 78.19% of LUT usage and 60.54% of power consumption compared to the core that uses exact fixed-point multiplier, while maintaining comparable accuracy on a subset of MNIST dataset.

Manuscript from author [PDF]

ES2019-144

Improving Pedestrian Recognition using Incremental Cross Modality Deep Learning

Danut Ovidiu Pop, Alexandrina Rogozan, Fawzi Nashashibi, Abdelaziz Bensrhair

Abstract
Late fusion schemes with deep learning classification patterns set up with multi-modality images have an essential role in pedestrian protection systems since they have achieved prominent results in the pedestrian recognition task. In this paper, the late fusion scheme merged with Convolutional Neural Networks (CNN) is investigated for pedestrian recognition based on the Daimler stereo vision data sets. An independent CNN-based classifier for each imaging modality (Intensity, Depth, and Optical Flow) is handled before the fusion of its probabilistic output scores with a Multi-Layer Perceptron which provides the recognition decision. In this paper, we set out to prove that the incremental cross-modality deep learning approach enhances pedestrian recognition performances. It also outperforms state-of-the-art pedestrian classifiers on the Daimler stereo-vision data sets.

Manuscript from author [PDF]

ES2019-152

Machine learning in research and development of new vaccines products: opportunities and challenges

Paul Smyth, Gaël de Lannoy, Moritz Von Stosch, Alexander Pysik, Amin Khan

Abstract
Modern high-throughput technologies deployed in research and development of new vaccine products have opened the door to machine learning applications that allow the automation of tasks and support for data-driven risk-based decision making. In this paper, the opportunities and the challenges faced for the deployment of machine learning algorithms in the field of vaccines development are discussed.

Manuscript from author [PDF]

ES2019-157

Real-time Convolutional Neural Networks for emotion and gender classification

Matias Valdenegro-Toro, Octavio Arriaga, Paul Plöger

Abstract
Emotion and gender recognition from facial features are important properties of human empathy. Robots should also have these capabilities. For this purpose we have designed special convolutional modules that allow a model to recognize emotions and gender with a considerable lower number of parameters, enabling real-time evaluation on a constrained platform. We report accuracies of 96% in the IMDB gender dataset and 66% in the FER-2013 emotion dataset, while requiring a computation time of less than 0.008 seconds on a Core i7 CPU. All our code, demos and pre-trained architectures have been released under an open-source license in our repository at https://github.com/oarriaga/face classification

Manuscript from author [PDF]

[Back to Top]


Learning methods and optimization


ES2019-57

Experimental study of the neuron-level mechanisms emerging from backpropagation

Simon Carbonnelle, Christophe De Vleeschouwer

Abstract
The backpropagation algorithm is the most successful learning algorithm for training deep artificial neural networks. Its inner workings are in stark contrast with other learning rules, as it is based on a global, black box optimization procedure rather than the repetition of a local, neuron-level procedure (e.g. like hebbian learning). In this paper, we present preliminary evidence suggesting that local, neuron-level mechanisms are in fact emerging during backpropagation-based training of neural networks and describe what could be key components of it.

Manuscript from author [PDF]

ES2019-69

Learning multimodal fixed-point weights using gradient descent

Lukas Enderich, Fabian Timm, Lars Rosenbaum, Wolfram Burgard

Abstract
Due to their high computational complexity, deep neural networks are still limited to powerful processing units. To promote a reduced model complexity by dint of low-bit fixed-point quantization, we propose a gradient-based optimization strategy to generate a symmetric mixture of Gaussian modes (SGM) where each mode belongs to a particular quantization stage. We achieve 2-bit state-of-the-art performance and illustrate the model's ability for self-dependent weight adaptation during training.

Manuscript from author [PDF]

ES2019-133

Preconditioned conjugate gradient algorithms for graph regularized matrix completion

Shuyu Dong, Pierre-Antoine Absil, Kyle Gallivan

Abstract
Low-rank matrix completion is the problem of recovering the missing entries of a data matrix by using the assumption that a good low-rank approximation to the true matrix is possible. Much attention has been put recently to exploiting correlations between the column/row entities through side information to improve the matrix completion quality. In this paper, we propose an efficient algorithm for solving the low-rank matrix completion with graph-based regularizers. Experiments on synthetic data show that our approach achieves significant speedup compared to the alternating minimization scheme.

Manuscript from author [PDF]

ES2019-194

Direct calculation of out-of-sample predictions in multi-class kernel FDA

Treder Matthias

Abstract
After a two-class kernel Fisher Discriminant Analysis (KFDA) has been trained on the full dataset, matrix inverse updates allow for the direct calculation of out-of-sample predictions for different test sets. Here, this approach is extended to the multi-class case by casting KFDA in an Optimal Scoring framework. In simulations using 10-fold cross-validation and permutation tests the approach is shown to be more than 1000x faster than retraining the classifier in each fold. Direct out-of-sample predictions can be useful on large datasets and in studies with many training-testing iterations.

Manuscript from author [PDF]

ES2019-176

Complex Valued Gated Auto-encoder for Video Frame Prediction

Niloofar Azizi, Nils Wandel, Sven Behnke

Abstract
Over recent years, complex-valued artificial neural networks have gained increasing interest as they allow neural networks to learn richer representations while potentially incorporating less parameters. Especially in the domain of computer graphics, many traditional operations such as image smoothing/sharpening rely heavily on computations in the complex domain thus complex valued neural networks apply naturally. In this paper, we perform frame predictions in video sequences using a complex valued gated auto-encoder with tied input weights. First, our method is motivated showing how the Fourier transform can be seen as the basis for translational operations. Then, we present how a complex neural network can learn such transformations and compare its performance and parameter efficiency to a real valued gated auto-encoder. Furthermore, we show how extending both --- the real and the complex valued --- neural networks by using convolutional units can significantly improve prediction performance and parameter efficiency. All networks are assessed on the bouncing ball dataset.

Manuscript from author [PDF]

ES2019-80

On overfitting of multilayer perceptrons for classification

joseph Rynkiewicz

Abstract
In this paper, we consider classification models involving multilayer perceptrons (MLP) with rectified linear (ReLU) functions for activation units. It is a difficult task to study the statistical properties of such models. The main reason is that in practice these models may be heavily overparameterized. We study the asymptotic behavior of the difference between the loss function of estimated models and the loss function of the theoretical best model. These theoretical results give us information on the overfitting properties of such models. Some simulations illustrate our theoretical finding and raise new questions.

Manuscript from author [PDF]

ES2019-141

Very Simple Classifier: a concept binary classifier to investigate features based on subsampling and locality

Luca Masera, Enrico Blanzieri

Abstract
We propose Very Simple Classifier (VSC) a novel method designed to incorporate the concepts of subsampling and locality in the definition of features to be used as the input of a perceptron. The rationale is that locality theoretically guarantees a bound on the generalization error. Each feature in VSC is a max-margin classifier built on randomly-selected pairs of samples. The locality in VSC is achieved by multiplying the value of the feature by a confidence measure that can be characterized in terms of the Chebichev inequality. The output of the layer is then fed in a output layer of neurons. The weights of the output layer are then determined by a regularized pseudoinverse. Extensive comparison of VSC against 9 competitors in the task of binary classification is carried out. Results on 22 benchmark datasets with fixed parameters show that VSC is competitive with the Multi Layer Perceptron (MLP) and outperforms the other competitors. An exploration of the parameter space shows VSC can outperform MLP.

Manuscript from author [PDF]

ES2019-178

Sparse minimal learning machine using a diversity measure minimization

Madson Dias, Lucas Sousa, Ajalmar Rocha Neto, Cesar Mattos, Joao Gomes, Tommi Kärkkäinen

Abstract
The minimal learning machine (MLM) training procedure consists in solving a linear system with multiple measurement vectors (MMV) created between the geometric configurations of points in the input and output spaces. Such geometric configurations are built upon two matrices created using subsets of input and output points, named reference points (RPs). The present paper considers an extension of the focal underdetermined system solver (FOCUSS) for MMV linear systems problems with additive noise, named regularized MMV FOCUSS (regularized M FOCUSS), and evaluates it in the task of selecting input reference points for regression settings. Experiments were carried out using UCI datasets, where the proposal was able to produce sparser models and achieve competitive performance when compared to the regular strategy of selecting MLM input RPs.

Manuscript from author [PDF]

ES2019-87

Minimax center to extract a common subspace from multiple datasets

Emilie Renard, Pierre-Antoine Absil, Kyle Gallivan

Abstract
We address the problem of extracting common information from multiple datasets. More specifically, we look for a common subspace minimizing the maximal dissimilarity with all datasets and we propose an algorithm derived from the first order necessary conditions of optimality. On synthetic datasets the proposed method gives as good results as a Riemannian based approach, but also provides an evaluation on how far the iterate is from a critical point.

Manuscript from author [PDF]

ES2019-164

Interpolation on the manifold of fixed-rank positive-semidefinite matrices for parametric model order reduction: preliminary results

Estelle Massart, Pierre-Yves Gousenbourger, Thanh Son Nguyen, Tatjana Stykel, Pierre-Antoine Absil

Abstract
We present several interpolation schemes on the manifold of fixed-rank positive-semidefinite (PSD) matrices. We explain how these techniques can be used for model order reduction of parameterized linear dynamical systems, and obtain preliminary results on an application.

Manuscript from author [PDF]

ES2019-115

Progress Towards Graph Optimization: Efficient Learning of Vector to Graph Space Mappings

Stefan Mautner, Rolf Backofen, Fabrizio Costa

Abstract
Optimization in vector space domains is well understood. However, in high dimensional settings or when dealing with structured data such as sequences and graphs, optimization becomes difficult. A possible strategy is to map graphs to vector codes and use machine learning to learn a map from codes back to graphs. This in turn allows to employ standard optimization techniques over vectors to optimize graphs. Here we propose an approach to invert a vector mapping based on a combination of graph kernels and graph grammars. We evaluate the proposed approach in an artificial setup and on real molecular graphs.

Manuscript from author [PDF]

[Back to Top]


60 Years of Weightless Neural Systems


ES2019-1

Systems with 'subjective feelings' - the perspective from weightless automata

Igor Aleksander, Helen Morton

Abstract

Manuscript from author [PDF]

ES2019-108

Prediction of palm oil production with an enhanced n-Tuple Regression Network

Leopoldo Lusquino Filho, Luiz Oliveira, Aluizio Lima Filho, Gabriel Guarisa, Priscila Machado Vieira Lima, Felipe Maia Galvão França

Abstract
This paper introduces Regression WiSARD and ClusRegression WiSARD, two new weightless neural network models that were applied in the challenging task of predicting the total palm oil production of a set of 28 differently located sites under different climate and soil profiles. Both models were derived from the n-tuple regression weightless neural model and obtained error rates of 8.737% and 8.938%, respectively, which are very competitive with the state-of-art (7.569%), whilst being four (4) orders of magnitude faster during the training phase.

Manuscript from author [PDF]

ES2019-83

Memory Efficient Weightless Neural Network using Bloom Filter

Leandro Santiago de Araújo, Letícia Dias Verona, Fábio Medeiros Rangel, Fabricio Firmino de Faria, Daniel Sadoc Menasche, Wouter Caarls, Maurício Breternitz, Sandip Kundu, Priscila Machado Vieira Lima, Felipe Maia Galvão França

Abstract
Weightless Neural Networks are Artificial Neural Networks based on RAM memory broadly explored as solution for pattern recognition applications. Due to its memory approach, it can be easily implemented in hardware and software providing efficient learning mechanism. Unfortunately, the straightforward implementation requires a large amount of memory resources making its adoption impracticable on memory constraint systems. In this paper, we propose a new model of Weightless Neural Network which utilizes Bloom Filters to implement RAM nodes. By using Bloom Filters, the memory resources are widely reduced allowing false positives entries. The experiment results show that our model using Bloom Filters achieves competitive accuracy, training time and testing time, consuming up to 6 order of magnitude less memory resources in comparison with the standard Weightless Neural Network model.

Manuscript from author [PDF]

ES2019-56

A WNN model based on Probabilistic Quantum Memories

Priscila G.M. dos Santos, Rodrigo S Sousa, Adenilton J. da Silva

Abstract
In this work, we evaluate a Weightless Neural Network model based on a Probabilistic Quantum Memory. The model does not require any training and performs the classification by calculating the Hamming distance between a new sample and the training samples stored on the quantum memory. In order to evaluate the classification capabilities of this quantum model, we conducted classical experiments using an equivalent classical description of the Probabilist Quantum Memory algorithm. We present the first evaluation of a quantum weightless neural networks on public benchmark datasets.

Manuscript from author [PDF]

ES2019-153

Weightless neural systems for deforestation surveillance and image-based navigation of UAVs in the Amazon forest

Eduardo Ribeiro, Vitor Torres, Brayan James, Mateus Braga, Elcio Shiguemori, Haroldo Velho, Luiz Torres, Antônio Braga

Abstract
This work proposes a novel methodology for the recognition of deforestation areas in tropical forests using weightless neural systems in UAVs. The weightless neural systems embedded in hardware brings a considerable improvement in the speed of processing of image-based navigation of UAVs. In our approach the UAV navigates at the frontier of the deforestation area by means of previously trained descriptors, being able to monitor the increase of deforestation area. Experiments using images of the Amazon rainforest have been performed to validate the proposed approach.

Manuscript from author [PDF]

ES2019-54

An evolutionary approach for optimizing weightless neural networks

Maurizio Giordano, Massimo De Gregorio

Abstract
WiSARD is a weightless neural network model using RAMs to store the function computed by each neuron rather than storing it in connection weights between neurons. Non-linearity in WiSARD is imple- mented by a mapping that splits the binary input into tuples of bits and associate these tuples to neurons. In this work we apply an evolutionary al- gorithm to make evolve an initial population of mappings by combinations and mutations toward the generation of new mappings granting significant improvements in classification accuracy in the conducted experiments.

Manuscript from author [PDF]

ES2019-187

Modeling Sparse Data as Input for Weightless Neural Network

Luis Kopp, Jose Barbosa Filho, Priscila Machado Vieira Lima, Claudio de Farias

Abstract
Dealing with large and sparse input data has been a challenge to machine learning algorithms. In Natural Language Processing (NLP), the number of words used in a text is only a small fraction of a dictionary with all possible words and leads to very sparse matrix. In this paper we propose an alternative method for constructing the input vector in a Weightless Neural Network model using WiSARD. Our algorithm significantly outperformed the benchmark method in accuracy by 3.7\% on average when aggregating columns in groups of 3 or 6 words.

Manuscript from author [PDF]

[Back to Top]


Domain adaptation and learning


ES2019-20

Multi-target feature selection through output space clustering

Konstantinos Sechidis, Eleftherios Spyromitros-Xioufis, Ioannis Vlahavas

Abstract
A key challenge in information theoretic feature selection is to estimate mutual information expressions that capture three desirable terms: the relevancy of a feature with the output, the redundancy and the complementarity between groups of features. The challenge becomes more pronounced in multi-target problems, where the output space is multi-dimensional. Our work presents a generic algorithm that captures these three desirable terms and is suitable for the well-known multi-target prediction settings of multi-label/dimensional classification and multivariate regression. We achieve this by combining two ideas: deriving low-order information theoretic approximations for the input space and using clustering for deriving low-dimensional approximations of the output space.

Manuscript from author [PDF]

ES2019-162

Feature relevance bounds for ordinal regression

Lukas Pfannschmidt, Jonathan Jakob, Michael Biehl, Peter Tino, Barbara Hammer

Abstract
The increasing occurrence of ordinal data, mainly sociodemographic, led to a renewed research interest in ordinal regression, i.e. the prediction of ordered classes. Besides model accuracy, the interpretation of these models itself is of high relevance, and existing approaches therefore enforce e.g. model sparsity. For high dimensional or highly correlated data, however, this might be misleading due to strong variable dependencies. In this contribution, we aim for an identification of feature relevance bounds which – besides identifying all relevant features – explicitly differentiates between strongly and weakly relevant features.

Manuscript from author [PDF]

ES2019-44

User-steering interpretable visualization with probabilistic principal components analysis

Viet Minh Vu, Benoît Frénay

Abstract
The lack of interpretability generally in machine learning and specifically in visualization is often encountered. Integration of user's feedbacks into visualization process is a potential solution. This paper shows that the user's knowledge expressed by the positions of fixed points in the visualization can be transferred directly into a probabilistic principal components analysis (PPCA) model to help user steer the visualization. Our proposed interactive PPCA model is evaluated with different datasets to prove the feasibility of creating explainable axes for the visualization.

Manuscript from author [PDF]

ES2019-49

Metric learning with submodular functions

Jiajun Pan, Hoel Le Capitaine

Abstract
Metric learning mainly focuses on learning distances (or similarities) that use single feature weights with Lp norms, or pair of features with Mahalanobis distances. In this paper, we consider higher order interactions in the feature space, by the help of submodular set-functions. We propose to define a distance metric for continuous features based on submodular functions, and then present a dedicated metric learning approach. This is naturally at the price of higher complexity, so that we propose a method allowing to decrease this complexity, by reducing the order of interactions that are taken into account. This approach finally gives a computationally feasible problem. Experiments on various datasets show the effectiveness of the approach.

Manuscript from author [PDF]

ES2019-135

Fusing Features based on Signal Properties and TimeNet for Time Series Classification

Arijit Ukil, Pankaj Malhotra, Soma Bandyopadhyay, Tulika Bose, Ishan Sahu, Ayan Mukherjee, Lovekesh Vig, Arpan Pal, Gautam Shroff

Abstract
Automated feature extraction from time series to capture statistical, temporal, spectral, and morphololgical properties is highly desirable but challenging due to diverse nature of real-world time series applications. In this paper, we consider extracting a rich and robust set of time series features encompassing signal processing based features as well as generic hierarchical features extracted via deep neural networks. We present SPGF-TimeNet: a generic feature extractor for time series that allows fusion of signal processing, information-theoretic, and statistical features (Signal Properties based Generic Features (SPGF)) with features from an off-the-shelf pre-trained deep recurrent neural network (TimeNet). Through empirical evalution on diverse benchmark datasets from the UCR Time Series Classi cation (TSC) Archive, we show that classfi ers trained on SPGF-TimeNet-based hybrid and generic features outperform state-of-the-art TSC algorithms such as BOSS, while being computationally efficient.

Manuscript from author [PDF]

ES2019-51

Metric learning with relational data

Jiajun Pan, Hoel Le Capitaine

Abstract
The vast majority of metric learning approaches are meant to be applied on data described by feature vectors, with some notable exceptions such as times series, trees or graphs. The objective of this paper is to propose metric learning algorithms that consider multi-relational data. More specifically, we present a metric learning approach taking into account the features of the observations, as well as the relationships between observations.Experiments and comparisons of the two settings for a collective classification task on real-world datasets show that our method i) presents a better performance than other approaches in both settings, and ii) scales well with the volume of the data.

Manuscript from author [PDF]

ES2019-110

Feature and Algorithm Selection for Capacitated Vehicle Routing Problems

Jussi Rasku, Nysret Musliu, Tommi Kärkkäinen

Abstract
Many exact, heuristic, and metaheuristic algorithms have been proposed to effectively produce high quality solutions to vehicle routing problems. However, it remains an open question which algorithm is the most appropriate for solving a given problem instance, mostly because the different strengths and weaknesses of algorithms are still not well understood. We propose an extensive feature set for describing capacitated vehicle routing problem instances and illustrate how it can be used in algorithm selection, and how different feature selection approaches can be used to recognize the most relevant features for this task.

Manuscript from author [PDF]

ES2019-112

Topic-based historical information selection for personalized sentiment analysis

Siwen Guo, Sviatlana Höhn, Christoph Schommer

Abstract
In this paper, we present a selection approach designed for personalized sentiment analysis with the aim of extracting related information from a user's history. Analyzing a person's past is key to modeling individuality and understanding the current state of the person. We consider a user's expressions in the past as historical information, and target posts from social platforms for which Twitter texts are chosen as exemplary. While implementing the personalized model PERSEUS, we observed information loss due to the lack of flexibility regarding the design of the input sequence. To compensate this issue, we provide a procedure for information selection based on the similarities in the topics of a user's historical posts. Evaluation is conducted comparing different similarity measures, and improvements are seen with the proposed method.

Manuscript from author [PDF]

ES2019-143

Bridging face and sound modalities through domain adaptation metric learning

Christos Athanasiadis, Enrique Hortal, Stylianos Asteriadis

Abstract
Robust emotion recognition systems require extensive training by employing huge number of training samples with purpose of generating sophisticated models. Furthermore, research is mostly focused on facial expression recognition due, mainly to, the wide availability of related datasets. However, the existence of rich and publicly available datasets is not the case for other modalities like sound and so forth. In this work, a heterogeneous domain adaptation framework is introduced for bridging two inherently different domains (namely face and audio). The purpose is to perform affect recognition on the modality where only a small amount of data is available, leveraging large amounts of data from another modality.

Manuscript from author [PDF]

ES2019-18

Model selection for Extreme Minimal Learning Machine using sampling

Tommi Kärkkäinen

Abstract
A combination of Extreme Learning Machine (ELM) and Minimal Learning Machine (MLM)-to use a distance-based basis from MLM in the ridge regression like learning framework of ELM-was proposed in [8]. In the further experiments with the technique [9], it was concluded that in multilabel classification one can obtain a good validation error level without overlearning simply by using the whole training data for constructing the basis. Here, we consider possibilities to reduce the complexity of the resulting machine learning model, referred as the Extreme Minimal Leaning Machine (EMLM), by using a bidirectional sampling strategy: To sample both the feature space and the space of observations in order to identify a simpler EMLM without sacrificing its generalization performance.

Manuscript from author [PDF]

ES2019-34

Knowledge Discovery in Quarterly Financial Data of Stocks Based on the Prime Standard using a Hybrid of a Swarm with SOM

Michael Thrun

Abstract
Stocks of the German Prime standard have to publish financial reports every three months which were not used fully for fundamental analysis so far. Through web scrapping, an up-to-date high-dimensional dataset of 45 features of 269 companies was extracted, but finding meaningful cluster structures in a high-dimensional dataset with a low number of cases is still a challenge in data science. A hybrid of a swarm with a SOM called Databionic swarm (DBS) found meaningful structures in the financial reports. Using the Chord distance the DBS algorithm results in a topographic map of high-dimensional structures and a clustering. Knowledge from the clustering is acquired using CART. The cluster structures can be explained by simple rules that allow predicting which future stock courses will fall with a 70% probability.

Manuscript from author [PDF]

ES2019-55

Dimensionality reduction in a hydraulic valve positioning application

Travis Wiens

Abstract
This paper presents an application of neural network signal processing to estimate the position of a hydraulic valve spool, based on acoustic excitement of the spool's end chamber. The spool's end chamber acts somewhat like a Helmholtz resonator whose frequency response changes based on its volume (and therefore spool position). However, non-ideal characteristics of the system including wave propagation effects and distributed parameters mean that estimating the volume is more complicated than simply evaluating the resonant frequency. In this case the frequency response has high dimensionality with high redundancy and noise. We present the use of linear and nonlinear principal component analysis to preprocess the frequency response data prior to neural network regression.

Manuscript from author [PDF]

ES2019-117

Class-aware t-SNE: cat-SNE

Cyril de Bodt, Dounia Mulders, Daniel Lopez-Sanchez, Michel Verleysen, John Lee

Abstract
Stochastic Neighbor Embedding (SNE) and variants like $t$-distributed SNE are popular methods of unsupervised dimensionality reduction (DR) that deliver outstanding experimental results. Regular $t$-SNE is often used to visualize data with class labels in colored scatterplots, even if those labels are actually not involved in the DR process. This paper proposes a modification of $t$-SNE that uses class labels to adjust the individual widths of the Gaussian neighborhoods around each datum, instead of deriving those from a perplexity set by the user. The widths are adjusted such that neighbors of the same class around a datum exceed a certain fraction of the probability, typically above $50\%$. Doing so tends to shrink the bulk of the classes and to stretch their separation. Experimental results show that the proposed class-aware $t$-SNE ($\mathrm{ca}t$-SNE) outperforms regular $t$-SNE in a $K$NN classification task carried in the embedding.

Manuscript from author [PDF]

ES2019-42

Variational auto-encoders with Student’s t-prior

Najmeh Abiri, Mattias Ohlsson

Abstract
We propose a new structure for the variational autoencoder (VAE) prior, with the weakly informative multivariate Student-t distribution. In the proposed model all distribution parameters are trained, thereby allowing for a more robust approximation of the underlying data distribution. We used Fashion-MNIST data in two experiments to compare the proposed VAE with the standard Gaussian prior. Both experiments showed a better reconstruction of the images with VAE using Student-t prior distribution.

Manuscript from author [PDF]

[Back to Top]


Streaming data analysis, concept drift and analysis of dynamic data sets


ES2019-3

Recent trends in streaming data analysis, concept drift and analysis of dynamic data sets

Albert Bifet, Barbara Hammer, Frank-Michael Schleif

Abstract
Today, many data are not any longer static but occur as dynamic data streams with high velocity, variability and volume. This leads to new challenges to be addressed by novel or adapted algorithms. In this tutorial we provide an introduction into the field of streaming data analysis summarizing its major characteristics and highlighting important research directions in the analysis of dynamic data.

Manuscript from author [PDF]

ES2019-105

Online Bayesian Shrinkage Regression

Waqas Jamil, Abdelhamid Bouchachia

Abstract
The present work introduces a new online regression method that extends the Shrinkage via Limit of Gibbs sampler (SLOG) in the context of online learning. In particular, we theoretically demonstrate that the proposed Online SLOG (OSLOG) is derived using the Bayesian framework without resorting to the Gibbs sampler. We also prove the performance guarantee of OSLOG.

Manuscript from author [PDF]

ES2019-33

Reactive Soft Prototype Computing for frequent reoccurring Concept Drift

Christoph Raab, Moritz Heusinger, Frank-Michael Schleif

Abstract
Todays datasets, especially in stream context, are more and more non-static and require algorithms to detect and adapt to change. Recent work shows vital research in the field, but mainly lack stable performance during model adaptation. In this work, a bound detection strategy followed by a prototype based insertion strategy is proposed. Validated through experimental results on a variety of typical non-static data, our solution provides stability and quick adjustment in times of change.

Manuscript from author [PDF]

ES2019-59

Beta Distribution Drift Detection for Adaptive Classifiers

Lukas Fleckenstein, Sebastian Kauschke, Johannes Fürnkranz

Abstract
With today's abundant streams of data, the only constant we can rely on is change. For stream classification algorithms, it is necessary to adapt to concept drift. This can be achieved by monitoring the model error, and triggering counter measures as changes occur. In this paper, we propose a drift detection mechanism that fits a beta distribution to the model error, and treats abnormal behavior as drift. It works with any given model, leverages prior knowledge about this model, and allows to set application-specific confidence thresholds. Experiments confirm that it performs well, in particular when drift occurs abruptly.

Manuscript from author [PDF]

ES2019-63

Importance of user inputs while using incremental learning to personalize human activity recognition models

Pekka Siirtola, Heli Koskimäki, Juha Röning

Abstract
In this study, importance of user inputs is studied in the context of personalizing human activity recognition models using incremental learning. Inertial sensor data from three body positions are used, and the classification is based on Learn++ ensemble method. Three different approaches to update models are compared: non-supervised, semi-supervised and supervised. Non-supervised approach relies fully on predicted labels, supervised fully on user labeled data, and the proposed method for semi-supervised learning, is a combination of these two. In fact, our experiments show that by relying on predicted labels with high confidence, and asking the user to label only uncertain observations (from 12% to 26% of the observations depending on the used base classifier), almost as low error rates can be achieved as by using supervised approach. In fact, the difference was less than 2%-units. Moreover, unlike non-supervised approach, semi-supervised approach does not suffer from drastic concept drift, and thus, the error rate of the non-supervised approach is over 5%-units higher than using semi-supervised approach.

Manuscript from author [PDF]

[Back to Top]


Societal Issues in Machine Learning: When Learning from Data is Not Enough


ES2019-6

Societal Issues in Machine Learning: When Learning from Data is Not Enough

Davide Bacciu, Battista Biggio, Paulo Lisboa, José D. Martín, Luca Oneto, Alfredo Vellido

Abstract
It has been argued that Artificial Intelligence (AI) is experiencing a fast process of commodification. Such characterization is on the interest of big IT companies, but it correctly reflects the current industrialization of AI. This phenomenon means that AI systems and products are reaching the society at large and, therefore, that societal issues related to the use of AI and Machine Learning (ML) cannot be ignored any longer. Designing ML models from this human-centered perspective means incorporating human-relevant requirements such as safety, fairness, privacy, and interpretability, but also considering broad societal issues such as ethics and legislation. These are essential aspects to foster the acceptance of ML-based technologies, as well as to ensure compliance with an evolving legislation concerning the impact of digital technologies on ethically and privacy sensitive matters. The {ESANN} special session for which this tutorial acts as an introduction aims to showcase the state of the art on these increasingly relevant topics among ML theoreticians and practitioners. For this purpose, we welcomed both solid contributions and preliminary relevant results showing the potential, the limitations and the challenges of new ideas, as well as refinements, or hybridizations among the different fields of research, ML and related approaches in facing real-world problems involving societal issues.

Manuscript from author [PDF]

ES2019-29

Privacy Preserving Synthetic Health Data

Andrew Yale, Saloni Dash, Ritik Dutta, Isabelle Guyon, Adrien Pavao, Kristin Bennett

Abstract
We examine the feasibility of using synthetic medical data generated by GANs in the classroom, to teach data science in health infor- matics. We present an end-to-end methodology to retain instructional utility, while preserving privacy to a level, which meets regulatory re- quirements: (1) a GAN is trained by a certified medical-data security-aware agent, inside a secure environment; (2) the GAN is used outside of the secure environment by external users (instructors or researchers) to gener- ate synthetic data. This second step facilitates data handling for external users, by avoiding de-identification, which may require special user training, be costly, and/or cause loss of data fidelity. We benchmark our proposed GAN versus various baseline methods using a novel set of metrics. At equal levels of privacy and utility, GANs provide small footprint models, meeting the desired specifications of our application domain. Data, code, and a challenge that we organized for educational purposes are available.

Manuscript from author [PDF]

ES2019-78

Fairness and Accountability of Machine Learning Models in Railway Market: are Applicable Railway Laws Up to Regulate Them?

Charlotte Ducuing, Luca Oneto, Canepa Renzo

Abstract
In this work we discuss whether the law is up to regulate the use of machine learning model in the context of the railway public transportation system. In particular, we deal with the problems of fairness and accountability of these models when exploited in the context of train traffic management. Railway sector-specific regulation, in their quality as network industry, hereby serves as a pilot. We show that, even where technological solutions are available, the law needs to keep up to support and accurately regulate the use of the technological solutions and we identify stumble points in this regard.

Manuscript from author [PDF]

ES2019-134

Dynamic fairness - Breaking vicious cycles in automatic decision making

Benjamin Paaßen, Astrid Bunge, Carolin Hainke, Leon Sindelar, Matthias Vogelsang

Abstract
In recent years, machine learning techniques have been increasingly applied in sensitive decision making processes, raising fairness concerns. Past research has shown that machine learning may reproduce and even exacerbate human bias due to biased training data or flawed model assumptions, and thus may lead to discriminatory actions. To counteract such biased models, researchers have proposed multiple mathematical definitions of fairness according to which classifiers can be optimized. However, it has also been shown that the outcomes generated by some fairness notions may be unsatisfactory. In this contribution, we add to this research by considering decision making processes in time. We establish a theoretic model in which even perfectly accurate classifiers which adhere to almost all common fairness definitions lead to stable long-term inequalities due to vicious cycles. Only demographic parity, which enforces equal rates of positive decisions in all groups, avoids these effects and establishes instead a virtuous cycle leading to perfectly accurate and fair classification in the long term.

Manuscript from author [PDF]

ES2019-120

Detecting Black-box Adversarial Examples through Nonlinear Dimensionality Reduction

Francesco Crecchi, Davide Bacciu, Battista Biggio

Abstract
Deep neural networks are vulnerable to adversarial examples,i.e., carefully-perturbed input samples aimed to mislead classification. In this work, we propose a detection method based on t-SNE, a powerful nonlinear dimensionality reduction technique. Our empirical findings show that the proposed approach is able to effectively detect black-box adversarial examples, i.e., adversarial perturbations not carefully tuned to also bypass the detection method. While we believe that our method may also improve the robustness of deep nets against white-box adversarial examples, we leave a more detailed investigation of this issue to future work.

Manuscript from author [PDF]

ES2019-97

Deep RL for autonomous robots: limitations and safety challenges

Olov Andersson, Patrick Doherty

Abstract
With the rise of deep reinforcement learning, there has also been a string of successes on continuous control problems using physics simulators. This has lead to some optimism regarding use in autonomous robots and vehicles. However, to successful apply such techniques to the real world requires a firm grasp of their limitations. As recent work has raised questions of how diverse these simulation benchmarks really are, we here instead analyze a popular deep RL approach on toy examples from robot obstacle avoidance. We find that these converge very slowly, if at all, to safe policies. We identify convergence issues on stochastic environments and local minima as problems that warrant more attention for safety-critical control applications.

Manuscript from author [PDF]

ES2019-124

Explaining classification systems using sparse dictionaries

Andrea Apicella, Fracesco Isgro, Roberto Prevete, Andrea Sorrentino, Guglielmo Tamburrini

Abstract
A pressing research topic is to find ways to explain the decisions of machine learning systems to end users, data officers, and other stakeholders. These explanations must be understandable to human beings. Much work in this field focuses on image classification, as the required explanations can rely on images, therefore making communication relatively easy, and may take into account the image as a whole. Here, we propose to exploit the representational power of sparse dictionaries to determine image local properties that can be used as crucial ingredients of humanly understandable explanations of classification decisions.

Manuscript from author [PDF]

[Back to Top]


Statistical physics of learning and inference


ES2019-2

Statistical physics of learning and inference

Michael Biehl, Nestor Caticha, Manfred Opper, Thomas Villmann

Abstract

Manuscript from author [PDF]

ES2019-72

Trust, law and ideology in a NN agent model of the US Appellate Courts

Nestor Caticha, Felippe Alves

Abstract
Interacting NN are used to model US Appellate Court three judge panels. Agents, whose initial states have three contributions derived from common knowledge of the law, political affiliation and personality, learn by exchange of opinions, updating their state and trust about other agents. The model replicates data patterns only if initially the agents trust each other and are certain about their trust independently of party affiliation, showing evidence of ideological voting, dampening and amplification. Absence of law or party contribution destroys the theoretical-empirical agreement. We identify quantitative signatures for different levels of the law, ideological or idiosyncratic contributions.

Manuscript from author [PDF]

ES2019-173

On-line learning dynamics of ReLU neural networks using statistical physics techniques

Michiel Straat, Michael Biehl

Abstract
We introduce exact macroscopic on-line learning dynamics of two-layer neural networks with ReLU units in the form of a system of differential equations, using techniques borrowed from statistical physics. For the first experiments, numerical solutions reveal similar behavior compared to sigmoidal activation researched in earlier work. In these experiments the theoretical results show good correspondence with simulations. In overrealizable and unrealizable learning scenarios, the learning behavior of ReLU networks shows distinctive characteristics compared to sigmoidal networks.

Manuscript from author [PDF]

ES2019-92

Noise helps optimization escape from saddle points in the neural dynamics

Fang Ying, Yu Zhaofei, Chen Feng

Abstract
Synaptic connectivity in the brain is thought to encode the long-term memory of an organism. But experimental data point to surprising ongoing fluctuations in synaptic activity. Assuming that the brain computation and plasticity can be understood as probabilistic inference, one of the essential roles of noise is to efficiently improve the performance of optimization in the form of stochastic gradient descent. The strict saddle condition for synaptic plasticity is deduced and under such condition noise can help escape from saddle points on high dimensional domains. The theoretical result explains the stochasticity of synapses and guides us how to make use of noise. Our simulation results manifest that in the learning and test phase, the accuracy of synaptic sampling is almost 20% higher than that without noise.

Manuscript from author [PDF]

[Back to Top]


Image processing and transfer learning


ES2019-169

Deep hybrid approach for 3D plane segmentation

Felipe Gomez Marulanda, Pieter Libin, Timothy Verstraeten, Ann Nowe

Abstract
We address the limitations of Deep learning models for 3D geometry segmentation by using Conditional Random fields (CRF). We show that CRFs can take advantage of the neighbouring structure of point clouds to assist the learning of the Deep Learning models (DL). Our hybrid PN-CRF model is able to learn more optimal weights by taking advantage of equal-segmentation assignments to neighbouring points. As a result, it increases the robustness in the model specially for segmentation tasks where correctly detecting the boundaries between segmentations is very important.

Manuscript from author [PDF]

ES2019-66

visualizing image classification in fourier domain

Florian Franzen, Chunrong Yuan

Abstract
Image classification is successfully done with Convolutional Neural Networks (CNN). Alternatively it can be done in Fourier domain avoiding the convolution process. In this work, we develop several neural networks (NN) for classifying images in Fourier domain. In order to understand and explain the behaviour of the built NNs, we visualize neuron activities and analyze the underlying patterns relevant for the learning and classification process. We have carried out comparative study based on several datasets. By using images of objects with partial occlusion, we are able to find out the parts that are important for the classification of certain objects.

Manuscript from author [PDF]

ES2019-71

Blind-spot network for image anomaly detection: A new approach to diabetic retinopathy screening

Shaon Sutradhar, José Rouco, Marcos Ortega

Abstract
The development of computer-aided screening (CAS) systems is motivated by the high prevalence and severity of the target disease along with the time taken to manually assess each case. This is the case with diabetic retinopathy screening, that is based on the manual grading of retinography images. The development of CAS systems, however, usually involves data-driven approaches that require extensive and usually scarce manually labeled datasets. With this in mind, we propose the use of unsupervised anomaly detection methods for screening that can take advantage of the large amount of healthy cases available. Concretely, we focus on reconstruction-based anomaly detection methods, which are usually approached with autoencoders. We propose a new network architecture, the Blind-Spot Network, that, according to the presented experiments, improves the performance of autoencoders in this setting.

Manuscript from author [PDF]

ES2019-17

A document detection technique using convolutional neural networks for optical character recognition systems

Lorand Dobai, Mihai Teletin

Abstract
An important part of an optical character recognition pipeline is the preprocessing step, whose purpose is to enhance the conditions under which the text extraction is later performed. In this paper, we present a novel deep learning based preprocessing method to jointly detect and deskew documents in digital images. Our work intends to improve the optical recognition performance, especially on frames which are skewed (slightly rotated) or have cluttered backgrounds. The proposed method achieves good document detection and deskewing results on a dataset of photos of cash receipts.

Manuscript from author [PDF]

ES2019-100

Learning super-resolution 3D segmentation of plant root MRI images from few examples

Ali Oguz Uzman, Jannis Horn, Sven Behnke

Abstract
Analyzing plant roots is crucial to understand plant performance in different soil environments. While magnetic resonance imaging (MRI) can be used to obtain 3D images of plant roots, extraction of the root structural model is challenging due to highly noisy soil environments and low-resolution of MRI images. To improve both contrast and resolution, we adapt the state-of-the-art method RefineNet for 3D segmentation of the plant root MRI images in super-resolution. The networks are trained from few manual segmentations that are augmented by geometric transformations, realistic noise, and other variabilities. The resulting segmentations contain most root structures including branches not extracted by human supervision.

Manuscript from author [PDF]

ES2019-175

Analyzing spatial dissimilarities in high-resolution geo-data : a case study of four European cities

Julien Randon-Furling, William Clark, Madalina Olteanu

Abstract
The analysis of spatial dissimilarities across cities often relies on pre-defined areal units, leading to problems of scale, interpretability and cross-comparisons. Furthermore, traditional measures of dissimilarities tend to be single-number indices that fail to capture the complexity of segregation patterns. We present in this paper a method that allows one to extract and analyze information on all scales, at every point in the city, through a stochastic sequential aggregation procedure based on high-resolution data. This method provides insightful visual representations, as well as mathematical characterizations of segregation phenomena.

Manuscript from author [PDF]

ES2019-21

Computerized tool for identification and enhanced visualization of Macular Edema regions using OCT scans

Iago Otero Coto, Plácido Francisco Lizancos Vidal, Joaquim de Moura, Jorge Novo, Marcos Ortega

Abstract
We propose a novel methodology using Optical Coherence Tomography (OCT) images to detect the 3 clinically defined types of Macular Edema, which is among the main causes of blindness: Diffuse Retinal Thickening (DRT), Cystoid Macular Edema (CME) and Serous Retinal Detachment (SRD). To perform this detection, we sample the images and train models to create an intuitive color map that represents the 3 pathologies to facilitate the clinical evaluation. The proposed method was tested using a dataset composed by 96 OCT images. The system provided satisfactory results with accuracy values of 90.49%, 93.23% and 88.87% for the CME, SRD and DRT detections, respectively.

Manuscript from author [PDF]

ES2019-201

A best-first branch-and-bound search for solving the transductive inference problem using support vector machines

Hygor Xavier Araújo, Raul Fonseca Neto, Saulo Moraes Villela

Abstract
In this paper we present a new method for solving the transductive inference problem whose objective is predicting the binary labels of a subset of points of interest of an unknown decision function. We attempt to learn a decision boundary using SVM. To obtain the maximal-margin hypothesis over labeled and unlabeled samples we employ an admissible best-first search based on margin values. Empirical evidence suggests that this globally optimal solution can obtain excellent results in the transduction problem. Due to the selection strategy used the search algorithm explores only a small fraction of unlabeled samples making it efficiently applicable to median-sized datasets. We compare our results with the results obtained from the TSVM demonstrating better results in margin values.

Manuscript from author [PDF]

ES2019-46

LEAP nets for power grid perturbations

Benjamin Donnot, Balthazar Donon, Isabelle Guyon, Liu ZHENGYING, Antoine MAROT, Patrick Panciatici, Marc Schoenauer

Abstract
We propose a novel neural network embedding approach to model power transmission grids, in which high voltage lines are disconnected and re-connected with one-another from time to time, either accidentally or willfully. We call our architeture LEAP net, for Latent Encoding of Atypical Perturbation. Our method implements a form of transfer learning, permitting to train on a few source domains, then generalize to new target domains, without learning on any example of that domain. We evaluate the viability of this technique to rapidly assess curative actions that human operators take in emergency situations, using real historical data, from the French high voltage power grid.

Manuscript from author [PDF]

ES2019-81

Active one-shot learning with Prototypical Networks

Rinu Boney, Alexander Ilin

Abstract
We consider the problem of active one-shot classification where a classifier needs to adapt to new tasks by requesting labels for one example per class from (potentially many) unlabeled examples. We propose a clustering approach to the problem. The features extracted with Prototypical Networks are clustered using K-means and the label for one representative sample from each cluster is requested to label the whole cluster. We demonstrate good performance of this simple active adaptation strategy using image data.

Manuscript from author [PDF]

ES2019-123

Transfer Learning for transferring machine-learning based models among hyperspectral sensors

Patrick Menz, Andreas Backhaus, Udo Seiffert

Abstract
Using previously generated machine learning models under changing sensor hardware with nearly the same performance is a desirable goal. This constitutes a model transfer problem. We compare a Radial Basis Function Network adapted for transfer learning to a classical data alignment approach. This approach to transfer machine-learning models is tested on a task of material classification using hyperspectral imaging recorded with different camera systems and the aim to make camera systems interchangeable. The results show that a machine-learning based algorithm outperforms a state-of-the-art hyperspectral data alignment algorithm.

Manuscript from author [PDF]

[Back to Top]


Time series and signal processing


ES2019-126

Multiple-Kernel dictionary learning for reconstruction and clustering of unseen multivariate time-series

Babak Hosseini, Barbara Hammer

Abstract
There exist many approaches for description and recognition of unseen classes in datasets. Nevertheless, it becomes a challenging problem when we deal with multivariate time-series (MTS) (e.g., motion data), where we cannot apply the vectorial algorithms directly to the inputs. In this work, we propose a novel multiple-kernel dictionary learning (MKD) which learns semantic attributes based on specific combinations of MTS dimensions in the feature space. Hence, MKD can fully/partially reconstructs the unseen classes based on the training data (seen classes). Furthermore, we obtain sparse encodings for unseen classes based on the learned MKD attributes, and upon which we propose a simple but effective incremental clustering algorithm to categorize the unseen MTS classes in an unsupervised way. According to the empirical evaluation of our MKD framework on real benchmarks, it provides an interpretable reconstruction of unseen MTS data as well as a high performance regarding their online clustering.

Manuscript from author [PDF]

ES2019-130

Tensor factorization to extract patterns in multimodal EEG data

Dounia Mulders, Cyril de Bodt, Nicolas Lejeune, John Lee, André Mouraux, Michel Verleysen

Abstract
Noisy multi-way data sets are ubiquitous in many domains. In neuroscience, electroencephalogram (EEG) data are recorded during periodic stimulation from different sensory modalities, leading to steady-state (SS) recordings with at least four ways: the channels, the time, the subjects and the modalities. Improving the signal-to-noise ratio (SNR) of the SS responses is crucial to enable their practical use. Supervised spatial filtering methods can be considered for this purpose to relevantly guide the extraction of specific activity patterns. Nevertheless, such approaches are difficult to validate with few subjects and can process at most two data ways simultaneously, the remaining ones being either averaged or considered independently despite their dependencies. This paper hence designs unsupervised tensor factorization models to enable identifying meaningful underlying structures characterized in all ways of multimodal SS data. We show on EEG recordings from 15 subjects that such factorizations faithfully reveal consistent spatial topographies, time courses with enhanced SNR and subject variations of the periodic brain activity.

Manuscript from author [PDF]

ES2019-119

Beyond Pham's algorithm for joint diagonalization

Pierre Ablin, Jean-François Cardoso, Alexandre Gramfort

Abstract
The approximate joint diagonalization of a set of matrices consists in finding a basis in which these matrices are as diagonal as possible. This problem naturally appears in several statistical learning tasks such as blind signal separation. We consider the diagonalization criterion studied in a seminal paper by Pham (2001), and propose a new quasi-Newton method for its optimization. Through numerical experiments on simulated and real datasets, we show that the proposed method outperforms Pham’s algorithm. An open source Python package is released.

Manuscript from author [PDF]

ES2019-50

Frequency Domain Transformer Networks for Video Prediction

Hafez Farazi, Sven Behnke

Abstract
The task of video prediction is forecasting the next frames given some previous frames. Despite much recent progress, this task is still challenging mainly due to high nonlinearity in the spatial domain. To address this issue, we propose a novel architecture, Frequency Domain Transformer Network (FDTN), which is an end-to-end learnable model that formulates the transformations of the signal in the frequency domain. Experimental evaluations show that this approach can outperform some widely used video prediction methods like Video Ladder Network (VLN) and Predictive Gated Pyramids (PGP).

Manuscript from author [PDF]

ES2019-184

Comparison between DeepESNs and gated RNNs on multivariate time-series prediction

Claudio Gallicchio, Alessio Micheli, Luca Pedrelli

Abstract
We propose an experimental comparison between Deep Echo State Networks (DeepESNs) and gated Recurrent Neural Networks (RNNs) on multivariate time-series prediction tasks. In particular, we compare reservoir and fully-trained RNNs able to represent signals featured by multiple time-scales dynamics. The analysis is performed in terms of efficiency and prediction accuracy on 4 polyphonic music tasks. Our results show that DeepESN is able to outperform ESN in terms of prediction accuracy and efficiency. Whereas, between fully-trained approaches, Gated Recurrent Units (GRU) outperforms Long Short-Term Memory (LSTM) and simple RNN models in most cases. Overall, DeepESN turned out to be extremely more efficient than others RNN approaches and the best solution in terms of prediction accuracy on 3 out of 4 tasks.

Manuscript from author [PDF]

ES2019-159

Autoregressive Convolutional Recurrent Neural Network for Univariate and Multivariate Time Series Prediction

Matteo Maggiolo , Gerasimos Spanakis

Abstract
Time Series forecasting (univariate and multivariate) is a problem of high complexity due the different patterns that have to be detected in the input, ranging from high to low frequencies ones. In this paper we propose a new model for timeseries prediction that utilizes convolutional layers for feature extraction, a recurrent encoder and a linear autoregressive component. We motivate the model and we test and compare it against a baseline of widely used existing architectures for univariate and multivariate timeseries. The proposed model appears to outperform the baselines in almost every case of the multivariate timeseries datasets, in some cases even with 50% improvement which shows the strengths of such a hybrid architecture in complex timeseries.

Manuscript from author [PDF]

ES2019-15

Using Deep Learning and Evolutionary Algorithms for Time Series Forecasting

Rafael Thomazi Gonzalez, Dante Augusto Couto Barone

Abstract
Deep Learning is one of the latest approaches in the field of artificial neural networks. Since they were first proposed, Deep Learning models have obtained state-of-art results in some problems related to classification and pattern recognition. However, such models have been little used in time series forecasting. This work aims to investigate the use of some of these architectures in this kind of problem. Another contribution is the use of one Evolutionary Algorithm to optimize the hyperparameters of these models. The advantage of the proposed method is shown on two artificial time series datasets and one electricity load demand dataset.

Manuscript from author [PDF]

ES2019-103

lightweight autonomous bayesian optimization of Echo-State Networks

Cerina Luca, Giuseppe Franco, Marco Domenico Santambrogio

Abstract
Echo State Networks (ESN) represent a good option to tackle non-linear, time-dependent problems without the training complexity of standard Recurrent Neural Networks (RNNs), thanks to intrinsic dynamics that arise from untrained sparse networks. However, performance and stability of ESN are determined by their hyper-parameters, e.g. Reservoir dimension and sparsity, and the characteristics of the input, whose optimal values required time consuming procedures to be found. Here we propose an efficient automatic optimization framework for ESN based on the Bayesian Optimization given user-defined objectives, and bounded ranges on hyper-parameters. Results shown performance comparable withexhaustive grid-search optimization algorithms.

Manuscript from author [PDF]

ES2019-99

time series modelling of market price in real-time bidding

Manxing Du, Christian Hammerschmidt, Georgios Varisteas, Radu State, Mats Brorsson, Zhu Zhang

Abstract
Real-Time-Bidding (RTB) is one of the most popular online advertisement selling mechanisms. Modeling the highly dynamic bidding environment is crucial for making good bids. Market prices of auctions fluctuate heavily within short time spans. State-of-the-art methods neglect the temporal dependencies of bidders' behaviors. In this paper, the bid requests are aggregated by time and the mean market price per aggregated segment is modeled as a time series. We show that the Long Short Term Memory (LSTM) neural network outperforms the state-of-the-art univariate time series models by capturing the nonlinear temporal dependencies in the market price. We further improve the predicting performance by adding a summary of exogenous features from bid requests.

Manuscript from author [PDF]

[Back to Top]


Dynamical systems and reinforcement learning


ES2019-65

Short-term trajectory planning using reinforcement learning within a neuromorphic control architecture

Florian Mirus, Benjamin Zorn, Jörg Conradt

Abstract
In this paper, we present a first step towards neuromorphic vehicle control. We propose a modular and hierarchical system architecture entirely implemented in a spiking neuron substrate, which allows for adjustment of individual components trough either supervised or reinforcement learning as well as future deployment on dedicated neuromorphic hardware. In a sample instantiation, we investigate automated training of a neuromorphic trajectory selection module using reinforcement learning to demonstrate the general feasibility of our approach. We evaluate our system using the open-source race car simulator TORCS.

Manuscript from author [PDF]

ES2019-129

training networks separately on static and dynamic obstacles improves collision avoidance during indoor robot navigation

Viktor Schmuck, David Meredith

Abstract
Autonomous robot navigation and dynamic obstacle avoidance in complex, cluttered, indoor environments is a challenging task. A robust solution would allow robots to be deployed in hospitals, airports or shopping centres to serve as guides and fulfil other functions requiring safe human--robot interaction. Previous studies have explored various approaches to selecting sensor types, collecting data, and training models capable of safely avoiding unmapped, possibly dynamic obstacles in an indoor environment. In this paper we address the problem of recognizing and anticipating collisions, in order to determine when avoidance manoeuvres are required. We propose and compare two sensor-fusion and neural-network-based solutions, one in which models are trained separately on static and dynamic samples and another in which a model is trained on samples of collisions with both dynamic and static obstacles. The measured accuracies confirmed that the separately trained, ensemble models had better recognition performance, but were slower at calculation than the models trained without taking the obstacle types into account.

Manuscript from author [PDF]

ES2019-149

Human feedback in continuous actor-critic reinforcement learning

Cristian Millán, Bruno Fernandes, Francisco Cruz

Abstract
Reinforcement learning methods are used when an agent tries to learn from a changing environment. With continuous actions, the performance is significantly better, but the learning requires excessive time to find the proper policy. In this work, we focused on including human feedback in reinforcement learning continuous action space. We joint the policy with the feedback to favor actions in regions of low density. We compare the performance of the feedback over continuous actor-critic algorithm and evaluate it in the cart-pole balancing task. The obtained results show that our approach increases the accumulated reward and improves performance during the task.

Manuscript from author [PDF]

ES2019-76

Chasing the Echo State Property

Claudio Gallicchio

Abstract
Reservoir Computing (RC) provides an efficient way for designing dynamical recurrent neural models. While training is restricted to a simple output component, the recurrent connections are left untrained after initialization, subject to stability constraints specified by the Echo State Property (ESP). Literature conditions for the ESP typically fail to properly account for the effects of driving input signals, often limiting the potentialities of the RC approach. In this paper, we study the fundamental aspect of asymptotic stability of RC models in presence of driving input, introducing an empirical ESP index that enables to easily analyze the stability regimes of reservoirs. Results on two benchmark datasets reveal interesting insights on the dynamical properties of input-driven reservoirs, suggesting that the actual domain of ESP validity is much wider than what covered by literature conditions commonly used in RC practice.

Manuscript from author [PDF]

[Back to Top]