Soft- and hard-ware models of auditory processing

We build large-scale neural systems for auditory and sonar-based scene analysis. This covers

  • pre-processing in the auditory periphery;

  • processing in the auditory thalamo-cortical system;

  • learning of STRFs from real-world data using spike-timing-dependent plasticity;

  • category learning of acoustic and sonar patterns in higher cortical areas;

  • self-organisation of hierarchical representations of auditory scenes;

  • implementation of auditory and sonar processing in aVLSI- and FPGA-based hardware.

 General Introduction

The development of large scale models of cortical information processing has recently seen great progress driven by the availability of powerful computers and computer clusters. These models use large numbers of biologically quite plausible neurons and synapses that closely emulate the processes taking place in the brain during, for example, vision or hearing. Many large-scale models of the visual system have now been implemented. The auditory domain, however, has attracted much less attention.

We aim to develop a large-scale model of sound perception, more precisely object-formation and auditory scene analysis. This model will detect individual sounds, assign them to putative sources in an auditory scene and decide whether individual sources are likely to correspond to a living entity. The model will be biologically motivated, underpinned by the experiments performed by our project partners. It will use spiking neurons, dynamic synapses and spike-timing-dependent plasticity (STDP), and overall possess a hierarchical structure: starting from artificical cochleae that pick up sounds and filter them into a number of frequency bands, it will extract acoustic features and learn spectro-temporal receptive fields within the thalamo-cortical loops. This pre-processed data will then be classified in a further categorisation hierarchy. Feedback processes will be used to guide attentional modulation of peripheral processing and to improve recognition.

The auditory processing is complemented by an active sonar system that emits ultra-high frequency sounds and picks up their reflections from objects in the environment. The spectro-temporal properties of these signals contain information about the identity and behaviour of an object, which adds to the auditory information. Processing of the sonar signals will follow similar principles as used in the auditory sub-system. The classification results from both systems will be combined for final decisions. 

An ultimate goal of the project is a combined hardware-software system able to use sounds and active sonar in order to analyse real environments and detect living entities in them through the recognition of patterns of behaviour. We develop special purpose analog VLSI chips that implement integrate and fire neurons and spike-timing-dependent plasticity, and will support real-time, low-level processing of incoming sound and sonar signals. On a higher level we use hierarchical recognition models on FPGAs to represent the composite temporal structure of sound signals and discriminate between signals from different sources.


Robert Mill, Tamás Bőhm, Alexandra Bendixen, István Winkler, and Susan L. Denham (2011). 'CHAINS: Competition and Cooperation between Fragmentary Event Predictors in a Model of Auditory Scene Analysis', 45th International Conference on Information Sciences and Systems, Johns Hopkins University, USA, 23-25 March 2011

  • This paper presents an algorithm called CHAINS for separating temporal patterns of events that are mixed together. The algorithm is motivated by the task the auditory system faces when it attempts to analyse an acoustic mixture to determine the sources that contribute to it, and in particular, sources that emit regular sequences.

Salvador Dura-Bernal, Thomas Wennekers, Susan L. Denham (2011). Modelling object perception in cortex using hierarchical Bayesian networks and belief propagation, 45th International Conference on Information Sciences and Systems, Johns Hopkins University, USA, 23-25 March 2011.

  • Hierarchical generative models and Bayesian belief propagation have been shown to provide a theoretical framework that can account for perceptual processes, including feedback modulation. The framework explains both psychophysical and physiological experimental data and maps well onto the hierarchical distributed cortical anatomy. We propose a novel methodology to implement selectivity and invariance using belief propagation on Bayesian networks, to combine feedback information from multiple parents, significantly reducing the number of parameters and operations, and to deal with loops using loopy belief propagation and different sampling methods.

Denham, S.L., Dura-Bernal, S., Coath, M., & Balaguer-Ballester, E. (2010). Neurocomputational models of perceptual organization. In I. Czigler & I. Winkler (Eds), Unconscious Memory Representations in Perception: Processes and Mechanisms in the Brain (pp. 147-178). John Benjamin: Amsterdam and Philadelphia.

  • We consider models of perceptual organisation in the visual and auditory modalities and motivate our view of perception as a process of inference and verification through a number of specific examples.

Mill, R., Coath, M., Wennekers, T., Denham, S.L. (2010), 'Abstract Stimulus-Specific Adaptation Models', Neural Computation, 23(2): 435:76.

  • Stimulus-specific adaptation  (SSA) refers to a decrease in the spiking of a neuron in response to a repetitive stimulus. We address the computational problem of SSA when inputs are encoded as Poisson spike trains. How should a system—biological or artificial—maximise its response to rare stimuli and minimise its response to common ones? Detailed treatment of this question will be helpful to others designing computational or hardware models of SSA that receive Poisson inputs. 

Dura-Bernal, S.; Wennekers, T. and Denham, S. Proceedings of BICS 2010 - Brain Inspired Cognitive Systems 14-16 July 2010, Madrid, Spain. (2010) The role of feedback in a hierarchical model of object perception.

  • A hierarchical model of visual object perception based on the HMax model. This model does not use spiking neurons and STDP, but implements a cortical recognition hierarchy with predictive feedback using the formalism of belief propagation. It can recognise a number of objects in a size-, rotation-, and scale-invariant manner, and can learn new objects, if a presented one cannot be matched against any stored object.

Coath, M.; Mill, R.; Denham, S. and Wennekers, T. (2010). The emergence of feature sensitivity in a recurrent model of auditory cortex with spike timing dependant plasticity. Proceedings of BICS 2010 - Brain Inspired Cognitive Systems 14-16 July 2010, Madrid, Spain.

  • This paper describes the thalamo-cortical pre-processing we use in the SCANDLE project. It is comprised of spiking neurons with facilitating and depressing synapses, and STDP. Neurons are arranged in several cortical and sub-cortical layers. The model learns correlations in the stimulus patterns. Interestingly, correlations often are expressed more in population responses than in single neurons. Testing single neurons with simple tones also does usually not reveal the full complexity of their spectro-temporal receptive field structures.

Humble, J., Furber, S.; Denham, S. and Wennekers, T. (2010) STDP pattern onset learning depends on background activity. Proceedings of BICS 2010 - Brain Inspired Cognitive Systems 14-16 July 2010, Madrid, Spain.

  • This is a study of STDP-learning in feedforward spiking neural network. It analyses the learning effects reported by Masquelier et al (2007) claiming that spiking neurons can learn the onset of a repetitive firing pattern. Our results are broadly consistent with this finding but also show a specific impact of background noise on the behaviour of the neurons.

Wennekers, T.; Palm, G. (2009) Syntactic Sequencing in Hebbian Cell Assmblies. Journal of Cognitive Neurodynamics, in press, DOI 10.1007/s11571-009-9095-z.

  • This paper considers the generation of spatio-temporal patterns in neural networks, especially sequences with syntactic structure. It stores sequences as ordered sets of hetero-associative patterns and proposes a mechanism of how switching between patterns is possible given a global and entirely unspecific control stimulus. Parameter regions that guarantee stable operation of the model are derived analytically. This type of model may be useful to implement temporal structure of auditory objects in SCANDLE.

Garagnani, M.; Wennekers, T.; Pulvermuller, F. (2009) Recruitment and consolidation of cell assemblies for words by way of Hebbian learning and competition in a multi-layer neural network. Journal of Cognitive Computation 1, 160-176, 2009; DOI 10.1007/s12559-009-9011-1.

  • This work implements a multiple layer architecture of spiking neurons that represents auditory areas and speach production areas. It aims at learning assemblies for word-representations in a audio-motor-loop.

Wennekers, T. (2009) On the Natural Hierarchical Composition of Cliques in Cell Assemblies. Invited key-note paper for the inaugural issue of the Journal of Cognitive Computation 1, 128--138, 2009.

  • This work considers the compositional structure cell assemblies (CAs). CAs represent objects by possibly large collections of cells. Different cells can occur in different objects and this overlap structure can implement compositional feature hierarchies. The paper generalises earlier learning rules for the learning of concept hierarchies in bi-directional associative memories.

Symes, A. and Wennekers, T. (2009) A Large-Scale Model of Spatiotemporal Patterns of Excitation and Inhibition Evoked by the Horizontal Network in Layer 2/3 of Ferret Visual Cortex. Neural Networks 2, 1079--1092, 2009.

  • This is a study of a large-scale model of the primary visual cortex, especially activation patterns expected in optical recordings from layer II/III-slices under electrical stimulation. It aims at explaining experimentally observed optical dynamic patterns, and trace them down to excitatory and inhibitory synaptic pathways in the cortex. Similar models as this one althouogh for auditory and sonar processing will be used in SCANDLE.

Lanyon, L.L., Denham, S.L. (2010). Modelling Visual Neglect: Computational Insights into Conscious Perception, PLoS One, June 2010 | Volume 5 | Issue 6 | e11128.

  • The aim of this work was to examine the effects of parietal and frontal lesion in an existing computational model of visual attention and search and simulate visual search behaviour under lesion conditions. We find that unilateral parietal lesion in this model leads to symptoms of visual neglect in simulated search scan paths, including an inhibition of return (IOR) deficit, while frontal lesion leads to milder neglect and to more severe deficits in IOR and perseveration in the scan path.

Coath, M., Denham, S.L., Smith, L.M., Honing, H., Hazan, A., Holonowicz, P., Purwins, H. (2009). An auditory model for the detection of perceptual onsets and beat tracking in singing. Connection Science, 21:2, 193-205

  • We describe a biophysically motivated model of auditory salience and show that the derived measure of salience can be used to identify the position of perceptual onsets in a musical stimulus, and track and predict rhythmic structure.

Ballaguer-Ballester, E., Clark, N., Coath, M., Krumbholz, K., Denham, S.L. (2009). Understanding pitch perception as a hierarchical process with top-down modulation, PLoS Computational Biology, 1009 Mar; 5(3). e1000301

  • Pitch is one of the most important features of natural sounds, underlying the perception of melody in music and prosody in speech. However, the temporal dynamics of pitch processing are still poorly understood. We describe a neurocomputational model, which provides for the first time a unified account of the multiple time scales observed in pitch perception. The model contains a hierarchy of integration stages and uses feedback to adapt the effective time scales of processing at each stage in response to changes in the input stimulus. The model has features in common with a hierarchical generative process and suggests a key role for efferent connections from central to sub-cortical areas in controlling the temporal dynamics of pitch processing.

Ballaguer-Ballester, E., Denham, S.L., Meddis, R. (2008), A cascade autocorrelation model of pitch perception, J. Acoust. Soc. Am., 124(4), 2186-2195.

  • Autocorrelation algorithms, in combination with computational models of the auditory periphery, have been successfully used to predict the pitch of a wide range of complex stimuli. However, new stimuli are frequently offered as counterexamples to the viability of this approach. This study addresses the issue of whether in the light of these challenges the predictive power of autocorrelation can be preserved by changes to the peripheral model and the computational algorithm.


Document Actions