direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments


Search for publications

Journal Publications by GRK members

Extracting auditory cues in tone-in-noise detection with a sparse feature selection algorithm
Citation key Schoenfelder2011
Author Schönfelder, V.H., and Wichmann, F.A.
Year 2011
ISSN 1662-5188
DOI 10.3389/conf.fncom.2011.53.00074
Journal Front. Comput. Neurosci.
Volume BC11: Computational Neuroscience & Neurotechnology Bernstein Conference & Neurex Annual Meeting 2011
Number 00074
Abstract Introduction: As a classical paradigm in auditory psychophysics, Tone-in-Noise (TiN) detection still presents a challenge as regards the question which auditory cues human observers use to detect the signal tone (Fletcher, 1938). For narrow band noise, no conclusive answer has been given as to which stimulus features explain observer behavior on a trial-by-trial level (Davidson, 2009). In the present study a large behavioral data set for TiN detection was analyzed with a modern machine learning algorithm, L1-regularized logistic regression (Tibshirani, 1996). Enforcing sparse solutions, this method serves as a feature selection technique allowing the identification of the set of features that is critical to explain observer behavior (Schönfelder and Wichmann, 2011). Methods: An extensive data set (>20'000 trials/observer) was collected with six naïve observers performing TiN detection in a yes/no paradigm. Stimuli were short (200 ms) sound burst consisting of a narrow band gaussian noise masker (100 Hz) centred around a signal tone (500 Hz). Data was collected in blocks with fixed signal-to-noise ratios (SNRs) at four levels along the slope of the psychometric function. Data on response consistency was also collected, estimated from responses to pairs of similar stimuli and serving as a measure of reproducibility of single trial decisions. Subsequently, linear observer models were fit to the data with an L1-regularized logistic regression, for each observer and each SNR separately. The set of features used during data fitting consisted of three components: energy, sound spectrum and envelope spectrum, with each component comprising one (energy) or multiple (spectra) scalar entries characterizing the presented sound. Results: In terms of the psychometric function, observers could hardly be distinguished, only one – a trained musician – had a significantly lower threshold than the rest. Nevertheless, the analysis of perceptual features resulted in two groups of subjects using different combinations of auditory cues, as already observed by Richards (1993). Energy alone, as suggested by Green and Swets (1966), was not sufficient to explain responses, nor was the shape of the envelope spectrum, as proposed by Dau (1996). Instead, most observers relied dominantly on a mixture of sound energy and asymmetric spectral filters, with a peak frequency centered above the signal tone and a negative lobe below. These filters may correspond to off-frequency listening effects or result from the asymmetry of the auditory filters. The results suggest that observers relied on multiple detectors instead of one single feature in this task. Differences in detection strategy across different SNR were not observed. In general, observers showed poor consistency in their responses, in particular for low SNR. Nevertheless, single-trial predictions from the extracted observer models were reliable within the boundaries dictated by response consistency (Neri, 2006).
Link to original publication Download Bibtex entry

To top

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

Auxiliary Functions