A Review of physical and perceptual feature extraction techniques for speech, music and environmental sounds

Alías-Pujol, Francesc; Socoró, Joan Claudi; Sevillano, Xavier; Alías-Pujol, Francesc; Socoró, Joan Claudi; Sevillano, Xavier

doi:https://doi.org/10.3390/app6050143

Publication date

2016-05

URI http://hdl.handle.net/20.500.14342/3449

DOI

https://doi.org/10.3390/app6050143

Abstract

Endowing machines with sensing capabilities similar to those of humans is a prevalent quest in engineering and computer science. In the pursuit of making computers sense their surroundings, a huge effort has been conducted to allow machines and computers to acquire, process, analyze and understand their environment in a human-like way. Focusing on the sense of hearing, the ability of computers to sense their acoustic environment as humans do goes by the name of machine hearing. To achieve this ambitious aim, the representation of the audio signal is of paramount importance. In this paper, we present an up-to-date review of the most relevant audio feature extraction techniques developed to analyze the most usual audio signals: speech, music and environmental sounds. Besides revisiting classic approaches for completeness, we include the latest advances in the field based on new domains of analysis together with novel bio-inspired proposals. These approaches are described following a taxonomy that organizes them according to their physical or perceptual basis, being subsequently divided depending on the domain of computation (time, frequency, wavelet, image-based, cepstral, or other domains). The description of the approaches is accompanied with recent examples of their application to machine hearing related problems

Document Type

Article

Published version

Language

English

Subject (CDU)

531/534 - Mechanics

Keywords

Percepció auditiva

Percepció de la música

Percepció del llenguatge

Pages

13 p.

Publisher

MDPI

Is part of

Applied Sciences. 2016, Vol. 6, No.5 (Maig)

Recommended citation

This citation was generated automatically.

Show full item record

This item appears in the following Collection(s)

Articles publicats en revistes [647]

Rights

Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by/4.0/