Bird species classification with audio-visual data using CNN and multiple kernel learning

Naranchimeg Bold, Chao Zhang, Takuya Akashi*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

Recently, deep convolutional neural networks (CNN) have become a new standard in many machine learning applications not only in image but also in audio processing. However, most of the studies only explore a single type of training data. In this paper, we present a study on classifying bird species by combining deep neural features of both visual and audio data using kernel-based fusion method. Specifically, we extract deep neural features based on the activation values of an inner layer of CNN. We combine these features by multiple kernel learning (MKL) to perform the final classification. In the experiment, we train and evaluate our method on a CUB-200-2011 standard data set combined with our originally collected audio data set with respect to 200 bird species (classes). The experimental results indicate that our CNN+MKL method which utilizes the combination of both categories of data outperforms single-modality methods, some simple kernel combination methods, and the conventional early fusion method.

Original languageEnglish
Title of host publicationProceedings - 2019 International Conference on Cyberworlds, CW 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages85-88
Number of pages4
ISBN (Electronic)9781728122977
DOIs
StatePublished - 2019/10
Event18th International Conference on Cyberworlds, CW 2019 - Kyoto, Japan
Duration: 2019/10/022019/10/04

Publication series

NameProceedings - 2019 International Conference on Cyberworlds, CW 2019

Conference

Conference18th International Conference on Cyberworlds, CW 2019
Country/TerritoryJapan
CityKyoto
Period2019/10/022019/10/04

Keywords

  • Bird species classification
  • Feature combination
  • Multimodal fusion
  • Multiple kernel learning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Computer Graphics and Computer-Aided Design
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Media Technology

Fingerprint

Dive into the research topics of 'Bird species classification with audio-visual data using CNN and multiple kernel learning'. Together they form a unique fingerprint.

Cite this