TY - GEN
T1 - Bird species classification with audio-visual data using CNN and multiple kernel learning
AU - Bold, Naranchimeg
AU - Zhang, Chao
AU - Akashi, Takuya
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Recently, deep convolutional neural networks (CNN) have become a new standard in many machine learning applications not only in image but also in audio processing. However, most of the studies only explore a single type of training data. In this paper, we present a study on classifying bird species by combining deep neural features of both visual and audio data using kernel-based fusion method. Specifically, we extract deep neural features based on the activation values of an inner layer of CNN. We combine these features by multiple kernel learning (MKL) to perform the final classification. In the experiment, we train and evaluate our method on a CUB-200-2011 standard data set combined with our originally collected audio data set with respect to 200 bird species (classes). The experimental results indicate that our CNN+MKL method which utilizes the combination of both categories of data outperforms single-modality methods, some simple kernel combination methods, and the conventional early fusion method.
AB - Recently, deep convolutional neural networks (CNN) have become a new standard in many machine learning applications not only in image but also in audio processing. However, most of the studies only explore a single type of training data. In this paper, we present a study on classifying bird species by combining deep neural features of both visual and audio data using kernel-based fusion method. Specifically, we extract deep neural features based on the activation values of an inner layer of CNN. We combine these features by multiple kernel learning (MKL) to perform the final classification. In the experiment, we train and evaluate our method on a CUB-200-2011 standard data set combined with our originally collected audio data set with respect to 200 bird species (classes). The experimental results indicate that our CNN+MKL method which utilizes the combination of both categories of data outperforms single-modality methods, some simple kernel combination methods, and the conventional early fusion method.
KW - Bird species classification
KW - Feature combination
KW - Multimodal fusion
KW - Multiple kernel learning
UR - http://www.scopus.com/inward/record.url?scp=85077117675&partnerID=8YFLogxK
U2 - 10.1109/CW.2019.00022
DO - 10.1109/CW.2019.00022
M3 - 会議への寄与
AN - SCOPUS:85077117675
T3 - Proceedings - 2019 International Conference on Cyberworlds, CW 2019
SP - 85
EP - 88
BT - Proceedings - 2019 International Conference on Cyberworlds, CW 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 18th International Conference on Cyberworlds, CW 2019
Y2 - 2 October 2019 through 4 October 2019
ER -