Review Data.xlsx

	A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T	U	V	W	X	Y	Z	AA	AB	AC	AD
1	Paper	Author	Title	publisher	keywords	year	sound classification applications/use cases	algorithms	preprocessing	Feature Extraction	steps/process	Denoising techniques	Tools used	Context awareness in sound classification	design/model/framework	setup images	field tests	results	challenges/limitations	Accuracy Levels	Other Information	Classifies			formula	graphs	Algorithm/Flowchart	Architecture	Pseudocode	Network/Component Diagram
2	ACM1	HONG R. et. al	Video Accessibility Enhancement for Hearing-Impaired Users	ACM	Accessibility, dynamic captioning, hearing impairment	2011	Dynamic Captioning	Viola Jones, Haar feature based cascade mouth detector			script location, script-speech alignment, and voice volume estimation		videos along with scripts but can be extended to process general videos without scripts	dynamic captioning put scripts at suitable positions to help the hearing-impaired audience better recognize the speakers	script location, script-speech alignment, and voice volume estimation	Video Feeds		better tracking of the scripts and perceive the moods that are conveyed by the variation of volume	Focus more on dynamic captioning rather than the user interface	80		Speech in video to dynamic caption			Y-Gaussian distance, linear representation	y	Y-accessibility enhancement & script-speech alignment		face mapping
3	ACM2	Wang W. et. al	A Smartphone-based Digital Hearing Aid to Mitigate Hearing Loss at Specific Frequencies	ACM	Digital hearing aids, smartphone, sound classification	2014	Hearing Loss of certain frequencies among elderly	GMM Classifier, WOLA (Weighted Over-Lap Add) filter bank			speech processing in the frequency domain and sound classification to classify input sounds into speech and speech with noise categories.	WOLA filter banks then split the sound up into different frequency bands, which are then amplified (reduced) by the amplification in the specific frequency ranges at which the user’s hearing is impaired. Finally, the WOLA synthesis filter bank reconstructs the acoustic signal from the amplified sub-band signals, which is sent to the receiver for play out.	Smartphones		script location, script-speech alignment, and voice volume estimation	acoustic signals			frequency domain processing is currently a bit slow due to computational complexity			Audio Frequencies			Y	Y	Y	hearing aid app (Application and Storage)
4	ACM3	Bountourakis V. et. al	Machine Learning Algorithms for Environmental Sound Recognition: Towards Soundscape Semantics	ACM	Environmental Sound Recognition, audio classification, semantic audio analysis, computer audition, feature extraction, feature selection, machine learning algorithms	2015	Comparison between algorithms for sound classification	• k-Nearest Neighbors (k-NN) • Naive Bayes • Support Vector Machines (SVM) • C4.5 algorithm (decision tree) • Logistic Regression • Artificial Neural Networks (ANN)		stationary (frequency-based) feature extraction and non-stationary (time frequency based) feature extraction	database, segmentation, feature extraction, feature selection, classification, evaluation			Environmental Sounds	database, segmentation, feature extraction, feature selection, classification, evaluation	Sound Signals				the highest classification rates were achieved by k-NN with feature set 3 (85.8%), ANN with feature set 2 and use of PCA (86.95%) and SVM with feature set 2 and use of PCA (85.41%).		airplanes, alarms, applause, birds, dogs, footsteps, motorcycles, rain, rivers, sea waves, thunders, wind.
5	ACM4	Bragg D. et. al	A Personalizable Mobile Sound Detector App Design for Deaf and Hard-of-Hearing Users	ACM	Sound detection, accessibility, deaf, hard-of-hearing	2016	deaf and hard-of-hearing people to be notified about sounds around them						Smartphones, mobile application	Accesibility		Sound Signals	87 participants (51 female, 36 male). 50 were deaf, and 37 were hard-of-hearing. Ages ranged 18-99 (mean 42, std dev 17).	app design to be usable for deaf and hard-of-hearing users recording training examples of sound		70	No participants used apps to monitor sounds outside of the study	participants revealed they wanted classifications of dropping items, walking/running behind, moving carts, fire drill, printer, conversations, and baby sound.	vehicles passing by, children having bad dreams, smoke and carbon monoxide detectors, app pliances making unusual noises, water running, socializing, something dropping on the floor, gunshots, conversations, and distinguishing between multiple sources with similar frequency range.	Users train the system using the sounds at home
6	ACM6	Kurnaz S. & Aljabery M.	Predict the type of hearing aid of audiology patients using data mining techniques	ACM	audiology, National Health System, audiograms, BTE, ITE, Machine Learning, Data Mining, Hearing Aid.	2018	Choice of hearing aid	AdaBoost classifier, Random forests classification, Logistic Regression, Orange Canvas Modeler						Hearing aid choice														ML Model
7	ACM7	Li M. et. al	Environmental Noise Classification Using Convolution Neural Networks	ACM	Environmental noise; Convolution Neural Network (CNN); Short-Time Fourier Transform (STFT); Log Mel-Frequency Spectral Coefficients (MFSCs); Tensorflow	2018		CNN	Short-Time Fourier Transform (STFT					Environment											Log Mel-Frequency Spectral Coefficients		Y ML, STFT	CNN
8	ACM8	Alsouda Y. et. al.	IoT-based Urban Noise Identification Using Machine Learning: Performance of SVM, KNN, Bagging, and Random Forest	ACM	urban noise; smart cities; support vector machine (SVM); k-nearest neighbors (KNN); bootstrap aggregation (Bagging); random forest; mel-frequency cepstral coefficients (MFCC); internet of things (IoT).	2019	classification of environmental sounds	SVM, KNN, Bagging, Random Forest		mel-frequency cepstral coefficients (MFCC)	Feature extraction, model training, classifier, prediction		Raspberry pi, Microphone hat	Environment	Feature extraction, model training, classifier, prediction	Sound		high noise identification accuracy that is in the range 88% – 94%. E		Classifier SVM KNN Bagging Random Forest Accuracy [%] 93.87 93.88 87.81 89.91		quietness, silence, car horn, children playing, gunshot, jackhammer, siren, and street music			K-Nearest Neighbors		ML, MFCC
9	ACM9	Wang . et. al.	Privacy-aware environmental sound classification for indoor human activity recognition	ACM	Smart Buildings, Privacy-aware Environmental Sound Recognition, Voice Bands Stripping, Internet Of Things, Computational Efficiency, Web Crawling, Mel Frequency Cepstral Coefficients, Linear Predictive Cepstral Coefficients, Support Vector Machine	2019	indoor environmental sound classification	Decision tree, Random Forest, Mixed Gaussian, Naive Bayes, SVM(Linear & RBF kernel), Artificial Neural Network						Environment						0.9						Y	ML, Feature extraction
10	ACM10	Inik O. & Seker H	Convolutional Neural Networks for the Classification of Environmental Sounds	ACM	Environmental sound classification (ESC), Deep Learning, Convo�lutional Neural Networks (CNN), Urbansound8k	2020	classification of environmental sounds	CNN					Intel®Core™ i9-7900X 3.30GHz×20 processor, 64 GB Ram and 2 x GeForce RTX2080Ti graphic card. Matlab R2020a 64bit (win64)	Environment						0.825		e air conditioner, car horn, children playing, dog bark, drilling, engine idling, gun shot, jackhammer, siren, and street music						CNN
11	ACM11	Sigtia S. et. al.	Automatic Environmental Sound Recognition: Performance Versus Computational Cost	ACM	Automatic environmental sound recognition, computational auditory scene analysis, deep learning, machine learning.	2016	classification of environmental sounds	Gaussian Mixture Models, SVM, DNN, RNN		Mel-frequency cepstral coefficient (MFCC)			Baby Cry Data Set, Smoke Alarm Data Set	Environment				Deep Neural Networks yield the best ratio of sound classification accuracy across a range of computational costs, while Gaussian Mixture Models offer a reasonable accuracy at a consis�tently small cost, and Support Vector Machines stand between both in terms of compromise between accuracy and computational cost.				smoke alarms and baby cries			Gaussian Mixture Models, SVM, DNN (Feed forward) RNN	Y
12	ACM12	Laar V. & Vries. B	A Probabilistic Modeling Approach to Hearing Loss Compensation	ACM	Hearing aids, hearing loss compensation, probabilistic modeling, factor graphs, message passing, machine learning	2016	Hearing Aid (HA) algorithms tuning, probabilistic modeling approach to the design of HA algorithms	Bayes factor (BF)			signal processing (SP), parameter estimation (PE) and model comparison (MC) tasks evaluation			Speech Understanding											performance evaluation, signal processing	Y	hearting aid signal processing			hearing aid agent
13	ACM13	Salehi H. et. al.	Learning-Based Reference-Free Speech Quality Measures for Hearing Aid Applications	ACM	Hearing aids, speech quality, perceptual linear prediction, gammatone filterbank energies, reference-free quality assessment, support vector regression, machine learning.	2018	Speech quality of hearing aids							Speech Understanding			A group of 18 HI listeners were recruited to provide the speech quality ratings								Linear prediction		Feature extraction
14	IEEE1	Demir F. et. al	A New Deep CNN Model for Environmental Sound Classification	IEEE	Environmental sound classification, spectrogram images, CNN model, deep features	2020	Environmental sound classification	CNN		spectrogram method converts the signals into time frequency images or loudness of a signal over time at different frequencies existing in a specific waveform deep feature extraction			DCASE-2017 ASC and the UrbanSound8K datasets	Environment						86.7		air conditioner, car horn, children, dog bark drilling, engine idling, gun shot, jack�hammer, siren, and street music			STFT, CNN, Accuracy	Y	ML, CNN, KNN
15	IEEE2	Ridha. A & Shehieb W.	Assistive Technology for Hearing-Impaired and Deaf Students Utilizing Augmented Reality	IEEE	Assistive technology; Augmented Reality; Deaf; Education; Hearing-Impairment; Machine Learning.	2021	augmented reality glasses that will assist students in their educational journey with real-time transcribing, speech emotion recognition, sound indications features, as well as classroom assistive tools.						AR Glasses	Environmental Sounds				live transcription feature that uses Google Cloud services, additionally storing the transcribed lectures for future reference in the classroom tools feature, that can also be shared among other students, making it a platform that can be used for communication between students		71.3		Car Horn, Siren, Gunshots, Broken Glass					ML			PCB Schematic
16	IEEE3	Melati A.& Karyono K.	ANDROID BASED SOUND DETECTION APPLICATION FOR HEARING-IMPAIRED USING ADABOOSTM1 CLASSIFIER WITH REPTREE WEAKLEARNER	IEEE	sound detection for hearing-impaired, machine learning, AdaBoostM1, REPTree, Android	2014	help the hearing-impaired people to detect sound around them and to recognize the sound	AdaBoostM1 functioning as a classifier and REPTree as weak learner					indoor sounds and the second database is outdoor sounds with a total of 23 sounds	Environment					Low Accuracy, propose better approach	40		baby crying x beep x broom sweeps x door creaking x door slam x door bell x foot step x hairdryer x knocking door x ringing x water runs x whistle airplanes x applause x birds chirp x car honk x crowded x dog bark x engine start x screaming x thunder x train x wind blowing				Y	ML		AdaBoostM, Bagging
17	IEEE4	Chen C. et. al.	Audio-Based Early Warning System of Sound Events on the Road for Improving the Safety of Hearing-Impaired People	IEEE	Android application, warning, audio detection, machine learning	2019	Road Safety for hearing impared	(CNNs)		MFCC			urbansound 8k	Safety				CNN is effective for environment sounds classification tasks by appropriate parameter settings and feature sets		66.4		Car-approaching, Car-horn, Children-playing, Dog-barking, Gun-shot, Construction, Siren, Engine-idling			MFCC	Y	ML
18	IEEE5	Bhat G. et. al.	Automated machine learning based speech classification for hearing aid applications and its real-time implementation on smartphone	IEEE	Automated Machine Learning, AutoML, Voice Activity Detection (VAD), Hearing aid devices (HADs), smartphone, real-time	2020	speech classification	AutoML based VAD, CNN						Speech Understanding								Speech			Signal model and training feature	Y	ML
19	IEEE6	Healy E. & Yoho S.	Difficulty understanding speech in noise by the hearing impaired: Underlying causes and technological solutions	IEEE		2016	poor speech understanding	single-microphone algorithm to extract speech from noise, DNN						Speech Understanding			Groups of 10 NH and 10 HI subjects heard IEEE sentences in unprocessed speech-plus-noise conditions and corresponding algorithm-processed conditions. In this study, multi-talker babble and cafeteria noise, each at two SNRs, were employed.					Speech					ML
20	IEEE7	Jatturas C. et. al.	Feature-based and Deep Learning-based Classification of Environmental Sound	IEEE		2019	comparison techniques for environmental sound classification	SVM, MLP, Deep Learning					Urban Sound 8k, Scikit-learn and Tensorflow	Environment						75		Air cond., Children Playing, Engine Idling, Siren, and Street Music.,			STFT, NN, SVM	Y		CNN
21	IEEE8	Saleem N. et. al.	Machine Learning Approach for Improving the Intelligibility of Noisy Speech	IEEE	Machine learning, speech enhancement, intelligibility, time-frequency masking, deep neural networks	2020	Intelligibility of Noisy Speech							Speech Understanding											RNN	Y		RNN
22	IEEE9	Jatturas C. et. al.	Recurrent Neural Networks for Environmental Sound Recognition using Scikit-learn and Tensorflow	IEEE		2019	Environmental sound classification	MLP, SVM		MFCC			Urban Sound 8k,	Environment				deep neural network models outperform both MLP and SVM with PCA		90		Car-approaching, Car-horn, Children-playing, Dog-barking, Gun-shot, Construction, Siren, Engine-idling			STFT, SVM	Y		RNN
23	IEEE10	Davis. N & Suresh. K	Environmental Sound Classification using Deep Convolutional Neural Networks and Data	IEEE		2018	Environmental sound classification	Conditional Neural Network	Time Stretchnig, pitch shifting, Dynamic Range Compression, Background Noise, Linear Prediction Ceptal Coefficients (LPCC)				Urbansound 8K	Environment						80		air conditioner, car horns, children playing, dog bark, drilling, engine idling, gunshot, jackhammers, siren and street music			LPCC	Y
24	IEEE11	Chu. S et. al	Environmental Sound Recognition With Time–Frequency Audio Features	IEEE	Terms—Audio classification, auditory scene recognition, data representation, feature extraction, feature selection, matching pursuit, Mel-frequency cepstral coefficient (MFCC).	2009	Environmental sound classification							Environment								that Restaurant, Casino,Train, Rain, and Street ambulance			short time energy, zero crossing rate, signal decomposition	Y
25	IEEE12	Chu. S et. al	WHERE AM I? SCENE RECOGNITION FOR MOBILE ROBOTS USING AUDIO FEATURES	IEEE		2006	Environmental sound classification							Environment
26	IEEE13	Ullo. S et. al	Hybrid Computerized Method for Environmental Sound Classification	IEEE	Environmental sound classification, Optimal allocation sampling, spectrogram, convolu�tional neural network, classification techniques	2020	Environmental sound classification	AlexNet and Visual Geometry Group (VGG)-16 networks decision tree (fine, medium, coarse kernel), k-nearest neighbor (fine, medium, cosine, cubic, coarse and weighted kernel), support vector machine, linear discriminant analysis, bagged tree and softmax classifiers	short-time Fourier transform (STFT)	Deep Feature Extraction			ESC-10, a ten-class environmental sound dataset, The experiments have been carried out on MATLAB (2018R). A computer with 8 GB RAM, intel i7 third generation processor of 3.4 GHz, 64 bit memory has been used.	Environment				AlexNet (FC-6) for fine kernel using a decision tree is 89.9%		90.1%, 95.8%, 94.7%, 87.9%, 95.6%, and 92.4% is obtained with a decision tree, k-neared neighbor, support vector machine, linear discriminant analysis, bagged tree and softmax classifier respectively	The methods proposed until now by the researchers have been limited in terms of performance. Hence an effective and robust method is required to classify environmental signals accurately. In the present work, authors aim to propose a method in which the dimension of data is reduced by OAS. The reduced data are then used to be transformed into images by STFT. Several features have been extracted from the spectrograms by using two pre-trained CNNs.	Classes from dataset			STFT, Sample Size,	Y	ML	CNN
27	IEEE14	Zhang X. et. al	Dilated Convolution Neural Network with LeakyReLU for Environmental Sound Classification	IEEE	Environmental sound classification ; Dilated Convolution Neural Network; Leaky Rectified Linear Unit; Activation Function	2017	Environmental sound classification	a dilated CNN-based ESC (D-CNN-ESC)		transforming acoustic waves to low level feature vectors following commonly used method			UrbanSound8K, ESC50, and CICESE	Environment				proposed D�CNN-ESC system outperforms the state-of-the-art ESC results obtained by very deep CNN-ESC system on UrbanSound8K dataset, the absolute error of our method is about 10% less than that of compared method.				All classes in the 3 datasets				Y	ML	CNN
28	IEEE15	Han B. & Hwang E.	ENVIRONMENTAL SOUND CLASSIFICATION BASED ON FEATURE COLLABORATION	IEEE	Environmental sound recognition, discrete chirplet transform, discrete curvelet transform, discrete Hilbert transform, feature extraction	2009	Environmental sound classification	SVM	We then applied equal-loudness level contours to each frame, to ensure that the signal more accurately represented human sound perception, and we eliminated the silence signal from the start and end points of each frame.	For traditional features, we collected mel-frequency cepstal coefficients (MFCC), zero-crossing rate (ZCR), spectral centroid (SC), spectral spread (SS), spectral flatness (SF), and spectral flux (SFX).				Environment				CDFs and ATFs are more effective than TFs for classification. Furthermore, when combined with TFs, they achieved the maximum accuracy.			three types of features: traditional features (TFs), change detection features (CDFs), and acoustic texture features (ATFs)	Street, road, talking, raining,bar, car			Hilbert transform, discrete chirplet transform	Y	ML, Feature extraction
29	IEEE16	Wang J. et. al	Environmental Sound Classification Using Hybrid SVM/KNN Classifier and MPEG-7 Audio Low-Level Descriptor	IEEE		2006	Environmental sound classification	Hybrid SVM/KNN		Audio Spectrum Centroid, Audio Spectrum Spread, Audio Spectrum Flatness				Environment				the proposed hybrid SVM/KNN classifier outperforms the HMM classifier in MPEG-7 sound recognition tool				male speech (50), female speech (50), cough (50), laughing (49), screaming (26), dog barking (50), cat mewing (45), frog wailing (50), piano (40), glass breaking (34), gun shooting (33), and knocking (50). There are totally 527 sound files in our database			SVM, KNN, Feature extraction (Audio Spectrum Centroid, spread and flatness)		ML
30	IEEE17	Piczak K.	ENVIRONMENTAL SOUND CLASSIFICATION WITH CONVOLUTIONAL NEURAL NETWORKS	IEEE	environmental sound, convolutional neu�ral networks, classification	2015	Environmental sound classification	CNN					ESC-50 and ESC-10, UrbanSound 8k	Environment					publicly available datasets of environmen�tal recordings are still very limited - both in number and in size1	UrbanSound8K dataset (LP - 73.1%, US - 73.7%)		Classes from dataset			ReLU	Y	ML	CNN
31	IEEE18	Salamon J. & Belo J.	Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification	IEEE	Environmental sound classification, deep convo�lutional neural networks, deep learning, urban sound datase	2016	Environmental sound classification	CNN						Environment											CNN	Y
32	IEEE19	Wang J. et. al	Gabor-Based Nonuniform Scale-Frequency Map for Environmental Sound Classification in Home Automation	IEEE	Environmental sound classification, feature extraction, Gabor function, home automation, matching pursuit (MP), nonuniform scale-frequency map	2013	Environmental sound classification	SVM		Gabor Dictionary Based on Critical Frequency Bands Nonuniform Scale-Frequency Map Dimensional Reduction of Scale-Frequency Maps Using Principal Component Analysis and Linear Discriminant Analysis				Environment				proposed feature is more appro�priate for practical use, especially environmental sound classification, since the proposed method has higher robustness against noise.		0.8621					nonuniform scale frequency classifier	Y	ML
33	IEEE20	Nayak D. et. al.	Machine Learning Models for the Hearing Impairment Prediction in Workers Exposed to Complex Industrial Noise: A Pilot Study	IEEE	Complex noise exposure, Hearing impairment, Machine learning, Noise-induced hearing loss	2018	Hearing Impairment Prediction in Workers Exposed to Complex Industrial Noise							Environment			Data sets were collected from 1,644 workers exposed to complex noises in 53 workshops of 17 factories in the Zhejiang province of China			78.6 and 80.1
34	IEEE21	Tokozume Y. & Harada T.	LEARNING ENVIRONMENTAL SOUNDS WITH END-TO-END CONVOLUTIONAL NEURAL NETWORK	IEEE	Environmental sound classification, convolu�tional neural network, end-to-end system, feature learning	2017	ESC	We refer to our CNN as EnvNet					Urbansound 8K, YorNoise	Environment						81.3								CNN
35	IEEE22																								local binary pattern
36	IEEE23																										ML	CNN	ML
37	ScienceDirect1	Nossier S. et. al.	Enhanced smart hearing aid using deep neural networks	SCIENCE DIRECT	Deep learning; Dropout; Noise of interest awareness; Smart hearing aid; Speech enhancement	2019	Smart hearing aid	DNN						Hearing aid						89		Car horn			NN		ML
38	ScienceDirect2	Abdoli et. al.	End-to-end environmental sound classification using a 1D convolutional neural network	SCIENCE DIRECT	Convolutional neural network Environmental sound classification Deep learning Gammatone filterbank	2019	Environemntal sound classification using 1D CNN	CNN						Environment		Sound									Feature extraction, MSE	Y	ML	CNN
39	ScienceDirect3	Mushtaq Z & Su S.F	Environmental sound classification using a regularized deep convolutional neural network with data augmentation	SCIENCE DIRECT	Data augmentation Environmental sound classification Regularization Deep convolutional neural network Urbansound8k	2020	Environmental Sound Classification	DCNN						Environment						95.3					CNN	Y	ML	CNN
40	ScienceDirect4	Chen Y. et. al.	Environmental sound classification with dilated convolutions	SCIENCE DIRECT	Sound information retrieval Environmental sound classification Dilated convolutions	2018	Sound signal retrieval	CNN						Sound retrieval											ReLU, Softmax value, cross entropy, CNN	Y	ML	CNN
41	ScienceDirect6	Demir F et. al.	A new pyramidal concatenated CNN approach for environmental sound classification	SCIENCE DIRECT	Sound classification Deep learning SVM STFT CNN	2020	Environment sound classification	Deep learning CNN	Short Time Frontier Transform	VCGNet 16, VCGNet 19, DenseNet 201			Urbansound 8K, ESC - 10, ESC -50	Environment		Sound				94.8, 81.4, 78.1					Short Time Fourier Transform (STFT)	Y	ML
42	ScienceDirect7	Mushtaq Z et. al.	Spectral images based environmental sound classification using CNN with meaningful data augmentation	SCIENCE DIRECT	Environmental sound classification Convolutional neural network Spectrogram Data augmentation Transfer learning	2021	pproach of spectral images based on environmental sound classification using Convolutional Neural Networks (CNN) with meaningful data augmentation	CNN					ESC -10, ESC -50, Urbansound 8K	Environment		Sound				99.04, 99.49, 97.57					CNN	Y	ML, Transfer Learning
43	ScienceDirect8	Ahmad S et. al.	Environmental sound classification using optimum allocation sampling based empirical mode decomposition	SCIENCE DIRECT	Environmental sound classification Optimum allocation sampling Empirical mode decomposition Multi-class least squares support vector machine Extreme learning machine	2020	Automatic environmet sound classification	Optimum allocation sampling					ESC - 10	Environment		Sound				87.25, 77.61		dog bark, rain, sea waves, baby cry, clock tick, person sneeze, helicopter, chainsaw, rooster, and fire crackling			Empirical Mode Decomposition , Feature extraction (Approximate Entropy, Permutation Entropy, Log-energy entropy, Zero Crossing Rate) SVM, NN	Y	ML
44	ScienceDirect9	Mushtaq Z. & Su S.	Environmental sound classification using a regularized deep convolutional neural network with data augmentation	SCIENCE DIRECT	Data augmentation Environmental sound classification Regularization Deep convolutional neural network Urbansound8k ESC-10 ESC-50	2020	ESC	CNN		Mel-spectrogram (Mel), Mel-Frequency Cep�stral Coefficient (MFCC) and Log-Mel by using DCNN			ESC-10 ESC-50 US8K							94.9 89.2 95.3					Y	Y	Y
45	Springer1	Medhat F. et. al.	Masked Conditional Neural Networks for Environmental Sound Classification	Springer	Conditional Neural Networks � CLNN � Masked Conditional Neural Networks � MCLNN � Restricted Boltzmann Machine, RBM � Conditional Restricted Boltzmann Machine � CRBM � Deep Belief Nets � Environmental Sound Recognition � ESR � YorNois	2017	Environmental sound classification	Conditional Neural Network					Urbansound 8K, YorNoise	Environment						73		air conditioner, car horns, children playing, dog bark, drilling, engine idling, gunshot, jackhammers, siren and street music			CNN, Feature extraction	Y		CNN	CNN
46	Springer2	Zhang Z. et.al.	Deep Convolutional Neural Network with Mixup for Environmental Sound Classification	Springer	Environmental sound classification Convolutional neural network · Mixup	2018	ESC	CNN					ESC-10 dataset is a subset of 10 classes (400 samples), UrbanSound8K dataset is a collection of 8732 short (up to 4 s) audio clips of urban sound areas	Environment						91.7 83.9 83.7		dog bark, rain, sea waves, baby cry, clock tick, person sneeze, helicopter, chainsaw, rooster, fire crackling,air conditioner, car horn, children playing, dog bark, drilling, engine idling, gun shot, jackhammer, siren, and street music	we propose a novel CNN as our ESC system model inspired by VGG Net, . In order to achieve a better performance for our system on ESC, the effect of mixup hyper-parameter α is further explored. Figure 5 shows the change of accuracy with different α ranging from [0.1, 0.5]. We see that when α = 0.2, the best accuracy is achieved on all three datasets.		Generating training data	Y	ML
47	INTERSPEECH1	Sailor B. et. al.	Unsupervised Filterbank Learning Using Convolutional Restricted Boltzmann Machine for Environmental Sound Classification	INTERSPEECH	Unsupervised Filterbank Learning, ConvRBM, Sound Classification, CNN	2017	Environmental sound classification	supervised Convolutional Neural Network (CNN)					ESC -50 dataset	Environment		Sound/Audio Signal		proposed ConvRBM-BANK outperform EnvNET [18] even without the system combination. this shows the significance of unsupervised generative training using ConvRBM		78.45	ConvRBM-BANK performs significantly better than CNN with FBEs				CNN	Y	Y	CNN
48	INTERSPEECH2	Sharma J. et. al.	Environment Sound Classification using Multiple Feature Channels and Attention based Deep Convolutional Neural Networ	INTERSPEECH	Convolutional Neural Networks, Attention, Multiple Feature Channels, Environment Sound Classification	2020	Environmental sound classification	CNN		Mel-Frequency Cepstral Coeffi�cients (MFCC), Gammatone Frequency Cepstral Coefficients (GFCC), Constant Q-transform (CQT) and Chromagram			ESC-10 ESC-50 US8K	Environment						94.75(ESC-10) 87.45(ESC-50) 97.52(US8k)			We stop at 128 features, which pro�duces the best results, to avoid increasing the complexity of the model.		CNN	N	y	CNN
49	eprint aRXIV	Mohaimenuzzaman Md. et. al.	Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained Devices	eprint aRXIV	Deep Learning, Audio Classification, Environmental Sound Classification, Acoustics, Intelligent Sound Recognition, Micro-Controller, IoT, Edge-AI, ESC-50	2021	ESC	ACDNet, which produces above state-of-the-art accuracy on ESC-10 (96.65%) and ESC-50 (87.1%), we describe the com�pression pipeline and show that it allows us to achieve 97.22% size reduction and 97.28%					ACDNet is implemented in PyTorch version 1.7.1 and Wavio audio library is used to process the audio files. ESC-10 ESC-50 US8K	Environment				While limitations of the programming environment have restricted the accuracy of our current test deployment on a physical MCU, we have conclusively shown that 81.5% accuracy is achievable on such a resource-impoverished device, close to the state-of-the-art and above human performance.		96.65	it is likely that the performance can be improved by using quantization-aware training and pruning. Secondly, we would like to try the SpArSe approach for further optimisations now that we have developed Micro-ACDNet as a suitable starting point for its optimisations.		ACDNet, which produces above state-of-the-art accuracy on ESC-10 (96.65%) and ESC-50 (87.1%), we describe the com�pression pipeline and show that it allows us to achieve 97.22% size reduction and 97.28%		Training sample, Learning rate, prunning process	Y		CNN	Hybrid prunning
50	MDPI1																								Dempter-Shafer Evidence Theory		ML	CNN
51																									crossover fitting, user preference matching, populating diversity		ML			Presentation, application, storage
52	MDPI3																								Speech Enhancement, CNN			CNN
53	MDPI4																								soundscaping, source mixing and source, modeling, STFT, posterior distribution		soundscaping, source mixing and source modeling
54	MDPI5																								acoustic feedback signal, microphone, feedback and error signals, computational complexity					Hearing aid sructure/curcuit
55	MDPI6
56	MDPI7																								Sampling frequency, Speech Quality Perception Evaluation		ML, Feature extraction, Evaluation process	denoising autoencoder networks
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100