Sound classification: Machine learning
No.13667870 ViewReplyOriginalReport
Quoted By: >>13668295 >>13668323
I am trying to write something that scans an audio file and tells when and what kind of Anuran is present (from a set of 6 diff species). I am having trouble understanding what would be the workflow of this whole process.
What I am doing so far:
>scanning several sound samples from frog databases in the time domain and segmenting that audio around where the calls might be (pic related)
>using the frequency information from those time slices to train a supervised model
Most articles I am reading use Mel-frequency cepstrum to "just do it" and a whole lot of dark magics.
Where can I read more on MFCCs? How would you recommend to preprocess the data? I basically need to remove sections without calls from several recordings, and possibly reduce the noise a bit. (I already downsample them to 16k Hz and filter them).
Pic related is my shitty segmentation algorithm, I just look for areas that cross the mean+-std*factor, it doesn't section them cleanly and works meh.
What I am doing so far:
>scanning several sound samples from frog databases in the time domain and segmenting that audio around where the calls might be (pic related)
>using the frequency information from those time slices to train a supervised model
Most articles I am reading use Mel-frequency cepstrum to "just do it" and a whole lot of dark magics.
Where can I read more on MFCCs? How would you recommend to preprocess the data? I basically need to remove sections without calls from several recordings, and possibly reduce the noise a bit. (I already downsample them to 16k Hz and filter them).
Pic related is my shitty segmentation algorithm, I just look for areas that cross the mean+-std*factor, it doesn't section them cleanly and works meh.