-
I just want to classify an 20ms audio whether there are people talking. |
Beta Was this translation helpful? Give feedback.
Answered by
snakers4
Aug 20, 2021
Replies: 1 comment 2 replies
-
Hi, 20ms is a too small of an audio chunk size for our VAD - https://github.com/snakers4/silero-vad#how-vad-works |
Beta Was this translation helpful? Give feedback.
2 replies
Answer selected by
snakers4
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
20ms is a too small of an audio chunk size for our VAD - https://github.com/snakers4/silero-vad#how-vad-works
For very small chunks - please use WebRTC VAD