Hi, i have a wav file
and between each word in the wav file I have full silence (I checked with Hex workshop and silence is represented with 0's)
how can I cut the non-silence sound ?
I'm programming using python
thanks
Hi, i have a wav file
and between each word in the wav file I have full silence (I checked with Hex workshop and silence is represented with 0's)
how can I cut the non-silence sound ?
I'm programming using python
thanks
I have no experience with this, but have a look at the wave module present in the standard library. That may do what you want. Otherwise you'll have to read the file as a byte stream an cut out sequences of 0-bytes (but you cannot just cut out all 0-bytes, as that would invalidate the file...)
You might want to try using sox, a command-line sound processing tool. It has many modes, one of them is silence
:
silence: Removes silence from the beginning, middle, or end of a sound file. Silence is anything below a specified threshold.
It supports multiple sound formats and it's quite fast, so parsing large files shouldn't be a problem.
To remove silence from the middle of a file, specify a
below_periods
that is negative. This value is then treated as a positive value and is also used to indicate the effect should restart processing as specified by theabove_periods
, making it suitable for removing periods of silence in the middle of the sound file.
I haven't found any python building for libsox, though, but You can use it as You use all command line programs in python (or You can rewrite it - use sox sources for guidance then).
Python has a wav module. You can use it to open a wav file for reading and use the `getframes(1)' command to walk through the file frame by frame.
import wave
w = wave.open('beeps.wav', 'r')
for i in range():
frame = w.readframes(i)
The frame returned will be a byte string with hex values in it. If the file is stereo the result will look something like this (4 bytes):
'\xe2\xff\xe2\xff'
If its mono, it will have half the data (2 bytes):
'\xe2\xff'
Each channel is 2 bytes long because the audio is 16 bit. If is 8 bit, each channel will only be one byte. You can use the getsampwidth()
method to determine this. Also, getchannels()
will determine if its mono or stereo.
You can loop over these bytes to see if they all equal zero, meaning both channels are silent. In the following example I use the ord()
function to convert the '\xe2'
hex values to integers.
import wave
w = wave.open('beeps.wav', 'r')
for i in range(w.getnframes()):
frame = w.readframes(i)
all_zero = True
for j in range(len(frame)):
if ord(frame[j]) > 0:
all_zero = False
break
if all_zero:
# perform your cut here
print 'silence found at frame %s' % w.tell()
You will need to come up with some threshold value of a minimum number of consecutive zeros before you cut them. Otherwise you'll be removing perfectly valid zeros from the middle of normal audio data. You can iterate through the wave file, copying any non-zero values, and buffering up zero values. When you're buffering zeroes and eventually come across the next non-zero, if the buffer has fewer samples that the threshold, copy them over, otherwise discard it.
Python is not a great tool for this sort of task though. :(