tags:

views:

57

answers:

1

Im implementing ITU standard based loudness measurement program and as it states i should use some kind of gating to exclude silence regions from affecting measured average sound level - for example - Ill take general integration time of 3 seconds - if, for example the first second of whole sound contains speech and the last 2/3 of it contains silence (people taking breath, thinking or for similar reasons) then value of loudness i get is smaller than it should be - because im taking silence regions in account. So then there is somehow suggested, but not very well documented solution - you take "instant" (400ms) loudness measurement besides needed (3000ms) integration time and if "instant" loudness is 8LU (LU stands for Loudness Unit) lower than loudness measured in "full time"(3000ms) - you then pause loudness measurement, while you get your instant level in range of long range level. Long story short - you get a number of incoming samples, for example 10ms of them, you calculate your sliding short and long term loudness, then you compare if the short term loudness is 8 units lower than long term and discard that set of samples/pause sound measurement for 10ms samples you just got, effectively ignoring them and keeping your long term loudness in higher level - ignoring those 10ms that are "to silent relative to long term";

So the problem is: since im ignoring all the samples(small chunks of samples actually) that are 8lu lower than my long term loudness level, im effectively blocking my long term loudness level to become smaller when it actually should.

From "2010 papers of EBU P/Loud working group:

"P/LOUD conducted listening tests in Q4/2009 and January 2010 to determine the best gating threshold. It was found that two candidate gating methods out of the four tested gave good results, both being statistically significantly better than the other two. Those two methods were a gate of 6LU relative to ungated LKFS (‘6rel’) and 10LU relative to ungated LKFS (‘10rel’). For all candidates a block length of 400ms was used. Pragmatically, a value of 8rel was chosen for further informal tests against the other gating function already used by broad‐casters"*

P.S Sorry for my En, its not my native language.

A: 

I don't see where in the standard suggests an approach as complicated as you describe. Instead, from my, admittedly cursory, overview of this, I think you need to calculate the loudness in a sliding window by breaking the window into smaller time bins, and if any of the smaller time bins within this window fall below a threshold (-8LU), you leave these bins out of your calculation.

Maybe you are doing this and just not calculating the average correctly. To find the average loudness correctly when you drop samples, you need to take the sum of the loudness levels that are not dropped (i.e. the ones above your cutoff threshold), and divide this by the amount of time that the loudness is above threshold. That is, I assume that when you say, "loudness level to become smaller [than] it actually should", what you're doing is dividing by the total time, which would incorrectly bring down the value of the average. Instead you should divide by only the amount of time used in calculating the sum, i.e. N*(small time bin size in seconds), where N is the number of bins above threshold.

Maybe the algorithm is seeming more complicated than it really is because you're looking at an approach that tries to determine whether each new time bin is above threshold as it comes into the sliding window, and not recalculate it each shift of the sliding window? This is certainly possible, and is the way to do it efficiently, but the algorithm is somewhat more complex.

tom10
Algorithm is complex, i cant simply scale it down by time because it measures actual loudness and takes characteristics of waveform into account and is not a simple sum of bins, but got away with it by somewhat "pausing" level measurement if current bin (chunk of samples i got from sound card) is 8LU lower than current loudness level calculated using instant(400ms) buffer, but also influencing gated loudness level by the fact that im skipping samples, so your answer got me on right way, thanks.
Oscar