views:

42

answers:

1

So this is from the late 90s ... http://www.cs.princeton.edu/~prc/SingingSynth.html

Why hasn't this taken off? (We can synthesize photorealistic like images, but the synthesis of singing ... still seems to be in very primitive stages).

What exactly is it that makes the synthesis of singing difficult?

http://www.interspeech2007.org/Technical/synthesis_of_singing_challenge.php <-- still seems primitive.

Thanks!

+1  A: 

My feeling is that we get into the uncanny valley for sounds easier than for images. While our brain accepts a badly formed image relatively well, it does not accept a badly formed sound unless it sounds natural. Everything that does not sound perfectly unperfect sounds creepy, and this makes a very strong barrier to actual applications. It is good for announcements and telephone services, but we are a long way from totally synthetic singing.

On the other hand, modification of actual voices is daily performed, both live and in studio. Without Autotune all the "gangsta" and "lady gagas" out there would do a job more suited to their actual talent.

Stefano Borini
Do you have any references to back this up? I'm not sure why uncanny sounds would be more creepy ... unless, historically, things that wounds weird = creepy because back in acveman days, they meant predators.
anon
@anon: the reference is the first two words: "my feeling"
Stefano Borini