Are your two cases (overlapping and non-overlapping) using the exact same set of shapes? Because if the overlapping case involves a total area of all your shapes that is larger than the first case, then it would be expected to be slower. If it's the same set of shapes that slows down if some of them overlap, then that would be very unusual and shouldn't happen on any standard hardware OpenGL implementation (what platform are you using?). Backface culling won't be causing any problem.
Whenever a shape is drawn, the GPU has to do some work for each pixel that it covers on the screen. If you draw the same shape 100 times in the same place, then that's 100-times the pixel work. Depth buffering can reduce some of the extra cost for opaque objects if you draw objects in depth-sorted order, but that trick can't work for things that use transparency.
When using transparency, it's the sum of the area of each rendered shape that matters. Not the amount of the screen that's covered after everything is rendered.