Not aware of anyone having done that and made it public. Just IMHO, it doesn't sound very promising.
As Martinus points out, some compression algorithms are highly serial. Block compression algorithms like LZW can be parallelized by coding each block independently. Ziping a large tree of files can be parallelized at the file level.
However, none of these is really SIMD-style parallelism (Single Instruction Multiple Data), and they're not massively parallel.
GPUs are basically vector processors, where you can be doing hundreds or thousands of ADD instructions all in lock step, and executing programs where there are very few data-dependent branches.
Compression algorithms in general sound more like an SPMD (Single Program Multiple Data) or MIMD (Multiple Instruction Multiple Data) programming model, which is better suited to multicore cpus.
Video compression algorithms can be accellerated by GPGPU processing like CUDA only to the extent that there is a very large number of pixel blocks that are being cosine-transformed or convolved (for motion detection) in parallel, and the IDCT or convolution subroutines can be expressed with branchless code.
GPUs also like algorithms that have high numeric intensity (the ratio of math operations to memory accesses.) Algorithms with low numeric intensity (like adding two vectors) can be massively parallel and SIMD, but still run slower on the gpu than the cpu because they're memory bound.