There's some data freely available in the NLTK Corpora package. http://nltk.org
Dan Bikel offers his parser for free on his website with a train file based on the tagged Wall Street Journal Corpus, an NLP testing standard. http://www.cis.upenn.edu/~dbikel/software.html
If you're looking for good enough, I'm sure you can generate lots of data based on these parsers and then train your own, and it will probably perform fine for many commercial uses. Unfortunately, this is the disappointing reality of use of the great resources available at the Linguistic Data Consortium. However, for a startup that focuses on NLP, it's not really something you can skimp on. This is why for many of these kind of undertakings, you can employ a pilot phase on poorer data (see above) and then see what your success rates are before making the capital investment.
If you're just doing this for research, then by all means, seek out your nearest computational linguistics program and see what kinds of concessions they'll make for you to poke through their licensed corpora.
Good luck!