I'm building a website that will be an open-source, user-contributed content kind of thing, and I think if developers had access to nightly production SQL dumps, they'd be more likely to check out the code from github and play with it.
In line with that idea, I'm considering either:
- Not collecting private user information at all, using open-id for accounts and making heavy use of memcache for things like session authentication.
- Anonymizing sensitive data before publishing
Sometimes I get carried away with "wouldn't it be cool if...?" ideas, so I'm hoping for a sanity check here. Any obvious flaws in either approach? Is this a sane idea?