I am going to start on with a new project. I need to deal with hundred gigs of data in a .NET application. It is very early stage now to give much detail about this project. Some overview is follows:
- Lots of writes and Lots of reads on same tables, very realtime
- Scaling is very important as the client insists expansion of database servers very frequently, thus, the application servers as well
- Foreseeing, lots and lots of usage in terms of aggregate queries could be implemented
- Each row of data may contains lots of attributes to deal with
I am suggesting/having following as a solution:
- Use distributed hash table sort of persistence (not S3 but an inhouse one)
- Use Hadoop/Hive likes (any replacement in .NET?) for any analytical process across the nodes
- Impelement GUI in ASP.NET/Silverlight (with lots of ajaxification,wherever required)
What do you guys think? Am i making any sense here?