views:

264

answers:

4

In order to understand how other people work with project solving I put up this project formulation:

  • The project involves several different customer databases where data should be read and uploaded to the project's database server.

  • Close to nothing is generally known about the customer's database when a new customer arrives, apart from some general knowledge about the business type of the customer.

  • It requires a program to read the data from the customers.

  • It requires a web site with diverse cross sections of the customers' data, in an attempt to do something in a general way for the different customers.

  • It requires the handling of a several GB database. And synchronization of millions of rows with the customers' databases.

  • The visual appearance and functionality of the web site should be dazzling, including charts, report-server-like functionality with email and sms reports.

  • The different customers will probably also have different requirements so the system should be parameterized in some way.

  • The different users will probably also like to have some personalized pages.

  • Some advertisement pages for the project, documentation and manuals will probably also be needed.

  • The web pages should load faster than 0.1 second and serve hundreds of simultaneous users.

How would you approach such a requirement?

How many people would you take on the project, initially?

Which different specialities / expertises would you expect to need?

How carefully would you plan such a project?


EDIT: OK it might sound unrealistic, but what should the first steps be and what kind of organization would be capable of handling this appropriately?

+4  A: 

There are no requirements listed in this question other than:

  • Large unknown data files will be uploaded and processed into a database.
  • This large data will be displayed on a website.
  • The website should be dazzling.
  • The website should be fast even with hundreds of users.

I recommend you hire someone to get better requirements. If you want technology recommendations, you might get them here:

  • C is fast. (but so are many other things)
  • Flash websites are dazzling. :(
  • Apache or IIS in combination with MySQL/PostgreSQL/MSSQL are both capable of handling a large server load.
altCognito
Thanks for the comments.
Ole Lynge
+3  A: 

<troll>

How would you approach such a requirement?

Terrified

How many people would you take on the project, initially?

Day one? one – me. Day two? probably more.

Which different specialities / expertises would you expect to need?

To start with: A databases guy, a back-end guy, a font-end guy, and a psychic to guess the unknown database structures

How carefully would you plan such a project?

Not very. Actually, I wouldn't accept this project, not without better specs.

Henrik Paul
I couldn't agree more with not accepting the project. I've found that the projects that I have caused me the most trouble were the ones where the clients refused to define a spec document and then got upset when unknowns popped up and caused delays.
Paulo
He he. Thanks for the comments.
Ole Lynge
+1  A: 
meade
Wauv. It looks like it's based on my sort of general list of 'requirements'. Thanks a lot.
Ole Lynge
+1  A: 

To me it seems like a data warehouse project, if I understood correctly your problem.

From technical perspective:

You need to setup a "staging area" for each customer database, where to put the relevant data to be loaded in your central system.

Your job will then be to load the data from each staging area, transform it to a common format and store it in your database.

Then, use reporting tools over your database to build nice reports, data mining, etc.

You can use specialized ETL tools (they might be pretty expensive) or use simple SQL combined with some procedural language and scripts for data transformation and loading.

You can use specialized reporting tools (e.g. Business Objects) to build your reports over the built data warehouse. One of their feature is that they are thought to allow the end user to customize and build their own reports as well.

From staffing perspective:

You'll need people that worked in datawarehouse. I can't say anything about the sizing of the team, since it depends on the number of data sources and the complexity.

Planning:

I would suggest the following steps:

  1. Use the first customers to identify common data model for the warehouse.
  2. Implement the common datawarehouse features (common model, metadata, basic framework for ETL transformation, basic support for reports). This will set up the "infrastructure". Hardware setup and sizing is also needed here.
  3. Use iterations for setting up customers. On each iteration, analyze source data, develop the data filter to select only the relevant data to be loaded in the datawarehouse, design and implement the data transformation, design and implement new reports, if necessary.

You can treat each new customer to be handled as a separate iteration, and treat multiple customers in parallel iterations, once the datawarehouse model and structure (the infrastructure) is in place.

Cătălin Pitiș
Thanks for the nice answer, which gives me some 'serious' advice :).
Ole Lynge