I am working with google app engine and using the low leval java api to access Big Table. I'm building a SAAS application with 4 layers:
- Client web browser
- RESTful resources layer
- Business layer
- Data access layer
I'm building an application to help manage my mobile auto detailing company (and others like it). I have to represent these four separate concepts, but am unsure if my current plan is a good one:
- Appointments
- Line Items
- Invoices
- Payments
Appointment: An "Appointment" is a place and time where employees are expected to be in order to deliver a service.
Line Item: A "Line Item" is a service, fee or discount and its associated information. An example of line items that might go into an appointment:
Name: Price: Commission: Time estimate Full Detail, Regular Size: 160 75 3.5 hours $10 Off Full Detail Coupon: -10 0 0 hours Premium Detail: 220 110 4.5 hours Derived totals(not a line item): $370 $185 8.0 hours
Invoice: An "Invoice" is a record of one or more line items that a customer has committed to pay for.
Payment: A "Payment" is a record of what payments have come in.
In a previous implementation of this application, life was simpler and I treated all four of these concepts as one table in a SQL database: "Appointment." One "Appointment" could have multiple line items, multiple payments, and one invoice. The invoice was just an e-mail or print out that was produced from the line items and customer record.
9 out of 10 times, this worked fine. When one customer made one appointment for one or a few vehicles and paid for it themselves, all was grand. But this system didn't work under a lot of conditions. For example:
- When one customer made one appointment, but the appointment got rained out halfway through resulting in the detailer had to come back the next day, I needed two appointments, but only one line item, one invoice and one payment.
- When a group of customers at an office all decided to have their cars done the same day in order to get a discount, I needed one appointment, but multiple invoices and multiple payments.
- When one customer paid for two appointments with one check, I needed two appointments, but only one invoice and one payment.
I was able to handle all of these outliers by fudging things a little. For example, if a detailer had to come back the next day, i'd just make another appointment on the second day with a line item that said "Finish Up" and the cost would be $0. Or if I had one customer pay for two appointments with one check, I'd put split payment records in each appointment. The problem with this is that it creates a huge opportunity for data in-congruency. Data in-congruency can be a serious problem especially for cases involving financial information such as the third exmaple where the customer paid for two appointments with one check. Payments must be matched up directly with goods and services rendered in order to properly keep track of accounts receivable.
Proposed structure:
Below, is a normalized structure for organizing and storing this data. Perhaps because of my inexperience, I place a lot of emphasis on data normalization because it seems like a great way to avoid data incongruity errors. With this structure, changes to the data can be done with one operation without having to worry about updating other tables. Reads, however, can require multiple reads coupled with in-memory organization of data. I figure later on, if there are performance issues, I can add some denormalized fields to "Appointment" for faster querying while keeping the "safe" normalized structure intact. Denormalization could potentially slow down writes, but I was thinking that I might be able to make asynchronous calls to other resources or add to the task que so that the client does not have to wait for the extra writes that update the denormalized portions of the data.
Tables:
Appointment
start_time
etc...
Invoice
due_date
etc...
Payment
invoice_Key_List
amount_paid
etc...
Line_Item
appointment_Key_List
invoice_Key
name
price
etc...
The following is the series of queries and operations required to tie all four entities (tables) together for a given list of appointments. This would include information on what services were scheduled for each appointment, the total cost of each appointment and weather or not payment as been received for each appointment. This would be a common query when loading the calendar for appointment scheduling or for a manager to get an overall view of operations.
- QUERY for the list of "Appointments" who's "start_time" field lies between the given range.
- Add each key from the returned appointments into a List.
- QUERY for all "Line_Items" who's appointment_key_List field includes any of the returns appointments
- Add each invoice_key from all of the line items into a Set collection.
- QUERY for all "Invoices" in the invoice ket set (this can be done in one asynchronous operation using app engine)
- Add each key from the returned invoices into a List
- QUERY for all "Payments" who's invoice_key_list field contains a key matching any of the returned invoices
- Reorganize in memory so that each appointment reflects the line_items that are scheduled for it, the total price, total estimated time, and weather or not it has been paid for.
...As you can see, this operation requires 4 datastore queries as well as some in-memory organization (hopefully the in-memory will be pretty fast)
Can anyone comment on this design? This is the best I could come up with, but I suspect there might be better options or completely different designs that I'm not thinking of that might work better in general or specifically under GAE's (google app engine) strengths, weaknesses, and capabilities.
Thanks!
Usage clarification
Most applications are more read-intensive, some are more write intensive. Below, I describe a typical use-case and break down operations that the user would want to perform:
Manager gets a call from a customer:
- Read - Manager loads the calendar and looks for a time that is available
- Write - Manager queries customer for their information, I pictured this to be a succession of asynchronous reads as the manager enters each piece of information such as phone number, name, e-mail, address, etc... Or if necessary, perhaps one write at the end after the client application has gathered all of the information and it is then submitted.
- Write - Manager takes down customer's credit card info and adds it to their record as a separate operation
- Write - Manager charges credit card and verifies that the payment went through
Manager makes an outgoing phone call:
- Read Manager loads the calendar
- Read Manager loads the appointment for the customer he wants to call
- Write Manager clicks "Call" button, a call is initiated and a new CallReacord entity is written
- Read Call server responds to call request and reads CallRecord to find out how to handle the call
- Write Call server writes updated information to the CallRecord
- Write when call is closed, call server makes another request to the server to update the CallRecord resource (note: this request is not time-critical)
Accepted answer:: Both of the top two answers were very thoughtful and appreciated. I accepted the one with few votes in order to imperfectly equalize their exposure as much as possible.