views:

445

answers:

1

The Situation

I have a very compressed time schedule to write a simple (basically write-only web app). The app is to be a mostly jQuery-driven question tree. The questions and tree will probably need to change both before and after the site launches.

The answers will get emailed... I probably don't even need to store them, but I'm going to just in case.

This needs to be slapped up on a shared host in very short order.

My proposed strategy

The Question Tree Itself

Implement the question-tree and validation mostly in jQuery and HTML. Keep the question-answer state stored as a javascript object with "Question text" : "Question answer" as the format for each question.

The form validation would be jQuery based only, no server side validation of individual fields other than (as mentioned later) making sure only valid JSON is inserted.

Identifying a User

Handle session state with PHP, use the PHP session ID as the unique key for each user.

As each question is answered, make a simple AJAX call to a very simple PHP script that accepts the PHP session ID, and the JSON repreentation of the object. (The reason for sending it each time is so if the user quits answering questions, at least we get SOME data.)

Storage

Storage is handled in a (php embedded) SQLite DB like this:

CREATE TABLE q_and_a_storage (
  php_session_id text primary key,
  json_storage text
);

Server Side

The PHP AJAX receive script is very dumb. It simply checks the DB to see if the session id exists, and then INSERT's or UPDATES's as appropriate. It also makes sure that the response is valid JSON before inserting.

I just want to know if this is incredibly foolhardy or if it is reasonable. Is there some big security hole I'm not thinking of?

Things people are going to want to know:

  • I'm estimating under a million fillers of the form in this iteration
  • All we really need to do is make sure we send an initial email with the data but I'm storing it just in case
  • It's VERY likely I'll need to retune the question set, and have almost no way of knowing which questions should go and which should stay.
  • If I need later analysis of the data I can later send it to a CouchDB and run map/reduce queries on it, which is why this model is attractive to me.
  • It SEEMS like the javascript only form submission deters most spam, and the only payoff for attack is useless JSON stored in the DB.
  • Super quick development time and flexibility of the question set are the really important factors here.
+1  A: 

Very thorough explanation, thanks. So the one weak spot I see here (if it matters) is that anybody who can evince or guess the session id can "retroactively change" the JSON that represents the answers -- I know you say that JSON is "useless", but, if that's the case, then why are you storing it in the first place?-) Maybe I'm being overly paranoid about the php session id's security (if it's essentially secure, then my objection crumbles), but if there's any value to a potential spammer in performing such retroactive changes, then I'd add a validation level (based on securely encrypted cookies under my own control...).

Alex Martelli
Not a bad thought.
danieltalsky