views:

102

answers:

4

I've heard it claimed that the simplest solution to preventing SQL injection attacks is to html encode all text before inserting into the database. Then, obviously, decode all text when extracting it. The idea being that if the text only contains ampersands, semi-colons and alphanumerics then you can't do anything malicious.

While I see a number of cases where this may seem to work, I foresee the following problems in using this approach:

  • It claims to be a silver bullet. Potentially stopping users of this technique from understanding all the possible related issues - such as second-order attacks.
  • It doesn't necessarily prevent any second-order / delayed payload attacks.
  • It's using a tool for a purpose other than that which it was designed for. This may lead to confusion amongst future users/developers/maintainers of the code. It's also likley to be far from optimal in performance of effect.
  • It adds a potential performance hit to every read and write of the database.
  • It makes the data harder to read directly from the database.
  • It increases the size of the data on disk. (Each character now being ~5 characters - In turn this may also impact disk space requirements, data paging, size of indexes and performance of indexes and more?)
  • There are potential issues with high range unicode characters and combining characters?
  • Some html [en|de]coding routines/libraries behave slightly differently (e.g. Some encode an apostrophe and some don't. There may be more differences.) This then ties the data to the code used to read & write it. If using code which [en|de]codes differently the data may be changed/corrupted.
  • It potentially makes it harder to work with (or at least debug) any text which is already similarly encoded.

Is there anything I'm missing?
Is this actually a reasonable approach to the problem of preventing SQL injection attacks?
Are there any fundamental problems with trying to prevent injection attacks in this way?

+7  A: 

You should prevent sql injection by using parameter bindings (eg. never concatenate your sql strings with user input, but use place holders for your parameters and let the framework you use do the right escaping). Html encoding, on the other hand, should be used to prevent cross-site scripting.

klausbyskov
htmlencoding should only be used when displaying the data to the user, so the original values are being stored in the db.
henchman
+1  A: 

How you get the idea that HTML Encoded text only contains ampersands, semi-colons and alphanumerics after decoding?

I can really encode a "'" in HTML - and that is one of the things needed to get yo into trouble (as it is a string delimiter in SQL).

So, it works ONLY if you put the HTML encoded text into the database.

THEN you havequite some trouble with any text search... and presentation of readable text outside (like in SQL manager). I would consider that a really bad architected sitaution as you have not solved the issue just duct-taped away an obvious attack vector.

Numeric fields are still problematic, unless your HTML handling is perfect, which I would not assume given that workaround.

Use SQL parameters ;)

TomTom
The idea 'Encoded text only contains ampersands, semi-colons and alphanumerics after ENcoding' is not mine, but a claim the person supporting this method used.
Matt Lacey
+1  A: 

The single character that enables SQL injection is the SQL string delimer ', also known as hex 27 or decimal 39.

This character is represented in the same way in SQL and in HTML. So an HTML encode does not affect SQL injection attacks at all.

Andomar
+2  A: 

Absolutely not.

SQL injections should be prevented by parametrized queries. Or in the worst case by escaping the SQL parameter for SQL, not HTML. Each database has its own rules about this, mysql API (and most frameworks) for example provides a particular function for that. Data itself in the database should not be modified when stored.

Escaping HTML entities prevents XSS and other attacks when returning web content to clients' browsers.

sibidiba