views:

196

answers:

4

For example I've often wanted to search stackoverflow with

SELECT whatever FROM questions WHERE
   views * N + votes * M > answers AND NOT(answered) ORDER BY views;

or something like that.

Is there any reasonable way to allow users to use SQL as a search/filter language?

I see a few problems with it:

  • Accessing/changing stuff (a carefully setup user account should fix that)
  • SQL injection (given the previous the worst they should be able to do is get back junk and crash there session).
  • DOS attacks with pathological queries
  • What indexes do you give them?

Edit: I'd like to allow joins and what not as well.

+2  A: 

If you do SQLEncode your users' input (and make sure to remove all ; as well!), I see no huge safety flaw (other than that we're still handing nukes out to psychos...) in having three input boxes - one for table, one for columns and one for conditions. They won't be able to have strings in their conditions, but queries like your example should work. You will do the actual pasting together of the SQL statement, so you'll be in control of what is actually executed. If your setup is good enough you'll be safe.

BUT, I wouldn't for my life let my user enter SQL like that. If you want to really customize search options, give either a bunch of flags for the search field, or a bunch of form elements that can be combined at will.

Another option is to invent some kind of "markup language", sort of like Markdown (the framework SO uses for formatting all these questions and answers...), that you can translate to SQL. Then you can make sure that only "harmless" selects are performed, and you can protect user data etc.

In fact, if you ever implement this, you should see if you could run the commands from a separate account on the SQL server, which only has access to the very basic needs, and obviously only read access.

Tomas Lycken
+1 I agree completely. Users should input data values, and perhaps choices about logic (like ALL keywords vs. ANY keywords), but users should not be able to input code that is run verbatim.
Bill Karwin
The problem is that by the time I make a system that lest the user quiery like I'd like them to be able to, I'd have reimplemented much/most of SQL's read only parts
BCS
I don't believe you. No application needs that much flexibility.
Bill Karwin
@Bill: I'd love to have read only SQL access to the stack overflow DB. I'll grant I wouldn't use ALL of SQL but I don't think there is much of the query system I could rule out as something I wouldn't ever want to use.
BCS
+5  A: 

Accessing/changing stuff
No problem, just run the query with a crippled user, with permissions only to select

SQL injection
Just sanitize the query

DOS attacks
Time-out the query and throttle the access by IP. I guess you can also throttle the CPU usage in some servers

Eduardo Molteni
`SELECT * FROM questions, questions, questions, questions, questions, questions ORDER BY RAND()` -- I just crashed your server, with nothing but select privilege.
Bill Karwin
Not if the input is sanitized properly.
Bratch
@Bill Karwin: CPU/RAM/time limits could keep that query from crashing anything.
BCS
That query looks to me like a DOS, not something that should cause an actual crash (i.e., BSOD/kernel panic), so Eduardo already addressed it: Throttle CPU usage and set a query timeout.
Dave Sherohman
I agree. Just say, there are no guarantees and your query can be timed out at a time of our choosing.
Matthew Flaschen
+1  A: 

Facebook does this with FQL. See the blog post or presentation.

David Phillips
The Yahoo API query language is similar but the links don't help with the implementation part, just the usage.
Trey
A: 

I just thought of a strong sanitize method that could be used to restrict what can be used.

  • Use MySQL and grab it's lex/yacc files
  • use the lex file as is
  • gut the yacc file to only the things you want to allow
  • use action rules that spit out the input on success.
BCS
reading David's links it looks like that might be exactly what FQL is (or maybe done directly in a hacked MySQL to avoid the bounce back to text)
BCS