ansaurus

Question

Split multiple SQL statements into individual SQL statements

Answer 1

A:

Your best bet is to require the user to put some type of deliminator between statements. For example: require each statement be delineated with a line containing only the word GO, or a "\", or end each statement with a ";".

This way you can easily break the single string into separate SQL statements.

Jess 2009-03-11 01:47:26

Added point 7. Forgot to mention I definitely don't want delimiter.

Justin 2009-03-11 02:23:01

Answer 2

A:

Maybe try this library. I have used it successfully for parsing sql in the past. http://www.sqlparser.com/

Craig 2009-03-11 01:50:29

I'll check this out. I need to do this in code, so I'm not sure that would work, but I'll look tomorrow.

Justin 2009-03-11 02:24:39

Answer 3

A:

If you don't want your users to put in a delimiting character such as ';' or any thing else, you will need to parse the input yourself and have logic to determine where statements begin.

Your logic will need to deal with the obvious query starting keywords 'SELECT', 'UPDATE', 'INSERT', 'DELETE' and work forward to the next keyword (or end of input).

Nick Josevski 2009-03-11 01:52:33

I have recently just worked on a SQL Parser. Initially I thought like you did that it should be relatively straight forward but don't be fooled. Even with the help of a SQL Parser third party component I still had to write 600 lines of code to do some pretty simple parsing.

Craig 2009-03-11 01:59:10

Yah, I know I can write my own routine to do this. But it's hideous every time I try and fails.

Justin 2009-03-11 02:25:33

Answer 4

A:

Have you tried using the keywords 'SELECT', 'UPDATE', 'INSERT' and 'DELETE' combined with counting the number of opening '(' and closing braces ')' ?

This should allow you to determine avoid nested SELECT statements and find the correct end of the statement.

Craig T 2009-03-11 01:58:45

Yes, I did. The code got hideous and long, and I constantly found use cases that broke it, so I figured there must be a more flexible way.

Justin 2009-03-11 02:29:20

Answer 5

A:

You need to require the semicolon delimiter. Technically, without it a SQL statement is completely invalid; anyone omitting it is writing malformed SQL. Requiring the semicolon solves all of your problems, in a standardized way, and makes the software easy to write.

Perhaps do the following: if the user enters a query not containing one or more semicolons (outside of quotes, of course), add a semicolon at the end and run it as a single query. Otherwise, split the entered queries at semicolons and run each one individually, perhaps tacking on a semicolon at the end of the final query if omitted.

This solution is easy to write, SQL standard compliant, and just plain works. Not requiring the delimiter is a sure path to madness.

kquinn 2009-03-11 02:08:32

"Not requiring the delimiter is a sure path to madness." I totally disagree. That's one of the best features in MS SQL Management Studio.

Justin 2009-03-11 03:41:23

If LuckyLindy's comment above is to be believed, even SQL Management Studio uses the approach I describe. Not requiring delimiters will require you to write a full SQL parser, as complex as the server's itself. Don't do it. 'Saving' users the 'trouble' of semicolons will only hurt in the long run.

kquinn 2009-03-11 06:06:01

@kquinn - SQL Server Management Studio does _not_ require a delimiter.

Justin 2010-01-12 16:26:31

You're still crazy for demanding this. The alternative is simple and easy. Your demand for no delimiters is difficult, error-prone, and fragile.

kquinn 2010-01-13 06:58:14

@kquinn. Respectfully, I disagree. SQL Server Management Studio does not require delimiters and it is not an error-prone or fragile application. The idea is to design software for how people will tend to behave, not how I want them to behave. Look, I'm not saying this is easy, and I'm certainly not going to try to write my own SQL parser in PHP, but it's still worth trying to find a way.

Justin 2010-03-30 12:30:53

The fact that this question is now one year old and not yet solved should tell you something: namely, your demand is inane. Just require a semicolon or a double-newline or something, and be done with it.

kquinn 2010-03-30 19:44:06

Answer 6

A:

You could parse it yourself I suppose. Look for the keywords SELECT, DELETE, UPDATE, INSERT, EXEC, etc.

As you parse, if you encounter a "(" increment a counter: nest_level++

If you encounter a ")" decrement nest_level--

Then when you come across a keyword, and nest_level == 0, then you've come to the next statement.

You'll also have to handle cases like

 INSERT ...
 SELECT ....

So for an INSERT you would have to look for either SELECT or VALUES...

And no doubt other cases.

Agree with kquinn you should just require the semicolon. I don't think there's anything "uncool" about that.

MikeW 2009-03-11 02:21:58

Yah, these are all the traps that I caught in trying to write my own algorithm.

Justin 2009-03-11 03:43:48

Answer 7

A:

I'm not sure this is possible at all. You would certainly need an in-depth knowledge of the SQL syntax of your target DBMS. For example just off the top of my head this is a single MySQL statement:

INSERT INTO things
SELECT * FROM otherthings ON DUPLICATE KEY
UPDATE thingness=thingness+1

It is likely there are constructs in some DBMSs that, without a delimiter, could be ambiguous.

I don't want to require the user to enter a semi-colon after each SQL statement.

I think you may be forced to. It's totally the standard way to delimit SQL statements. Even if you can find a heuristic to detect probably-start-of-SQL-statement points, you risk disasters like an accidental “DELETE FROM things”-without-WHERE-clause.

SQL statements can be on one or multiple lines, so I can't wrap on LBs/CRs

Would double-newline-for-new-statement be acceptable?

I tried some RegEx attempts, but that doesn't seem to be powerful enough.

No, even with semicolon delimiters, regex is nowhere near powerful enough to parse SQL. Problem points would include:

';'
";"
`;`
'\';'
''';'
-- ;
#;
/*;*/

and any interposition of these structures. Eek!

bobince 2009-03-11 03:38:19

All good points, but I don't want to require a delimiter. It's very possible and safe to parse it without, just look @ SQL Management Studio. I didn't say it was going to be easy.

Justin 2009-03-11 03:43:09

Answer 8

+1 A:

To add a quirk to the discussion that periodically causes issues:

DECLARE c CURSOR FOR
    SELECT * FROM SomeWhere ...
        FOR UPDATE

The trailing UPDATE tends to throw ad hoc parsers off their stride. It may well be that you don't have to worry about that because the DECLARE notation (which is really Embedded SQL, not plain SQL) is not permitted in the first place. But the FOR UPDATE clause can appear in some dialects of SQL even when not in a DECLARE statement, so beware.

Jonathan Leffler 2009-03-11 04:36:16

Answer 9

+1 A:

maybe with the following Java Regexp? check the test...

@Test
public void testRegexp() {
    String s = //
        "SELECT 'hello;world' \n" + //
        "FROM DUAL; \n" + //
        "\n" + //
        "SELECT 'hello;world' \n" + //
        "FROM DUAL; \n" + //
        "\n";

    String regexp = "([^;]*?('.*?')?)*?;\\s*";

    assertEquals("<statement><statement>", s.replaceAll(regexp, "<statement>"));
}

mhoms 2010-05-12 12:36:03

ansaurus

tags:

views:

answers:

Split multiple SQL statements into individual SQL statements

related questions