tags:

views:

16762

answers:

12

I'm looking for the best workarounds for the PreparedStatement "IN clause" issue, which apparently is not supported for multiple values due to sql injection attack security issues: One ?, One value. Not a list of values.

To illustrate:

select my_column from my_table where search_column in (?)

using ps.setString(1, "'A', 'B', 'C'"); is essentially a non-working attempt at a workaround of the reasons for using ? in the first place.

Ideas for workarounds?

I don't want to have to execute the prepared statement multiple times, once for each search value (not using an IN clause), but that's where I am at this point.

+1  A: 

I've never tried it, but would .setArray() do what you're looking for?

Update: Evidently not. setArray only seems to work with a java.sql.Array that comes from an ARRAY column that you've retrieved from a previous query, or a subquery with an ARRAY column.

Paul Tomblin
Doesn't work with all databases, but it's the "correct" approach.
skaffman
You mean all drivers. Some drivers have proprietary equivalents of this years old (last century?) standard. Another way is to bung a batch of values into a temporary table, but not all databases support that...
Tom Hawtin - tackline
http://java.sun.com/j2se/1.3/docs/guide/jdbc/getstart/mapping.html#996857According to Sun, Array content [typically] remains on the server side and is pulled as needed. PreparedStatement.setArray() can send back an Array from a previous ResultSet, not create a new Array on the client side.
Chris Mazzola
+4  A: 

I suppose you could (using basic string manipulation) generate the query string in the PreparedStatement to have a number of ?'s matching the number of items in your list.

Of course if you're doing that you're just a step away from generating a giant chained OR in your query, but without having the right number of ? in the query string, I don't see how else you can work around this.

Adam Bellaire
Not really a solution for me since I want to send in a different number of ? each time I call the ps. But don't think I hadn't considered it. :P
Chris Mazzola
Another hack: you can use a large number of parameter placeholders -- as many as the longest list of values you'll have -- and if your list of values is shorter, you can repeat values:...WHERE searchfield IN (?, ?, ?, ?, ?, ?, ?, ?)and then provide values: A, B, C, D, A, B, C, D
Bill Karwin
But overall I favor Adam's solution: generate the SQL dynamically, and concatenate ? placeholders to match the number of values you have to pass.
Bill Karwin
Bill, that solution is workable if I don't want to reuse the PreparedStatement. Another solution is to make the single param call multiple times and accumulate the results on the client side. Likely it would be more efficient to build/execute a new Statement with custom number of ? each time though.
Chris Mazzola
A: 

try using the instr function?

select my_column from my_table where  instr(?, ','||search_column||',') > 0

then

ps.setString(1, ",A,B,C,");

Admittedly this is a bit of a dirty hack, but it does reduce the opportunities for sql injection. Works in oracle anyway.

stjohnroe
Oh, and I am aware that it will not utilise indexes
stjohnroe
A: 

Just for completeness: So long as the set of values is not too large, you could also simply string-construct a statement like

... WHERE tab.col = ? OR tab.col = ? OR tab.col = ?

which you could then pass to prepare(), and then use setXXX() in a loop to set all the values. This looks yucky, but many "big" commercial systems routinely do this kind of thing until they hit DB-specific limits, such as 32 KB (I think it is) for statements in Oracle.

Of course you need to ensure that the set will never be unreasonably large, or do error trapping in the event that it is.

Carl Smotricz
Yes, you're right. My goal in this case was to reuse the PreparedStatement with different numbers of items each time.
Chris Mazzola
Using "OR" would obfuscate the intent. Stick with "IN" as its easier to read and the intent is more clear. The only reason to switch is if the query plans were different.
James Schek
A: 

Following Adam's idea. Make your prepared statement sort of select my_column from my_table where search_column in (#) Create a String x and fill it with a number of "?,?,?" depending on your list of values Then just change the # in the query for your new String x an populate

+4  A: 

An unpleasant work-around, but certainly feasible is to use a nested query. Create a temporary table MYVALUES with a column in it. Insert your list of values into the MYVALUES table. Then execute

select my_column from my_table where search_column in ( SELECT value FROM MYVALUES )

Ugly, but a viable alternative if your list of values is very large.

This technique has the added advantage of potentially better query plans from the optimizer (check a page for multiple values, tablescan only once instead once per value, etc) may save on overhead if your database doesn't cache prepared statements. Your "INSERTS" would need to be done in batch and the MYVALUES table may need to be tweaked to have minimal locking or other high-overhead protections.

James Schek
What advantages would that have over querying my_table one value at a time?
Paul Tomblin
The query optimizer can reduce I/O load by retrieving all possible matches from a loaded page. Tablescans or index scans may be performed once instead of once per value. Overhead for inserting values can be reduced with batch operations and may be less than several queries.
James Schek
+1  A: 

No simple way AFAIK. If the target is to keep statement cache ratio high (i.e to not create a statement per every parameter count), you may do the following:

  1. create a statement with a few (e.g. 10) parameters:

    ... WHERE A IN (?,?,?,?,?,?,?,?,?,?) ...

  2. Bind all actuall parameters

    setString(1,"foo"); setString(2,"bar");

  3. Bind the rest as NULL

    setNull(3,Types.VARCHAR) ... setNull(10,Types.VARCHAR)

NULL never matches anything, so it gets optimized out by the SQL plan builder.

The logic is easy to automate when you pass a List into a DAO function:

while( i < param.size() ) {
  ps.setString(i+1,param.get(i));
  i++;
}

while( i < MAX_PARAMS ) {
  ps.setNull(i+1,Types.VARCHAR);
  i++;
}
Vladimir Dyuzhev
+6  A: 

An analysis of the various options available, and the pros and cons of each is available here

Don
A: 

I just used Adam's solution; here's the source:

public void myQuery(List<String> items, int other) {
  ...
  String q4in = generateQsForIn(items.size());
  String sql = "select * from stuff where foo in ( " + q4in + " ) and bar = ?";
  PreparedStatement ps = connection.prepareStatement(sql);
  int i = 1;
  for (String item : items) {
    ps.setString(i++, item);
  }
  ps.setInt(i++, other);
  ResultSet rs = ps.executeQuery();
  ...
}

private String generateQsForIn(int numQs) {
    String items = "";
    for (int i = 0; i < numQs; i++) {
        items += "?";
        if (i < numQs - 1) {
            items += ", ";
        }
    }
    return items;
}
neu242
Please use `StringBuilder` and `String#format()`.
BalusC
There's no need to use StringBuilder anymore. The compiler converts the + signs to StringBuilder.append() anyway, so there is no performance hit. Try yourself :)
neu242
A: 

Here is the way I would suggest Click here

Hilo
+1  A: 

Here's how I do it:

public static String preparePlaceHolders(int length) {
    StringBuilder builder = new StringBuilder();
    for (int i = 0; i < length;) {
        builder.append("?");
        if (++i < length) {
            builder.append(",");
        }
    }
    return builder.toString();
}

public static void setValues(PreparedStatement preparedStatement, Object... values) throws SQLException {
    for (int i = 0; i < values.length; i++) {
        preparedStatement.setObject(i + 1, values[i]);
    }
}

and how I use it:

private static final String SQL_FIND = "SELECT id, name, value FROM data WHERE id IN (%s)";

public List<Data> find(Set<Long> ids) throws SQLException {
    Connection connection = null;
    PreparedStatement statement = null;
    ResultSet resultSet = null;
    List<Data> list = new ArrayList<Data>();
    String sql = String.format(SQL_FIND, preparePlaceHolders(ids.size()));

    try{
        connection = database.getConnection();
        statement = connection.prepareStatement(sql);
        setValues(statement, ids.toArray());
        resultSet = statement.executeQuery();
        while (resultSet.next()) {
            Data data = new Data();
            data.setId(resultSet.getLong("id"));
            data.setName(resultSet.getString("name"));
            data.setValue(resultSet.getInt("value"));
            list.add(data);
        }
    } finally {
        close(connection, statement, resultSet);
    }

    return list;
}
BalusC
A: 

One of advantages of prepared statements is that sohuld can be compiled once for efficiency. By making the in clause dynamic this effectively negates the prepared statement.