views:

66

answers:

2

I have a table with user_ids that we've gathered from a streaming datasource of active accounts. Now I'm looking to go through and fill in the information about the user_ids that don't do much of anything.

Is there a SQL (postgres if it matters) way to have a query return random numbers not present in the table?

Eg something like this:

SELECT RANDOM(count, lower_bound, upper_bound) as new_id 
WHERE new_id NOT IN (SELECT user_id FROM user_table) AS user_id_table

Possible, or would it be best to generate a bunch of random numbers with a scripted wrapper and pass those into the DB to figure out non existant ones?

A: 

It is posible. If you want the IDs to be integers, try:

SELECT trunc((random() * (upper_bound - lower_bound)) + lower_bound) AS new_id 
FROM generate_series(1,upper_bound) 
WHERE new_id NOT IN (
    SELECT user_id 
    FROM user_table)
Matthew Flynn
Hmm, it looks like it should work but postgres complains that new_id does not exist. It does the same with replacing the nested select with just a list of numbers. Maybe a DB engine limitation?
Peck
Hmmm.... Yes, apparently you cannot use column aliases in a WHERE or HAVING clause in postgresql. You could consider using pl/pgsql to set a variable to a random number, test it against the table, and repeat until you get a good one.
Matthew Flynn
A: 

You can wrap the query above in a subselect, i.e.

SELECT * FROM (SELECT trunc(random() * (upper - lower) + lower) AS new_id  
FROM generate_series(1, count)) AS x 
WHERE x.new_id NOT IN (SELECT user_id FROM user_table)
Alexey Klyukin