views:

97

answers:

3

Ok, I have a problem that i've searched and searched for solutions to online and can't find any leads, here it is (oh, and i'm very new to this database stuff ;-)

Lets say I have thousands of retail stores and each one can carry 1 to 100 products, but there are 10,000 possible products available. I need each store owner to be able to see (and edit) the list of products in their store - this is straightforward and I can figure out the table structure for this - but what I also want to do is to be able to find the most common product combinations across stores, so for example I want to see something like:

120: pid34, pid234, pid876, pid120, pid100
118: pid45, pid54, pid657, pid763, pid237
115: pid23, pid546, pid657, pid543, pid23

Where the first number is the number of stores with those products (order of the products within the store doesn't matter), and the pid numbers are the product ids of the products (except remember there could be up to 100 products in each store).

So if I set up a table of:

ID, PID,    Store ID
1,  pid34,  10
2,  pid234, 10
3,  pid876, 10
4,  pid120, 10
5,  pid100, 10
6,  pid45,  45
... etc ...

I can keep track of my store inventories, but i've no idea how I can do my 'combinations' search, help!

A: 

This is categorically not a database problem. This is a problem which something like Prolog or ECLiPSe would be great for, or perhaps a constraint solver within some other language.

Justice
A: 
var1 = Select the unique store numbers

foreach var1
  select the products that apply to that store number.
/foreach

Justice is right though, this is not neccessarilly a database problem, but rather a more general programming problem.

SeanJA
+1  A: 

You should be able to solve most of this problem using a database query, but you will still need another programming language wrapped around this query (such as java along with JDBC). Assuming you are using a SQL database of some sort that is. By the looks of your structure I am thinking you are going to need to use a group by clause. While I don't know your database structure since you didn't mention it so I am going to for the sake of this say that your table is called "products"

Let's first design a query to get the number of each product:

SELECT pid, COUNT(*) AS NUM FROM products p GROUP BY pid ORDER BY NUM DESC;

So this query will return something as follows:

pid,    NUM
pid34   120
pid29   120
pid20   20

So, this is getting closer but still not exactly what is wanted. However, now by using the query combined with a programming language, things can easily be done. I have done some java code below as an example.

// Assumes that database connection has already been made and is in the variable: conn1
Statement stmt1 = conn1.createStatement();
ResultSet rs1 = stmt1.executeQuery("SELECT pid, COUNT(*) AS NUM FROM products p GROUP BY pid ORDER BY NUM DESC");

int prevNum = -1;
while(rs1.next())
{
    int thisNum = rs1.getInt("NUM");
    if(thisNum != prevNum)
    {
        // this means it is a different number than the last result, start a new line
        system.out.print("\n" + thisNum + ": ");
        prevNum = thisNum;
    }
    system.out.print(rs1.getString("pid") + ", ");
}
stmt1.close();
conn1.close();

While I may not have all the logic correct, especially when it comes to formatting, this should get you going on the right path with how you need to go about it. Really this problem needs to be solved by using a combination of queries and a programming language.

Queries can do a small subset of problems, however they do an extremely good job at doing those problems, whereas a programming language can solve a much wider array of problems. However, a programming language will in many cases not be as efficient as the database is at solving the same problems, which is why combining the two many times can allow for the solving of complex problems with a lot of efficiency.

Brian