Let's say I need to query the associates of a corporation. I have a table, "transactions", which contains data on every transaction made.
CREATE TABLE `transactions` (
`transactionID` int(11) unsigned NOT NULL,
`orderID` int(11) unsigned NOT NULL,
`customerID` int(11) unsigned NOT NULL,
`employeeID` int(11) unsigned NOT NULL,
`corporationID` int(11) unsigned NOT NULL,
PRIMARY KEY (`transactionID`),
KEY `orderID` (`orderID`),
KEY `customerID` (`customerID`),
KEY `employeeID` (`employeeID`),
KEY `corporationID` (`corporationID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
It's fairly straightforward to query this table for associates, but there's a twist: A transaction record is registered once per employee, and so there may be multiple records for one corporation per order.
For example, if employees A and B from corporation 1 were both involved in selling a vacuum cleaner to corporation 2, there would be two records in the "transactions" table; one for each employee, and both for corporation 1. This must not affect the results, though. A trade from corporation 1, regardless of how many of its employees were involved, must be treated as one.
Easy, I thought. I'll just make a join on a derived table, like so:
SELECT corporationID FROM transactions JOIN (SELECT DISTINCT orderID FROM transactions WHERE corporationID = 1) AS foo USING (orderID)
The query returns a list of corporations who have been involved in trades with corporation 1. That's exactly what I need, but it's very slow because MySQL can't use the corporationID index to determine the derived table. I understand that this is the case for all subqueries/derived tables in MySQL.
I've also tried to query a collection of orderIDs separately and use a ridiculously large IN() clause (typhically 100 000+ IDs), but as it turns out MySQL has issues using indices on ridiculously large IN() clauses as well and as a result the query time does not improve.
Are there any other options available, or have I exhausted them both?