(I've changed the details of this question to avoid NDA issues. I'm aware that if taken literally, there are better ways to run this theoretical company.)
There is a group of warehouses, each of which are capable of storing and distributing 200 different products, out of a possible 1000 total products that Company A manufactures. Each warehouse is stocked with 200 products, and assigned orders which they are then to fill from their stock on hand.
The challenge is that each warehouse needs to be self-sufficient. There will be an order for an arbitrary number of products (5-10 usually), which is assigned to a warehouse. The warehouse then packs the required products for the order, and ships them together. For any item which isn't available in the warehouse, the item must be delivered individually to the warehouse before the order can be shipped.
So, the problem lies in determining the best warehouse/product configurations so that the largest possible number of orders can be packed without having to order and wait for individual items.
For example (using products each represented by a letter, and warehouses capable of stocking 5 product lines):
Warehouse 1: [A, B, C, D, E]
Warehouse 2: [A, D, F, G, H]
Order: [A, C, D] -> Warehouse 1
Order: [A, D, H] -> Warehouse 2
Order: [A, B, E, F] -> Warehouse 1 (+1 separately ordered)
Order: [A, D, E, F] -> Warehouse 2 (+1 separately ordered)
The goal is to use historical data to minimize the number of individually ordered products in future. Once the warehouses had been set up a certain way, the software would just determine which warehouse could handle an order with minimal overhead.
This immediately strikes me as a machine learning style problem. It also seems like a combination of certain well known NP-Complete problems, though none of them seem to fit properly.
Is there a model which represents this type of problem?