views:

174

answers:

1

I have an existing (java) application that models an order book, as it stands each order is visible to every other. There is now a requirement to put (what is effectively) a per order ACL in place.

An example to illustrate, lets say I have access groups [V-Z] and orders [A-F]

      A B C D E F
   V  1 0 0 1 1 0
   W  0 1 1 0 0 1
   X  0 0 0 0 1 1
   Y  1 1 0 0 0 1
   Z  0 1 0 1 0 0

A new order comes in that specifies visibility as W & Y. What would be a fast way to return the set of values that can be seen by the incoming order?

One implementation that has been suggested is to represent each row as a BitSet and do W | Y though I wonder what will happen to performance as the size of the matrix increases.

A nice to have but not essential feature is to allow for a parent-child relationship on one dimension like

        A B C D E F
   V    1 0 0 1 1 0
   W    0 1 1 0 0 1
   X-1  0 0 0 0 1 1
   X-2  1 0 0 0 1 1
   X-3  0 1 0 0 1 1
   Y    1 1 0 0 0 1
   Z    0 1 0 1 0 0

It would be ideal if it were similarly efficient to retrieve "W | X" as "W | X-1"

Any hints in the direction of an algorithm and/or appropriate data structure much appreciated.

+1  A: 

The simple solution:

class AccessGroupName { ... }
class Order { ... }

Map<AccessGroupName, Collection<Order>> visibility = new HashMap<AccessGroupName, Collection<Order>>();

addVisibility(AccessGroupName group, Order order) {
 Collection<Order> orders = visibilities.get(group);
 if (orders == null) {
  orders = new ArrayList<Order>();
  visibility.put(group, orders);
 }
 if (!orders.contains(order)) orders.add(order);
}

public Set<Order> getVisibility(Collection<AccessGroupName> names) {
 Set<Order> visible = new HashSet<Order>();
 for (AccessGroupName name: names) {
  visible.addAll(visibilities.get(name));
 }
 return visible;
}

HashMap lookups are O(1). Iterating an ArrayList is O(n). Adding items to a HashSet is O(n). Overall, this will be O(n) where n is the total number of elements in the added lists (which might be more than the number of elements in the resulting set if there's overlap). The constant is, roughly, the time it takes to get an element from an ArrayList iterator plus the time it takes to add something to a HashSet - the former is on the order of 10 cycles, the latter nearer 100.

Memory use, over and above the AccessGroupName and Order instances themselves, is about 14-15 words per group plus 1-2 words per order. Mostly object headers.

This code doesn't do anything clever, but i think you'll be pretty hard pressed to beat O(n) with a constant of <200 cycles.

In particular, if the notional matrix is sparse (that is, if there are lots of access groups with a few orders each), this will beat the pants off a bitset approach, which will waste a hell of a lot of space on zeroes, and time on ORing zeroes together.

Tom Anderson