Is a customer identified by id? Is it an int or long? If the answer to both questions is yes, an array with 10,000,000 integers shouldn't take more than 10M*4 = 40MB memory - not a big deal on decent hardware. Simply sort and compare them.
btw, sorting an array with 10M random ints takes less than 2 seconds on my machine - again, nothing to be afraid of.
Here's some very simple Java code:
public static void main(final String args[]) throws Exception {
// elements in each log file
int count = 10000000;
// "read" our log file
Random r = new Random();
int[] a1 = new int[count];
int[] a2 = new int[count];
for (int i = 0; i < count; i++) {
a1[i] = Math.abs(r.nextInt());
a2[i] = Math.abs(r.nextInt());
}
// start timer
long start = System.currentTimeMillis();
// sort logs
Arrays.sort(a1);
Arrays.sort(a2);
// counters for each array
int i1 = 0, i2 = 0, i3 = 0;
// initial values
int n1 = a1[0], n2 = a2[0];
// result array
int[] a3 = new int[count];
try {
while (true) {
if (n1 == n2) {
// we found a match, save value if unique and increment counters
if (i3 == 0 || a3[i3-1] != n1) a3[i3++] = n1;
n1 = a1[i1++];
n2 = a2[i2++];
} else if (n1 < n2) {
// n1 is lower, increment counter (next value is higher)
n1 = a1[i1++];
} else {
// n2 is lower, increment counter (next value is higher)
n2 = a2[i2++];
}
}
} catch (ArrayIndexOutOfBoundsException e) {
// don't try this at home - it's not the pretties way to leave the loop!
}
// we found our results
System.out.println(i3 + " commont clients");
System.out.println((System.currentTimeMillis() - start) + "ms");
}
result
// sample output on my machine:
46308 commont clients
3643ms
as you see, quite efficient for 10M records in each log