Karl's approach works just fine, but we can greatly reduce the amount of data produced by the mappers.
Let K the number of samples you want. We'll assume that this is small enough to hold in memory on one of your nodes. We'll assign a random value to each matching row, and then use a modification of the selection algorithm to find the K smallest values.
At the setup part of each mapper, create a priority queue; a Fibonnacci heap is a good choice for this. We'll be using floats as the priorities; if you have a huge amount of data, doubles may be more appropriate to avoid there being ties. For each row that matches your condition, insert that row into the priority queue with a randomly chosen float between 0 and 1 as the priority. If you have more than K things in your queue, remove the highest valued item (this is opposite of the terminology of a standard Fibonnacci heap).
Finally, at the end of the mapper, emit everything in your queue. For each item you emit, use as the key the priority as a FloatWritable
and some representation of the corresponding row as the value (the row ID, or perhaps the entire row contents). You will emit only K values per mapper (or less if there were fewer than K matching rows in that mapper).
In your single reducer, Hadoop will automatically scan through the keys in order from lowest to highest. Emit the rows corresponding to the first K keys you see (the K lowest), then quit.
This works because each matching row has the same probability of having one of the K smallest float values. We keep track of the K smallest floats for each mapper to make sure we don't miss any, and then send them to the reducer to find the K smallest overall.