views:

23

answers:

1

I recently gathered, using a questionnaire, a set of opinions on the importance of various software components. Figuring that some form of Condorcet voting method would be the best way to obtain an overall rank, I opted to use OpenSTV to analyze it.


My data is in tabular format, space delimited, and looks more or less like:

A B C D E F G    # Candidates
5 2 4 3 7 6 1    # First ballot. G is ranked first, and E is ranked 7th
4 2 6 5 1 7 3    # Second ballot
etc

In this format, the number indicates the rank and the sequence order indicates the candidate. Each "candidate" has a rank (required) from 1 to 7, where a 1 means most important and a 7 means least important. No duplicates are allowed.

This format struck me as the most natural way to represent the output, being a direct representation of the ballot format.


The OpenSTV/BLT format uses a different method of representing the same info, conceptually as follows:

G B D C A F E    # Again, G is ranked first and E is ranked 7th
E B G A D C F    # 
etc

The actual numeric file format uses the (1-based) index of the candidate, rather than the label, and so is more like:

7 2 4 3 1 6 5    # Same ballots as before.
5 2 7 1 4 3 6    # A -> 1, G -> 7

In this format, the number indicates the candidate, and the sequence order indicates the rank. The actual, real, BLT format also includes a leading weight and a following zero to indicate the end of each ballot, which I don't care too much about for this.


My question is, what is the most elegant way to convert from the first format to the (numeric) second?

A: 

Here's my solution in Python, and it works ok but feels a little clumsy. I'm sure there's a cleaner way(perhaps in another language?)

This took me longer than it should have to wrap my head around yesterday afternoon, so maybe somebody else can use this too.

Given:

ballot = '5 2 4 3 7 6 1'

Python one(ish)-liner to convert it:

rank = [i for r,i in sorted((int(r),i+1) for i,r in enumerate(ballot.split())]
rank = " ".join(rank)

Alternatively, in a slightly more understandable form:

# Split into a list and convert to integers
int_ballot = [int(x) for x in ballot.split()]

# This is the important bit.
# enumerate(int_ballot) yields pairs of (zero-based-candidate-index, rank)
# Use a list comprehension to swap to (rank, one-based-candidate-index)
ranked_ballot = [(rank,index+1) for index,rank in enumerate(int_ballot)]

# Sort by the ranking. Python sorts tuples in lexicographic order
# (ie sorts on first element)
# Use a comprehension to extract the candidate from each pair
rank = " ".join([candidate for rank,candidate in sorted(ranked_ballot)])
kibibu