I want to have a structure token
that has start/end pairs for position, sentence, and paragraph information. I also want the members to be accessible in two different ways: as a start/end pair and individually. Given:
struct token {
struct start_end {
int start;
int end;
};
start_end pos;
start_end sent;
start_end para;
typedef start_end token::*start_end_ptr;
};
I can write a function, say distance()
, that computes the distance between any of the three start
/end
pairs like:
int distance( token const &i, token const &j, token::start_end_ptr mbr ) {
return (j.*mbr).start - (i.*mbr).end;
}
and call it like:
token i, j;
int d = distance( i, j, &token::pos );
that will return the distance of the pos
pair. But I can also pass &token::sent
or &token::para
and it does what I want. Hence, the function is flexible.
However, now I also want to write a function, say max()
, that computes the maximum value of all the pos.start
or all the pos.end
or all the sent.start
, etc.
If I add:
typedef int token::start_end::*int_ptr;
I can write the function like:
int max( list<token> const &l, token::int_ptr p ) {
int m = numeric_limits<int>::min();
for ( list<token>::const_iterator i = l.begin(); i != l.end(); ++i ) {
int n = (*i).pos.*p; // NOT WHAT I WANT: It hard-codes 'pos'
if ( n > m )
m = n;
}
return m;
}
and call it like:
list<token> l;
l.push_back( i );
l.push_back( j );
int m = max( l, &token::start_end::start );
However, as indicated in the comment above, I do not want to hard-code pos
. I want the flexibility of accessible the start
or end
of any of pos
, sent
, or para
that will be passed as a parameter to max()
.
I've tried several things to get this to work (tried using unions, anonymous unions, etc.) but I can't come up with a data structure that allows the flexibility both ways while having each value stored only once.
Any ideas how to organize the token
struct so I can have what I want?
Attempt at clarification
Given struct of pairs of integers, I want to be able to "slice" the data in two distinct ways:
- By passing a pointer-to-member of a particular start/end pair so that the called function operates on any pair without knowing which pair. The caller decides which pair.
- By passing a pointer-to-member of a particular
int
(i.e., only oneint
of any pair) so that the called function operates on anyint
without knowing either whichint
or which pair saidint
is from. The caller decides whichint
of which pair.
Another example for the latter would be to sum, say, all para.end
or all sent.start
.
Also, and importantly: for #2 above, I'd ideally like to pass only a single pointer-to-member to reduce the burden on the caller. Hence, me trying to figure something out using unions.
For #2, the struct would be optimally laid out like:
struct token2 {
int pos_start;
int pos_end;
int sent_start;
int sent_end;
int para_start;
int para_end;
};
The trick is to have token
and token2
overlaid somehow with a union
, but it's not apparent if/how that can be done and yet satisfy the accessible requirements.