views:

721

answers:

2

I have read in few different places that using c++0x's new string literals it might be possible to compute a string's hash at compile time. However, no one seems to be ready to come out and say that it will be possible or how it would be done.

  • Is this possible?
  • What would the operator look like?

I'm particularly interested use cases like this.

void foo( const std::string& value )
{
   switch( std::hash(value) )
   {
      case "one"_hash: one(); break;
      case "two"_hash: two(); break;
      /*many more cases*/
      default: other(); break;
   }
}

Note: the compile time hash function doesn't have to look exactly as I've written it. I did my best to guess what the final solution would look like, but meta_hash<"string"_meta>::value could also be a viable solution.

+4  A: 

Note that the form shown here wasn't accepted into the standard, as noted below.

Compile time string processing is guessed to become possible through user-defined literals proposed in N2765.
As i already mentioned, i don't know of any compiler that currently implements it and without compiler support there can be only guess work.

In §2.13.7.3 and 4 of the draft we have the following:

Otherwise (S contains a literal operator template), L is treated as a call of the form
operator "" X<'c1', 'c2', ... , 'ck'>() where n is the source character sequence c1c2...ck. [Note: The sequence c1c2...ck can only contain characters from the basic source character set. —end note]

Combine that with constexpr and we should have compile time string processing.

update: i overlooked that i was reading the wrong paragraph, this form is allowed for user-defined-integer-literals and -floating-literals, but apparently not for -string-literals (§2.13.7.5).
This part of the proposal seems to have not been accepted.

That being said, with my limited glimpse at C++0x, it might look something like this (i most likely got something wrong):

template<char c, char... str>
struct hash {
    static const unsigned result = c + hash<str...>::result;
};

template<char c>
struct hash {
    static const unsigned result = c;
};

template<char... str> 
constexpr unsigned
operator "" _hash() {
    return hash<str>::result;
}

// update: probably wrong, because the above 
// form is not allowed for string-literals:    
const unsigned h = "abcd"_hash;

If Jerrys approach works, then the following should work however:

constexpr unsigned operator "" _hash(const char* s, size_t) {
    return const_hash(s);
}
Georg Fritzsche
Nice (and necessary) combination of var length templates and `constexpr` user defined literal. I'm not sure you can use a string literal as a template parameter, don't they have static linkage? (they do in C++98 at least and are therefore verboten as template parameters).
Motti
I have mixed up the paragraphs and thought that this case was an exception - sadly it doesn't appear to be so.
Georg Fritzsche
@Motti: where is gf using the string literal as a template parameter? Are you confusing passing the variadic template of chars as a string literal?
caspin
It seems from the original proposal, the template version for string literals wasn't accepted (due to concatenation issues? http://stackoverflow.com/questions/1108008/any-ideas-for-c1y/1109243#1109243 - comments) - too bad.
Georg Fritzsche
+6  A: 

At least by my reading of §7.1.5/3 and §5.19, the following might be legitimate:

unsigned int constexpr const_hash(char const *input) { 
    // really simple hash function...
    return static_cast<unsigned int>(*input) 
         && static_cast<unsigned int>(*input) + hash(input+1); 
}

This does seem to follow the basic rules in §7.1.5/3:

  1. The form is: "return expression;"
  2. Its only parameter is a pointer, which is a scalar type, and therefore a literal type.
  3. Its return is unsigned int, which is also scalar (and therefore literal).
  4. There is no implicit conversion to the return type.

There is some question whether the *inputs involve an illegal lvalue to rvalue conversion, and I'm not sure I understand the rules in §5.19/2/6/21 and §4.1 well enough to be sure about that.

Usage would be something like:

switch(std::hash(value)) {
    case const_hash("one"): one(); break;
    case const_hash("two"): two(); break;
    // ...
    default: other(); break;
}

To comply with the requirements of §5.19/2/6/2 you might have to do something like this though:

// one of the `constexpr`s is probably redundant, but I haven't figure out which.
char constexpr * constexpr v_one = "one"; 

// ....

case const_hash(v_one): one(); break;
  1. I'm using the extra 'slash' numbers to refer to unnumbered bullet points, so this is the second bullet point inside if the sixth bullet point under §5.19/2. I think I might have to talk to Pete Becker about whether it's possible to add some sort of numbers/letters/roman numerals down the hierarchy to identify pieces like this...
Jerry Coffin
Two things wrong with this. 1: Recursive calls are not allowed in `constexpr`, 2: You have no halting condition (where `*input == nullptr`) and as I understand `constexpr` you can't have one.
Motti
Jerry Coffin
I don't remember where it says that but I remember reading it somewhere (downvote cancelled)
Motti
recursive constexpr. I read some meeting minutes from 2008. There was discussion about allowing or disallowing recursive constexpr functions. The gist was the current wording didn't forbid it, and slightly implied it. The committee felt that such a powerful feature should be explicitly spelt out. That was back in 2008, I don't know what's happened since then.
caspin
I don't see it in the draft, but N2235 (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2235.pdf) says in 4.1: "possibly involvingcalls of *previously defined* constant expression functions". Either that restriction was dropped or i am overlooking it in the draft.
Georg Fritzsche
Right -- I probably should have pointed out that I was going from N3000, which (I believe) is currently the most recent draft. I'm pretty sure it was forbidden at one time, but at least for now it seems to be allowed.
Jerry Coffin
I wish the other answer was correct, but it looks like this is the closest to correct. The only remaining question is if a pointer can be dereference at in a constexpr.
caspin