tags:

views:

81

answers:

1

While this seems like a duplicate of Crypt Kicker Problem, it isn't.

I've solved the problem, but I'm not all together that satisfied with my solution. The problem statement is:

A common but insecure method of encrypting text is to permute the letters of the alphabet. In other words, each letter of the alphabet is consistently replaced in the text by some other letter. To ensure that the encryption is reversible, no two letters are replaced by the same letter.

Your task is to decrypt several encoded lines of text, assuming that each line uses a different set of replacements, and that all words in the decrypted text are from a dictionary of known words.

Input

The input consists of a line containing an integer n, followed by n lowercase words, one per line, in alphabetical order. These n words compose the dictionary of words which may appear in the decrypted text. Following the dictionary are several lines of input. Each line is encrypted as described above.

There are no more than 1,000 words in the dictionary. No word exceeds 16 letters. The encrypted lines contain only lower case letters and spaces and do not exceed 80 characters in length.

Output

Decrypt each line and print it to standard output. If there are multiple solutions, any one will do. If there is no solution, replace every letter of the alphabet by an asterisk.

Sample Input

6

and

dick

jane

puff

spot

yertle

bjvg xsb hxsn xsb qymm xsb rqat xsb pnetfn

xxxx yyy zzzz www yyyy aaa bbbb ccc dddddd

Sample Output

dick and jane and puff and spot and yertle

**** *** **** *** **** *** **** *** ******


I brute forced the problem: I bucketed the dictionary into a set based on length. Then I did a recursive brute force where I tried every possible substitution based on word length and backtracked if there wasn't a match. It works, but I'm very unsatisfied with the solution. I may just be obsessing, but it seems like there should be a more elegant way to solve the problem. My code is below:

#include<iostream>
#include<algorithm>
#include<vector>
#include<sstream>
#include<string>
#include<map>
#include<set>
using namespace std;
bool Find(vector<set<string > > &dict,vector<string> &line, map<char,char> &dec,int spot){
  //Check that the end of the line hasn't been reached
  if(spot<line.size()){
    //Get the size of the current word
    int sSize=line[spot].size();
    string cand;
    cand.resize(sSize,'A');
    //Attempt to decode the current word
    for(int i=0;i<sSize;i++){
      if(dec.find(line[spot][i])!=dec.end())
         cand[i]=dec[line[spot][i]];
    }   
    //Check all strings in the dictionary of the current length                
    for(set<string>::iterator it=dict[sSize].begin();it!=dict[sSize].end();it++){
      bool notMatch=false;
      for(int i=0;i<sSize;i++)
       //A is used to signify an undecoded character, this if says if the character was
       // decoded and it does not equal to corresponding character in the word, it's not                   a match
       if(cand[i]!='A'&&cand[i]!=(*it)[i])
        notMatch=true;
     if(notMatch)
       continue;
      for(int i=0;i<sSize;i++)
      //if it is a feasible match, then add the learned characters to the decoder
    if(cand[i]=='A')
      dec.insert(pair<char,char> (line[spot][i],(*it)[i]));
      //Keep decoding
      if(Find(dict,line,dec,spot+1))
    return true;
      //If decoding failed, then remove added characters
      for(int i=0;i<sSize;i++)
    if(cand[i]=='A')
      dec.erase(line[spot][i]);
    }
    if(spot==0){
      //This means no solution was found, fill decoder with a map to astericks
      string b="qwertyuiopasdfghjklzxcvbnm";
      for(int i=0;i<b.size();i++)
    dec.insert(pair<char,char> (b[i],'*'));
    }
    return false;
  }
  return true;
}
int main(){
  int size;
  cin >> size;
  vector<set<string> > dict;
  dict.resize(17);
  string grab;
  for(int i=0;i<size;i++){
    //Bucket dictionary
    cin >> grab;
    dict[grab.size()].insert(grab);
  }
  while(getline(cin,grab)){
    stringstream in(stringstream::in |stringstream::out);
    in << grab;
    vector<string> line;
    while(in >> grab)
      line.push_back(grab);
    map<char,char> dec;
    Find(dict,line,dec,0);
    for(int i=0;i<line.size();i++){
      for(int j=0;j<line[i].size();j++)
    cout << dec[line[i][j]];
      if(i!=line.size()-1)
    cout << " ";
      else
    cout << endl;
    }
  }
}

Also, I'm not particularly interested in solutions that wouldn't work in c++. Just because it's the language I work with in the programming contest, so I'm confined to it for solving these problems. I also know that there are quite a few stylistic and minor efficiency things that I could do differently that don't concern me too much, I'm missing a break or two. Mainly I'm just wondering if there's a simpler solution, or if my implementation is over-complicating things. Thanks.

+3  A: 

I would solve this by comparing letter-patterns in the words. First I would convert the dictionary like so:

and -> 123
dick -> 1234
jane -> 1234
puff -> 1233
spot -> 1234
yertle -> 123452

This particular dictionary doesn't work too well, but the general idea is to map out the pattern formed by the letters. For example the word "letters" maps to 1233245 which is a much better example since there are multiple e's and t's.

Then I'd do the same thing to the encrypted text:

bjvg xsb hxsn xsb qymm xsb rqat xsb pnetfn -> 1234 123 1234 123 1233 123 1234 123 123452

We can do a reverse lookup and determine that second word is "and", the fifth is "puff" and ninth is "yertle". "dick", "jane" and "spot" all have the same pattern so we can't immediately tell them apart, but using the information gained from "and", "puff" and "yertle" you can fill in the rest.

Niki Yoshiuchi
It seems that you could prune the search space with this method, by narrowing down each bucket in the dictionary for each word. Which would be useful, in general, but in the confines of the problem would add complexity that isn't needed. A good thought though.
Jacob Schlather
You can also choose which word you start with more intelligently. For example, in your example dictionary you have 4 four letter words, but only 1 three letter word and only 1 six letter word. Your cipher text starts with a four letter word, but it also contains three and six letter words. If you started with those, you'd decrypt the text much faster since you wouldn't need to backtrack on those. This would of course require more information (the buckets would need to be sorted by size and you'd need to know the length of each word in your cipher text).
Niki Yoshiuchi