Alright so I am writing a function as part of a lexical analyzer that 'looks up' or searches for a match with a keyword. My lexer catches all the obvious tokens such as single and multi character operators (+ - * / > < = == etc
) (also comments and whitespace are already taken out) so I call a function after I've collected a stream of only alphanumeric characters (including underscores) into a string
, this string then needs to be matched as either a known keyword or an identifier.
So I was wondering how I might go about identifying it? I know I basically need to compare it to some list or array or something of all the built in keywords, and if it matches one return that match to it's corresponding enum value; otherwise, if there is no match, then it must be a function or variable identifier. So how should I look for matches? I read somewhere that something called a Binary Search Tree is an efficient way to do it or by using Hash Tables, problem is I've never used either so I am not sure if it's the right way. Could I possibly use a MySQL database?