How do I tokenize a string in C++?
Java has a convenient split method: String str = "The quick brown fox"; String[] results = str.split(" "); Is there an easy way to do this in C++? ...
Java has a convenient split method: String str = "The quick brown fox"; String[] results = str.split(" "); Is there an easy way to do this in C++? ...
I have string like this /c SomeText\MoreText "Some Text\More Text\Lol" SomeText I want to tokenize it, however I can't just split on the spaces. I've come up with somewhat ugly parser that works, but I'm wondering if anyone has a more elegant design. This is in C# btw. EDIT: My ugly version, while ugly, is O(N) and may actually be ...
StringTokenizer? Convert the String to a char[] and iterate over that? Something else? ...
I have a string which is like this: this is [bracket test] "and quotes test " I'm trying to write something in Python to split it up by space while ignoring spaces within square braces and quotes. The result I'm looking for is: ['this','is','bracket test','and quotes test '] ...
I'm not sure if the title is very clear, but basically what I have to do is read a line of text from a file and split it up into 8 different string variables. Each line will have the same 8 chunks in the same order (title, author, price, etc). So for each line of text, I want to end up with 8 strings. The first problem is that the last ...
So here's what I'm looking to achieve. I would like to give my users a single google-like textbox where they can type their queries. And I would like them to be able to express semi-natural language such as "view all between 1/1/2008 and 1/2/2008" it's ok if the syntax has to be fairly structured and limited to this specific domain ...
I have a constantly growing database of keywords. I need to parse incoming text inputs (articles, feeds etc) and find which keywords from the database are present in the text. The database of keywords is much larger than the text. Since the database is constantly growing (users add more and more keywords to watch for), I figure the bes...
Basically I'm creating an indoor navigation system in J2ME. I've put the location details in a .txt file i.e. Locations names and their coordinates. Edges with respective start node and end node as well as the weight (length of the node). I put both details in the same file so users dont have to download multiple files to get their ma...
How do i parse tokens from an input string. For example: char *aString = "Hello world". I want the output to be: "Hello" "world" ...
I'm using jquery DynaCloud with wordCount to create a dynamic tagcloud. I have specific terms to include in the cloud (though the frequency is different for each user), and some of the terms are multiple word, or have special characters ("&", "'", " ", etc.) as part of the term. I break the terms with specific html blocks: <pre><span...
Recently, I am started learning Antlr. And knew that lexer/parser together could be used in construction of programming languages. Other than DSL & programming languages, Have you ever directly or in-directly used lexer/parser tools (and knowledge) to solve real world problem? is it possible to solve the same problem by an average progr...
I just learned about Java's Scanner class and now I'm wondering how it compares/competes with the StringTokenizer and String.Split. I know that the StringTokenizer and String.Split only work on Strings, so why would I want to use the Scanner for a String? Is Scanner just intended to be one-stop-shopping for spliting? ...
I have this string: %{Children^10 Health "sanitation management"^5} And I want to convert it to tokenize this into an array of hashes: [{:keywords=>"children", :boost=>10}, {:keywords=>"health", :boost=>nil}, {:keywords=>"sanitation management", :boost=>5}] I'm aware of StringScanner and the Syntax gem (http://syntax.rubyforge.org/) ...
I have a daemon that reads a configuration file in order to know where to write something. In the configuration file, a line like this exists: output = /tmp/foo/%d/%s/output Or, it may look like this: output = /tmp/foo/%s/output/%d ... or simply like this: output = /tmp/foo/%s/output ... or finally: output = /tmp/output I hav...
This problem is a challenging one. Our application allows users to post news on the homepage. That news is input via a rich text editor which allows HTML. On the homepage we want to only display a truncated summary of the news item. For example, here is the full text we are displaying, including HTML In an attempt to make a b...
Hi, This looks like homework stuff but please be assured that it isn't homework. Just an exercise in the book we use in our c++ course, I'm trying to read ahead on pointers.. The exercise in the book tells me to split a sentence into tokens and then convert each of them into pig latin then display them.. pig latin here is basically ...
I am writing a program which will tokenize the input text depending upon some specific rules. I am using C++ for this. Rules Letter 'a' should be converted to token 'V-A' Letter 'p' should be converted to token 'C-PA' Letter 'pp' should be converted to token 'C-PPA' Letter 'u' should be converted to token 'V-U' This is just a sample...
I've pretty much finished coding a SIC assembler for my systems programming class but I'm stumped on the tokenizing part. For example, take this line of source code: The format (free format) is: {LABEL} OPCODE {OPERAND{,X}} {COMMENT} The curls indicate that the field is optional. Also, each field must be separated by at least one sp...
I am using split() to tokenize a String separated with * following this format: name*lastName*ID*school*age % name*lastName*ID*school*age % name*lastName*ID*school*age I'm reading this from a file named "entrada.al" using this code: static void leer() { try { String ruta="entrada.al"; File myFile = new File (ruta...
I'm thinking about the tokenizer here. Each token calls a different function inside the parser. What is more efficient: A map of std::functions/boost::functions A switch case I thank everyone in advance for their answer. ...