Just for my own purposes, I'm trying to build a tokenizer in Java where I can define a regular grammar and have it tokenize input based on that. The StringTokenizer class is depracated, and I've found a couple functions in Scanner that hint towards what I want to do, but no luck yet. Anyone know a good way of going about this?
...
My database consists of 3 tables (one for storing all items, one for the tags, and one for the relation between the two):
Table: Post
Columns: PostID, Name, Desc
Table: Tag
Columns: TagID, Name
Table: PostTag
Columns: PostID, TagID
What is the best way to save a space separated string (e.g. "smart funny wonderful") into the 3 databas...
I have been trying to tokenize a string using SPACE as delimiter but it doesn't work. Does any one have suggestion on why it doesn't work?
Edit: tokenizing using:
strtok(string, " ");
the code is like the following
pch = strtok (str," ");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ");
}
...
I have a string to tokenize. It's form is HHmmssff where H, m, s, f are digits.
It's supposed to be tokenized into four 2-digit numbers, but I need it to also accept short-hand forms, like sff so it interprets it as 00000sff.
I wanted to use boost::tokenizer's offset_separator but it seems to work only with positive offsets and I'd lik...
Hello,
I am looking for a clear definition of what a "tokenizer", "parser" and "lexer" are and how they are related to each other (e.g., does a parser use a tokenizer or vice versa)? I need to create a program will go through c/h source files to extract data declaration and definitions.
I have been looking for examples and can find so...
Hello i been trying to get a tokenizer to work using the boost library tokenizer class.
I found this tutorial on the boost documentation:
http://www.boost.org/doc/libs/1 _36 _0/libs/tokenizer/escaped _list _separator.htm
problem is i cant get the argument's to escaped _list _separator("","","");
but if i modify the boost/tokenizer.hpp...
Hi,
I'm going to implement a tokenizer in Python and I was wondering if you could offer some style advice?
I've implemented a tokenizer before in C and in Java so I'm fine with the theory, I'd just like to ensure I'm following pythonic styles and best practices.
Listing Token Types:
In Java, for example, I would have a list of fields...
I just learned about Java's Scanner class and now I'm wondering how it compares/competes with the StringTokenizer and String.Split. I know that the StringTokenizer and String.Split only work on Strings, so why would I want to use the Scanner for a String? Is Scanner just intended to be one-stop-shopping for spliting?
...
I need to process html submitted in my web application and don't want to munge the whole thing with regular expressions. What tokenizer approach and/or software should I take?
...
I am used to the c-style getchar(), but it seems like there is nothing comparable for java. I am building a lexical analyzer, and I need to read in the input character by character.
I know I can use the scanner to scan in a token or line and parse through the token char-by-char, but that seems unwieldy for strings spanning multiple line...
Hi,
I need to convert C# code to an equivalent XML representation.
I plan to convert the C# code (C# 2.0 code snippets, no generics or nullable types) to an AST and then convert the AST to XML.
Looking for a simple lexer/parser for C# which outputs an AST.
Any pointers on converting C# code to an XML representation (which can be convert...
I have found myself designing a language for fun that is a cross between Ruby and Java, and as I work on the compiler / interpreter I find myself pondering using whitespace as a terminator, like:
class myClass extends baseClass
function someFunction(arg)
value eq firstValue
value2 eq anotherValue
x = 2
The alter...
See also: Is this a good substr() for C?
strtok() and friends skip over empty fields, and I do not know how to tell it not to skip but rather return empty in such cases.
Similar behavior from most tokenizers I could see, and don't even get me started on sscanf() (but then it never said it would work on empty fields to begin with).
I...
I am playing around with the boost strings library and have just come across the awesome simplicity of the split method.
string delimiters = ",";
string str = "string, with, comma, delimited, tokens, \"and delimiters, inside a quote\"";
// If we didn't care about delimiter characters within a quoted section we could us
vector<s...
I'm thinking about the tokenizer here.
Each token calls a different function inside the parser.
What is more efficient:
A map of std::functions/boost::functions
A switch case
I thank everyone in advance for their answer.
...
Hi,
I'm trying to learn myself some C++ from scratch at the moment.
I'm well-versed in python, perl, javascript but have only encountered C++ briefly, in a
classroom setting in the past. Please excuse the naivete of my question.
I would like to split a string using a regular expression but have not had much luck finding
a cle...
How to get the same results as http://developer.yahoo.com/search/content/V1/termExtraction.html
This question has been asked quite a few times before.
http://stackoverflow.com/questions/1078766/best-approach-to-analyze-text-in-php
http://stackoverflow.com/questions/711062/what-is-a-good-keyword-extraction-web-service
http://stackoverf...
I know there are string tokenizers but is there an "int tokenizer"?
For example, I want to split the string "12 34 46" and have:
list[0]=12
list[1]=34
list[2]=46
In particular, I'm wondering if Boost::Tokenizer does this. Although I couldn't find any examples that didn't use strings.
...
I have this string:
[a [a b] [c e f] d]
and I want a list like this
lst[0] = "a"
lst[1] = "a b"
lst[2] = "c e f"
lst[3] = "d"
My current implementation that I don't think is elegant/pythonic is two recursive functions (one splitting with '['
and the other with ']' ) but I am sure it can be
done using list comprehensions or regula...
I have a .txt file with integers on each line e.g.
1
4
5
6
I want to count the occurences of the values that are in an array with the file.
My code extract is this
String s = null;
FileReader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr);
while ((s = br.readLine()) !=null) {
StringTokeni...