Does anyone know of a Java library that handles finding sentence boundaries? I'm thinking that it would be a smart StringTokenizer implementation that knows about all of the sentence terminators that languages can use.
Here's my experience with BreakIterator:
Using the example here:
I have the following Japanese:
今日はパソコンを買った。高性能のマックは早...
My string is as follows:
smtp:[email protected];SMTP:[email protected];X400:C=US;A= ;P=Test;O=Exchange;S=Jack;G=Black;
I need back:
smtp:[email protected]
SMTP:[email protected]
X400:C=US;A= ;P=Test;O=Exchange;S=Jack;G=Black;
The problem is the semi-colons seperate the addresses and also part of the X400 address. Can anyone suggest how best to spl...
Pretty basic, I'm just curious how others might implement this algorithm and would like to see if there are any clever tricks to optimize the algorithm...I just had to implement this for a project that I am working on.
Given a string in CamelCase, how would you go about "spacifying" it?
e.g. given FooBarGork I want Foo Bar Gork back.
...
Hi, I'm trying to adapt this answer
http://stackoverflow.com/questions/53849/how-do-i-tokenize-a-string-in-c#53921
to my current string problem which involves reading from a file till eof.
from this source file:
Fix grammatical or spelling errors
Clarify meaning without changing it
Correct minor mistakes
I want to create a vect...
Hello everyone,
I'm trying to do something simple here. When I execute the following code in Visual Studio 2008 using the unicode character set, xmlString is correct.
Unfortunately I need to convert the CString to a unsigned char*.
Using the code below, ucStr becomes "<" (i.e. the first character of xmlString).
How should I convert th...
I am faced with the need to pull out the information in a string of the format "blah.bleh.bloh" in ANSI C. Normally I would use strok() to accomplish this, but since I am getting this string via strtok, and strtok is not thread-safe, I cannot use this option.
I have written a function to manually parse the string. Here is a snippit:
...
I currently have the code below, to replace a characters in a string but I now need to replace characters within the first X (in this case 3) characters and leave the rest of the string. In my example below I have 51115 but I need to replace any 5 within the first 3 characters and I should end up with 61115.
My current code:
value = 51...
I'm writing a scanner as part of a compiler.
I'm having a major headache trying to write this one portion:
I need to be able to parse a stream of tokens and push them one by one into a vector, ignoring whitespace and tokenizing special symbols (simple case, lets just consider parentheses and braces)
Example:
int main(){ ...
The .NET Framework gives us the Format method:
string s = string.Format("This {0} very {1}.", "is", "funny");
// s is now: "This is very funny."
I would like an "Unformat" function, something like:
object[] params = string.Unformat("This {0} very {1}.", "This is very funny.");
// params is now: ["is", "funny"]
I know something simi...
Or: Should I optimize my string-operations in PHP? I tried to ask PHP's manual about it, but I didn't get any hints to anything.
...
Hi all,
The problem is with the convert of the txt box value, but why?
string strChar = strTest.Substring(0, Convert.ToInt16(txtBoxValue.Text));
Error is: Input string was not in a correct format.
Thanks all.
...
Hi,
I wonder if there is an easy way to check if two strings match by excluding certain characters in the strings. See example below.
I can easily write such a method by writing a regular expression to find the "wild card" characters, and replace them with a common character. Then compare the two strings str1 and str2. I am not looking...
Hi,
I'm trying to make a method that returns a string of words in opposite order.
IE/ "The rain in Spain falls mostly on the"
would return: "the on mostly falls Spain in rain The"
for this i am not supposed to use any built in Java classes just basic java
So far i have:
lastSpace = stringIn.length();
for (int i = strin...
StringBuilder has a reputation as being a faster string manipulation tool than simply concatenating strings. Whether or not that's true, I'm left wondering about the results of StringBuilder operations and the strings they produce.
A quick jaunt into Reflector shows that StringBuilder.ToString() doesn't always return a copy, sometimes ...
I'm trying to find out the best practice when removing characters from the start of a string.
In some languages, you can use MID without a length parameter however, in TSQL the length is required.
Considering the following code, what is the best practise? (The hex string is variable length)
DECLARE @sHex VARCHAR(66)
SET @sHex = '0x7E2...
I'm trying to find a way to place a colon ( : ) into a string, two characters from the end of the string.
Examples of $meetdays:
1200 => 12:00900 => 9:001340 =>13:40
Not sure if this should be a regular expression or just another function that I'm not aware of.
...
In Java is there a way to check the condition:
"Does this single character appear at all in string x"
without using a loop?
Thank you,
...
I have the following string:
"Look on http://www.google.com".
I need to convert it to:
"Look on http://www.google.com"
The original string can have more than 1 URL string.
How do I do this in php?
Thanks
...
I am building a text parser using regular expressions. I need to convert all tab characters in a string to space characters. I cannot assume how many spaces a tab should encompass otherwise I could replace a tab with, say, 4 space characters. Is there any good solution for this type of problem. I need to do this in code so I cannot use a...
Is there a way to dynamically execute code contained in a string using .net 2.0, in a similar way to eval() in javascript or using sp_executeSQL in tsql?
I have a string value in a variable that I want to manipulate at some point in my application - so the code would essentially be string manipulation. I don't know what different manipu...