Basically you need to first separate the block of text into sentences. That's tricky enough, even in English since you need to look out for periods, question marks, exclamation marks and any other sentence terminators.
Then you process one sentence at a time after removing all punctuation (commas, semi-colons, colons, and so on).
Then, when you're left with an array of words, it becomes simpler:
for i = 1 to num_words-1:
for j = i+1 to num_words:
phrase = words[i through j inclusive]
store phrase
That's it, pretty simple (after initial massaging of the text block, which may not be as simple as you think).
This will give you all phrases of two or more words in every sentence.
The separation into sentences, separation into words, removal of punctuation and so on will be the hardest bit but I've already shown you some simple initial rules to follow. The rest should be added every time a block of text breaks the algorithm.
Update:
As requested, here's some Java code which gives the phrases:
public class testme {
public final static String text =
"My username is click upvote." +
" I have 4k rep on stackoverflow.";
public static void procSentence (String sent) {
System.out.println ("==========");
System.out.println ("sentence [" + sent + "]");
// Split sentence at whitspace into array.
String [] sa = sent.split("\\s+");
// Process each starting word.
for (int i = 0; i < sa.length - 1; i++) {
// Process each phrase.
for (int j = i+1; j < sa.length; j++) {
// Build the phrase.
String phrase = sa[i];
for (int k = i+1; k <= j; k++) {
phrase = phrase + " " + sa[k];
}
// This is where you have your phrase. I just
// print it out but you can do whatever you
// wish with it.
System.out.println (" " + phrase);
}
}
}
public static void main(String[] args) {
// This is the block of text to process.
String block = text;
System.out.println ("block [" + block + "]");
// Keep going until no more sentences.
while (!block.equals("")) {
// Remove leading spaces.
if (block.startsWith(" ")) {
block = block.substring(1);
continue;
}
// Find end of sentence.
int pos = block.indexOf('.');
// Extract sentence and remove it from text block.
String sentence = block.substring(0,pos);
block = block.substring(pos+1);
// Process the sentence (this is the "meat").
procSentence (sentence);
System.out.println ("block [" + block + "]");
}
System.out.println ("==========");
}
}
which outputs:
block [My username is click upvote. I have 4k rep on stackoverflow.]
==========
sentence [My username is click upvote]
My username
My username is
My username is click
My username is click upvote
username is
username is click
username is click upvote
is click
is click upvote
click upvote
block [ I have 4k rep on stackoverflow.]
==========
sentence [I have 4k rep on stackoverflow]
I have
I have 4k
I have 4k rep
I have 4k rep on
I have 4k rep on stackoverflow
have 4k
have 4k rep
have 4k rep on
have 4k rep on stackoverflow
4k rep
4k rep on
4k rep on stackoverflow
rep on
rep on stackoverflow
on stackoverflow
block []
==========
Now, keep in mind this is pretty basic Java (some might say it's C written in a Java dialect :-). It's just meant to illustrate how to output word groupings from a sentence as you asked for.
It does not do all the fancy sentence detection and punctuation removal I mentioned in the original answer.