tags:

views:

79

answers:

4

I need to convert a text entered in a textarea, to a form like:

word1|word2|word3|word4|word5

How can i do this?

+3  A: 

Assuming the user enters the text into the textarea like this:

word1|word2|word3|word4|word5

and you store that in variable string userText, then use:

var textArray = userText.split('|');
Chetan
i am asking for the code to convert the text the user entered to this: "word1|word2|word3|word4|word5" i will need to change any non alphanumerical characters, spaces, tabs etc to a "|".
conrad
A: 

This should get rid of the tabs, spaces, etc (any unwanted whitespace), and replace them with a '|' character. And, the second replace will get rid of the non-alphanumeric and '|' characters. Then, you can split the text on the '|' to give you an array of the words.

var textIn= document.getElementById("myTextArea");
textIn.value = (textIn.value).replace(/\s+/g,'|').replace(/[^\w|]/g, '');
var textArr = textIn.value.split('|');

Also, if you don't want to actually replace the text in the textarea, you can store it to a var instead on the 2nd line of code.

kchau
... but what about punctuators (dots, commas, exclamation marks, etc.). They should be removed, also. And what about splitting on the "|" character? There are no such characters in the source text.
Šime Vidas
Thanks, This is useful, but i will also need to remove dot, comma, question mark etc.
conrad
@conrad, Taken care of.
kchau
A: 

Try this...

var textAreaWords=textAreaNode.value.replace(/[^\w ]+/g,'').replace(/\s+/g,'|').split('|');

This will only keep the A-Za-z0-9_ characters as part of the first replace. The second replace turns all spaces/newlines/tabs into pipe characters. It will also convert multiple consecutive spaces into 1 pipe.

Eric
This will only match the first occurrence of `[^\w]` and `\s+` and doesn't work over newlines. (No `|` is inserted)
indieinvader
I was writing the code from memory and it was untested. I have updated the code above to include the g attribute to make it a global search. After the changes, it is working in Firefox for Strings with multiple new line characters and multiple occurrences of "special characters"
Eric
+1  A: 

This should do the trick:

input = textarea.value.
    replace(/\b/g, '|'). // Replace word boundaries with '|'
    replace(/\s|[^a-zA-Z0-9\|]/g, ''). // Remove all non-alphanumeric chars
    replace(/\|{2,}/g, '|'). // Replace repetitions of '|' (like '||') with '|'
    replace(/^\||\|$/g, ''); // Remove extra '|' chars
array = input.split('|');
indieinvader