views:

58

answers:

4

Alright, so I'm currently working on a simplistic manner of mimicking the function of PHP's urlencode() with JS's escape() and some string replacement. I'm purely looking at keyboard characters, so no special ASCII or Unicode characters. The problem I'm encountering is that the specific characters *, +, and / all have special meanings in RegExp, and seeing as JavaScript String Object methods use RegExp parameters, I cannot fully replicate PHP's urlencode(). Thus, how would one manage to perform a replacement of these characters within a string using JS?

Background information:
escape() discrepancies with urlencode():
@: not converted, should be %40
&: considered an html entity, thus is escaped as %26amp%3B rather than %26
*: not converted, should be %2A
+: not converted, should be %2B
/: not converted, should be %2F
<: considered an html entity, thus is escaped as %26lt%3B rather than %3C
>: considered an html entity, thus is escaped as %26gt%3B rather than %3E

HTML Entity conversion is as simple as

str.replace(/&amp;/g, '%26').replace(/&lt;/g, '%3C').replace(/&gt;/g, '%3E');

and @ can be replaced, as it is not a reserved RegExp character.
* represents the {0} conditional
+ represents the {1} conditional
/ is the basic RegExp tag character

Thank you for your time.

A: 

You can always use good ole' string array access.

var myString = "Hello this is a string. * It contains an asterisk.";

for(var i = 0; i < myString.length; i++)
{
    if(myString[i] == '*')
    {
        alert(i);
    }
}​

EDIT: Lol - wrong *. Astérix est d'un dessin animé français... french class.

Mahir
It contains an asterisk, not an asterix.
Robusto
+2  A: 

Is there a reason not simply to escape the *, +, and / characters with backslashes in the regex?

s = s.replace( /\*/g, 'star' );
Ed J. Plunkett
+1  A: 

To me, chaining replaces isn't very elegant. I would try:

var symbols = {
    '@': '%40',
    '&amp;': '%26',
    '*': '%2A',
    '+': '%2B',
    '/': '%2F',
    '&lt;': '%3C',
    '&gt;': '%3E'
};
str = str.replace(/([@*+/]|&(amp|lt|gt);)/g, function (m) { return symbols[m]; });

Conveniently, this also avoids the original problem.

Casey Hope
A: 

Sorry that I'm not using the same account, but to respond to all, here goes:

Mr. Plunkett, escaping the characters actually doesn't work in this case, for whatever reason (probably due to their reserved character status). I forgot to mention that I tried escaping them xD.

Mr. Hope, thank you for the bypass solution. I'll check that out and see if it works (it should).

Mahir (محير؟), Firstly, string array access would be a very very lengthy addition of code, especially when dealing with three extra characters that need to be replaced (yes, I know, it could just be an if/else if/else if, but that's still quite a bit of code). Oui, et bien ça suffit de dire qu'il est mon caractère favorit de la littérature Française.

With regards to William's response, if you do a bit of research, you'll notice that encodeURIComponent() does not actually encode a multitude of characters which are encoded by PHP's urlencode(). The object of this entire question was to build a simple replica of urlencode() within JS.

So thank you all for responding, it is much appreciated.

--addendum--
Mr. Hope's code works, with a single tweak:

function urlencode(str) {
   var symbols = {
      '@': '%40',
      '%26amp%3B: '%26',
      '*': '%2A',
      '+': '%2B',
      '/': '%2F',
      '%26lt%3B': '%3C',
      '%26gt%3B': '%3E'
   };
   return escape(str).replace(/([@*+/]|%26(amp|lt|gt)%3B)/g, function (m) { return symbols[m]; });
}

Obviously someone else might change the code a different way to make it work, but for me, eliminating the crosstalk from escaping the percentage signs of the replacements is much more convenient.

Thank you all, once again.

Patrick Reilly