tags:

views:

673

answers:

2

Hello,

   T = {xmlelement,"presence",
                                     [{"xml:lang","en"}],
                                     [{xmlcdata,<<"\n">>},
                                      {xmlelement,"priority",[],
                                          [{xmlcdata,<<"5">>}]},
                                      {xmlcdata,<<"\n">>},
                                      {xmlelement,"c",
                                          [{"xmlns",
                                            "http://jabber.org/protocol/caps"},
                                           {"node","http://psi-im.org/caps"},
                                           {"ver","0.12.1"},
                                           {"ext","cs ep-notify html"}],
                                          []},
                                      {xmlcdata,<<"\n">>}]}.

I want to remove all the whitespace i..e tabs/spaces/newline chars. I tried the following, but it does not work:

trim_whitespace(Input) ->
re:replace(Input, "(\r\n)*", "").
+1  A: 

If you want to remove everything in a string, you need to pass the global option to re:replace(). You're also only replacing newlines by using that regex. The call should probably look like this:

trim_whitespace(Input) -> re:replace(Input, "\\s+", "", [global]).
Matt Kane
I get the following error:socket:trim_whitespace(P).** exception error: bad argument in function re:replace/4socket:trim_whitespace(P).** exception error: bad argument in function re:replace/4 called as re:replace({xmlelement,"presence", ....
Alec Smart
Sorry, I misread the man page! See my edit.
Matt Kane
A: 

All the whitespace in your question is in cdata sections - why not just filter those out of the tuple?

remove_cdata(List) when is_list(List) ->
    remove_list_cdata(List);
remove_cdata({xmlelement, Name, Attrs, Els}) ->
    {xmlelement, Name, remove_cdata(Attrs), remove_cdata(Els)}.

remove_list_cdata([]) ->
    [];
remove_list_cdata([{xmlcdata,_}|Rest]) ->
    remove_list_cdata(Rest);
remove_list_cdata([E = {xmlelement,_,_,_}|Rest]) ->
    [remove_cdata(E) | remove_list_cdata(Rest)];
remove_list_cdata([Item | Rest]) ->
    [Item | remove_list_cdata(Rest)].


remove_cdata(T) =:= 
    {xmlelement,"presence",
     [{"xml:lang","en"}],
     [{xmlelement,"priority",[],[]},
      {xmlelement,"c",
       [{"xmlns","http://jabber.org/protocol/caps"},
        {"node","http://psi-im.org/caps"},
        {"ver","0.12.1"},
        {"ext","cs ep-notify html"}],
       []}]}
archaelus
am looking to remove all the \r\n characters. I want it to appear in a single line so that I can send it to my perl program. For that I need to remove all the \r\n from the entire tuple. Right now after every comma there is a new line character. How do I compress everything into a single line?
Alec Smart
You want to serialize that erlang tuple into a string (with no newlines/carriage returns)?
archaelus