tags:

views:

374

answers:

3

Archaelus suggested in this post that writing a new format routine to handle named parameters may be a good learning exercise. So, in the spirit of learning the language I wrote a formatting routine which handles named parameters.



An Example:

1> fout:format("hello ~s{name}, ~p{one}, ~p{two}, ~p{three}~n",[{one,1},{three,3},{name,"Mike"},{two,2}]).
hello Mike, 1, 2, 3
ok



The Benchmark:

1> timer:tc(fout,benchmark_format_overhead,["hello ~s{name}, ~p{one}, ~p{two}, ~p{three}~n",[{one,1},{name,"Mike"},{three,3},{two,2}],100000]).
{421000,true}
= 4.21us per call

Although I suspect that much of this overhead is due to looping, as a calling the function with one loop yields a response in < 1us.

1> timer:tc(fout,benchmark_format_overhead,["hello ~s{name}, ~p{one}, ~p{two}, ~p{three}~n",[{one,1},{name,"Mike"},{three,3},{two,2}],1]).
{1,true}

If there is a better way of benchmarking in erlang, please let me know.



The Code: (which has been revised in accordance with Doug's suggestion)

-module(fout).

-export([format/2,benchmark_format_overhead/3]).

benchmark_format_overhead(_,_,0)->
    true;
benchmark_format_overhead(OString,OList,Loops) ->
    {FString,FNames}=parse_string(OString,ONames),
    benchmark_format_overhead(OString,OList,Loops-1).

format(OString,ONames) ->
    {FString,FNames}=parse_string(OString,ONames),
    io:format(FString,FNames).

parse_string(FormatString,Names) ->
    {F,N}=parse_format(FormatString),
    {F,substitute_names(N,Names)}.

parse_format(FS) ->
    parse_format(FS,"",[],"").

parse_format("",FormatString,ParamList,"")->
    {lists:reverse(FormatString),lists:reverse(ParamList)};
parse_format([${|FS],FormatString,ParamList,"")->
    parse_name(FS,FormatString,ParamList,"");
parse_format([$}|_FS],FormatString,_,_) ->
    throw({'unmatched } found',lists:reverse(FormatString)});
parse_format([C|FS],FormatString,ParamList,"") ->
    parse_format(FS,[C|FormatString],ParamList,"").

parse_name([$}|FS],FormatString,ParamList,ParamName) ->
    parse_format(FS,FormatString,[list_to_atom(lists:reverse(ParamName))|ParamList],"");
parse_name([${|_FS],FormatString,_,_) ->
    throw({'additional { found',lists:reverse(FormatString)});
parse_name([C|FS],FormatString,ParamList,ParamName) ->
    parse_name(FS,FormatString,ParamList,[C|ParamName]).

substitute_names(Positioned,Values) ->
    lists:map(fun(CN)->
                        case lists:keysearch(CN,1,Values) of
                            false ->
                                throw({'named parameter not found',CN,Values});
                            {_,{_,V}} ->
                                V
                        end end,
              Positioned).

As this was a learning exercise, I was hoping that those more experienced with erlang could give me tips on how to improve my code.

Cheers, Mike

+1  A: 

Without comment on the algorithm, or on use of appropriate library functions...

I would have expected to see more use of pattern matching and recursion; for example parse_character (no longer folded) might be replaced with something like:

parse_in_format ([], FmtStr, ParmStrs, ParmName) -> {FmtStr, ParmStrs};
parse_in_format ([${ | Vr], FmtStr, ParmStrs, ParmName) -> parse_in_name (Vr, FmtStr, ParmStrs, ParmName);
parse_in_format ([$} | Vr], FmtStr, ParmStrs, ParmName) -> throw() % etc.
parse_in_format ([V | Vr], FmtStr, ParmStrs, ParmName) -> parse_in_format (Vr, [V | FmtStr], ParmStrs, ParmName).

parse_in_name ([], FmtStr, ParmStrs, ParmName) -> throw() % etc.
parse_in_name ([$} | Vr], FmtStr, ParmStrs, ParmName) -> parse_in_format (Vr, FmtStr, [list_to_atom(lists:reverse(ParmName))|ParmStrs], "");
parse_in_name ([${ | Vr], FmtStr, ParmStrs, ParmName) -> throw() % etc.
parse_in_name ([V | Vr], FmtStr, ParmStrs, ParmName) -> parse_in_name (Vr, FmtStr, ParmStrs, [V | ParmName]).

Kicked off with a

parse_in_format (FormatStr,  [], [], "");
Doug Currie
Thanks Doug, I agree that that method is superior, I'm rewriting it at the moment using fewer library functions.Erlang is my first functional language so I am still transitioning into functional thinking.Thanks for your comment!
Mike Hamer
+1  A: 

If you don't know if looping overhead affect your code much you should measure it. It's simple.

-define(COLOOPS, 1000000).

-export([call_overhead/1,measure_call_overhead/0, measure_call_overhead/1]).

% returns overhead in us 
measure_call_overhead() -> measure_call_overhead(?COLOOPS).
measure_call_overhead(N) -> element(1, timer:tc(?MODULE, call_overhead, [N]))/N.

call_overhead(0)->ok;
call_overhead(N)->
    ok=nop(),
    call_overhead(N-1).

nop()->ok.

It is about 50ns on my laptop. I think this should not affect your current code so much.

Another way how to measure is using directly statistics(wall_clock) or statistics(runtime) which returns time in ms. Benefit is that you don't need export measured function. It is only cosmetics improvement.

Hynek -Pichi- Vychodil
+2  A: 

In addition to doug's suggestion, I'd avoid using atom_to_list/1 here - the substitute names code doesn't need them and generating atoms at runtime is almost always a bad idea. Strings will work perfectly well.

parse_name([$}|FS],FormatString,ParamList,ParamName) ->
    parse_format(FS,FormatString,[lists:reverse(ParamName)|ParamList],"");
parse_name([${|_FS],FormatString,_,_) ->
    throw({'additional { found',lists:reverse(FormatString)});
parse_name([C|FS],FormatString,ParamList,ParamName) ->
    parse_name(FS,FormatString,ParamList,[C|ParamName]).

I would also use proplists:get_value instead of lists:keysearch/3 - when you have a list of two element tuples {Name, Value} as we do here, using the proplists code is the way to go - it's still a little messy as we need the case statement to check for missing values so we can crash with a better error.

substitute_names(Positioned,Values) ->
    [ case proplists:get_value(Name, Values) of
          undefined -> erlang:exit({missing_parameter, Name});
          V -> V
      end
      || Name <- Positioned ].

As this is a library, it should be a replacement for io_lib, not io. This way we don't have to provide all the alternatives io offers (optional IoDevice argument and so on).

format(OString,ONames) ->
    {FString,FNames}=parse_string(OString,ONames),
    io_lib:format(FString,FNames).

All in all, solid code. If you're willing to license it under BSD or something similar, I'd quite like to add it to my web framework code Ejango.

archaelus
Sorry for the delayed reply, I've been on holiday for the past 2 weeks.Thank you for your detailed suggestions and explanations, they were most helpful.I am happy to license the code under BSD, so feel free to use it however you wish (and with whatever modifications you wish).Cheers,Mike
Mike Hamer