I've been poking my nose in extending SAS string handling with some C functions such as a longest common substring algorithm. The proc FCMP functions get easily pretty inefficient.
The embedded C compiler in proc proto doesn't seem to produce the results I expect after writing the algorithm in Visual Studio. One thing I think I have verified is that the strings passed to a C function seem to be space padded to a length of approximately 100 characters.
Before I go on to write more code to deduce the place where the string should end, I'd like to know if anyone knows of alternative approaches or can in general share ideas about writing C functions for SAS?
Here's some code as an example
/* C functions*/
proc proto package=sasuser.funcs.sfuncs;
/* A string length function */
int cslen(const char *s);
externc cslen;
int cslen(const char *s)
{
int i=0;
while (s[i++]!=0){}
return i-1;
}
externcend;
/* A char function */
int cschar(const char *s,const int pos);
externc cschar;
int cschar(const char *s,const int pos)
{
return s[pos];
}
externcend;
run;
option cmplib=sasuser.funcs;
/* SAS wrappers */
proc fcmp outlib=sasuser.funcs.sfuncs;
function slen(s $);
val=cslen(s);
return(val);
endsub;
function schar(s $,pos);
val=cschar(s,pos);
return(val);
endsub;
quit;
Testing the funcs with
/* Tests */
data _null_;
length str $6.;
str="foobar";
len=slen(str);
firstchar=schar(str,0);
lastchar=schar(str,5);
shouldbenull=schar(str,6);
put _all_;
run;
gives
str=foobar len=91 firstchar=102 lastchar=114 shouldbenull=32 _ERROR_=0 _N_=1
EDIT: We'll, it turns out that you can hack yourself around this by simply trimming the string in the wrappers, for example:
proc fcmp outlib=sasuser.funcs.sfuncs;
function slen(s $);
val=cslen(trim(s));
return(val);
endsub;
quit;