views:

190

answers:

1

I've been poking my nose in extending SAS string handling with some C functions such as a longest common substring algorithm. The proc FCMP functions get easily pretty inefficient.

The embedded C compiler in proc proto doesn't seem to produce the results I expect after writing the algorithm in Visual Studio. One thing I think I have verified is that the strings passed to a C function seem to be space padded to a length of approximately 100 characters.

Before I go on to write more code to deduce the place where the string should end, I'd like to know if anyone knows of alternative approaches or can in general share ideas about writing C functions for SAS?

Here's some code as an example

/* C functions*/
proc proto package=sasuser.funcs.sfuncs;
    /* A string length function */
    int cslen(const char *s);
    externc cslen;
    int cslen(const char *s)
    {
        int i=0;
        while (s[i++]!=0){}
        return i-1;
    }
    externcend;
    /* A char function */
    int cschar(const char *s,const int pos);
    externc cschar;
    int cschar(const char *s,const int pos)
    {
        return s[pos];
    }
    externcend;
run;
option cmplib=sasuser.funcs;
/* SAS wrappers */
proc fcmp outlib=sasuser.funcs.sfuncs;
    function slen(s $);
        val=cslen(s);
        return(val);
    endsub;
    function schar(s $,pos);
        val=cschar(s,pos);
        return(val);
    endsub;
quit;

Testing the funcs with

/* Tests */
data _null_;
    length str $6.;
    str="foobar";
    len=slen(str);
    firstchar=schar(str,0);
    lastchar=schar(str,5);
    shouldbenull=schar(str,6);
    put _all_;
run;

gives

str=foobar len=91 firstchar=102 lastchar=114 shouldbenull=32 _ERROR_=0 _N_=1

EDIT: We'll, it turns out that you can hack yourself around this by simply trimming the string in the wrappers, for example:

proc fcmp outlib=sasuser.funcs.sfuncs;
    function slen(s $);
        val=cslen(trim(s));
        return(val);
    endsub;
quit;
+1  A: 

I would contact SAS Technical Support ([email protected]) for help with PROC PROTO and how SAS passes strings into C routines.

There are other ways to access routines written in C. One is to use CALL MODULE, the MODULEN function or the MODULEC function. These routines have the ability to call functions that are stored in a .dll (or .so on Unix). A link to the Windows CALL MODULE documentation is here:

http://support.sas.com/documentation/cdl/en/hostwin/61924/HTML/default/win-func-module.htm

A link to the UNIX CALL MODULE documentation is here:

http://support.sas.com/documentation/cdl/en/hostunx/61879/HTML/default/unx-func-module.htm

Another option is to license SAS/TOOLKIT. SAS/TOOLKIT allows you to write functions, PROCs, formats, informats, and engines in C, and other languages, that can be used from SAS. Here is a page that has information about SAS/TOOLKIT:

http://www.sas.com/products/toolkit/index.html

secoskyj
Thanks, +1. For the sake of transportability and code control, I'd actually prefer to compile small functions in SAS rather than relying on DLLs.
Ville Koskinen