views:

1927

answers:

4

What is the most efficient way to drop a table in SAS?

I have a program that loops and drops a large number of tables, and would like to know if there is a performance difference between PROC SQL; and PROC DATASETS; for dropping a single table at a time..

Or if there is another way perhaps???

+4  A: 

If it is reasonable to outsource to the OS, that might be fastest. Otherwise, my unscientific observations seem to suggest that drop table in proc sql is fastest. This surprised me as I expected proc datasets to be fastest.

In the code below, I create 4000 dummy data sets then try deleting them all with different methods. The first is with sql and on my system took about 11 seconds to delete the files.

The next two both use proc datasets. The first creates a delete statement for each data set and then deletes. The second just issues a blanket kill command to delete everything in the work directory. (I had expected this technique to be the fastest). Both proc datasets routines reported about 20 seconds to delete all 4000 files.

%macro create;
proc printto log='null';run;
%do i=1 %to 4000;
data temp&i;
x=1;
y="dummy";
output;run;
%end;
proc printto;run;
%mend;

%macro delsql;
proc sql;
%do i=1 %to 4000;
drop table temp&i;
%end;
quit;
%mend;

%macro deldata1;
proc datasets library=work nolist;
   %do i=1 %to 4000;
   delete temp&i.;
   %end;
run;quit;
%mend;

%macro deldata2;
proc datasets library=work kill;
run;quit;
%mend;

option fullstimer;
%create;
%delsql;

%create;
%deldata1;

%create;
%deldata2;
cmjohns
How do you mean outsource to OS? Do you mean via an X command?
Bazil
Yup - and it does appear speedier, especially if you are just wiping out a whole directory. For example - this deletes all sas datasets the work directory using the x command:%macro osdel;options noxwait;%let p=%sysfunc(pathname(WORK,l)); x del "%mend;%osdel;
cmjohns
correcting my earlier comment - I meant to say it deletes all sas data sets in the work folder that start with "temp" (since that was the prefix I used on my test in my answer).
cmjohns
I can confirm your results, cmjohns.PROC SQL: 9-13 seconds. PROC DATASETS (individual): 11-22 seconds. PROC DATASETS (KILL option): 20-29 seconds.
Martin Bøgelund
As mentioned in the other answer, you should really test. My guess is that the differences are not necessarily related to the approach/proc but more related to the system configuration and OS.
Jay Stevens
+1  A: 

We are discussing tables or datasets?

Tables implies database tables. To get rid of these in a fast way, using proc SQL pass-through facility would be the fastest. Specifically if you can connect to the database once and drop all of the tables, then disconnect.

If we are discussing datasets in SAS, I would argue that both proc sql and proc datasets are extremely similar. From an application standpoint, they both go through the same deduction to create a system command that deletes a file. All testing I have seen from SAS users groups or presentations have always suggested that the use of one method over the other is marginal and based on many variables.

If it is imperative that you have the absolute fastest way to drop the datasets / tables, you may just have to test it. Each install and setup of SAS is different enough to warrant testing.

AFHood
agreed - I just wondered if one method was quicker overall...
Bazil
+2  A: 

I tried to fiddle with the OS-delete approach.

Deleting with the X-command can not be recommended. It took forever!

I then tried with the system command in a datastep:

%macro delos;
data _null_;
do i=1 to 9;
delcmd="rm -f "!!trim(left(pathname("WORK","L")))!!"/temp"!!trim(left(put(i,4.)))!!"*.sas7*";
rc=system(delcmd);
end;
run;
%mend;

As you can see, I had to split my deletes into 9 separate delete commands. The reason is, I'm using wildcards, "*", and the underlying operating system (AIX) expands these to a list, which then becomes too large for it to handle...

The program basically constructs a delete command for each of the nine filegroups "temp[1-9]*.sas7*" and issues the command.

Using the create macro function from cmjohns answer to create 4000 data tables, I can delete those in only 5 seconds using this approach.

So a direct operating system delete is the fastest way to mass-delete, as I expected.

Martin Bøgelund
A: 

proc delete is another, albeit undocumented, solution..

http://www.sascommunity.org/wiki/PROC_Delete

Bazil