tags:

views:

960

answers:

6

The following snippet of code is used to find the PID of a user's terminal, by using ptree and grabbing the third PID from the results it returns. All terminal PID's are stored in a hash with the user's login as the key.

   ## If process is a TEMINAL.
   ## The command ptree is used to get the terminal's process ID.
   ## The user can then use this ID to peek the user's terminal.
   if ($PID =~ /(\w+)\s+(\d+) .+basic/) {
    $user = $1;
    if (open(PTREE, "ptree $2 |")) {
     while ($PTREE = <PTREE>) {
      if ($PTREE =~ /(\d+)\s+-pksh-ksh/) {
       $terminals{$user} = $terminals{$user} . " $1";
       last;
      }
      next;
     }
     close(PTREE);
    }
    next;
   }

Below is a sample ptree execution:

ares./home_atenas/lmcgra> ptree 29064
485   /usr/lib/inet/inetd start
  23054 /usr/sbin/in.telnetd
    23131 -pksh-ksh
      26107 -ksh
        29058 -ksh
          29064 /usr/ob/bin/basic s=61440 pgm=/usr/local/etc/logon -q -nr trans
            412   sybsrvr

I'd like to know if there is a better way to code this. This is the part of the script that takes longest to run.

Note: this code, along with other snippets, are inside a loop and are executed a couple of times.

+5  A: 

I think the main problem is that this code is in a loop. You don't need to run ptree and parse the results more than once! You need to figure out a way to run ptree once and put it into a data structure that you can use later. Probably be some kind of simple hash will suffice. You may even be able to just keep around your %terminals hash and keep reusing it.

Some nitpicks...

  • Both of your "next" statements seem unnecessary to me... you should be able to just remove them.

  • Replace

    $terminals{$user} = $terminals{$user} . " $1";
    

with:

    $terminals{$user} .= " $1";
  • Replace the bareword PTREE which you are using as a filehandle with $ptreeF or some such... using barewords became unnecessary for filehandles about 10 years ago :)

  • I don't know why your $PID variable is all caps... it could be confusing to readers of your code because it looks like there is something special about that variable, and there isn't.

JoelFan
AFAICT, ptree is called with different arguments each time around the loop.
Paul
Correct. ptree is called various time within a loop.
lamcro
Thanks for the nitpicks JoelFan.I have not used perl in more than 10 years, so any changes after that are foreign to me. I'm pretty much a noob again.
lamcro
np... actually, the second "next" might be necessary, because it's probably escaping from an outer loop that you are not showing
JoelFan
Correct. This code is used inside another loop.
lamcro
+4  A: 

I think you'll get the best performance improvement by avoiding the overhead of repeatedly executing an external command (ptree, in this case). I'd look for a CPAN module that provides a direct interface to the data structures that ptree is reading. Check the Linux:: namespace, maybe? (I'm not sure if ptree is setuid; that may complicate things.)

The above advice aside, some additional style and robustness notes based on the posted snippet only (forgive me if the larger code invalidates them):

  • I'd start by using strict, at the very least. Lexical filehandles would also be a good idea.

  • You appear to be silently ignoring the case when you cannot open() the ptree command. That could happen for many reasons, some of which I can't imagine you wanting to ignore, such as…

  • You're not using the full path to the ptree command, but rather assuming it's in your path—and that the one in your path is the right one.

John Siracusa
I was thinking of using ps to get the parents pid, but I would need to loop this to get the great-grandparent's pid. That's the one I need. Thanks.
lamcro
+3  A: 

How many users are on the system? Can you invert this? List all -pksh-ksh processes in the system along with their EUIDs, and build the map from that - that might be only one execution of ps/ptree.

Paul
Sorry, there are many users and each can have up to three terminals open. The whole script is used to find those terminals that are using a file. I use fuser to find the processes that use a file. Then use ptree to find the terminal's pid.
lamcro
I'd bet a single ps getting the data for all processes is going to be much faster, and I can't see there being any realistic problem with number of users.
ysth
+1  A: 

I just did some trivial timings here based on your script (calling "cat ptree.txt" instead of ptree itself) and confirmed my thoughts that all of your time is spent creating new sub-processes and running ptree itself. Unless you can factor away the need to call ptree (maybe there's a way to open up the connection once and reuse it, like with nslookup), you won't see any real gains.

Joe Casadonte
thanks. I will try something else.
lamcro
+2  A: 

Have you considered using 'who -u' to tell you which process is the login shell for a given tty instead of using ptree? This would simplify your search - irrespective of the other changes you should also make.

Jonathan Leffler
Thanks, but there are too many terminal processes and I need to find the ones using a file.
lamcro
+3  A: 

I was thinking of using ps to get the parents pid, but I would need to loop this to get the great-grandparent's pid. That's the one I need. Thanks. – lamcro


Sorry, there are many users and each can have up to three terminals open. The whole script is used to find those terminals that are using a file. I use fuser to find the processes that use a file. Then use ptree to find the terminal's pid. – lamcro


If you have (or can get) a list of PIDs using a file, and just need all of the grand-parents of that PID, there's an easier way, for sure.

#!perl

use warnings;
use strict;

#***** these PIDs are gotten with fuser or some other method *****
my($fpids) = [27538, 31812, 27541];

#***** get all processes, assuming linux PS *****
my($cmd) = "ps -ef";
open(PS, "$cmd |") || die qq([ERROR] Cannot open pipe from "$cmd" - $!\n);

my($processlist) = {};
while (<PS>) {
    chomp;

    my($user, $pid, $ppid, $rest) = split(/ +/, $_, 4);
    $processlist->{$pid} = $ppid;
}

close PS;

#***** lookup grandparent *****
foreach my $fpid (@$fpids) {
    my($parent) = $processlist->{$fpid} || 0;
    my($grandparent) = $processlist->{$parent} || 0;

    if ($grandparent) {
     #----- do something here with grandparent's pid -----
     print "PID:GRANDPID - $fpid:$grandparent\n";
    }
    else {
     #----- some error condition -----
     print "ERROR - Cannot determine GrandPID: $fpid ($parent)\n";
    }
}

Which for me produces:

ERROR - Cannot determine GrandPID: 27538 (1)
PID:GRANDPID - 31812:2804
PID:GRANDPID - 27541:27538
Joe Casadonte