views:

2069

answers:

8

Is there a idiomatic way of removing elements from PATH-like shell variables?

That is I want to take

PATH=/home/joe/bin:/usr/local/bin:/usr/bin:/bin:/path/to/app/bin:.

and remove or replace the /path/to/app/bin without clobbering the rest of the variable. Extra points for allowing me put new elements in arbitrary positions. The target will be recognizable by a well defined string, and may occur at any point in the list.

I know I've seen this done, and can probably cobble something together on my own, but I'm looking for a nice approach. Portability and standardization a plus.

I use bash, but example are welcome in your favorite shell as well.


The context here is one of needing to switch conveniently between multiple versions (one for doing analysis, another for working on the framework) of a large scientific analysis package which produces a couple dozen executables, has data stashed around the filesystem, and uses environment variable to help find all this stuff. I would like to write a script that selects a version, and need to be able to remove the $PATH elements relating to the currently active version and replace them with the same elements relating to the new version.


This is related to the problem of preventing repeated $PATH elemets when re-running login scripts and the like.


+3  A: 

For deleting an element you can use sed:

#!/bin/bash
NEW_PATH=$(echo -n $PATH | tr ":" "\n" | sed "/foo/d" | tr "\n" ":")
export PATH=$NEW_PATH

will delete the paths that contain "foo" from the path.

You could also use sed to insert a new line before or after a given line.

Edit: you can remove duplicates by piping through sort and uniq:

echo -n $PATH | tr ":" "\n" | sort | uniq -c | sed -n "/ 1 / s/.*1 \(.*\)/\1/p" | sed "/foo/d" | tr "\n" ":"
florin
Nice. I like the use of tr. Upvote. Still looking for something idiomatic, but this will work.
dmckee
I don't like the "sort | uniq" bit. Order matters in PATH.
dmckee
@dmckee: absolutely agree - sorting the elements in a PATH is, in general, a disaster. Specific case: /bin comes long before /usr/perl/bin, but I want Perl 5.10.0 from the latter, not Perl 5.6.x from the former.
Jonathan Leffler
A: 

The first thing to pop into my head to change just part of a string is a sed substitution.

example: if echo $PATH => "/usr/pkg/bin:/usr/bin:/bin:/usr/pkg/games:/usr/pkg/X11R6/bin" then to change "/usr/bin" to "/usr/local/bin" could be done like this:

## produces standard output file

## the "=" character is used instead of slash ("/") since that would be messy, # alternative quoting character should be unlikely in PATH

## the path separater character ":" is both removed and re-added here, # might want an extra colon after the last path

echo $PATH | sed '=/usr/bin:=/usr/local/bin:='

This solution replaces an entire path-element so might be redundant if new-element is similar.

If the new PATH'-s aren't dynamic but always within some constant set you could save those in a variable and assign as needed:

PATH=$TEMP_PATH_1; # commands ... ; \n PATH=$TEMP_PATH_2; # commands etc... ;

Might not be what you were thinking. some of the relevant commands on bash/unix would be:

pushd popd cd ls # maybe l -1A for single column; find grep which # could confirm that file is where you think it came from; env type

..and all that and more have some bearing on PATH or directories in general. The text altering part could be done any number of ways!

Whatever solution chosen would have 4 parts:

1) fetch the path as it is 2) decode the path to find the part needing changes 3) determing what changes are needed/integrating those changes 4) validation/final integration/setting the variable

waynecolvin
+1  A: 

There are a couple of relevant programs in the answers to "How to keep from duplicating path variable in csh". They concentrate more on ensuring that there are no repeated elements, but the script I provide can be used as:

export PATH=$(clnpath $head_dirs:$PATH:$tail_dirs $remove_dirs)

Assuming you have one or more directories in $head_dirs and one or more directories in $tail_dirs and one or more directories in $remove_dirs, then it uses the shell to concatenate the head, current and tail parts into a massive value, and then removes each of the directories listed in $remove_dirs from the result (not an error if they don't exist), as well as eliminating second and subsequent occurrences of any directory in the path.

This does not address putting path components into a specific position (other than at the beginning or end, and those only indirectly). Notationally, specifying where you want to add the new element, or which element you want to replace, is messy.

Jonathan Leffler
Very helpful. The repeated path issue is subsidiary to me here, but was in the back of my mind. Thanks. +1
dmckee
+1  A: 

OK, thanks to all responders. I've prepared an encapsulated version of florin's answer. The first pass looks like this:

# path_tools.bash
#
# A set of tools for manipulating ":" separated lists like the
# canonical $PATH variable.
#
# /bin/sh compatibility can probably be regained by replacing $( )
# style command expansion with ` ` style
###############################################################################
# Usage:
#
# To remove a path:
#    replace-path         PATH $PATH /exact/path/to/remove    
#    replace-path-pattern PATH $PATH <grep pattern for target path>    
#
# To replace a path:
#    replace-path         PATH $PATH /exact/path/to/remove /replacement/path   
#    replace-path-pattern PATH $PATH <target pattern> /replacement/path
#    
###############################################################################
# Finds the _first_ list element matching $2
#
#    $1 name of a shell variable to be set
#    $2 name of a variable with a path-like structure
#    $3 a grep pattern to match the desired element of $1
function path-element-by-pattern (){ 
    target=$1;
    list=$2;
    pat=$3;

    export $target=$(echo -n $list | tr ":" "\n" | grep -m 1 $pat);
    return
}

# Removes or replaces an element of $1
#
#   $1 name of the shell variable to set (i.e. PATH) 
#   $2 a ":" delimited list to work from (i.e. $PATH)
#   $2 the precise string to be removed/replaced
#   $3 the replacement string (use "" for removal)
function replace-path () {
    path=$1;
    list=$2;
    removestr=$3;
    replacestr=$4; # Allowed to be ""

    export $path=$(echo -n $list | tr ":" "\n" | sed "s|$removestr|$replacestr|" | tr "\n" ":" | sed "s|::|:|g");
    unset removestr
    return 
}

# Removes or replaces an element of $1
#
#   $1 name of the shell variable to set (i.e. PATH) 
#   $2 a ":" delimited list to work from (i.e. $PATH)
#   $2 a grep pattern identifying the element to be removed/replaced
#   $3 the replacement string (use "" for removal)
function replace-path-pattern () {
    path=$1;
    list=$2;
    removepat=$3; 
    replacestr=$4; # Allowed to be ""

    path-element-by-pattern removestr $list $removepat;
    replace-path $path $list $removestr $replacestr;
}

Still needs error trapping in all the functions, and I should probably stick in a repeated path solution while I'm at it.

You use it by doing a . /include/path/path_tools.bash in the working script and calling on of the the replace-path* functions.


I am still open to new and/or better answers.

dmckee
Incidentally, while bash may allow the hyphens in the function name, other shells do not.
Jonathan Leffler
See my second 'answer' (actually a commentary on the solution).
Jonathan Leffler
+3  A: 

Addressing the proposed solution from dmckee:

  1. While some versions of Bash may allow hyphens in function names, others (MacOS X) do not.
  2. I don't see a need to use return immediately before the end of the function.
  3. I don't see the need for all the semi-colons.
  4. I don't see why you have path-element-by-pattern export a value. Think of export as equivalent to setting (or even creating) a global variable - something to be avoided whenever possible.
  5. I'm not sure what you expect 'replace-path PATH $PATH /usr' to do, but it does not do what I would expect.

Consider a PATH value that starts off containing:

.
/Users/jleffler/bin
/usr/local/postgresql/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/usr/bin
/bin
/sw/bin
/usr/sbin
/sbin

The result I got (from 'replace-path PATH $PATH /usr') is:

.
/Users/jleffler/bin
/local/postgresql/bin
/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/local/bin
/bin
/bin
/sw/bin
/sbin
/sbin

I would have expected to get my original path back since /usr does not appear as a (complete) path element, only as part of a path element.

This can be fixed in replace-path by modifying one of the sed commands:

export $path=$(echo -n $list | tr ":" "\n" | sed "s:^$removestr\$:$replacestr:" |
               tr "\n" ":" | sed "s|::|:|g")

I used ':' instead of '|' to separate parts of the substitute since '|' could (in theory) appear in a path component, whereas by definition of PATH, a colon cannot. I observe that the second sed could eliminate the current directory from the middle of a PATH. That is, a legitimate (though perverse) value of PATH could be:

PATH=/bin::/usr/local/bin

After processing, the current directory would no longer be on the PATH.

A similar change to anchor the match is appropriate in path-element-by-pattern:

export $target=$(echo -n $list | tr ":" "\n" | grep -m 1 "^$pat\$")

I note in passing that grep -m 1 is not standard (it is a GNU extension, also available on MacOS X). And, indeed, the-n option for echo is also non-standard; you would be better off simply deleting the trailing colon that is added by virtue of converting the newline from echo into a colon. Since path-element-by-pattern is used just once, has undesirable side-effects (it clobbers any pre-existing exported variable called $removestr), it can be replaced sensibly by its body. This, along with more liberal use of quotes to avoid problems with spaces or unwanted file name expansion, leads to:

# path_tools.bash
#
# A set of tools for manipulating ":" separated lists like the
# canonical $PATH variable.
#
# /bin/sh compatibility can probably be regained by replacing $( )
# style command expansion with ` ` style
###############################################################################
# Usage:
#
# To remove a path:
#    replace_path         PATH $PATH /exact/path/to/remove
#    replace_path_pattern PATH $PATH <grep pattern for target path>
#
# To replace a path:
#    replace_path         PATH $PATH /exact/path/to/remove /replacement/path
#    replace_path_pattern PATH $PATH <target pattern> /replacement/path
#
###############################################################################

# Remove or replace an element of $1
#
#   $1 name of the shell variable to set (e.g. PATH)
#   $2 a ":" delimited list to work from (e.g. $PATH)
#   $3 the precise string to be removed/replaced
#   $4 the replacement string (use "" for removal)
function replace_path () {
    path=$1
    list=$2
    remove=$3
    replace=$4        # Allowed to be empty or unset

    export $path=$(echo "$list" | tr ":" "\n" | sed "s:^$remove\$:$replace:" |
                   tr "\n" ":" | sed 's|:$||')
}

# Remove or replace an element of $1
#
#   $1 name of the shell variable to set (e.g. PATH)
#   $2 a ":" delimited list to work from (e.g. $PATH)
#   $3 a grep pattern identifying the element to be removed/replaced
#   $4 the replacement string (use "" for removal)
function replace_path_pattern () {
    path=$1
    list=$2
    removepat=$3
    replacestr=$4        # Allowed to be empty or unset

    removestr=$(echo "$list" | tr ":" "\n" | grep -m 1 "^$removepat\$")
    replace_path "$path" "$list" "$removestr" "$replacestr"
}

I have a Perl script called echopath which I find useful when debugging problems with PATH-like variables:

#!/usr/bin/perl -w
#
#   "@(#)$Id: echopath.pl,v 1.7 1998/09/15 03:16:36 jleffler Exp $"
#
#   Print the components of a PATH variable one per line.
#   If there are no colons in the arguments, assume that they are
#   the names of environment variables.

@ARGV = $ENV{PATH} unless @ARGV;

foreach $arg (@ARGV)
{
    $var = $arg;
    $var = $ENV{$arg} if $arg =~ /^[A-Za-z_][A-Za-z_0-9]*$/;
    $var = $arg unless $var;
    @lst = split /:/, $var;
    foreach $val (@lst)
    {
            print "$val\n";
    }
}

When I run the modified solution on the test code below:

echo
xpath=$PATH
replace_path xpath $xpath /usr
echopath $xpath

echo
xpath=$PATH
replace_path_pattern xpath $xpath /usr/bin /work/bin
echopath xpath

echo
xpath=$PATH
replace_path_pattern xpath $xpath "/usr/.*/bin" /work/bin
echopath xpath

The output is:

.
/Users/jleffler/bin
/usr/local/postgresql/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/usr/bin
/bin
/sw/bin
/usr/sbin
/sbin

.
/Users/jleffler/bin
/usr/local/postgresql/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/work/bin
/bin
/sw/bin
/usr/sbin
/sbin

.
/Users/jleffler/bin
/work/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/usr/bin
/bin
/sw/bin
/usr/sbin
/sbin

This looks correct to me - at least, for my definition of what the problem is.

I note that echopath LD_LIBRARY_PATH evaluates $LD_LIBRARY_PATH. It would be nice if your functions were able to do that, so the user could type:

replace_path PATH /usr/bin /work/bin

That can be done by using:

list=$(eval echo '$'$path)

This leads to this revision of the code:

# path_tools.bash
#
# A set of tools for manipulating ":" separated lists like the
# canonical $PATH variable.
#
# /bin/sh compatibility can probably be regained by replacing $( )
# style command expansion with ` ` style
###############################################################################
# Usage:
#
# To remove a path:
#    replace_path         PATH /exact/path/to/remove
#    replace_path_pattern PATH <grep pattern for target path>
#
# To replace a path:
#    replace_path         PATH /exact/path/to/remove /replacement/path
#    replace_path_pattern PATH <target pattern> /replacement/path
#
###############################################################################

# Remove or replace an element of $1
#
#   $1 name of the shell variable to set (e.g. PATH)
#   $2 the precise string to be removed/replaced
#   $3 the replacement string (use "" for removal)
function replace_path () {
    path=$1
    list=$(eval echo '$'$path)
    remove=$2
    replace=$3            # Allowed to be empty or unset

    export $path=$(echo "$list" | tr ":" "\n" | sed "s:^$remove\$:$replace:" |
                   tr "\n" ":" | sed 's|:$||')
}

# Remove or replace an element of $1
#
#   $1 name of the shell variable to set (e.g. PATH)
#   $2 a grep pattern identifying the element to be removed/replaced
#   $3 the replacement string (use "" for removal)
function replace_path_pattern () {
    path=$1
    list=$(eval echo '$'$path)
    removepat=$2
    replacestr=$3            # Allowed to be empty or unset

    removestr=$(echo "$list" | tr ":" "\n" | grep -m 1 "^$removepat\$")
    replace_path "$path" "$removestr" "$replacestr"
}

The following revised test now works too:

echo
xpath=$PATH
replace_path xpath /usr
echopath xpath

echo
xpath=$PATH
replace_path_pattern xpath /usr/bin /work/bin
echopath xpath

echo
xpath=$PATH
replace_path_pattern xpath "/usr/.*/bin" /work/bin
echopath xpath

It produces the same output as before.

Jonathan Leffler
I still need to look at this in some detail, but it appears to be a substantial improvement. Thanks. +1
dmckee
+1  A: 

Just a note that bash itself can do search and replace. It can do all the normal "once or all", cases [in]sensitive options you would expect.

From the man page:

${parameter/pattern/string}

The pattern is expanded to produce a pattern just as in pathname expansion. Parameter is expanded and the longest match of pattern against its value is replaced with string. If Ipattern begins with /, all matches of pattern are replaced with string. Normally only the first match is replaced. If pattern begins with #, it must match at the beginning of the expanded value of parameter. If pattern begins with %, it must match at the end of the expanded value of parameter. If string is null, matches of pattern are deleted and the / following pattern may be omitted. If parameter is @ or *, the substitution operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with @ or *, the substitution operation is applied to each member of the array in turn, and the expansion is the resultant list.

You can also do field splitting by setting $IFS (input field separator) to the desired delimiter.

dj_segfault
Oddly enough, I had known that. It just escaped me the day I was working on that.
dmckee
+1  A: 

This is easy using awk.

Replace

{
  for(i=1;i<=NF;i++) 
      if($i == REM) 
          if(REP)
              print REP; 
          else
              continue;
      else 
          print $i; 
}

Start it using

function path_repl {
    echo $PATH | awk -F: -f rem.awk REM="$1" REP="$2" | paste -sd:
}

$ echo $PATH
/bin:/usr/bin:/home/js/usr/bin
$ path_repl /bin /baz
/baz:/usr/bin:/home/js/usr/bin
$ path_repl /bin
/usr/bin:/home/js/usr/bin

Append

Inserts at the given position. By default, it appends at the end.

{ 
    if(IDX < 1) IDX = NF + IDX + 1
    for(i = 1; i <= NF; i++) {
        if(IDX == i) 
            print REP 
        print $i
    }
    if(IDX == NF + 1)
        print REP
}

Start it using

function path_app {
    echo $PATH | awk -F: -f app.awk REP="$1" IDX="$2" | paste -sd:
}

$ echo $PATH
/bin:/usr/bin:/home/js/usr/bin
$ path_app /baz 0
/bin:/usr/bin:/home/js/usr/bin:/baz
$ path_app /baz -1
/bin:/usr/bin:/baz:/home/js/usr/bin
$ path_app /baz 1
/baz:/bin:/usr/bin:/home/js/usr/bin

Remove duplicates

This one keeps the first occurences.

{ 
    for(i = 1; i <= NF; i++) {
        if(!used[$i]) {
            print $i
            used[$i] = 1
        }
    }
}

Start it like this:

echo $PATH | awk -F: -f rem_dup.awk | paste -sd:

Validate whether all elements exist

The following will print an error message for all entries that are not existing in the filesystem, and return a nonzero value.

echo -n $PATH | xargs -d: stat -c %n

To simply check whether all elements are paths and get a return code, you can also use test:

echo -n $PATH | xargs -d: -n1 test -d
Johannes Schaub - litb
Clear and working, but I'm not sure that it is better than florin's pipeline. +1 Thanks.
dmckee
+1  A: 

Reposting my answer to What is the most elegant way to remove a path from the $PATH variable in Bash? :

#!/bin/bash
IFS=:
# convert it to an array
t=($PATH)
unset IFS
# perform any array operations to remove elements from the array
t=(${t[@]%%*usr*})
IFS=:
# output the new array
echo "${t[*]}"

or the one-liner:

PATH=$(IFS=':';t=($PATH);unset IFS;t=(${t[@]%%*usr*});IFS=':';echo "${t[*]}");
nicerobot
All bash, too. Impressive.
dmckee