views:

372

answers:

4

I'm looking for the equivalent of a urlencode for terminal output -- I need to make sure that garbage characters I (may) print from an external source don't end up doing funky things to my terminal, so a prepackaged function to escape special character sequences would be ideal.

I'm working in Python, but anything I can readily translate works too. TIA!

A: 

You could pipe it through strings

./command | strings

This will strip out the non string characters

ng
I'd still like to display the characters, but in a way that won't cause side effects in the terminal itself. This is a nice back-up plan, though!
cdleary
+3  A: 

Unfortunately "terminal output" is a very poorly defined criterion for filtering (see question 418176). I would suggest simply whitelisting the characters that you want to allow (which would be most of string.printable), and replacing all others with whatever escaped format you like (\FF, %FF, etc), or even simply stripping them out.

Sparr
+2  A: 
$ ./command | cat -v

$ cat --help | grep nonprinting
-v, --show-nonprinting   use ^ and M- notation, except for LFD and TAB

Here's the same in py3k based on android/cat.c:

#!/usr/bin/env python3
"""Emulate `cat -v` behaviour.

use ^ and M- notation, except for LFD and TAB

NOTE: python exits on ^Z in stdin on Windows
NOTE: newlines handling skewed towards interactive terminal. 
      Particularly, applying the conversion twice might *not* be a no-op
"""
import fileinput, sys

def escape(bytes):
    for b in bytes:
        assert 0 <= b < 0x100

        if  b in (0x09, 0x0a): # '\t\n' 
            yield b
            continue

        if  b > 0x7f: # not ascii
            yield 0x4d # 'M'
            yield 0x2d # '-'
            b &= 0x7f

        if  b < 0x20: # control char
            yield 0x5e # '^'
            b |= 0x40
        elif  b == 0x7f:
            yield 0x5e # '^'
            yield 0x3f # '?'
            continue

        yield b

if __name__ == '__main__':
    write_bytes = sys.stdout.buffer.write 
    for bytes in fileinput.input(mode="rb"):
        write_bytes(escape(bytes))

Example:

$ perl -e"print map chr,0..0xff" > bytes.bin 
$ cat -v bytes.bin  > cat-v.out 
$ python30 cat-v.py bytes.bin > python.out
$ diff -s cat-v.out python.out 

It prints:

Files cat-v.out and python.out are identical
J.F. Sebastian
Very nice, thanks for porting that/pointing out the implementation.
cdleary
+1  A: 

If logging or printing debugging output, I usually use repr() to get a harmless printable version of an object, including strings. This may or may not be what you wanted; the cat --show-nonprinting method others have used in other answers is better for lots of multi-line output.

x = get_weird_data()
print repr(x)
Teddy