views:

91

answers:

2

I'm confused by how the length of a string is calculated when expandtabs is used. I thought expandtabs replaces tabs with the appropriate number of spaces (with the default number of spaces per tab being 8). However, when I ran the commands using strings of varying lengths and varying numbers of tabs, the length calculation was different than I thought it would be (i.e., each tab didn't always result in the string length being increased by 8 for each instance of "/t").

Below is a detailed script output with comments explaining what I thought should be the result of the command executed above. Would someone please explain the how the length is calculated when expand tabs is used?

IDLE 2.6.5     
>>> s = '\t'
>>> print len(s)
1
>>> #the length of the string without expandtabs was one (1 tab counted as a single space), as expected.
>>> print len(s.expandtabs())
8
>>> #the length of the string with expandtabs was eight (1 tab counted as eight spaces).
>>> s = '\t\t'
>>> print len(s)
2
>>> #the length of the string without expandtabs was 2 (2 tabs, each counted as a single space).
>>> print len(s.expandtabs())
16
>>> #the length of the string with expandtabs was 16 (2 tabs counted as 8 spaces each).
>>> s = 'abc\tabc'
>>> print len(s)
7
>>> #the length of the string without expandtabs was seven (6 characters and 1 tab counted as a single space).
>>> print len(s.expandtabs())
11
>>> #the length of the string with expandtabs was NOT 14 (6 characters and one 8 space tabs).
>>> s = 'abc\tabc\tabc'
>>> print len(s)
11
>>> #the length of the string without expandtabs was 11 (9 characters and 2 tabs counted as a single space).
>>> print len(s.expandtabs())
19
>>> #the length of the string with expandtabs was NOT 25 (9 characters and two 8 space tabs).
>>>
+1  A: 

Like when you are entering tabs in a text-editor, the tab character increases the length to the next multiple of 8.

So:

  • '\t' by itself is 8, obviously.
  • '\t\t' is 16.
  • 'abc\tabc' starts at 3 characters, then a tab pushes it up to 8, and then the last 'abc' pushes it from 8 to 11...
  • 'abc\tabc\tabc' likewise starts at 3, tab bumps it to 8, another 'abc' goes to 11, then another tab pushes it to 16, and the final 'abc' brings the length to 19.
Mark Rushakoff
Great explanation of how the column pointer is pushed up to the next multiple of 8. Thanks.
Mithrill
+2  A: 

The tab increments the column pointer to the next multiple of 8:

>>> 'abc\tabc'.expandtabs().replace(' ', '*')
'abc*****abc'
Ignacio Vazquez-Abrams
Thanks. I've taken the knowledge gained from the answers I received from you and Mark and expanded the Python programming Wikibook. http://en.wikibooks.org/wiki/Python_Programming/Strings#expandtabs
Mithrill
@Mithrill: Moving the column pointer to the next multiple of N instead of adding N spaces is NOT Python-specific ... it's just how tabs work; see http://en.wikipedia.org/wiki/Tab_key
John Machin