views:

331

answers:

3

This is a very newbie question and i will probably get downvoted for it, but i quite honestly couldn't find the answer after at least an hour googling. I learned how to slice strings based on "exact locations" where you have to know exactly where the word ends. But i did not find any article that explained how do it on "non static" strings that could change.

Also i do not want to use string.split() in this case as its a little overkill for what i need.

I basically have a string like this:

myString = "!save python Python is a high-level object oriented language created by Guido van Rossum."
# the format is !save [singleword] [definition]

i need to "slice" this string but i cant figure out a proper way to do it.

i need to save a to variable the title (python in this case) and the definition of this string. Somethig like:

title = myString[1]
definition = myString[everything after string[1]

I'm not exactly sure how to do this when you have a dynamic string where you dont know where each word ends.

I would greatly appreciate some pointers on what functions/methods should i read on to archieve this. Thank you in advance.

+1  A: 

If you have spaces between your command, title, and definition you could:

wordList = myString.split()
cmd = wordList[0] # !save
title = wordList[1] # python
definition = ' '.join(wordList[2:])  # Python is a high-level object oriented language created by Guido van Rossum.

If you really would rather not use split you could use regular expressions:

import re
m = re.match('(/S+)/s*(/S+)/s*(.*)')
cmd = m.group(1)
title = m.group(2)
definition = m.group(3)
Jesse
this is a great answer! thank you so much! i will definitely use some of your suggestions ! thanks!
Ah, you don't want to use a saw. Here is a chainsaw instead!
Tom Leys
Note that myString.split() does normalizing of whitespace, so doing the join() again afterwards doesn't get you back to the original "definition".
James Antill
+11  A: 

Why is split overkill?

verb, title, definition = myString.split (' ', 2)
Joe Holloway
Would this affect the third part of his string that has multiple spaces in it?
mandroid
Nope, second parameter for split says the maximum number of times to split the string. After the first 2 the rest of the string is returned.
Jesse
@mandroid The second parameter tells it how many splits to perform. In this case it will stop splitting after it hits the second occurrence of the delimiter. http://docs.python.org/library/stdtypes.html#str.splitWhen in doubt, crack open a shell and give it a try.
Joe Holloway
+2  A: 

The selected answer (after PEP8ing):

verb, title, definition = my_string.split(' ', 2)

splits on a single space. It's likely a better choice to split on runs of whitespace, just in case there are tabs or multiple spaces on either side of the title:

verb, title, definition = my_string.split(None, 2)

Also consider normalising the whitespace in the definition:

definition = ' '.join(definition.split())
John Machin
+1 for addessing the multiple whitespace characters/runs
mhawke
@mhawke: I think that normali[sz]e_whitespace() would be a good candidate for a str method ... the join/split caper is very inefficient, especially when no change is required; a built-in could just return a reference to the original string in that case. One annoying thing about join/split is that \xA0 aka   is not considered whitespace in the (default) C-locale in Python 2.x.
John Machin
Generally speaking I like your recommendations, but I took his specification literally: # the format is !save [singleword] [definition]
Joe Holloway
@jholloway: re "I took his specification literally" ... your confession comes rather late. Have you truly repented?
John Machin
@John Machin Not sure what you mean, but I'll repent for the PEP8 mistake in my solution if that makes you happy :)
Joe Holloway
@jholloway: I meant: Have you truly repented taking the OP's specification literally?
John Machin
@John Machin I guess not. My answer solves the problem as stated without inferring additional requirements around whitespace. I agree that your amendment to my solution solves additional edge cases w/o adding complexity, but what is there to 'repent' for?
Joe Holloway
@John Machin Of course it occurs to me now that maybe you're being tongue-in-cheek and I'm not reading it correctly, so apologies in advance if I'm just being dense.
Joe Holloway
@jholloway: You need to repent the dreadful sin of taking a specification literally. Users and OPs never have extra whitespace, never have many to many relationships, never have duplicate data, etc etc. Apology accepted :-)
John Machin