views:

74

answers:

8

Hello,

Any way to extract what's after the @(if any) and before the next . (if any)?

Examples:

host
host.domain.com
user@host
first.last@host
[email protected]
[email protected]

I need to get host in a variable.

Suggestions in Python? Any method is welcomed.

Thanks,

EDIT: I fixed my question. Need to match host and host.blah.blah too.

+1  A: 

You can use a couple of string.split calls, the first using '@' as a separator, the second using '.'

Ofir
+1 i think it's really the best way. Neither regex solutions given matche string without @.
M42
A: 

do a split by '@', and then substring.

Ankit Jain
+1  A: 
>>> x = "[email protected]"
>>> x.split("@")[1].split(".")[0]
'host'
>>> y = "first.last@host"
>>> y.split("@")[1].split(".")[0]
'host'
>>> 

There will be an IndexError Exception thrown if there is no @ in the string.

Fabian
A: 
'[email protected]'.split('@')[1].split('.')[0]
CrociDB
+1  A: 
host = re.search(r"@(\w+)(\.|$)", s).group(1)
Amarghosh
Doesn't look like it'll work if there's no `.` after the `@`
John Machin
@John You're right - fixed it.
Amarghosh
Nice one. I added a `|` after the `@` in case there is no `@`
Alix
@Amarghosh: You didn't fix it properly; `$` can match `\n` at the end of the string; use `\Z` instead of `$`.
John Machin
@Alix: Why? Apart from the `\Z` thing, the pattern fails to match in other circumstances e.g. `foo@#` so just leave failure to match as being generally "invalid input".
John Machin
@John That's a feature I need: to be able to specify a server name only e.g. `foo` or `foo.bar.tld` (it was not in my question though).
Alix
@John what if the string contains multiple lines and a line ends with host name as in `"user@host\nsome other text"`
Amarghosh
Will not matche the first 2 examples from OP: ie without @ in the string.
M42
Yup, I'm working on it. Can't get it working without a `if:` `else:` though...
Alix
@Alix, @Amarghosh: not finished with you yet :) ... using `\w` allows the underscore `_` (dodgy) but disallows the hyphen `-` (definitely not a good idea) ... see http://en.wikipedia.org/wiki/Hostname ... I suggest use of `[-\w]` or `[-A-Za-z0-9]` instead of `\w`
John Machin
@Amarghosh: If the OP wants to allow trailing rubbish (including a newline), then just use `@([-\w]+)` with no end anchor.
John Machin
@John `[-A-Za-z0-9]` is the correct one.
Amarghosh
@Amarghosh: I know that [-A-Za-z0-9] is the "correct" one, but out there in real world some folks have host names with underscores in them ...
John Machin
A: 

Here is one more solution:

re.search("^.*@([^.]*).*", str).group(1)

edit: Much better solution thanks to the comment:

re.search("@([^.]*)[.]?", str).group(1)
Klark
(1) `re.search("^.*X")` is functionally equivalent to `re.search("X")` [except in the presence of newlines] and (as implemented) is horribly slower if X is not found. (2) The trailing `.*` (matches 0 or more of any character [except a newline]) is utterly pointless.
John Machin
Thanks for the comment. Those are the great things to know.
Klark
Any trailing pattern that can match 0 characters is utterly pointless, whether it's `.*` or `[.]?` or something else -- you don't want what it matches, so don't put it in.
John Machin
A: 
>>> s="[email protected]"
>>> s[s.index("@")+1:]
'host.domain.com'
>>> s[s.index("@")+1:].split(".")[0]
'host'
ghostdog74
A: 
import re

hosts = """
user@host1
first.last@host2
[email protected]
[email protected]
"""

print re.findall(r"@(\w+)", hosts)  

returns:

['host1', 'host2', 'host3', 'host4']
ssoler