tags:

views:

551

answers:

8

I have a name as a string, in this example "markus johansson".

I'm trying to code a program that makes 'm' and 'j' uppercase:

name = "markus johansson"

for i in range(1, len(name)):
    if name[0] == 'm':
     name[0] = "M"
    if name[i] == " ":
     count = name[i] + 1
    if count == 'j':    
            name[count] = 'J'

I'm pretty sure this should work, but it gives me this error:

File "main.py", line 5 in <module> 
   name[0] = "M" 
TypeError: 'str' object does support item assignment

I know there is a library function called .title(), but I want to do "real programming".

How do I fix this?

+5  A: 

Strings are immutable. They can't be changed. You must create a new string with the changed content. If you want to make every 'j' uppercase:

def make_uppercase_j(char):
    if char == 'j':
        return 'J'
    else:
        return char
name = "markus johansson"
''.join(make_uppercase_j(c) for c in name)
unbeknown
''.join('J' if c == 'j' else c for c in name)
J.F. Sebastian
+8  A: 

I guess that what you're trying to achieve is:

from string import capwords
capwords(name)

Which yields:

'Markus Johansson'

EDIT: OK, I see you want to tear down a open door. Here's low level implementation.

''.join([char.upper() if prev==' ' else char for char,prev in zip(name,' '+name)])
vartec
Dude, he wants "real programming not wannabe." :)
Tyson
Yeah.. Real programmers use integers ;-)
vartec
All wrong, real programmers use butterflies. Proof: http://xkcd.com/378/
Georg
Real programmers genetically engineer an army of butterflies for parallel writes to the hard drive.
Tyson
The OP asks for lower level implementation without `str.title` or `string.capwords`.
J.F. Sebastian
+1  A: 

If I understand your original algorithm correctly, this is what you want to do:

namn = list("markus johansson")

if namn[0] == 'm':
    namn[0] = "M"

count = 0

for i in range(1, len(namn)):
    if namn[i] == " ":
        count = i + 1
    if count and namn[count] == 'j':    
        namn[count] = 'J'

print ''.join(namn)

Of course, there's a million better ways ("wannabe" ways) to do what you're trying to do, like as shown in vartec's answer. :)

As it stands, your code only works for names that start with a J and an M for the first and last names, respectively.

Tyson
+7  A: 
>>> "markus johansson".title()
'Markus Johansson'

Built in string methods are the way to go.

EDIT: I see you want to re-invent the wheel. Any particular reason ? You can choose from any number of convoluted methods like:

' '.join(j[0].upper()+j[1:] for j in "markus johansson".split())

Standard Libraries are still the way to go.

Christian Witts
Yeah, I didnt know title, great tip. standard library rulez again.
Jiri
`j[1:]` -> `j[1:].lower()` or just `j.capitalize()`. `str.split()` drops all spaces therefore `' '.join(...)` does different thing compared to `str.title()`.
J.F. Sebastian
This is the best and simplest method.
Selinap
A: 

"real programming"?

I would use .title(), and I'm a real programmer.

Or I would use regular expressions

re.sub(r"(^|\s)[a-z]", lambda m: m.group(0).upper(), "this   is a set of  words")

This says "If the start of the text or a whitespace character is followed by a lower-case letter" (in English - other languages are likely not supported), then for each match convert the match text to upper-case. Since the match text is the space and the lower-case letter, this works just fine.

If you want it as low-level code then the following works. Here I only allow space as the separator (but you may want to support newline and other characters). On the other hand, "string.lowercase" is internationalized, so if you're in another locale then it will, for the most part, still work. If you don't want that then use string.ascii_lowercase.

import string

def title(s):
    # Capitalize the first character
    if s[:1] in string.lowercase:
        s = s[0].upper() + s[1:]

    # Find spaces
    offset = 0
    while 1:
        offset = s.find(" ", offset)
        # Reached the end of the string or the
        # last character is a space
        if offset == -1 or offset == len(s)-1:
            break

        if s[offset+1:offset+2] in string.lowercase:
            # Is it followed by a lower-case letter?
            s = s[:offset+1] + s[offset+1].upper() + s[offset+2:]
            # Skip the space and the letter
            offset += 2
        else:
            # Nope, so start searching for the next space
            offset += 1

    return s

To elaborate on my comment to this answer, this question can only be an exercise for curiosity's sake. Real names have special capitalization rules: the "van der" in "Johannes Diderik van der Waals" is never capitalized, "Farrah Fawcett-Majors" has the "M", and "Cathal Ó hEochaidh" uses the non-ASCII Ó and h, which modify "Eochaidh" to mean "grandson of Eochaidh".

Andrew Dalke
It has poor performance for long inputs, say 50,000 words or more. That's because you are re-creating the string for each word. It probably has worse-than-linear big-O complexity. It would probably be more efficient to build the string in a StringIO object.
dangph
BTW, an easy way to generate a long input for testing: s = 'test ' * 50000
dangph
Yup, quadratic in the number of substitutions. But 1) this isn't meant as general purpose code 2) the OP wanted character-level code 3) 1000 substitutions is still < 1ms 4) who uses title for real? esp. since i18n, "quotes", etc. mess things up, 5) title on names like "van der Waal" is incorrect.
Andrew Dalke
Oh, and yes, I was writing that code like a C programmer. ;)
Andrew Dalke
+1  A: 

Plenty of good suggestions, so I'll be in good company adding my own 2 cents :-)

I'm assuming you want something a little more generic that can handle more than just names starting with 'm' and 'j'. You'll probably also want to consider hyphenated names (like Markus Johnson-Smith) which have caps after the hyphen too.

from string import lowercase, uppercase
name = 'markus johnson-smith'

state = 0
title_name = []

for c in name:
    if c in lowercase and not state:
        c = uppercase[lowercase.index(c)]
        state = 1
    elif c in [' ', '-']:
        state = 0
    else:
        state = 1 # might already be uppercase

    title_name.append(c)

print ''.join(title_name)

Last caveat is the potential for non-ascii characters. Using the uppercase and lowercase properties of the string module is good in this case becase their contents change depending on the user's locale (ie: system-dependent, or when locale.setlocale() is called). I know you want to avoid using upper() for this exercise, and that's quite neat... as an FYI, upper() uses the locale controlled by setlocale() too, so the practice of use uppercase and lowercase is a good use of the API without getting too high-level. That said, if you need to handle, say, French names on a system running an English locale, you'll need a more robust implementation.

Jarret Hardie
'markus johnson-smith'.title() -> 'Markus Johnson-Smith'
J.F. Sebastian
`uppercase[lowercase.index(c)]` makes it O(N**2). Use `c.islower()`, `c.lower()`, etc.
J.F. Sebastian
Totally agree... just trying to help him avoid the use of library functions as per his question.
Jarret Hardie
+5  A: 

string.capwords() (defined in string.py)

# Capitalize the words in a string, e.g. " aBc  dEf " -> "Abc Def".
def capwords(s, sep=None):
    """capwords(s, [sep]) -> string

    Split the argument into words using split, capitalize each
    word using capitalize, and join the capitalized words using
    join. Note that this replaces runs of whitespace characters by
    a single space.

    """
    return (sep or ' ').join(x.capitalize() for x in s.split(sep))

str.title() (defined in stringobject.c)

PyDoc_STRVAR(title__doc__,
"S.title() -> string\n\
\n\
Return a titlecased version of S, i.e. words start with uppercase\n\
characters, all remaining cased characters have lowercase.");
static PyObject*
string_title(PyStringObject *self)
{
    char *s = PyString_AS_STRING(self), *s_new;
    Py_ssize_t i, n = PyString_GET_SIZE(self);
    int previous_is_cased = 0;
    PyObject *newobj = PyString_FromStringAndSize(NULL, n);
    if (newobj == NULL)
     return NULL;
    s_new = PyString_AsString(newobj);
    for (i = 0; i < n; i++) {
     int c = Py_CHARMASK(*s++);
     if (islower(c)) {
      if (!previous_is_cased)
          c = toupper(c);
      previous_is_cased = 1;
     } else if (isupper(c)) {
      if (previous_is_cased)
          c = tolower(c);
      previous_is_cased = 1;
     } else
      previous_is_cased = 0;
     *s_new++ = c;
    }
    return newobj;
}

str.title() in pure Python

class String(str):
    def title(self):
        s = []
        previous_is_cased = False
        for c in self:
            if c.islower():
               if not previous_is_cased:
                  c = c.upper()
               previous_is_cased = True
            elif c.isupper():
               if previous_is_cased:
                  c = c.lower()
               previous_is_cased = True
            else:
               previous_is_cased = False
            s.append(c)
        return ''.join(s)

Example:

>>> s = ' aBc  dEf '
>>> import string
>>> string.capwords(s)
'Abc Def'
>>> s.title()
' Abc  Def '
>>> s
' aBc  dEf '
>>> String(s).title()
' Abc  Def '
>>> String(s).title() == s.title()
True
J.F. Sebastian
+1  A: 

If you're looking into more generic solution for names, you should also look at following examples:

  • John Adams-Smith
  • Joanne d'Arc
  • Jean-Luc de'Breu
  • Donatien Alphonse François de Sade

Also some parts of the names shouldn't start with capital letters, like:

  • Herbert von Locke
  • Sander van Dorn
  • Edwin van der Sad

so, if you're looking into creating a more generic solution, keep all those little things in mind.

(This would be a perfect place to run a test-driven development, with all those conditions your method/function must follow).

kender
This isn't the place for test-driven development - this is the place to review your process and figure out why the names were all coming in as lowercase, and change it so that you don't need post-hoc code. Even with post-hoc code, make sure _any_ ambiguity is resolved by humans.
Andrew Dalke