views:

245

answers:

5

Just beginning with python and know enough to know I know nothing. I would like to find alternative ways of splitting a list into a list of dicts. Example list:

data = ['ID:0:0:0',
        'Status:Ok',
        'Name:PhysicalDisk0:0:0',
        'State:Online',
        'FailurePredicted:No',
        'ID:0:0:1',
        'Status:Ok',
        'Name:PhysicalDisk0:0:1',
        'State:Online',
        'FailurePredicted:No']

Finished list of dicts:

[{'Status': 'Ok',
  'State': 'Online',
  'ID': '0:0:0',
  'FailurePredicted': 'No',
  'Name': 'PhysicalDisk0:0:0'},
 {'Status': 'Ok',
  'State': 'Online',
  'ID': '0:0:1',
  'Name': 'PhysicalDisk0:0:1',
  'FailurePredicted': 'No'}]

The list has repeating elements that require multiple dicts and the list varies in length. My code seems like it could be simplified, if only I knew Python better. My current code:

DELETED CODE It didn't work. :(

----------- File output as requested -------------------

# omreport storage pdisk controller=0
List of Physical Disks on Controller PERC 5/i Integrated (Embedded)

Controller PERC 5/i Integrated (Embedded)
ID                        : 0:0:0
Status                    : Ok
Name                      : Physical Disk 0:0:0
State                     : Online
Failure Predicted         : No
Progress                  : Not Applicable
Type                      : SAS
Capacity                  : 136.13 GB (146163105792 bytes)
Used RAID Disk Space      : 136.13 GB (146163105792 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare                 : No
Vendor ID                 : DELL    
Product ID                : ST3146755SS     
Revision                  : T107
Serial No.                : 3LN1EF0G            
Negotiated Speed          : Not Available
Capable Speed             : Not Available
Manufacture Day           : 07
Manufacture Week          : 24
Manufacture Year          : 2005
SAS Address               : 5000C50004731C35

ID                        : 0:0:1
Status                    : Ok
Name                      : Physical Disk 0:0:1
State                     : Online
Failure Predicted         : No
Progress                  : Not Applicable
Type                      : SAS
Capacity                  : 136.13 GB (146163105792 bytes)
Used RAID Disk Space      : 136.13 GB (146163105792 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare                 : No
Vendor ID                 : DELL    
Product ID                : ST3146755SS     
Revision                  : T107
Serial No.                : 3LN1EF88            
Negotiated Speed          : Not Available
Capable Speed             : Not Available
Manufacture Day           : 07
Manufacture Week          : 24
Manufacture Year          : 2005
SAS Address               : 5000C500047320B9
+1  A: 
import re

results = []
temp = {}
for item in data:
    (key, value) = re.search('(.*?):(.*)', item).groups()
    if temp.has_key(key): temp = {}
    temp[key] = value
    if temp not in results: results.append(temp)
mkotechno
That produces a list of lists, I would ultimately like a list of dicts for easier parsing later on.
CarpeNoctem
edited with dicts instead lists
mkotechno
+1  A: 

If you have no more info than "each repetition of a key signals the need to start a new dict", your code can be improved only marginally, for example as:

results = []
curd = {}
for x in data:
  k, v = x.split(':', 1)
  if k in curd:
    results.append(curd)
    curd = {}
  curd[k] = v
results.append(curd)

i.e., no need to keep an intermediate list tmp rather than an intermediate dict curd. The semantics are subtly different -- you're initiating a new dict only when both key and value coincide (so an item such as 'Status:Borked' would "trample over" one being built from 'Status:Ok', for example), I'm taking the key only as the identifier (so, no trampling over in such a case) -- you sure the exact semantics you implement are what you require?

Alex Martelli
Sure is a strong word that I am far from - this is my first python script. I would like a list of dicts to make iteration easier later on. I didn't want to paste in the entire script as it might have scared you helpful people away.
CarpeNoctem
Yes, "each repetition of a key signals the need to start a new dict", since the amount of key:value elements in the list can change. This is what I was trying to write but wasn't quite there. Thanks.
CarpeNoctem
+6  A: 
result = [{}]
for item in data:
    key, val = item.split(":", 1)
    if key in result[-1]:
        result.append({})
    result[-1][key] = val
teepark
Note that `result[-1]` means "the last element" - using negative indices counts from the back
Smashery
So would this be what people speak of when they say "pythonic"? It's concise and it works perfectly.
CarpeNoctem
Yeah, I'd describe this as "pythonic". Rather than storing temporary variables, everything is put straight into the result list.
Smashery
Good (and readable) code, although I personally will add a couple of comments. Just a question of preference.
Khelben
A: 
ret = []
ITEMS_AMOUNT = 5 
while True:
        tmp = {}
        for i in data[0:ITEMS_AMOUNT]:
                tmp.update(dict([i.split(':', 1)]))
        ret.append(tmp)

        if len(data) == ITEMS_AMOUNT:
                break
        data = data[ITEMS_AMOUNT:]

print ret
nandu
A: 
d=dict([])
c=0
whatiwant=["ID","Status","Name","State","Failure Predicted"]
for line in open("file"):
    line=line.rstrip()
    sline=line.split(":",1)
    sline[0]=sline[0].strip()
    if sline[0]=="ID":
        c+=1
        d.setdefault(c,[])
    if sline[0] in whatiwant:
        d[c].append((sline[0],' '.join(sline[1:])))
for i,j in d.iteritems():
    print i,j

output

$ ./python.py
1 [('ID', ' 0:0:0'), ('Status', ' Ok'), ('Name', ' Physical Disk 0:0:0'), ('State', ' Online'), ('Failure Predicted', ' No')]
2 [('ID', ' 0:0:1'), ('Status', ' Ok'), ('Name', ' Physical Disk 0:0:1'), ('State', ' Online'), ('Failure Predicted', ' No')]
ghostdog74