Below is the code:
s= "Name1=Value1;Name2=Value2;Name3=Value3"
dict(item.split("=") for item in s.split(";"))
I would like to understand how this works. Will it perform for loop first or will it split first?
List of dictionary
s1= "Name1=Value1,Name2=Value2,Name3=Value3;Name1=ValueA,Name2=ValueB,Name3=ValueC"
If you have python installed, I recommend using its interactive repl
With the repl you can run the parts of your program step by step:
s.split(";") will give you ['Name1=Value1', 'Name2=Value2', 'Name3=Value3']
['Name1=Value1', 'Name2=Value2', 'Name3=Value3']
item.split("=") for item in s.split(";") will give you a python generator that iterates on the the list from step 1 and split it off like into smaller lists like this:
[['Name1', 'Value1'], ['Name2', 'Value2'], ['Name3', 'Value3']]
Finally dict(...) on the pairs will turn them into key-value pairs in a python dictionary like this:
{'Name1': 'Value1', 'Name2': 'Value2', 'Name3': 'Value3'}
dict is being passed a generator expression, which produces a sequence of lists by first calling s.split(";"), then yielding the result of item.split("=") for each value in the result of the first split. A more verbose version:
s = "..."
d = dict()
name_value_pairs = s.split(";")
for item in name_value_pairs:
name_value = item.split("=")
d.update([name_value])
I use d.update rather than something simpler like d[x] = y because both dict and d.update can accept the same kind of sequence of key/value pairs as arguments.
From here, we can reconstruct the original by eliminating one temporary variable at a time, from
s = "..."
d = dict()
for item in s.split(";"):
name_value = item.split("=")
d.update(name_value)
to
s = "..."
d = dict()
for item in s.split(";"):
d.update([item.split("=")])
to
s = "..."
d = dict(item.split("=") for item in s.split(";"))
If you write it like that, you might understand better what's happening.
s= "Name1=Value1;Name2=Value2;Name3=Value3"
semicolon_sep = s.split(";")
equal_sep = [item.split("=") for item in semicolon_sep]
a = dict(equal_sep)
print(a["Name1"])
First, it splits the text from wherever there is a semicolon. In this way, we create a list with three elements as "semicolon_sep":
>>> print(semicolon_sep)
['Name1=Value1', 'Name2=Value2', 'Name3=Value3']
Then, it makes a loop over this list to separate each item wherever there is "=". In this way, we have 2 columns for each item (Name and Value). By putting this list (equal_sep) in dict() we change the list to a dictionary.
Related
I have a list of filenames. I need to group them based on the ending names after underscore ( _ ). My list looks something like this:
[
'1_result1.txt',
'2_result2.txt',
'3_result2.txt',
'4_result3.txt',
'5_result4.txt',
'6_result1.txt',
'7_result2.txt',
'8_result3.txt',
]
My end result should be:
List1 = ['1_result1.txt', '6_result1.txt']
List2 = ['2_result2.txt', '3_result2.txt', '7_result2.txt']
List3 = ['4_result3.txt', '8_result3.txt']
List4 = ['5_result4.txt']
This will come down to making a dictionary of lists, then iterating the input and adding each item to its proper list:
output = {}
for item in inlist:
output.setdefault(item.split("_")[1], []).append(item)
print output.values()
We use setdefault to make sure there's a list for the entry, then add our current filename to the list. output.values() will return just the lists, not the entire dictionary, which appears to be what you want.
using defaultdict from collections module:
from collections import defaultdict
output = defaultdict(list)
for file in data:
output[item.split("_")[1]].append(file)
print output.values()
using groupby from itertools module:
data.sort(key=lambda x: x.split('_')[1])
for key, group in groupby(data, lambda x: x.split('_')[1]):
print list(group)
Starting with Python 2.4, both list.sort() and sorted() added a key parameter to specify a function to be called on each list element prior to making comparisons.
The value of the key parameter should be a function that takes a single argument and returns a key to use for sorting purposes. This technique is fast because the key function is called exactly once for each input record.
So if l is the name of your list then you could use something like :
l.sort(key=lambda s: s.split('_')[1])
More information about key functions at here
I have a list of dictionaries like so:
[{'a':'21'},{},{'b':20'},{'c':'89'},{}]
What's the most efficient way to purge empty dictionaries from this list, end result being:
[{'a':'21'},{'b':'20'},{'c':'89'}]
I'm trying:
new_list_of_dictionaries = []
for dictionary in list_of_dictionaries:
if dictionary:
new_list_of_dictionaries.append(dictionary)
return new_list_of_dictionaries
I don't suppose this can be done in O(1) or something?
Just use a list comprehension, and filter on the boolean truth. An empty dictionary is considered false:
return [d for d in list_of_dictionaries if d]
In Python 2, you could also use the filter() function, using None as the filter:
return filter(None, list_of_dictionaries)
In Python 3 that returns an iterator, not a list, so you'd have to call list() on that (so return list(filter(None, ...))), at which point the list comprehension is simply more readable. Of course, if you don't actually need to have random access to the result (so direct index access to result[whatever]), then an iterator might still be a good idea anyway.
Note that this has to take O(N) time, you have to test each and every dictionary. Even if lists had some kind of automaticly updated map that lets you get the indices of the dictionaries that are empty in O(1) time, removing items from a list requires moving later entries forward.
Comprehension or filter (Python2, Python3):
return filter(None, list_of_dictionaries)
# Python3, if you prefer a list over an iterator
return list(filter(None, list_of_dictionaries))
None as filter function will filter out all non-truthy elements, which in the case of empty collections makes it quite concise.
could use a list comprehension?
myList = [{'a':'21'},{},{'b':'20'},{'c':'89'},{}]
result = [x for x in myList if x]
I did it in this way
d = [{'a': '21'}, {}, {'b': 20}, {'c': '89'}, {}]
new_d = []
for item in d:
check = bool(item)
if not check:
del item
else:
new_d.append(item)
print(new_d)
[{'a': '21'}, {'b': 20}, {'c': '89'}]
I have two lists that have a relationship with each other. List1 is descriptors and List2 is rankings of those descriptions
list1 = ["String1", "String2", "String3"]
list2 = ["2", "1", "3"]
What I want to be able to do is create variables that link these up. So if I want to print ranking number 1, I would get what was originally string2.
What's the best way to approach this?
Use a dictionary, such as
content = {"2":"String1", "1":"String2", "3":"String3"}
print content["1"]
If you would like to generate the dic from list, you could:
content = dict((key, value) for (key, value) in zip(list2, list1))
Thanks to #minitech, a much more beautiful statement would be:
content = dict(zip(list2, list1))
You could try a dictionary. Dicionaries contain key:value pairs:
rankings = {"1":"String2", "2":"String1", "3":"String3"}
then you can access the elements like:
print rankings["1"]
it will print
"String2"
But how to create this dictionary? Use a for loop (assuming list1 and list2 have the same length)
rankings = {} # Create empy dicionary
for i in range(len(list2)): # Loop 'n' times, where 'n' is the length of the lists
rankings[list2[i]] = list1[i]
Note: Depending in your implementation, you could just use a number, instead of a string as key in the dictionary:
rankings = {1:"String2", 2:"String1", 3:"String3"}
I have a list in the following format:
['CASE_1:a','CASE_1:b','CASE_1:c','CASE_1:d',
'CASE_2:e','CASE_2:f','CASE_2:g','CASE_2:h']
I want to create a new list which looks like like this:
['CASE_1:a,b,c,d','CASE_2:e,f,g,h']
Any idea how to get this done elegantly??
You can use a defaultdict by treating case as the key, and appending to the list each letter, where case and the letter are obtained by splitting the elements of your list on ':' - such as:
from collections import defaultdict
case_letters = defaultdict(list)
start = ['CASE_1:a','CASE_1:b','CASE_1:c','CASE_1:d', 'CASE_2:e','CASE_2:f','CASE_2:g','CASE_2:h']
for el in start:
case, letter = el.split(':')
case_letters[case].append(letter)
result = sorted('{case}:{letters}'.format(case=key, letters=','.join(values)) for key, values in case_letters.iteritems())
print result
As this is homework (edit: or was!!?) - I recommend looking at collections.defaultdict, str.split (and other builtin string methods), at the builtin type list and it's methods (such as append, extend, sort etc...), str.format, the builtin sorted method and generally a dict in general. Use the working example here along with the final manual for reference - all these things will come in handy later on - so it's in your best interest to understand them as best you can.
One other thing to consider is that having something like:
{1: ['a', 'b', 'c', 'd'], 2: ['e', 'f', 'g', 'h']}
is a lot more of a useful format and could be used to recreate your desired list afterwards anyway...
I've deleted my full solution since I realized this is homework, but here's the basic idea:
A dictionary is a better data structure. I would look at a collections.defaultdict. e.g.
yourdict = defaultdict(list)
You can iterate through your list (splitting each element on ':'). Something like:
#only split string once -- resulting in a list of length 2.
case, value = element.split(':',1)
Then you can add these to the dict using the list .append method:
yourdict[case].append(value)
Now, you'll have a dict which maps keys (Case_1, Case_2) to lists (['a','b','c','d'], [...]).
If you really need a list, you can sort the items of the dictionary and join appropriately.
sigh. It looks like the homework tag has been removed (here's my original solution):
from collections import defaultdict
d = defaultdict(list)
for elem in yourlist:
case, value = elem.split(':', 1)
d[case].append(value)
Now you have a dictionary as I described above. If you really want to get your list back:
new_lst = [ case+':'+','.join(values) for case,values in sorted(d.items()) ]
data = ['CASE_1:a','CASE_1:b','CASE_1:c','CASE_1:d', 'CASE_2:e','CASE_2:f','CASE_2:g','CASE_2:h']
output = {}
for item in data:
key, value = item.split(':')
if key not in output:
output[key] = []
output[key].append(value)
result = []
for key, values in output.items():
result.append('%s:%s' % (key, ",".join(values)))
print result
outputs
['CASE_2:e,f,g,h', 'CASE_1:a,b,c,d']
mydict = {}
for item in list:
key,value = item.split(":")
if key in mydict:
mydict[key].append(value)
else:
mydict[key] = [value]
[key + ":" + ",".join(value) for key, value in mydict.iteritems()]
Not much elegance, to be honest. You know, I'd store your list as a dict, cause it behaves as a dict in fact.
output is ['CASE_2:e,f,g,h', 'CASE_1:a,b,c,d']
My list:
list = ['name1', 'option1.1 value1.1', 'option1.2 value1.2', 'name2', 'option2.1 value2.1', 'option2.2 value2.2', 'option2.3 value2.3']
And i want create dictionary like this:
dict = {'name1':{'option1.1':'value1.1', 'option1.2':'value1.2'}, 'name2':{'option2.1': 'value2.1', 'option2.2':'value2.2', 'option2.3':'value2.3'}
I don't know how big is my list (numbers of names, options and values). Any idea?
with list and dict comprehension:
id=[b[0] for b in enumerate(lst) if 'name' in b[1]]+[None]
d={lst[id[i]]:dict(map(str.split,lst[id[i]+1:id[i+1]])) for i in range(len(id)-1)}
your original list was here named lst
This is perhaps not as readable as the answer by #sr22222, but perhaps it's faster, and a matter of taste.
Assuming name will always be a single token (also, don't use list and dict as variable names, as they are builtins):
result = {}
for val in my_list:
split_val = val.split()
if len(split_val) == 1:
last_name = split_val[0]
result[last_name] = {}
else:
result[last_name][split_val[0]] = split_val[1]
Note that this will choke if the list is badly formatted and the first value is not a name.