In python, I am using the mincemeat map-reduce framework
From my map function I would like to yield (k,v) in a loop, which would send the output to the reduce function (sample data given which is the output of my map function )
auth3 {'practical': 1, 'volume': 1, 'physics': 1}
auth34 {'practical': 1, 'volume': 1, 'chemistry': 1}
....
There would be many such entries; this is just a few as an example.
Here, auth3 and auth34 are keys and the respective values are dictionary items
Inside the reduce function when I try to print the key,values, I am getting "too many values to unpack" error. My reduce function looks like this
def reducefn(k, v):
for k,val in (k,v):
print k, v
Please let me know how to resolve this error.
First, define your dictionary with python built-in dict
>>> dic1 = dict(auth3 = {'practical': 1, 'volume': 1, 'physics': 1},
auth34 = {'practical': 1, 'volume': 1, 'chemistry': 1} )
>>> dic1
{'auth3': {'practical': 1, 'volume': 1, 'physics': 1},
'auth34': {'practical': 1, 'volume': 1, 'chemistry': 1}}
Then, your reduce function may go as
def reducefn(dictofdicts):
for key, value in dictofdicts.iteritems() :
print key, value
In the end,
>>> reducefn(dic1)
auth3 {'practical': 1, 'volume': 1, 'physics': 1}
auth34 {'practical': 1, 'volume': 1, 'chemistry': 1}
Use zip
def reducefn(k, v):
for k,val in zip(k,v):
print k, v
>>> reducefn({'practical': 1, 'volume': 1, 'physics': 1} ,{'practical': 1, 'volume': 1, 'chemistry': 1})
practical {'practical': 1, 'volume': 1, 'chemistry': 1}
volume {'practical': 1, 'volume': 1, 'chemistry': 1}
physics {'practical': 1, 'volume': 1, 'chemistry': 1}
>>>
reducefn(k,v) : constitutes a tuple of tuples ((k1,k2,k3..), (v1,v2,v3...))
zippping them gives you ((k1,v1), (k2,v2), (k3,v3)...) and thats what you want
def reducefn(*dicts): #collects multiple arguments and stores in dicts
for dic in dicts: #go over each dictionary passed in
for k,v in dic.items(): #go over key,value pairs in the dic
print(k,v)
reducefn({'practical': 1, 'volume': 1, 'physics': 1} ,{'practical': 1, 'volume': 1, 'chemistry': 1})
Produces
>>>
physics 1
practical 1
volume 1
chemistry 1
practical 1
volume 1
Now, regarding your implementation:
def reducefn(k, v):
The function signature above takes two arguments. The arguments passed to the function are accessed via k and v respectively. So an invocation of reducefn({"key1":"value"},{"key2":"value"}) results in k being assigned {"key1":"value"} and v being assigned {"key2":"vlaue"}.
When you try to invoke it like so: reducefn(dic1,dic2,dic3,...) you are passing in more than the allowed number of parameters as defined by the declaration/signature of reducefn.
for k,val in (k,v):
Now, assuming you passed in two dictionaries to reducefn, both k and v would be dictionaries. The for loop above would be equivalent to:
>>> a = {"Name":"A"}
>>> b = {"Name":"B"}
>>> for (d1,d2) in (a,b):
print(d1,d2)
Which gives the following error:
ValueError: need more than 1 value to unpack
This occurs because you're essentially doing this when the for loop is invoked:
d1,d2=a
You can see we get this error when we try that in a REPL
>>> d1,d2=a
Traceback (most recent call last):
File "<pyshell#24>", line 1, in <module>
d1,d2=a
ValueError: need more than 1 value to unpack
We could do this:
>>> for (d1,d2) in [(a,b)]:
print(d1,d2)
{'Name': 'A'} {'Name': 'B'}
Which assigns the tuple (a,b) to d1,d2. This is called unpacking and would look like this:
d1,d2 = (a,b)
However, in our for loop for k,val in (k,v): it wouldn't make sense as we would end up with k,and val representing the same thing as k,v did originally. Instead we need to go over the key,value pairs in the dictionaries. But seeing as we need to cope with n dictionaries, we need to rethink the function definition.
Hence:
def reducefn(*dicts):
When you invoke the function like this:
reducefn({'physics': 1},{'volume': 1, 'chemistry': 1},{'chemistry': 1})
*dicts collects the arguments, in such a way that dicts ends up as:
({'physics': 1}, {'volume': 1, 'chemistry': 1}, {'chemistry': 1})
As you can see, the three dictionaries passed into the function were collected into a tuple. Now we iterate over the tuple:
for dic in dicts:
So now, on each iteration, dic is one of the dictionaries we passed in, so now we go ahead and print out the key,value pairs inside it:
for k,v in dic.items():
print(k,v)
Related
I have a complex dictionary:
l = {10: [{'a':1, 'T':'y'}, {'a':2, 'T':'n'}], 20: [{'a':3,'T':'n'}]}
When I'm trying to iterate over the dictionary I'm not getting a dictionary with a list for values that are a dictionary I'm getting a tuple like so:
for m in l.items():
print(m)
(10, [{'a': 1, 'T': 'y'}, {'a': 2, 'T': 'n'}])
(20, [{'a': 3, 'T': 'n'}])
But when I just print l I get my original dictionary:
In [7]: l
Out[7]: {10: [{'a': 1, 'T': 'y'}, {'a': 2, 'T': 'n'}], 20: [{'a': 3, 'T': 'n'}]}
How do I iterate over the dictionary? I still need the keys and to process each dictionary in the value list.
There are two questions here. First, you ask why this is turned into a "tuple" - the answer to that question is because that is what the .items() method on dictionaries returns - a tuple of each key/value pair.
Knowing this, you can then decide how to use this information. You can choose to expand the tuple into the two parts during iteration
for k, v in l.items():
# Now k has the value of the key and v is the value
# So you can either use the value directly
print(v[0]);
# or access using the key
value = l[k];
print(value[0]);
# Both yield the same value
With a dictionary you can add another variable while iterating over it.
for key, value in l.items():
print(key,value)
I often rely on pprint when processing a nested object to know at a glance what structure that I am dealing with.
from pprint import pprint
l = {10: [{'a':1, 'T':'y'}, {'a':2, 'T':'n'}], 20: [{'a':3,'T':'n'}]}
pprint(l, indent=4, width=40)
Output:
{ 10: [ {'T': 'y', 'a': 1},
{'T': 'n', 'a': 2}],
20: [{'T': 'n', 'a': 3}]}
Others have already answered with implementations.
Thanks for all the help. I did discuss figure out how to process this. Here is the implementation I came up with:
for m in l.items():
k,v = m
print(f"key: {k}, val: {v}")
for n in v:
print(f"key: {n['a']}, val: {n['T']}")
Thanks for everyones help!
I have a python list that is compromised of multiple dictionaries within a list.
{"timestamp":"2019-10-05T00:07:50Z","icao_address":"AACAA5","latitude":39.71273649,"longitude":-41.79022217,"altitude_baro":"37000","speed":567,"heading":77,"source":"FM89","collection_type":"satellite","vertical_rate":"0","ingestion_time":"2019-10-05T02:49:47Z"}
{"timestamp":"2019-10-05T00:11:00Z","icao_address":"C03CF1","latitude":48.12194824,"longitude":-44.94451904,"altitude_baro":"36000","speed":565,"heading":73,"source":"FM89","collection_type":"satellite","vertical_rate":"0","ingestion_time":"2019-10-05T02:49:47Z"}
{"timestamp":"2019-10-05T00:11:15Z","icao_address":"A0F4F6","latitude":48.82104492,"longitude":-34.43157489,"altitude_baro":"35000","source":"FM89","collection_type":"satellite","ingestion_time":"2019-10-05T02:49:47Z"}
I am trying to add the key minute for all of the dictionaries within the list, and don't care for it's value at the moment, and run into a runtime error, which after reading on the reasoning is expected.
{"timestamp":"2019-10-05T00:11:15Z","icao_address":"A0F4F6","latitude":48.82104492,"longitude":-34.43157489,"altitude_baro":"35000","source":"FM89","collection_type":"satellite","ingestion_time":"2019-10-05T02:49:47Z", **"minute": "test"**}
{"timestamp":"2019-10-05T00:11:15Z","icao_address":"A0F4F5","latitude":48.82104492,"longitude":-34.43157489,"altitude_baro":"35000","source":"FM89","collection_type":"land","ingestion_time":"2019-10-05T02:49:47Z", **"minute": "test"**}
for data in list:
for value in data:
if value == 'latitude' or value == 'longitude':
data[value] = float('%.2f'%(data[value]))
what are possible ways to add keys to a dictionary while on a loop.
Use the standard dictionary assignment syntax in a loop to add a new key/value pair to each dictionary in your list:
>>> x = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
>>> for data in x:
... data['minute'] = 'test'
...
>>> x
[{'a': 1, 'b': 2, 'minute': 'test'}, {'a': 3, 'b': 4, 'minute': 'test'}]
You can read more about dictionaries in the docs here.
I am trying to sort below dictionary based on "resume_match_score" in descending order.
{'16334': [{'skill_match': {'java': 33,
'python': 5,
'swing': 1,
'apache cassandra': 1},
'skill_match_score': 0.8},
{'doc_attachment': '97817_1560102392mahesh-java.docx',
'document_path': '06_2019',
'firstname': 'nan',
'lastname': 'nan'},
{'job_title_match': {'java developer': 3}, 'job_title_match_score': 0.5},
{'resume_match_score': 0.71}],
'4722': [{'skill_match': {'java': 24, 'python': 1, 'hadoop': 31},
'skill_match_score': 0.6},
{'doc_attachment': '4285_1560088607Srujan_Hadoop.docx',
'document_path': '06_2019',
'firstname': 'nan',
'lastname': 'nan'},
{'job_title_match': {'hadoop developer': 3, 'java developer': 2},
'job_title_match_score': 1.0},
{'resume_match_score': 0.72}]
I tried as below and this seems to be working but giving only key instead of full dictionary object.
result = sorted(test_d, key=lambda k: test_d[k][3].get("resume_match_score", 0), reverse=True)
and
result = ['4722', '16334']
How to get complete dictionary in sorted order based on key resume_match_score?
Thanks in advance.
I am not sure if this is the proper way to do it but you can go through a for loop and change every key with a tuple of (key, value) pair.
Basically you can just add by something like this:
real_result = []
for key in result:
real_result.append((key, dict.get(key))
I just started learning python and found this snippet. It's supposed to count how many times a word appears. I guess, for all of you this will seem very logical, but unfortunately for me, it doesn't make any sense.
str = "house is where you live, you don't leave the house."
dict = {}
list = str.split(" ")
for word in list: # Loop over the list
if word in dict: # How can I loop over the dictionary if it's empty?
dict[word] = dict[word] + 1
else:
dict[word] = 1
So, my question here is, how can I loop over the dictionary? Shouldn't the dictionary be empty because I didn't pass anything inside?
Maybe I am not smart enough, but I don't see the logic. Can anybody explain me how does it work?
Many thanks
As somebody else pointed out, the terms str, dict, and list shouldn't be used for variable names, because these are actual Python commands that do special things in Python. For example, str(33) turns the number 33 into the string "33". Granted, Python is often smart enough to understand that you want to use these things as variable names, but to avoid confusion you really should use something else. So here's the same code with different variable names, plus some print statements at the end of the loop:
mystring = "house is where you live, you don't leave the house."
mydict = {}
mylist = mystring.split(" ")
for word in mylist: # Loop over the list
if word in mydict:
mydict[word] = mydict[word] + 1
else:
mydict[word] = 1
print("\nmydict is now:")
print(mydict)
If you run this, you'll get the following output:
mydict is now:
{'house': 1}
mydict is now:
{'house': 1, 'is': 1}
mydict is now:
{'house': 1, 'is': 1, 'where': 1}
mydict is now:
{'house': 1, 'is': 1, 'where': 1, 'you': 1}
mydict is now:
{'house': 1, 'is': 1, 'live,': 1, 'where': 1, 'you': 1}
mydict is now:
{'house': 1, 'is': 1, 'live,': 1, 'where': 1, 'you': 2}
mydict is now:
{"don't": 1, 'house': 1, 'is': 1, 'live,': 1, 'you': 2, 'where': 1}
mydict is now:
{"don't": 1, 'house': 1, 'is': 1, 'live,': 1, 'leave': 1, 'you': 2, 'where': 1}
mydict is now:
{"don't": 1, 'house': 1, 'is': 1, 'live,': 1, 'leave': 1, 'you': 2, 'where': 1, 'the': 1}
mydict is now:
{"don't": 1, 'house': 1, 'is': 1, 'live,': 1, 'house.': 1, 'leave': 1, 'you': 2, 'where': 1, 'the': 1}
So mydict is indeed updating with every word it finds. This should also give you a better idea of how dictionaries work in Python.
To be clear, you're not "looping" over the dictionary. The for command starts a loop; the if word in mydict: command isn't a loop, but just a comparison. It looks at all of the keys in mydict and sees if there's one that matches the same string as word.
Also, note that since you only split your sentence on strings, your list of words includes for example both "house" and "house.". Since these two don't exactly match, they're treated as two different words, which is why you see 'house': 1 and 'house.': 1 in your dictionary instead of 'house': 2.
I have dictionary in the following format:
dictionary = {'key' : ('value', row_number, col_number)}
I want that dictionary converted to below format:
converted_dict = {'key' : {'row':row_number, 'col':col_number}}
By using the following code i am getting below error
dict_list = [(key, dict([('row',value[1]), ('column',value[2])])) for
key, value in cleaned_dict]
converted_dict = dict(dict_list)
ValueError: too many values to unpack (expected 2)
I don't quite understand why you try to convert the dictionary to list when what you want is in fact dict. It seems you don't understand how to do dict comprehension. Try this approach:
converted_dict = {
key: {'row': value[1], 'column': value[2]} for key, value in cleaned_dict.items()
}
Also note that if you want to iterate over both the keys and the values in a dictionary you should call dictionary.items() (like in the code above)
You could use a dictionary comprehension:
dictionary = {'key' : ('value', 'row_number', 'col_number')}
>>> {k: {'row': row, 'col': col} for k, (_, row, col) in dictionary.items()}
{'key': {'row': 'row_number', 'col': 'col_number'}}
Iterating through dictionary only return nd no need of extra steps for convert list into dictionary. key,
So here we need pair of key and values so we need to use dict.items,
In [6]: lst = {'a':(0,1),'b':(2,1)}
In [7]: converted_dict = dict((key, dict([('row',value[0]), ('column',value[1])])) for key, value in lst.items())
Out[7]: {'a': {'column': 1, 'row': 0}, 'b': {'column': 1, 'row': 2}}
...and yet another dict-comprehension.
converted_dict = {k : dict(zip(('row', 'column'), v[1:])) for k, v in dictionary.items()}
which for:
dictionary = {'a' : ('value', 1, 2), 'b' : ('value', 0, 0)}
returns:
{'a': {'row': 1, 'column': 2}, 'b': {'row': 0, 'column': 0}}
The only thing wrong with your code is that it is missing the .items()1. It had to be:
dict_list = [... for key, value in cleaned_dict.items()]
1. If you are using Python 2 you need .iteritems() instead.