Appending values to a list within a dictionary - python

The following is my created dictionary testDict:
{'BTC': [23031.0897756201, 443936922524.46, 1], 'LTC': [89.6019345445, 6465505641.56, 2], 'NMC': [1.4363653274, 21166854.02, 3], 'TRC': [0.0180333433, 413601.88, 4']}
Im looking to append the following names to each list respectively:
Bitcoin
Litecoin
Namecoin
Terracoin
Below is my code:
def symbol_and_price1(rawInfoList, testDict):
# print(testDict)
nameList = []
for req in rawInfoList:
name = req['data']['name']
nameList.append(name)
testDict['BTC'].append(nameList)
print(testDict)
rawInfo()
Output:
{'BTC': [23031.0897756201, 443936922524.46, 1, ['Bitcoin', 'Litecoin', 'Namecoin', 'Terracoin']], 'LTC': [89.6019345445, 6465505641.56, 2], 'NMC': [1.4363653274, 21166854.02, 3], 'TRC': [0.0180333433, 413601.88, 4]}
It appends the whole list for the specified key (BTC); How can I make each key dynamic, and append the corresponding value to each key?
Desired output:
{'BTC': [23031.0897756201, 443936922524.46, 1, Bitcoin], 'LTC': [89.6019345445, 6465505641.56, 2, Litecoin], 'NMC': [1.4363653274, 21166854.02, 3, Namecoin]...etc}

You should modify each list separately. Replace the following line:
testDict['BTC'].append(nameList)
with this:
for i, (key, value) in enumerate(testDict.items()):
value.append(nameList[i])

Related

Finding multiple duplicates within nested lists, then deleting them whilst keeping the latest entry

I've been given the task of going through some signup data and deleting duplicate entries.
Each entry is contained within a list, all within a single list. I want to delete an entry if the first name, last name, and email are the same. A simplified version of the data looks like this.
data = [[0,1,'john','doe',test#email],[0,1,'james','doe',eggs#email],[0,1,'john','doe',test#email],[2,11,'john','Stephenson',stack#email]]
The desired outcome would have the same list outputted, however, the first entry of the duplicate would be deleted while keeping the second entry. As follows:
data = [0,1,'james','doe',eggs#email],[0,1,'john','doe',test#email],[2,11,'john','Stephenson',stack#email]]
The data above has duplicate names, but of course, people can share first names or last names. However, if someone has the same first name, last name, and email, it's clearly a duplicate.
Anyways, how would I go about doing this? The data set I'm dealing with has well over 1000 entries. If I compared every list to every list, would that be 1000^1000 tests having to be conducted?
Test data:
data = [
[0, 1, "john", "doe", "test#email"],
[0, 1, "james", "doe", "eggs#email"],
[0, 2, "john", "doe", "test#email"],
[2, 11, "john", "Stephenson", "stack#email"],
]
data = data * 2_500
Pandas has a very nice drop_duplicates method that is very fast. In this case I only test on the last three columns (name and email) and keep the last item of all duplicates.
import pandas as pd
pd.DataFrame(data).drop_duplicates(subset=[2,3,4], keep="last").values.tolist()
Output:
CPU times: user 9.45 ms, sys: 1.64 ms, total: 11.1 ms
Wall time: 9.51 ms
[[0, 1, 'james', 'doe', 'eggs#email'],
[0, 2, 'john', 'doe', 'test#email'],
[2, 11, 'john', 'Stephenson', 'stack#email']
data = [[0,1,'john','doe','test#email'],[0,1,'james','doe','eggs#email'],[0,1,'john','doe','test#email'],[2,11,'john','Stephenson','stack#email']]
final_list = [[*v, *k] for k, v in dict((tuple(i[2:]), i[0:2]) for i in data).items()]
print(final_list)
#[[0, 1, 'john', 'doe', 'test#email'], [0, 1, 'james', 'doe', 'eggs#email'], [2, 11, 'john', 'Stephenson', 'stack#email']]
Here is a more verbose answer which is easy to understand/read (at least in my opinion)
data = [[0,1,'john','doe',"test#email"],[0,1,'james','doe',"eggs#email"],[0,1,'john','doe',"test#email"],[2,11,'john','Stephenson',"stack#email"]]
def filter_lst(data):
found = set()
filtered_data = []
for lst in data:
lst_str = str(lst)
if not lst_str in found:
filtered_data.append(lst)
found.add(lst_str)
return filtered_data
You can store the items that have been seen in a dictionary and then iterate over that dictionary a second time and grab the last index of each seen item.
from collections import defaultdict
data = [[0,1,'john','doe','test#email'],[0,1,'james','doe','eggs#email'],[0,2,'john','doe','test#email'],[2,11,'john','Stephenson','stack#email']]
def no_dupes(data):
tmp = defaultdict(list)
for idx, d in enumerate(data):
tmp[(d[2], d[3], d[4])].append(idx)
final = []
for v in tmp.values():
idx = v[-1]
final.append(data[idx])
return final
print(no_dupes(data))
[[0, 2, 'john', 'doe', 'test#email'],
[0, 1, 'james', 'doe', 'eggs#email'],
[2, 11, 'john', 'Stephenson', 'stack#email']]

How to count the first element in a list with 2-tuples of strings?

I am trying to define a function count_first_names that input a list of names, 2-tuple of strings as in (first_name,last_name) and that returns a dictionary whose keys are first names and the values are the number of times that the first name appears on the list.
For example, take the first 5 presidents of the U.S.,
presidents = [("George","Washington"),
("John","Adams"),
("Thomas","Jefferson"),
("James", "Madison"),
("James", "Monroe"),]
Then, I would like to see:
count_first_names(presidents)
{'George':1, 'John':1, 'Thomas':1, 'James':2}
First, I created an empty dictionary, and took the first element of each tuple in the list. But I am not sure what to do next. Please help?
To count multiple items use the Counter:
collections.Counter(p[0] for p in presidents)
# Counter({'James': 2, 'George': 1, 'John': 1, 'Thomas': 1})
The result is a dict subclass.
try this code:
presidents = [("George","Washington"),
("John","Adams"),
("Thomas","Jefferson"),
("James", "Madison"),
("James", "Monroe"),]
count_dict = {}
for ele in presidents:
count_dict[ele[0]] = count_dict.get(ele[0],0) + 1
print(count_dict)
output is:
{'George': 1, 'John': 1, 'Thomas': 1, 'James': 2}
If you want to try comprehension:
{firstName:sum(item[0]==firstName for item in presidents) for firstName in set(item[0] for item in presidents)}
OUTPUT
Out[7]: {'Thomas': 1, 'John': 1, 'George': 1, 'James': 2}

Merge two json object in python

I am merging two json in python
I'm doing
import json
json_obj = json.dumps({"a": [1,2]})
json_obj1 = json.dumps({"a": [3,4]})
json_obj += json_obj1
print(json_obj)
I am expecting the output as
{"a": [1, 2,3,4]}
but i got
{"a": [1, 2]}{"a": [3, 4]}
How to get the earlier one?
In json module, dumps convert python object to a string, and loads convert a string into python object. So in your original codes, you just try to concat two json-string. Try to code like this:
import json
from collections import defaultdict
def merge_dict(d1, d2):
dd = defaultdict(list)
for d in (d1, d2):
for key, value in d.items():
if isinstance(value, list):
dd[key].extend(value)
else:
dd[key].append(value)
return dict(dd)
if __name__ == '__main__':
json_str1 = json.dumps({"a": [1, 2]})
json_str2 = json.dumps({"a": [3, 4]})
dct1 = json.loads(json_str1)
dct2 = json.loads(json_str2)
combined_dct = merge_dict(dct1, dct2)
json_str3 = json.dumps(combined_dct)
# {"a": [1, 2, 3, 4]}
print(json_str3)
json.dumps() converts a dictionary to str object, not a json(dict) object.
So, adding some dumps statement in your code shows that the type is changed to str after using json.dumps() and with + you are effectively concatenating the two string and hence you get the concatenated output.
Further, to merge the two dictionaries for your simple case, you can just use the append:
import json
json_obj = json.dumps({"a": [1,2]})
json_obj1 = json.dumps({"a": [3,4]})
print(type(json_obj1)) # the type is `str`
json_obj += json_obj1 # this concatenates the two str objects
json_obj = {"a": [1,2]}
json_obj1 = {"a": [3,4]}
json_obj["a"].extend(json_obj1["a"])
print(json_obj)
I suggest you to study basic fundamental of Python for your own sake as you don't seem to understand why your code wouldn't work.
import json
# We have two dictionaries to combine
json_obj_1 = {"a": [1,2], "b":[2,3], 'c': [1,2,3]}
json_obj_2 = {"a": [3,4], 'd':[4,2], 'e': [4,2,2]}
Merged dictionary will be stored here
hold_json_obj = {}
Don't worry, it's not actually that complicated. Read the code line by line with comments attached and you'll understand.
# We'll loop through every item in the json_obj_1 dictionary
for item_1 in json_obj_1:
# We'll also loop through every item in the json_obj_2 dictionary
for item_2 in json_obj_2:
# Now let's compare whether they are the same KEYS (not values)
if item_1 == item_2:
# if they match, we create a list to store the array
hold_array = []
hold_array.extend(json_obj_1[item_1])
hold_array.extend(json_obj_2[item_1])
# finally putting the array to our hold_json_obj
hold_json_obj[item_1] = hold_array
else:
# if they don't match, check if the key already exists in the
# hold_json_obj because we might be iterating json_obj_2 for the second time.
if item_2 not in hold_json_obj:
#add the ummatched array to hold_json_obj
hold_json_obj[item_2] = json_obj_2[item_2]
Now simply update json_obj_1 with the update method. The update function is required because if json_obj_1 has keys that json_obj_2 doesn't then we may have missed them out in the above loops.
json_obj_1.update(hold_json_obj)
print(json_obj_1)
This is what the print displays.
{'a': [1, 2, 3, 4], 'b': [2, 3], 'c': [1, 2, 3], 'd': [4, 2], 'e': [4, 2, 2]}

Python: Remove top 'n' keys from a dictionary

I have a dictionary dict. For each key in dict, there is a list, that has two items in it. One is another dictionary, the other is an integer.
dict = {
'hello' : [
{
'blah' : 1,
'dodo' : 2
},
3
],
'world' : [
{
'foo' : 7,
'bar' : 1
},
8
]
}
I want to sort the dictionary dict on the second item in the list, the integer. And then remove the first 'n' keys from the dictionary. Is there any way to do it? The sorted function works only on lists.
Here is the function I'm trying to do this in.
def create_inverted_index(inverted_index, corpus_tokens, corpus_files):
for file_tokens in corpus_tokens:
file_id = corpus_files[file_tokens[0]]
for token in file_tokens[1]:
if token in inverted_index.keys():
inverted_index[token][1] += 1
if file_id in inverted_index[token][0].keys():
inverted_index[token][0][file_id] += 1
else:
inverted_index[token][0][file_id] = 1
else:
inverted_index[token] = [{file_id : 1}, 1]
You can do it by doing this:
d = {1: [1, 2], 3: [2,4], 4:[3,3], 2:[4,1], 0:[5,0]} # dict to remove items from
sorted_list=sorted(d.items(), key=lambda x: x[1][1])
sorted_keys = [key[1] for key in sorted_list]
n=2 # number of items to remove
for key in sorted_keys[0:n]:
d = dict([(k,v) for k,v in d.items() if v != key ])
This code copies the dict to a list ordered by the second item in dict values. Then it creates a list with only the sorted keys and iterate over it removing them as values from the dictionary.
For my value of d and n=3, output is:
{3: [2, 4], 4: [3, 3]}
For n=2:
{1: [1, 2], 3: [2, 4], 4: [3, 3]}
Ps: Might not be most efficient way of doing this, but does the job
In Python, dictionaries don't have an order. You cannot sort a dict. However, you can take a look at collections.OrderedDict.

Updating a list of python dictionaries with a key, value pair from another list

Let's say I have the following list of python dictionary:
dict1 = [{'domain':'Ratios'},{'domain':'Geometry'}]
and a list like:
list1 = [3, 6]
I'd like to update dict1 or create another list as follows:
dict1 = [{'domain':'Ratios', 'count':3}, {'domain':'Geometry', 'count':6}]
How would I do this?
>>> l1 = [{'domain':'Ratios'},{'domain':'Geometry'}]
>>> l2 = [3, 6]
>>> for d,num in zip(l1,l2):
d['count'] = num
>>> l1
[{'count': 3, 'domain': 'Ratios'}, {'count': 6, 'domain': 'Geometry'}]
Another way of doing it, this time with a list comprehension which does not mutate the original:
>>> [dict(d, count=n) for d, n in zip(l1, l2)]
[{'count': 3, 'domain': 'Ratios'}, {'count': 6, 'domain': 'Geometry'}]
You could do this:
for i, d in enumerate(dict1):
d['count'] = list1[i]
You can do this:
# list index
l_index=0
# iterate over all dictionary objects in dict1 list
for d in dict1:
# add a field "count" to each dictionary object with
# the appropriate value from the list
d["count"]=list1[l_index]
# increase list index by one
l_index+=1
This solution doesn't create a new list. Instead, it updates the existing dict1 list.
Using list comprehension will be the pythonic way to do it.
[data.update({'count': list1[index]}) for index, data in enumerate(dict1)]
The dict1 will be updated with the corresponding value from list1.

Categories

Resources