Find dict matching another dict - python

I have unlimited list of elements(they are rows in document db) containing dicts like:
list = [
{a:''}
{a:'', b:''}
{a:'',b:'',c:''}
]
And I have input - element, unlimited in count of it's dicts, like:
{a:'', c:''}
I need a function to find element index matching most dict keys with input.
In this case it would be list[2], because it contains both {a:''} and {c:''}
Could you help me/prompt me how to do it?

You can use the builtin max function and provide a matching key:
# The input to search over
l = [{'a':''}, {'a':'', 'b':''}, {'a':'','b':'','c':''}]
# Extract the keys we'd like to search for
t = set({'a': '', 'c': ''}.keys())
# Find the item in l that shares maximum number of keys with the requested item
match = max(l, key=lambda item: len(t & set(item.keys())))
To extract the index in one pass:
max_index = max(enumerate(l), key=lambda item: len(t & set(item[1].keys())))[0]

>>> lst = [{'a':'a'},{'a':'a','b':'b'},{'a':'a','b':'b','c':'c'}]
>>> seen = {}
>>> [seen.update({key:value}) for dct in lst for key,value in dict(dct).items() if key not in seen.keys()]
>>> seen
Output
{'a': 'a', 'c': 'c', 'b': 'b'}
check here

Related

how to find a match in a dictionary

I have two dictionaries, RFQDict and AwardsDict. I want to take the keys of RFQdict and search through AwardsDict values for matches.
So I tried this
RFQDict = {}
AwardsDict = {}
# Fetch RFQ
RFQref = db.reference('TestRFQ')
snapshot = RFQref.get()
for key, val in snapshot.items():
RFQDict[key] = val
print('{0} => {1}'.format(key, val))
Awardsref = db.reference('DibbsAwards')
dsnapshot = Awardsref.get()
for key, val in dsnapshot.items():
AwardsDict[key] = val
print('{0} => {1}'.format(key, val))
for key in RFQDict:
if key in AwardsDict.values():
print(key+ " Match found")
is this the way to do it or there is a better way and how could return the key and values where the match was found?
In python3 you can do AwardsDict.values() & RFQDict.keys() and you will get a set with the common key/values.
The '&' operator is used for set intersection and works with the dictionary views returned by values() and keys(). More information of the view returned by those methods: https://docs.python.org/3/library/stdtypes.html?highlight=dictview#dictionary-view-objects
If you want to store the keys and values that match, it would probably be best to store the key and value from the second dictionary since if you just store the matching key and value you will have elements like (a, a) which won't really tell you much about where they matched in the second dictionary. Maybe something like this
d1 = {'a': 1, 'b': 2, 'c': 3}
d2 = {'x': 'a', 'y': 1, 'z': 'c'}
res = [(i, {j: d2[j]}) for i in d1 for j in d2 if i == d2[j]]
print(res)
# [('a': {'x': 'a'}), ('c': {'z': 'c'})]
I would do a list comprehension:
result=[x for x in AwardsDict.values() if x in RFQDict.keys() ]
This way you get a list keeping the duplicates. That is, if a RFQ key is presented in more than one value in AwardsDict. With the & operator you loss that information (as sets only have unique elements).
For example:
RFQDict = {}
AwardsDict = {}
for i in range(5):
RFQDict[i]=0
for i in range(5):
AwardsDict[i]=i
for i in range(5,11):
AwardsDict[i]=i//2 #integer division, i=8 and i=9 get a value of 4
result=[x for x in AwardsDict.values() if x in RFQDict.keys() ]
print('{}'.format(result))
#output [0, 1, 2, 3, 4, 2, 3, 3, 4, 4]

Divide alphanumeric word from a list and store as a key value pair of a dict

I have a list of alphanumeric data,
my_list = ["A1B2244", "B3H7654", "A1O6541", "J4777"]
I need to divide each word in dict form like
{"A1": ["B2244", "O6541"], "B3": ["H7654"], "J4": ["777"]}
Could you please let me know the easiest way to do this in python.
You can use itertools.groupby to group the elements of your list based on your condition(first two characters). Then supply the result to dict constructor
>>> from itertools import groupby
>>> dict([(k,list(g)) for k,g in groupby(sorted(k),key=lambda x: x[:2])])
>>> {'J4': ['J4777'], 'A1': ['A1B2244', 'A1O6541'], 'B3': ['B3H7654']}
list = ['A1B2244', 'B3H7654', 'A1O6541', 'J4777']
#first initialize lists based 2 first elements
d= {i[:2]:[] for i in list}
#loop to add items by key
[d.get(i[:2]).append(i[2:]) for i in list]
print(d)
output:
{'A1': ['B2244', 'O6541'], 'J4': ['777'], 'B3': ['H7654']}
my_list = ['A1B2244', 'B3H7654', 'A1O6541', 'J4777']
my_dict={i[:2]:i[2:] for i in my_list}
Edit:
Sorry I didn't notice the replication in your output. Others have short solutions, but a pure pythonic way is:
my_list = ['A1B2244', 'B3H7654', 'A1O6541', 'J4777']
my_dict={}
for i in my_list:
if i[:2] in my_dict:
my_dict[i[:2]].append(i[2:])
else:
my_dict[i[:2]]=[i[2:]]
Just to add to the first answer by Willian Vieira, I thought it would be helpful to know the output print(d) right after d= {i[:2]:[] for i in list}, which is:
{'A1': [], 'B3': [], 'J4': []}
just to clarify this line in the two-line solution. Just to see how the keys are made from taking the first two characters in each element of the list (without duplication of these characters), and the values are initialized as empty lists.
As per your comment to the question, rule to split is: split after the first digit. So, you can search for the index of the first digit, split and add to the dict. I ignored an input with no digits.
def first_index_of_digit(st):
for i in range(len(st)):
if st[i].isdigit():
return i
return -1
my_list = ["A1B2244", "B3H7654", "A1O6541", "J4777"]
dd = dict()
for item in my_list:
i = first_index_of_digit(item)
if (i == -1):
continue
k, v = item[:i+1], item[i+1:]
if (dd.get(k, 0) == 0):
dd[k] = list()
dd[k].append(v)
print(dd)
# {'A1': ['B2244', 'O6541'], 'B3': ['H7654'], 'J4': ['777']}

Mapping list elements into their positions (Python)

I have a list of URLs. (We can assume that a given URL is met in the list no more than once.)
I need a fast way to determine which of two URLs is before in the list.
I think, I should create the dict from URL to its position in the list.
What is the easy way (without writing a for loop with manual increasing of the counter) to map elements of a list into their positions in the list?
The best thing I conceived is:
order = {}
i = 0
for item in list:
order[item] = i
i += 1
Now to check if url1 is before url2, I check order[url1] < order[url2].
Can this code be shortened?
This creates your order
order = {k: v for v, k in enumerate(list)}
Example:
L = list('abc')
Your version:
order1 = {}
i = 0
for item in L:
order1[item] = i
i += 1
print(order1)
My version:
order2 = {k: v for v, k in enumerate(L)}
print(order2)
Output:
{'a': 0, 'b': 1, 'c': 2}
{'a': 0, 'b': 1, 'c': 2}
Better don't use listfor your variable name because it is a built-in.
enumerate provides an iterate that gives you the index and the value for each iteration through your list.
If you want to know which comes first for a specific pair of items, you can use the index method on the list:
a = ['cat', 'dog', 'fish']
a.index('cat') < a.index('dog') # True
a.index('fish') < a.index('dog') # False
List of URLs:
urls = ['A', 'B', 'C', 'D']
List of indices:
index = range(len(urls))
Create the dict:
order = dict(zip(urls, index))
Test:
print(order['A'] < order['B']) # True
Demo

Inverting a dictionary when some of the original values are identical

Say I have a dictionary called word_counter_dictionary that counts how many words are in the document in the form {'word' : number}. For example, the word "secondly" appears one time, so the key/value pair would be {'secondly' : 1}. I want to make an inverted list so that the numbers will become keys and the words will become the values for those keys so I can then graph the top 25 most used words. I saw somewhere where the setdefault() function might come in handy, but regardless I cannot use it because so far in the class I am in we have only covered get().
inverted_dictionary = {}
for key in word_counter_dictionary:
new_key = word_counter_dictionary[key]
inverted_dictionary[new_key] = word_counter_dictionary.get(new_key, '') + str(key)
inverted_dictionary
So far, using this method above, it works fine until it reaches another word with the same value. For example, the word "saves" also appears once in the document, so Python will add the new key/value pair just fine. BUT it erases the {1 : 'secondly'} with the new pair so that only {1 : 'saves'} is in the dictionary.
So, bottom line, my goal is to get ALL of the words and their respective number of repetitions in this new dictionary called inverted_dictionary.
A defaultdict is perfect for this
word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}
from collections import defaultdict
d = defaultdict(list)
for key, value in word_counter_dictionary.iteritems():
d[value].append(key)
print(d)
Output:
defaultdict(<type 'list'>, {1: ['first'], 2: ['second', 'fourth'], 3: ['third']})
What you can do is convert the value in a list of words with the same key:
word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}
inverted_dictionary = {}
for key in word_counter_dictionary:
new_key = word_counter_dictionary[key]
if new_key in inverted_dictionary:
inverted_dictionary[new_key].append(str(key))
else:
inverted_dictionary[new_key] = [str(key)]
print inverted_dictionary
>>> {1: ['first'], 2: ['second', 'fourth'], 3: ['third']}
Python dicts do NOT allow repeated keys, so you can't use a simple dictionary to store multiple elements with the same key (1 in your case). For your example, I'd rather have a list as the value of your inverted dictionary, and store in that list the words that share the number of appearances, like:
inverted_dictionary = {}
for key in word_counter_dictionary:
new_key = word_counter_dictionary[key]
if new_key in inverted_dictionary:
inverted_dictionary[new_key].append(key)
else:
inverted_dictionary[new_key] = [key]
In order to get the 25 most repeated words, you should iterate through the (sorted) keys in the inverted_dictionary and store the words:
common_words = []
for key in sorted(inverted_dictionary.keys(), reverse=True):
if len(common_words) < 25:
common_words.extend(inverted_dictionary[key])
else:
break
common_words = common_words[:25] # In case there are more than 25 words
Here's a version that doesn't "invert" the dictionary:
>>> import operator
>>> A = {'a':10, 'b':843, 'c': 39, 'd': 10}
>>> B = sorted(A.iteritems(), key=operator.itemgetter(1), reverse=True)
>>> B
[('b', 843), ('c', 39), ('a', 10), ('d', 10)]
Instead, it creates a list that is sorted, highest to lowest, by value.
To get the top 25, you simply slice it: B[:25].
And here's one way to get the keys and values separated (after putting them into a list of tuples):
>>> [x[0] for x in B]
['b', 'c', 'a', 'd']
>>> [x[1] for x in B]
[843, 39, 10, 10]
or
>>> C, D = zip(*B)
>>> C
('b', 'c', 'a', 'd')
>>> D
(843, 39, 10, 10)
Note that if you only want to extract the keys or the values (and not both) you should have done so earlier. This is just examples of how to handle the tuple list.
For getting the largest elements of some dataset an inverted dictionary might not be the best data structure.
Either put the items in a sorted list (example assumes you want to get to two most frequent words):
word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}
counter_word_list = sorted((count, word) for word, count in word_counter_dictionary.items())
Result:
>>> print(counter_word_list[-2:])
[(2, 'second'), (3, 'third')]
Or use Python's included batteries (heapq.nlargest in this case):
import heapq, operator
print(heapq.nlargest(2, word_counter_dictionary.items(), key=operator.itemgetter(1)))
Result:
[('third', 3), ('second', 2)]

Appending values to dictionary in Python

I have a dictionary to which I want to append to each drug, a list of numbers. Like this:
append(0), append(1234), append(123), etc.
def make_drug_dictionary(data):
drug_dictionary={'MORPHINE':[],
'OXYCODONE':[],
'OXYMORPHONE':[],
'METHADONE':[],
'BUPRENORPHINE':[],
'HYDROMORPHONE':[],
'CODEINE':[],
'HYDROCODONE':[]}
prev = None
for row in data:
if prev is None or prev==row[11]:
drug_dictionary.append[row[11][]
return drug_dictionary
I later want to be able to access the entirr set of entries in, for example, 'MORPHINE'.
How do I append a number into the drug_dictionary?
How do I later traverse through each entry?
Just use append:
list1 = [1, 2, 3, 4, 5]
list2 = [123, 234, 456]
d = {'a': [], 'b': []}
d['a'].append(list1)
d['a'].append(list2)
print d['a']
You should use append to add to the list. But also here are few code tips:
I would use dict.setdefault or defaultdict to avoid having to specify the empty list in the dictionary definition.
If you use prev to to filter out duplicated values you can simplfy the code using groupby from itertools
Your code with the amendments looks as follows:
import itertools
def make_drug_dictionary(data):
drug_dictionary = {}
for key, row in itertools.groupby(data, lambda x: x[11]):
drug_dictionary.setdefault(key,[]).append(row[?])
return drug_dictionary
If you don't know how groupby works just check this example:
>>> list(key for key, val in itertools.groupby('aaabbccddeefaa'))
['a', 'b', 'c', 'd', 'e', 'f', 'a']
It sounds as if you are trying to setup a list of lists as each value in the dictionary. Your initial value for each drug in the dict is []. So assuming that you have list1 that you want to append to the list for 'MORPHINE' you should do:
drug_dictionary['MORPHINE'].append(list1)
You can then access the various lists in the way that you want as drug_dictionary['MORPHINE'][0] etc.
To traverse the lists stored against key you would do:
for listx in drug_dictionary['MORPHINE'] :
do stuff on listx
To append entries to the table:
for row in data:
name = ??? # figure out the name of the drug
number = ??? # figure out the number you want to append
drug_dictionary[name].append(number)
To loop through the data:
for name, numbers in drug_dictionary.items():
print name, numbers
If you want to append to the lists of each key inside a dictionary, you can append new values to them using + operator (tested in Python 3.7):
mydict = {'a':[], 'b':[]}
print(mydict)
mydict['a'] += [1,3]
mydict['b'] += [4,6]
print(mydict)
mydict['a'] += [2,8]
print(mydict)
and the output:
{'a': [], 'b': []}
{'a': [1, 3], 'b': [4, 6]}
{'a': [1, 3, 2, 8], 'b': [4, 6]}
mydict['a'].extend([1,3]) will do the job same as + without creating a new list (efficient way).
You can use the update() method as well
d = {"a": 2}
d.update{"b": 4}
print(d) # {"a": 2, "b": 4}
how do i append a number into the drug_dictionary?
Do you wish to add "a number" or a set of values?
I use dictionaries to build associative arrays and lookup tables quite a bit.
Since python is so good at handling strings,
I often use a string and add the values into a dict as a comma separated string
drug_dictionary = {}
drug_dictionary={'MORPHINE':'',
'OXYCODONE':'',
'OXYMORPHONE':'',
'METHADONE':'',
'BUPRENORPHINE':'',
'HYDROMORPHONE':'',
'CODEINE':'',
'HYDROCODONE':''}
drug_to_update = 'MORPHINE'
try:
oldvalue = drug_dictionary[drug_to_update]
except:
oldvalue = ''
# to increment a value
try:
newval = int(oldval)
newval += 1
except:
newval = 1
drug_dictionary[drug_to_update] = "%s" % newval
# to append a value
try:
newval = int(oldval)
newval += 1
except:
newval = 1
drug_dictionary[drug_to_update] = "%s,%s" % (oldval,newval)
The Append method allows for storing a list of values but leaves you will a trailing comma
which you can remove with
drug_dictionary[drug_to_update][:-1]
the result of the appending the values as a string means that you can append lists of values as you need too and
print "'%s':'%s'" % ( drug_to_update, drug_dictionary[drug_to_update])
can return
'MORPHINE':'10,5,7,42,12,'
vowels = ("a","e","i","o","u") #create a list of vowels
my_str = ("this is my dog and a cat") # sample string to get the vowel count
count = {}.fromkeys(vowels,0) #create dict initializing the count to each vowel to 0
for char in my_str :
if char in count:
count[char] += 1
print(count)

Categories

Resources