Remove repeat values from a dictionary 'a' in a dictionary 'b' python - python

dictionary = {'0': "Linda", "1": "Anna", "2": 'Theda', "3":'Thelma',"4": 'Thursa',"5" :"Mary"}
dictionary2 = ['Linda', 'Ula', 'Vannie', 'Vertie', 'Mary']
I want to remove the same values from dictionary to dictionary2, I wrote the code like this:
[k for k, v in dictionary.items() if v not in dictionarya]
But it still can print out the words above same words, like this:
['0', '1', '2', '3', '4']
How to remove all the repeat words? so it can print out like this: e,g.
['1', '2', '3', '4']
How to just get the last loop? Thank you

To get the keys in which the values aren't contained in the other list you can use the following list comprehension
>>> [k for k,v in dictionary.items() if v not in dictionary2]
['2', '4', '3', '1']

Just do the following list comprehension:
>>> dictionary = {'0': "Linda", "1": "Anna", "2":"Mary"}
>>> dictionary2 = ['Linda', 'Theda', 'Thelma', 'Thursa', 'Ula', 'Vannie', 'Vertie', 'Mary']
>>> value = [i for i in dictionary2 if i not in dictionary.values()]
>>> value
['Theda', 'Thelma', 'Thursa', 'Ula', 'Vannie', 'Vertie']
>>>

This works:
dictionary = {'0': "Linda", "1": "Anna", "2":"Mary"}
dictionary2 = ['Linda', 'Theda', 'Thelma', 'Thursa', 'Ula', 'Vannie', 'Vertie', 'Mary']
# I want to remove the same values from dictionary to dictionary2, I wrote the code like this:
output = [i for i in dictionary2 if i not in dictionary.values()]
Result:
['Theda', 'Thelma', 'Thursa', 'Ula', 'Vannie', 'Vertie']

You could convert dictionary2 to a set, and use this syntax:
>>> dictionary = {'0': "Linda", "1": "Anna", "2": 'Theda', "3":'Thelma',"4": 'Thursa',"5" :"Mary"}
>>> dictionary2 = ['Linda', 'Ula', 'Vannie', 'Vertie', 'Mary']
>>> remove_names = set(dictionary2)
>>> [name for name in dictionary.values() if name not in remove_names]
['Anna', 'Theda', 'Thelma', 'Thursa']
>>> [id for id, name in dictionary.items() if name not in remove_names]
['1', '2', '3', '4']
Note that dictionary2 isn't a dict but a list.
Also, Python dicts are unordered. You cannot be sure that the result will always be 1,2,3,4.
Finally, if all your dict keys are integers (or look like integers), you'd better use a list :
>>> names = ["Linda","Anna",'Theda','Thelma','Thursa',"Mary"]
>>> remove_names = set(['Linda', 'Ula', 'Vannie', 'Vertie', 'Mary'])
>>> list(enumerate(names))
[(0, 'Linda'), (1, 'Anna'), (2, 'Theda'), (3, 'Thelma'), (4, 'Thursa'), (5, 'Mary')]
>>> [name for name in names if name not in remove_names]
['Anna', 'Theda', 'Thelma', 'Thursa']
>>> [id for id, name in enumerate(names) if name not in remove_names]
[1, 2, 3, 4]
This should be more readable and faster than your code.

dictionary = {'0': "Linda", "1": "Anna", "2":"Mary"}
dictionary2 = ['Linda', 'Theda', 'Thelma', 'Thursa', 'Ula','Vannie','Vertie', 'Mary']
for i in range(0, len(dictionary2)-1):
if dictionary2[i] in dictionary.values():
del dictionary2[i]
print (dictionary2)
Prints this:
['Theda', 'Thelma', 'Thursa', 'Ula', 'Vannie', 'Vertie']

Related

Grouping similar values in a dictionary

I'm new to programming and would appreciate if someone can help with the following in Python/Pandas.
I have a dictionary that has a list as the values. I'd like to be able to group together keys that have similar values. I've seen similar questions on here, but the catch in this case is i want to disregard the order of the values for example:
classmates={'jack':['20','male','soccer'],'brian':['26','male','tennis'],'charles':['male','soccer','20'],'zulu':['19','basketball','male']}
jack and charles have the same values but in different order. I'd like an output that will give the value irrespective of order. In this case, the output would be written to a csv as
['20','male','soccer']: jack, charles
['26','male','tennis']: brian
['19','basketball','male']: zulu
Using frozensets, apply, groupby + agg:
s = pd.DataFrame(classmates).T.apply(frozenset, 1)
s2 = pd.Series(s.index.values, index=s)\
.groupby(level=0).agg(lambda x: list(x))
s2
(soccer, 20, male) [charles, jack]
(26, male, tennis) [brian]
(basketball, male, 19) [zulu]
dtype: object
You can invert the dictionary in the way you want with the following code:
classmates={'jack':['20','male','soccer'],'brian':['26','male','tennis'],'charles':['male','soccer','20'],'zulu':['19','basketball','male']}
out_dict = {}
for key, value in classmates.items():
current_list = out_dict.get(tuple(sorted(value)), [])
current_list.append(key)
out_dict[tuple(sorted(value))] = current_list
print(out_dict)
This prints
{('20', 'male', 'soccer'): ['charles', 'jack'], ('26', 'male', 'tennis'): ['brian'], ('19', 'basketball', 'male'): ['zulu']}
from collections import defaultdict
ans = defaultdict(list)
classmates={'jack':['20','male','soccer'],
'brian':['26','male','tennis'],
'charles':['male','soccer','20'],
'zulu':['19','basketball','male']
}
for k, v in classmates.items():
sorted_tuple = tuple(sorted(v))
ans[sorted_tuple].append(k)
# ans is: a dict you desired
# defaultdict(<class 'list'>, {('20', 'male', 'soccer'): ['jack','charles'],
# ('26', 'male', 'tennis'): ['brian'], ('19', 'basketball', 'male'): ['zulu']})
for k, v in ans.items():
print(k, ':', v)
# output:
# ('20', 'male', 'soccer') : ['jack', 'charles']
# ('26', 'male', 'tennis') : ['brian']
# ('19', 'basketball', 'male') : ['zulu']
First of all convert your dictionary to a pandas dataframe.
df= pd.DataFrame.from_dict(classmates,orient='index')
Then sort it in ascending order by age.
df=df.sort_values(by=0,ascending=True)
Here 0 is a default column name. You can rename this column name.
You could do this in one line:
print({tuple(sorted(v)) : [k for k,vv in a.items() if sorted(vv) == sorted(v)] for v in a.values()})
or
Here is detailed solution :
dict_1 = {'jack': ['20', 'male', 'soccer'], 'brian': ['26', 'male', 'tennis'], 'charles': ['male', 'soccer', '20'],
'zulu': ['19', 'basketball', 'male']}
sorted_dict = {}
for key,value in dict_1.items():
sorted_1 = sorted(value)
sorted_dict[key] = sorted_1
tracking_of_duplicate = []
final_dict = {}
for key1,value1 in sorted_dict.items():
if value1 not in tracking_of_duplicate:
tracking_of_duplicate.append(value1)
final_dict[tuple(value1)] = [key1]
else:
final_dict[tuple(value1)].append(key1)
print(final_dict)

How to create a sentence from a dictionary

I am trying to make some code where the user inputs a sentence, the sentence is turned into a dict and then the dict is used to get the original sentence back.
Code:
import json
def code():
sentence = input("Please write a sentence: ")
dictionary = {v: k for k,v in enumerate(sentence.split(), start=1)}
with open('Dict.txt', 'w') as fp:
json.dump(dictionary, fp)
print(dictionary)
puncList = ["{","}",",",":","'","[","]","1","2","3","4","5"]
for i in puncList:
for sentence in dictionary:
dictionary=[sentence.replace(i," ") for sentence in dictionary]
print(' '.join(dictionary))
code()
Input:
Hello my name is Bob
Actual output:
{'Hello' : '1', 'name' : '3', 'Bob' : '5', 'my' : '2', 'is' : '4'}
Hello name Bob my is
Desired output:
{'Hello' : '1', 'name' : '3', 'Bob' : '5', 'my' : '2', 'is' : '4'}
Hello my name is Bob
This would be fine too:
{'Hello' : '1', 'my' : '2', 'name' : '3', 'is' : '4', 'Bob' : '5'}
Hello my name is Bob
For the part where I recreate the original sentence, it cant just print the sentence, it has to be from the dict.
You need to either use OrderedDict to retain the element order, or sort the dictionary elements before you print them out. You've already got an OrderedDict answer, so here's how to use the dict you created:
print(' '.join(k for (k, v) in sort(dictionary.items(), key=lambda x: x[1])))
Incidentally, your approach has a bug: If you apply it to a sentence with repeated words, e.g., "boys will be boys", you'll find that there's no element with index 1 in your dictionary since (boys, 4) will overwrite (boys, 1).
Use an OrderedDict on enumerate, like so:
from collections import OrderedDict
s = "Hello my name is Bob"
d = OrderedDict((v, i) for i, v in enumerate(s.split(), 1))
print(d)
# OrderedDict([('Hello', 1), ('my', 2), ('name', 3), ('is', 4), ('Bob', 5)])
s_rebuild = ' '.join(d)
print(s_rebuild)
# 'Hello my name is Bob'
Since the dictionary is already ordered, the values are not used for rebuilding the string.
You logic is flawed in that it can't handle sentences with repeated words:
Hello Bob my name is Bob too
{'name': 4, 'Hello': 1, 'Bob': 6, 'is': 5, 'too': 7, 'my': 3}
name Hello Bob is too my
We can deal with that using a defaultdict, making the values arrays of word positions instead of individual numbers. We can further improve things by dealing with your punch list up front via a split. Finally, we can reconstruct the original sentence using a pair of nested loops. We don't want/need an OrderedDict, or sorting, to do this:
import re
import json
from collections import defaultdict
PUNCH_LIST = r"[ {},:'[\]1-5]+"
def code():
dictionary = defaultdict(list)
sentence = input("Please write a sentence: ")
for position, word in enumerate(re.split(PUNCH_LIST, sentence), start=1):
dictionary[word].append(position)
with open('Dict.txt', 'w') as fp:
json.dump(dictionary, fp)
print(dictionary)
position = 1
sentence = []
while position:
for word, positions in dictionary.items():
if position in positions:
sentence.append(word)
position += 1
break
else:
position = 0
print(' '.join(sentence))
code()
EXAMPLE:
Please write a sentence: Hello Bob, my name is Bob too
defaultdict(<class 'list'>, {'is': [5], 'too': [7], 'Bob': [2, 6], 'Hello': [1], 'name': [4], 'my': [3]})
Hello Bob my name is Bob too
Where Dict.txt contains:
{"is": [5], "too": [7], "Bob": [2, 6], "Hello": [1], "name": [4], "my": [3]}
Note that the defaultdict is a convenience, not a requirement. A plain dictionary will do, but you'll have to initialize the lists for each key.

Grouping two lists in python

I have two lists which I want to group on the basis of the first element of the lists.
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
Here the first elements in the list inside the list are '1' , '2' and '3'.
I want my final list to be like :-
Final_List = [['1', 'abc', 'zef', 'rofl', 'pole'], ['3', 'lol', 'pop', 'lmao', 'wtf'], ['2', 'qwerty', 'opo', 'sole', 'pop']]
I have tried this using below code.
#!/usr/bin/python
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
d = {}
for i in list1:
d[i[0]] = i[1:]
for i in list2:
d[i[0]].extend(i[1:])
Final_List = []
for key, value in d.iteritems():
value.insert(0,key)
Final_List.append(value)
This code works but i was wondering if there was an easy and cleaner way to do it
Any help?
I would have written like you have written with a little modification, like this
Prepare a dictionary with all the elements from the second position gathered corresponding to the first element.
d = {}
for items in (list1, list2):
for item in items:
d.setdefault(item[0], [item[0]]).extend(item[1:])
And then just get all the values from the dictionary (Thanks #jamylak) :-)
print(d.values())
Output
[['3', 'lol', 'pop', 'lmao', 'wtf'],
['1', 'abc', 'zef', 'rofl', 'pole'],
['2', 'qwerty', 'opo', 'sole', 'pop']]
If item sequence in the lists inside of the Final_List is not important then this can be used,
[list(set(sum(itm, []))) for itm in zip(list1, list2)]
Your code seems correct. Just modify the following portion:
Final_List = []
for key in d:
L = [key] + [x for x in d[key]]
Final_List.append(L)
Yes, with list comprehension and enumerate
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
print [set(v + list2[k]) for k,v in enumerate(list1)]
[['1', 'abc', 'zef', 'rofl', 'pole'], ['2', 'qwerty', 'opo', 'sole', 'pop'], ['3', 'lol', 'pop', 'lmao', 'wtf']]
EDIT
With index relation
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['3','lmao','wtf'],['2','sole','pop']]
d1 = {a[0]:a for a in list1}
d2 = {a[0]:a for a in list2}
print [set(v + d2[k]) for k, v in d1.items()]
Using default dict and list comprehensions you can shorten your code
from collections import defaultdict
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
d = defaultdict(list)
for i in list1 + list2:
d[i[0]].extend(i[1:])
Final_List = [[key] + value for key, value in d.iteritems()]
print Final_List
list3 = []
for i in xrange(0,max(len(list1[0]), len(list2[0]))):
list3.append(list(list1[i]))
list3[i].extend(x for x in list2[i] if x not in list3[i])
with a xrange, you can iterate only once through the list.
A bit of functional style:
import operator, itertools
from pprint import pprint
one = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
two = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
A few helpers:
zero = operator.itemgetter(0)
all_but_the_first = operator.itemgetter(slice(1, None))
data = (one, two)
def foo(group):
# group is (key, iterator) from itertools.groupby
key = group[0]
lists = group[1]
result = list(key)
for item in lists:
result.extend(all_but_the_first(item))
return result
Function to process the daa
def process(data, func = foo):
# concatenate all the sublists
new = itertools.chain(*data)
# group by item zero
three = sorted(new, key = zero)
groups = itertools.groupby(three, zero)
# iterator that builds the new lists
return itertools.imap(foo, groups)
Usage
>>> pprint(list(process(data)))
[['1', 'abc', 'zef', 'rofl', 'pole'],
['2', 'qwerty', 'opo', 'sole', 'pop'],
['3', 'lol', 'pop', 'lmao', 'wtf']]
>>>
>>> for thing in process(data):
print thing
['1', 'abc', 'zef', 'rofl', 'pole']
['2', 'qwerty', 'opo', 'sole', 'pop']
['3', 'lol', 'pop', 'lmao', 'wtf']
>>>
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
Final_List = []
for i in range(0, len(list1)):
Final_List.append(list1[i] + list2[i])
del Final_List[i][3]
print Final_List
Output
[['1', 'abc', 'zef', 'rofl', 'pole'], ['2', 'qwerty', 'opo', 'sole', 'pop'], ['3', 'lol', 'pop', 'lmao', 'wtf']]

How can i change the values in dictionary of lists in python

I hae a dictionary like this
dict1 = [('var1','aa'),('var2','bb')('var3','cc')]
I have another dictionary
dict2 = [('var2','22'),('var3','33'),('var5','23'),('var6','33'),('var7','23')]
What i want to do is that i to replace the contents of dict2 with the varibels in dict1
I mean so that final dict3 = dict2 = [('var2','bb'),('var3','cc'),('var5','23'),('var6','33'),('var7','23')]
>>> list1 = [('var1','aa'),('var2','bb'),('var3','cc')]
>>> list2 = [('var2','22'),('var3','33'),('var5','23'),('var6','33'),('var7','23')]
>>> dict1 = dict(list1)
>>> list2 = [(k, dict1.get(k, v)) for k, v in list2]
>>> list2
[('var2', 'bb'), ('var3', 'cc'), ('var5', '23'), ('var6', '33'), ('var7', '23')]
This will maintain order from the original list2
Those are not dictionaries. Here is how to do it if they were:
>>> dict1 = {'var1':'aa', 'var2':'bb' ,'var3':'cc'}
>>> dict2 = {'var2':'22', 'var3':'33', 'var5':'23', 'var6':'33', 'var7':'23'}
>>> for key in dict2:
... if key in dict1:
... dict2[key] = dict1[key]
...
>>> dict2
{'var5': '23', 'var7': '23', 'var6': '33', 'var3': 'cc', 'var2': 'bb'}
You probably want to start out with dict1 and dict2 actually being dictionaries as in my example, but note that you can easily convert them using eg. dict1 = dict(dict1).

Finding tuples with a common element

Suppose I have a set of tuples with people's names. I want to find everyone who shares the same last name, excluding people who don't share their last name with anyone else:
# input
names = set([('John', 'Lee'), ('Mary', 'Miller'), ('Paul', 'Ryan'),
('Bob', 'Ryan'), ('Tina', 'Lee'), ('Bob', 'Smith')])
# expected output
{'Lee': ['Tina', 'John'], 'Ryan': ['Bob', 'Paul']} # or similar
This is what I am using
def find_family(names):
result = {}
try:
while True:
name = names.pop()
if name[1] in result:
result[name[1]].append(name[0])
else:
result[name[1]] = [name[0]]
except KeyError:
pass
return dict(filter(lambda x: len(x[1]) > 1, result.items()))
This looks ugly and inefficient. Is there a better way?
defaultdict can be used to simplify the code:
from collections import defaultdict
def find_family(names):
d = defaultdict(list)
for fn, ln in names:
d[ln].append(fn)
return dict((k,v) for (k,v) in d.items() if len(v)>1)
names = set([('John', 'Lee'), ('Mary', 'Miller'), ('Paul', 'Ryan'),
('Bob', 'Ryan'), ('Tina', 'Lee'), ('Bob', 'Smith')])
print find_family(names)
This prints:
{'Lee': ['Tina', 'John'], 'Ryan': ['Bob', 'Paul']}
Instead of using a while loop, use a for loop (or similar construct) over the set contents (and while you're at it, you can destructure the tuples):
for firstname, surname in names:
# do your stuff
You might want to use a defaultdict or OrderedDict (http://docs.python.org/library/collections.html) to hold your data in the body of the loop.
>>> names = set([('John', 'Lee'), ('Mary', 'Miller'), ('Paul', 'Ryan'),
... ('Bob', 'Ryan'), ('Tina', 'Lee'), ('Bob', 'Smith')])
You can get a dictionary of all the people where the keys are their lastnames easily with a for-loop:
>>> families = {}
>>> for name, lastname in names:
... families[lastname] = families.get(lastname, []) + [name]
...
>>> families
{'Miller': ['Mary'], 'Smith': ['Bob'], 'Lee': ['Tina', 'John'], 'Ryan': ['Bob', 'Paul']}
Then, you just need to filter the dictionary with the condition len(names) > 1. This filtering could be done using a "dictionary comprehension":
>>> filtered_families = {lastname: names for lastname, names in families.items() if len(names) > 1}
>>> filtered_families
{'Lee': ['Tina', 'John'], 'Ryan': ['Bob', 'Paul']}

Categories

Resources