How to make function unique in python - python

Okay, so I have to make a function called unique. This is what it should do:
If the input is: s1 = [{1,2,3,4}, {3,4,5}]
unique(s1) should return: {1,2,5} because the 1, 2 and 5 are NOT in both lists.
And if the input is s2 = [{1,2,3,4}, {3,4,5}, {2,6}]
unique(s2) should return: {1,5,6} because those numbers are unique and are in only one list of this collection of 3 lists.
I tried to make something like this:
for x in s1:
if x not in unique_list:
unique_list.append(x)
else:
unique_list.remove(x)
print(unique_list)
But the problem with this is that it takes a whole list as "x" and not each element from each list.
Anyone that can help me a bit with this?
I am not allowed to import anything.

Python set() objects have a symmetric_difference() method to find elements in either, but not both sets. You can reduce your list with this to find the total elements unique to each set:
from functools import reduce
l = [{1,2,3,4}, {3,4,5}, {2,6}]
reduce(set.symmetric_difference, l)
# {1, 5, 6}
You can, of course do this without reduce by manually looping over the list. ^ will produce the symmetric_difference:
l = [{1,2,3,4}, {3,4,5}, {2,6}]
final = set()
for s in l:
final = final ^ s
print(final)
# {1, 5, 6}

In [13]: def f(sets):
...: c = {}
...: for s in sets:
...: for x in s:
...: c[x] = c.setdefault(x, 0) + 1
...: return {x for x, v in c.items() if v == 1}
...:
In [14]: f([{1,2}, {2, 3}, {3, 4}])
Out[14]: {1, 4}

Related

Common characters between strings in an array

I am trying to find the common char between the strings in the array. I am using a hashmap for this purpose which is defined as Counter. After trying multiple times I am not able to get correct ans. What I am doing wrong here?
Expected Ans: {(c,1),(o,1)}
What I am getting: {('c', 1)}
My code:
arr = ["cool","lock","cook"]
def Counter(arr):
d ={}
for items in arr:
if items not in d:
d[items] = 0
d[items] += 1
return d
res = Counter(arr[0]).items()
for items in arr:
res &= Counter(items).items()
print(res)
In [29]: from collections import Counter
In [30]: words = ["cool","coccoon","cook"]
In [31]: chars = ''.join(set(''.join(words)))
In [32]: counts = [Counter(w) for w in words]
In [33]: common = {ch: min(wcount[ch] for wcount in counts) for ch in chars}
In [34]: answer = {ch: count for ch, count in common.items() if count}
In [35]: answer
Out[35]: {'c': 1, 'o': 2}
In [36]:
Try using functools.reduce and collections.Counter:
>>> from functools import reduce
>>> from collections import Counter
>>> reduce(lambda x,y: x&y, (Counter(elem) for elem in arr[1:]), Counter(arr[0]))
Counter({'c': 1, 'o': 1})
An approach without any other library could be like this:
arr = ["cool","lock","cook"]
def Counter(obj_str):
countdict = {x: 0 for x in set(obj_str)}
for char in obj_str:
countdict[char] += 1
return {(k, v) for k,v in countdict.items()}
print(Counter(arr[0]))
This should give you the result formated as you want it.

Unique elements of multiple sets

I have a list of sets like below. I want to write a function to return the elements that only appear once in those sets. The function I wrote kinda works. I am wondering, is there better way to handle this problem?
s1 = {1, 2, 3, 4}
s2 = {1, 3, 4}
s3 = {1, 4}
s4 = {3, 4}
s5 = {1, 4, 5}
s = [s1, s2, s3, s4, s5]
def unique(s):
temp = []
for i in s:
temp.extend(list(i))
c = Counter(temp)
result = set()
for k,v in c.items():
if v == 1:
result.add(k)
return result
unique(s) # will return {2, 5}
You can use directly a Counter and then get the elements that only appear once.
from collections import Counter
import itertools
c = Counter(itertools.chain.from_iterable(s))
res = {k for k,v in c.items() if v==1}
# {2, 5}
I love the Counter-based solution by #abc. But, just in case, here is a pure set-based one:
result = set()
for _ in s:
result |= s[0] - set.union(*s[1:])
s = s[-1:] + s[:-1] # shift the list of sets
#{2, 5}
This solution is about 6 times faster but cannot be written as a one-liner.
set.union(*[i-set.union(*[j for j in s if j!=i]) for i in s])
I think the proposed solution is similar to what #Bobby Ocean suggested but not as compressed.
The idea is to loop over the complete set array "s" to compute all the subset differences for each target subset "si" (avoiding itself).
For example starting with s1 we compute st = s1-s2-s3-s4-s5 and starting with s5 we have st=s5-s1-s2-s3-s4.
The logic behind is that due to the difference, for each target subset "si" we only keep the elements that are unique to "si" (compared to the other subsets).
Finally result is the set of the union of these uniques elements.
result= set()
for si in s: # target subset
st=si
for sj in s: # the other subsets
if sj!=si: # avoid itself
st = st-sj #compute differences
result=result.union(st)

Find count of characters within the string in Python

I am trying to create a dictionary of word and number of times it is repeating in string. Say suppose if string is like below
str1 = "aabbaba"
I want to create a dictionary like this
word_count = {'a':4,'b':3}
I am trying to use dictionary comprehension to do this.
I did
dic = {x:dic[x]+1 if x in dic.keys() else x:1 for x in str}
This ends up giving an error saying
File "<stdin>", line 1
dic = {x:dic[x]+1 if x in dic.keys() else x:1 for x in str}
^
SyntaxError: invalid syntax
Can anybody tell me what's wrong with the syntax? Also,How can I create such a dictionary using dictionary comprehension?
As others have said, this is best done with a Counter.
You can also do:
>>> {e:str1.count(e) for e in set(str1)}
{'a': 4, 'b': 3}
But that traverses the string 1+n times for each unique character (once to create the set, and once for each unique letter to count the number of times it appears. i.e., This has quadratic runtime complexity.). Bad result if you have a lot of unique characters in a long string... A Counter only traverses the string once.
If you want no import version that is more efficient than using .count, you can use .setdefault to make a counter:
>>> count={}
>>> for c in str1:
... count[c]=count.setdefault(c, 0)+1
...
>>> count
{'a': 4, 'b': 3}
That only traverses the string once no matter how long or how many unique characters.
You can also use defaultdict if you prefer:
>>> from collections import defaultdict
>>> count=defaultdict(int)
>>> for c in str1:
... count[c]+=1
...
>>> count
defaultdict(<type 'int'>, {'a': 4, 'b': 3})
>>> dict(count)
{'a': 4, 'b': 3}
But if you are going to import collections -- Use a Counter!
Ideal way to do this is via using collections.Counter:
>>> from collections import Counter
>>> str1 = "aabbaba"
>>> Counter(str1)
Counter({'a': 4, 'b': 3})
You can not achieve this via simple dict comprehension expression as you will require reference to your previous value of count of element. As mentioned in Dawg's answer, as a work around you may use list.count(e) in order to find count of each element from the set of string within you dict comprehension expression. But time complexity will be n*m as it will traverse the complete string for each unique element (where m are uniques elements), where as with counter it will be n.
This is a nice case for collections.Counter:
>>> from collections import Counter
>>> Counter(str1)
Counter({'a': 4, 'b': 3})
It's dict subclass so you can work with the object similarly to standard dictionary:
>>> c = Counter(str1)
>>> c['a']
4
You can do this without use of Counter class as well. The simple and efficient python code for this would be:
>>> d = {}
>>> for x in str1:
... d[x] = d.get(x, 0) + 1
...
>>> d
{'a': 4, 'b': 3}
Note that this is not the correct way to do it since it won't count repeated characters more than once (apart from losing other characters from the original dict) but this answers the original question of whether if-else is possible in comprehensions and demonstrates how it can be done.
To answer your question, yes it's possible but the approach is like this:
dic = {x: (dic[x] + 1 if x in dic else 1) for x in str1}
The condition is applied on the value only not on the key:value mapping.
The above can be made clearer using dict.get:
dic = {x: dic.get(x, 0) + 1 for x in str1}
0 is returned if x is not in dic.
Demo:
In [78]: s = "abcde"
In [79]: dic = {}
In [80]: dic = {x: (dic[x] + 1 if x in dic else 1) for x in s}
In [81]: dic
Out[81]: {'a': 1, 'b': 1, 'c': 1, 'd': 1, 'e': 1}
In [82]: s = "abfg"
In [83]: dic = {x: dic.get(x, 0) + 1 for x in s}
In [84]: dic
Out[84]: {'a': 2, 'b': 2, 'f': 1, 'g': 1}

Python Removing duplicates ( and not keeping them) in a list

Say I have:
x=[a,b,a,b,c,d]
I want a way to get
y=[c,d]
I have managed to do it with count:
for i in x:
if x.count(i) == 1:
unique.append(i)
The problem is, this is very slow for bigger lists, help?
First use a dict to count:
d = {}
for i in x:
if i not in d:
d[i] = 0
d[i] += 1
y = [i for i, j in d.iteritems() if j == 1]
x=["a","b","a","b","c","d"]
from collections import Counter
print([k for k,v in Counter(x).items() if v == 1])
['c', 'd']
Or to guarantee the order create the Counter dict first then iterate over the x list doing lookups for the values only keeping k's that have a value of 1:
x = ["a","b","a","b","c","d"]
from collections import Counter
cn = Counter(x)
print([k for k in x if cn[k] == 1])
So one pass over x to create the dict and another pass in the comprehension giving you an overall 0(n) solution as opposed to your quadratic approach using count.
The Counter dict counts the occurrences of each element:
In [1]: x = ["a","b","a","b","c","d"]
In [2]: from collections import Counter
In [3]: cn = Counter(x)
In [4]: cn
Out[4]: Counter({'b': 2, 'a': 2, 'c': 1, 'd': 1})
In [5]: cn["a"]
Out[5]: 2
In [6]: cn["b"]
Out[6]: 2
In [7]: cn["c"]
Out[7]: 1
Doing cn[k] returns the count for each element so we only end up keeping c and d.
The best way to do this is my using the set() function like this:
x=['a','b','a','b','c','d']
print list(set(x))
As the set() function returns an unordered result. Using the sorted() function, this problem can be solved like so:
x=['a','b','a','b','c','d']
print list(sorted(set(x)))

Python: Elegantly merge dictionaries with sum() of values [duplicate]

This question already has answers here:
Is there any pythonic way to combine two dicts (adding values for keys that appear in both)?
(22 answers)
Closed 10 years ago.
I'm trying to merge logs from several servers. Each log is a list of tuples (date, count). date may appear more than once, and I want the resulting dictionary to hold the sum of all counts from all servers.
Here's my attempt, with some data for example:
from collections import defaultdict
a=[("13.5",100)]
b=[("14.5",100), ("15.5", 100)]
c=[("15.5",100), ("16.5", 100)]
input=[a,b,c]
output=defaultdict(int)
for d in input:
for item in d:
output[item[0]]+=item[1]
print dict(output)
Which gives:
{'14.5': 100, '16.5': 100, '13.5': 100, '15.5': 200}
As expected.
I'm about to go bananas because of a colleague who saw the code. She insists that there must be a more Pythonic and elegant way to do it, without these nested for loops. Any ideas?
Doesn't get simpler than this, I think:
a=[("13.5",100)]
b=[("14.5",100), ("15.5", 100)]
c=[("15.5",100), ("16.5", 100)]
input=[a,b,c]
from collections import Counter
print sum(
(Counter(dict(x)) for x in input),
Counter())
Note that Counter (also known as a multiset) is the most natural data structure for your data (a type of set to which elements can belong more than once, or equivalently - a map with semantics Element -> OccurrenceCount. You could have used it in the first place, instead of lists of tuples.
Also possible:
from collections import Counter
from operator import add
print reduce(add, (Counter(dict(x)) for x in input))
Using reduce(add, seq) instead of sum(seq, initialValue) is generally more flexible and allows you to skip passing the redundant initial value.
Note that you could also use operator.and_ to find the intersection of the multisets instead of the sum.
The above variant is terribly slow, because a new Counter is created on every step. Let's fix that.
We know that Counter+Counter returns a new Counter with merged data. This is OK, but we want to avoid extra creation. Let's use Counter.update instead:
update(self, iterable=None, **kwds) unbound collections.Counter method
Like dict.update() but add counts instead of replacing them.
Source can be an iterable, a dictionary, or another Counter instance.
That's what we want. Let's wrap it with a function compatible with reduce and see what happens.
def updateInPlace(a,b):
a.update(b)
return a
print reduce(updateInPlace, (Counter(dict(x)) for x in input))
This is only marginally slower than the OP's solution.
Benchmark: http://ideone.com/7IzSx (Updated with yet another solution, thanks to astynax)
(Also: If you desperately want an one-liner, you can replace updateInPlace by lambda x,y: x.update(y) or x which works the same way and even proves to be a split second faster, but fails at readability. Don't :-))
from collections import Counter
a = [("13.5",100)]
b = [("14.5",100), ("15.5", 100)]
c = [("15.5",100), ("16.5", 100)]
inp = [dict(x) for x in (a,b,c)]
count = Counter()
for y in inp:
count += Counter(y)
print(count)
output:
Counter({'15.5': 200, '14.5': 100, '16.5': 100, '13.5': 100})
Edit:
As duncan suggested you can replace these 3 lines with a single line:
count = Counter()
for y in inp:
count += Counter(y)
replace by : count = sum((Counter(y) for y in inp), Counter())
You could use itertools' groupby:
from itertools import groupby, chain
a=[("13.5",100)]
b=[("14.5",100), ("15.5", 100)]
c=[("15.5",100), ("16.5", 100)]
input = sorted(chain(a,b,c), key=lambda x: x[0])
output = {}
for k, g in groupby(input, key=lambda x: x[0]):
output[k] = sum(x[1] for x in g)
print output
The use of groupby instead of two loops and a defaultdict will make your code clearer.
You can use Counter or defaultdict, or you can try my variant:
def merge_with(d1, d2, fn=lambda x, y: x + y):
res = d1.copy() # "= dict(d1)" for lists of tuples
for key, val in d2.iteritems(): # ".. in d2" for lists of tuples
try:
res[key] = fn(res[key], val)
except KeyError:
res[key] = val
return res
>>> merge_with({'a':1, 'b':2}, {'a':3, 'c':4})
{'a': 4, 'c': 4, 'b': 2}
Or even more generic:
def make_merger(fappend=lambda x, y: x + y, fempty=lambda x: x):
def inner(*dicts):
res = dict((k, fempty(v)) for k, v
in dicts[0].iteritems()) # ".. in dicts[0]" for lists of tuples
for dic in dicts[1:]:
for key, val in dic.iteritems(): # ".. in dic" for lists of tuples
try:
res[key] = fappend(res[key], val)
except KeyError:
res[key] = fempty(val)
return res
return inner
>>> make_merger()({'a':1, 'b':2}, {'a':3, 'c':4})
{'a': 4, 'c': 4, 'b': 2}
>>> appender = make_merger(lambda x, y: x + [y], lambda x: [x])
>>> appender({'a':1, 'b':2}, {'a':3, 'c':4}, {'b':'BBB', 'c':'CCC'})
{'a': [1, 3], 'c': [4, 'CCC'], 'b': [2, 'BBB']}
Also you can subclass the dict and implement a __add__ method:

Categories

Resources