Python 2.7 Counting number of dictionary items with given value - python

first question here, so i will get right to it:
using python 2.7
I have a dictionary of items, the keys are an x,y coordinate represented as a tuple: (x,y) and all the values are Boolean values.
I am trying to figure out a quick and clean method of getting a count of how many items have a given value. I do NOT need to know which keys have the given value, just how many.
there is a similar post here:
How many items in a dictionary share the same value in Python, however I do not need a dictionary returned, just an integer.
My first thought is to iterate over the items and test each one while keeping a count of each True value or something. I am just wondering, since I am still new to python and don't know all the libraries, if there is a better/faster/simpler way to do this.
thanks in advance.

This first part is mostly for fun -- I probably wouldn't use it in my code.
sum(d.values())
will get the number of True values. (Of course, you can get the number of False values by len(d) - sum(d.values())).
Slightly more generally, you can do something like:
sum(1 for x in d.values() if some_condition(x))
In this case, if x works just fine in place of if some_condition(x) and is what most people would use in real-world code)
OF THE THREE SOLUTIONS I HAVE POSTED HERE, THE ABOVE IS THE MOST IDIOMATIC AND IS THE ONE I WOULD RECOMMEND
Finally, I suppose this could be written a little more cleverly:
sum( x == chosen_value for x in d.values() )
This is in the same vein as my first (fun) solution as it relies on the fact that True + True == 2. Clever isn't always better. I think most people would consider this version to be a little more obscure than the one above (and therefore worse).

If you want a data structure that you can quickly access to check the counts, you could try using a Counter (as #mgilson points out, this relies on the values themselves being hashable):
>>> from collections import Counter
>>> d = {(1, 2): 2, (3, 1): 2, (4, 4): 1, (5, 6): 4}
>>> Counter(d.values())
Counter({2: 2, 1: 1, 4: 1})
You could then plug in a value and get the number of times it appeared:
>>> c = Counter(d.values())
>>> c[2]
2
>>> c[4]
1

Related

How to check if multiple characters are in a list?

I have a list of combinations (say 5 digit pin number) and want to take only the ones that have 1,2 and 3 in them. Looked around here but didnt seem to find any for some reason.
if 1 in combination and 2 in combination and 3 in combination:
This seems to work, but I'm sure there is a more efficient way since mine is quite ugly.
If combination is a set you can perform a subset check:
if {1, 2, 3} <= combination:
Otherwise, you can do:
if all(x in combination for x in (1, 2, 3)):
You can convert your mobination to string, and check the intersaction in set.
>>> combination = '456'
>>> needed = '123'
>>> set(needed) & set(combination)
set([])
>>> combination = '156'
>>> set(needed) & set(combination)
set(['1'])
If you get value return from intersection then your needed value is in combination.

How to count number of unique lists within list?

I've tried using Counter and itertools, but since a list is unhasable, they don't work.
My data looks like this: [ [1,2,3], [2,3,4], [1,2,3] ]
I would like to know that the list [1,2,3] appears twice, but I cant figure out how to do this. I was thinking of just converting each list to a tuple, then hashing with that. Is there a better way?
>>> from collections import Counter
>>> li=[ [1,2,3], [2,3,4], [1,2,3] ]
>>> Counter(str(e) for e in li)
Counter({'[1, 2, 3]': 2, '[2, 3, 4]': 1})
The method that you state also works as long as there are not nested mutables in each sublist (such as [ [1,2,3], [2,3,4,[11,12]], [1,2,3] ]:
>>> Counter(tuple(e) for e in li)
Counter({(1, 2, 3): 2, (2, 3, 4): 1})
If you do have other unhasable types nested in the sub lists lists, use the str or repr method since that deals with all sub lists as well. Or recursively convert all to tuples (more work).
ll = [ [1,2,3], [2,3,4], [1,2,3] ]
print(len(set(map(tuple, ll))))
Also, if you wanted to count the occurences of a unique* list:
print(ll.count([1,2,3]))
*value unique, not reference unique)
I think, using the Counter class on tuples like
Counter(tuple(item) for item in li)
Will be optimal in terms of elegance and "pythoniticity": It's probably the shortest solution, it's perfectly clear what you want to achieve and how it's done, and it uses resp. combines standard methods (and thus avoids reinventing the wheel).
The only performance drawback I can see is, that every element has to be converted to a tuple (in order to be hashable), which more or less means that all elements of all sublists have to be copied once. Also the internal hash function on tuples may be suboptimal if you know that list elements will e.g. always be integers.
In order to improve on performance, you would have to
Implement some kind of hash algorithm working directly on lists (more or less reimplementing the hashing of tuples but for lists)
Somehow reimplement the Counter class in order to use this hash algorithm and provide some suitable output (this class would probably use a dictionary using the hash values as key and a combination of the "original" list and the count as value)
At least the first step would need to be done in C/C++ in order to match the speed of the internal hash function. If you know the type of the list elements you could probably even improve the performance.
As for the Counter class I do not know if it's standard implementation is in Python or in C, if the latter is the case you'll probably also have to reimplement it in C in order to achieve the same (or better) performance.
So the question "Is there a better solution" cannot be answered (as always) without knowing your specific requirements.
list = [ [1,2,3], [2,3,4], [1,2,3] ]
repeats = []
unique = 0
for i in list:
count = 0;
if i not in repeats:
for i2 in list:
if i == i2:
count += 1
if count > 1:
repeats.append(i)
elif count == 1:
unique += 1
print "Repeated Items"
for r in repeats:
print r,
print "\nUnique items:", unique
loops through the list to find repeated sequences, while skipping items if they have already been detected as repeats, and adds them into the repeats list, while counting the number of unique lists.

What are the pythonic way to replace a specific set element?

I have a python set set([1, 2, 3]) and always want to replace the third element of the set with another value.
It can be done like below:
def change_last_elemnent(data):
result = []
for i,j in enumerate(list(data)):
if i == 2:
j = 'C'
result.append(j)
return set(result)
But is there any other pythonic way to do that,more smartly and making it more readable?
Thanks in advance.
Sets are unordered, so the 'third' element doesn't really mean anything. This will remove an arbitrary element.
If that is what you want to do, you can simply do:
data.pop()
data.add(new_value)
If you wish to remove an item from the set by value and replace it, you can do:
data.remove(value) #data.discard(value) if you don't care if the item exists.
data.add(new_value)
If you want to keep ordered data, use a list and do:
data[index] = new_value
To show that sets are not ordered:
>>> list({"dog", "cat", "elephant"})
['elephant', 'dog', 'cat']
>>> list({1, 2, 3})
[1, 2, 3]
You can see that it is only a coincidence of CPython's implementation that '3' is the third element of a list made from the set {1, 2, 3}.
Your example code is also deeply flawed in other ways. new_list doesn't exist. At no point is the old element removed from the list, and the act of looping through the list is entirely pointless. Obviously, none of that really matters as the whole concept is flawed.

Compare Dictionary Values for each key in the same Dictionary in Python

Update:
Hello again. My question is, how can I compare values of an dictionary for equality. More Informationen about my Dictionary:
keys are session numbers
values of each key are nested lists -> f.e.
[[1,0],[2,0],[3,1]]
the length of values for each key arent the same, so it could be that session number 1 have more values then session number 2
here an example dictionary:
order_session =
{1:[[100,0],[22,1],[23,2]],10:[100,0],[232,0],[10,2],[11,2]],22:[[5,2],[23,2],....],
... }
My Goal:
Step 1: to compare the values of session number 1 with the values of the whole other session numbers in the dictionary for equality
Step 2: take the next session number and compare the values with the other values of the other session numbers, and so on
- finally we have each session number value compared
Step 3: save the result into a list f.e.
output = [[100,0],[23,2], ... ] or output = [(100,0),(23,2), ... ]
if you can see a value-pair [100,0] of session 1 and 10 are the same. also the value-pair [23,2] of session 1 and 22 are the same.
Thanks for helping me out.
Update 2
Thank you for all your help and tips to change the nested list of lists into list of tuples, which are quite better to handle it.
I prefer Boaz Yaniv solution ;)
I also like the use of collections.Counter() ... unlucky that I use 2.6.4 (Counter works at 2.7) maybe I change to 2.7 sometimes.
If your dictionary is long, you'd want to use sets, for better performance (looking up already-encountered values in lists is going to be quite slow):
def get_repeated_values(sessions):
known = set()
already_repeated = set()
for lst in sessions.itervalues():
session_set = set(tuple(x) for x in lst)
repeated = (known & session_set) - already_repeated
already_repeated |= repeated
known |= session_set
for val in repeated:
yield val
sessions = {1:[[100,0],[22,1],[23,2]],10:[[100,0],[232,0],[10,2],[11,2]],22:[[5,2],[23,2]]}
for x in get_repeated_values(sessions):
print x
I also suggest (again, for performance reasons) to nest tuples inside your lists instead of lists, if you're not going to change them on-the-fly. The code I posted here will work either way, but it would be faster if the values are already tuples.
There's probably a nicer and more optimal way to do this, but I'd work my way from here:
seen = []
output = []
for val in order_session.values():
for vp in val:
if vp in seen:
if not vp in output:
output.append(vp)
else:
seen.append(vp)
print(output)
Basically, what this does is to look through all the values, and if the value has been seen before, but not output before, it is appended to the output.
Note that this works with the actual values of the value pairs - if you have objects of various kinds that result in pointers, my algorithm might fail (I haven't tested it, so I'm not sure). Python re-uses the same object reference for "low" integers; that is, if you run the statements a = 5 and b = 5 after each other, a and b will point to the same integer object. However, if you set them to, say, 10^5, they will not. But I don't know where the limit is, so I'm not sure if this applies to your code.
>>> from collections import Counter
>>> D = {1:[[100,0],[22,1],[23,2]],
... 10:[[100,0],[232,0],[10,2],[11,2]],
... 22:[[5,2],[23,2]]}
>>> [k for k,v in Counter(tuple(j) for i in D.values() for j in i).items() if v>1]
[(23, 2), (100, 0)]
If you really really need a list of lists
>>> [list(k) for k,v in Counter(tuple(j) for i in D.values() for j in i).items() if v>1]
[[23, 2], [100, 0]]
order_session = {1:[[100,0],[22,1],[23,2]],10:[[100,0],[232,0],[10,2],[11,2]],22:[[5,2],[23,2],[80,21]],}
output = []
for pair in sum(order_session.values(), []):
if sum(order_session.values(), []).count(pair) > 1 and pair not in output:
output.append(pair)
print output
...
[[100, 0], [23, 2]]

How to separate one list in two via list comprehension or otherwise

If have a list of dictionary items like so:
L = [{"a":1, "b":0}, {"a":3, "b":1}...]
I would like to split these entries based upon the value of "b", either 0 or 1.
A(b=0) = [{"a":1, "b":1}, ....]
B(b=1) = [{"a":3, "b":2}, .....]
I am comfortable with using simple list comprehensions, and i am currently looping through the list L two times.
A = [d for d in L if d["b"] == 0]
B = [d for d in L if d["b"] != 0]
Clearly this is not the most efficient way.
An else clause does not seem to be available within the list comprehension functionality.
Can I do what I want via list comprehension?
Is there a better way to do this?
I am looking for a good balance between readability and efficiency, leaning towards readability.
Thanks!
update:
thanks everyone for the comments and ideas! the most easiest one for me to read is the one by Thomas. but i will look at Alex' suggestion as well. i had not found any reference to the collections module before.
Don't use a list comprehension. List comprehensions are for when you want a single list result. You obviously don't :) Use a regular for loop:
A = []
B = []
for item in L:
if item['b'] == 0:
target = A
else:
target = B
target.append(item)
You can shorten the snippet by doing, say, (A, B)[item['b'] != 0].append(item), but why bother?
If the b value can be only 0 or 1, #Thomas's simple solution is probably best. For a more general case (in which you want to discriminate among several possible values of b -- your sample "expected results" appear to be completely divorced from and contradictory to your question's text, so it's far from obvious whether you actually need some generality;-):
from collections import defaultdict
separated = defaultdict(list)
for x in L:
separated[x['b']].append(x)
When this code executes, separated ends up with a dict (actually an instance of collections.defaultdict, a dict subclass) whose keys are all values for b that actually occur in dicts in list L, the corresponding values being the separated sublists. So, for example, if b takes only the values 0 and 1, separated[0] would be what (in your question's text as opposed to the example) you want as list A, and separated[1] what you want as list B.

Categories

Resources