Related
input_dict = {'ab':12, 'cd':4, 'ef':1, 'gh':8, 'kl':9}
out_dict = 2
Is there any way to find the length of keys of the dictionary, if the values in dictionary are greater than 2 and less than 9?
Try this,
In [61]: len([v for v in d.values() if 2 < v < 9])
Out[61]: 2
I think you want to find number of items in dictionary where value is 2 < v < 9:
input_dict = {"ab": 12, "cd": 4, "ef": 1, "gh": 8, "kl": 9}
out = sum(2 < v < 9 for v in input_dict.values())
print(out)
Prints:
2
Just returning the relevant lengths:
[len(k) for k, v in input_dict.items() if 2 < v < 9]
Returns:
[2, 2]
I have two lists of element
a = [1,2,3,2,3,1,1,1,1,1]
b = [3,1,2,1,2,3,3,3,3,3]
and I am trying to uniquely match the element from a to b, my expected result is like this:
1: 3
2: 1
3: 2
So I tried to construct an assignment matrix and then use scipy.linear_sum_assignment
a = [1,2,3,2,3,1,1,1,1,1]
b = [3,1,2,1,2,3,3,3,3,3]
total_true = np.unique(a)
total_pred = np.unique(b)
matrix = np.zeros(shape=(len(total_pred),
len(total_true)
)
)
for n, i in enumerate(total_true):
for m, j in enumerate(total_pred):
matrix[n, m] = sum(1 for item in b if item==(i))
I expected the matrix to be:
1 2 3
1 0 2 0
2 0 0 2
3 6 0 0
But the output is:
[[2. 2. 2.]
[2. 2. 2.]
[6. 6. 6.]]
What mistake did I made in here? Thank you very much
You don't even need to process this by Pandas. try to use zip and dict:
In [42]: a = [1,2,3,2,3,1,1,1,1,1]
...: b = [3,1,2,1,2,3,3,3,3,3]
...:
In [43]: c =zip(a,b)
In [44]: dict(c)
Out[44]: {1: 3, 2: 1, 3: 2}
UPDATE as OP said, if we need to store all the value with the same key, we can use defaultdict:
In [58]: from collections import defaultdict
In [59]: d = defaultdict(list)
In [60]: for k,v in c:
...: d[k].append(v)
...:
In [61]: d
Out[61]: defaultdict(list, {1: [3, 3, 3, 3, 3, 3], 2: [1, 1], 3: [2, 2]})
This row:
matrix[n, m] = sum(1 for item in b if item==(i))
counts the occurrences of i in b and saves the result to matrix[n, m]. Each cell of the matrix will contain either the number of 1's in b (i.e. 2) or the number of 2's in b (i.e. 2) or the number of 3's in b (i.e. 6). Notice that this value is completely independent of j, which means that the values in one row will always be the same.
In order to take j into consideration, try to replace the row with:
matrix[n, m] = sum(1 for x, y in zip(a, b) if (x, y) == (j, i))
In case your expected output, since how we specify the matrix as a(i, j) with i is the index of the row, and j is the index of the col. Looking at a(3,1) in your matrix, the result is 6, which means (3,1) combination matches 6 times, with 3 is from b and 1 is from a. We can find all the matches from 2 list.
matches = [tuple([x, y]) for x,y in zip(b, a)]
Then we can find how many matches there are of a specific combination, for example a(3, 1).
result = matches.count((3,1))
I have the following graph:
graph = {0 : {5:6, 4:8},
1 : {4:11},
2 : {3: 9, 0:12},
3 : {},
4 : {5:3},
5 : {2: 7, 3:4}}
I am trying to return the key that has the highest value in this graph. The expected output in this case would be 2 as key 2 has the highest value of 12.
Any help on how I can achieve this would be greatly appreciated.
Find the key whose maximum value is maximal:
max((k for k in graph), key=lambda k: max(graph[k].values(), default=float("-inf")))
The empty elements are disqualified by the ridiculous maximum. Alternately, you can just pre-filter such keys:
max((k for k in graph if graph[k]), key=lambda k: max(graph[k].values()))
Assuming it's all positive numbers
graph = {0 : {5:6, 4:8},
1 : {4:11},
2 : {3: 9, 0:12},
3 : {},
4 : {5:3},
5 : {2: 7, 3:4}}
highestKey = 0
max = 0
for key, value in graph.items():
for key2, value2 in value.items():
if (max < value2):
max = value2
highestKey = key
print(highestKey)
You can also create (max_weight, key) tuples for each key and get the max of those:
max_val = max((max(e.values()), k) for k, e in graph.items() if e)
# (12, 2)
print(max_val[1])
# 2
Note that we don't need a custom key function for max here because the first value in the tuple is the one we want max to consider.
The recursive solution is below. Does not make assumptions about depth of your tree. Only assumes that data types are either int, float or dict
import type
def getLargest(d):
def getLargestRecursive(d):
if type(d) == “dict”:
getLargestRecursive(d)
elif not largest or d > largest:
largest = d
largest = None
getLargestRecursive(d)
return largest
largestValues = [getLargest(k) for k in graph.keys]
answer = largestValues.index(max(largestValues))
You can also use dict comprehension to flat the dictionary and then print the max key,
graph = {0 : {5:6, 4:8},
1 : {4:11},
2 : {3: 9, 0:12},
3 : {},
4 : {5:3},
5 : {2: 7, 3:4}}
flat_dcit = {k:a for k, v in graph.items() for a in v.values()}
print(max(flat_dcit.keys(), key=(lambda k: flat_dcit[k])))
# output,
2
You can also try flattening your dictionary into a list of tuples then take the max of the tuple with the highest second value:
from operator import itemgetter
graph = {
0: {5: 6, 4: 8},
1: {4: 11},
2: {3: 9, 0: 12},
3: {},
4: {5: 3},
5: {2: 7, 3: 4},
}
result = max(((k, v) for k in graph for v in graph[k].values()), key=itemgetter(1))
print(result)
# (2, 12)
print(result[0])
# 2
I have a dictionary with {ID: (INITIALS, DATE, AREA)} like so:
>> mydict = {1: ('JN', '2012-06-13', 2),
2: ('JN', '2012-06-13', 5),
3: ('JN', '2012-06-14', 8),
4: ('AM', '2012-06-13', 3),
5: ('OV', '2012-06-14', 4)}
I have been able to summarise the values like this:
>> from collections import Counter
>> mycounter = Counter((val[0], val[1]) for val in mydict.values())
>> for (initials, date), count in mycounter.iteritems():
print ', '.join(initials, date, str(count))
JN, 2012-06-13, 2
JN, 2012-06-14, 1
AM, 2012-06-13, 1
OV, 2012-06-14, 1
I would like to also include a sum of the AREA of the value in mydict resulting in this:
JN, 2012-06-13, 2, 7
JN, 2012-06-14, 1, 8
AM, 2012-06-13, 1, 3
OV, 2012-06-14, 1, 4
Thanks!
EDIT: WORKING CODE (modified slightly from Ashwini's code):
I moved v[1] to join v[0] in a tuple so it would summarise based on the intials AND date (I didn't make that clear in my initial question, now edited to reflect that), then calculate the count and sum the areas. Martijn's code also worked but this solution requires one less import.
stats = {}
for k, v in mydict.items():
d = stats.setdefault((v[0], v[1]), [0, 0])
d[1] += v[-1]
d[0] += 1
You can use a normal dict here:
>>> dic = {}
for k, v in mydict.items():
d = dic.setdefault(v[0], [v[1], 0, 0])
d[2] += v[-1]
d[1] += 1
...
>>> dic
{'OV': ['2012-06-14', 1, 4],
'JN': ['2012-06-13', 2, 7],
'AM': ['2012-06-13', 1, 3]}
Loop through the dict to get the expected output:
>>> for k,v in dic.items():
print k +',',", ".join([str(x) for x in v])
...
OV, 2012-06-14, 1, 4
JN, 2012-06-13, 2, 7
AM, 2012-06-13, 1, 3
If orders matters then you can use collections.OrderedDict:
>>> from collections import OrderedDict
>>> dic = OrderedDict()
>>> for k, v in mydict.items():
d = dic.setdefault(v[0], [v[1], 0, 0])
d[2] += v[-1]
d[1] += 1
>>> for k,v in dic.items():
... print k +',',", ".join([str(x) for x in v])
...
JN, 2012-06-13, 2, 7
AM, 2012-06-13, 1, 3
OV, 2012-06-14, 1, 4
You are not using the full Counter API here, may as well replace that with a defaultdict
from collections import defaultdict
stats = defaultdict(lambda: [0, 0])
for entry in mydict.values():
counts = stats[tuple(entry[:2])]
counts[0] += 1
counts[1] += entry[-1]
then printing:
for (initials, date), (count, area) in stats.iteritems():
print ', '.join((initials, date, str(count), str(area)))
which outputs:
OV, 2012-06-14, 1, 4
AM, 2012-06-13, 1, 3
JN, 2012-06-13, 2, 7
This question already has answers here:
Modifying list while iterating [duplicate]
(7 answers)
Closed 8 years ago.
I want to iterate through a list, and remove the items that count more than once, so they don't get printed repeatedly by the for loop.
However, some items appearing only one time in the list seem to get affected too by this, and I can't figure out why.
Any input would be greatly appreciated.
Example Output:
listy = [2,2,1,3,4,2,1,2,3,4,5]
for i in listy:
if listy.count(i)>1:
print i, listy.count(i)
while i in listy: listy.remove(i)
else:
print i, listy.count(i)
Outputs:
2 4
3 2
1 2
thus ignoring completely 4 and 5.
You should not modify a list while iterating over it. This one should work:
listy = [2,2,1,3,4,2,1,2,3,4,5]
found = set()
for i in listy:
if not i in found:
print i, listy.count(i)
found.add(i)
The result is:
2 4
1 2
3 2
4 2
5 1
The reason for your problems is that you modify the list while you are iterating over it.
If you don't care about the order in which items appear in the output and don't care about the count, you can simply use use a set:
>>> listy = [2,2,1,3,4,2,1,2,3,4,5]
>>> print set(listy)
set([1, 2, 3, 4, 5])
If you do care about the count, use the Counter class from the collections module in the Standard Library:
>>> import collections
>>> collections.Counter(listy)
Counter({2: 4, 1: 2, 3: 2, 4: 2, 5: 1})
>>> c = collections.Counter(listy)
>>> for item in c.iteritems():
... print "%i has a count of %i" % item
...
1 has a count of 2
2 has a count of 4
3 has a count of 2
4 has a count of 2
5 has a count of 1
If you do care about both the order and the count, you have to build a second list:
>>> checked = []
>>> counts = []
>>> for item in listy:
>>> if item not in checked:
>>> checked.append(item)
>>> counts.append(listy.count(item))
>>> print zip(checked, counts)
... [(2, 4), (1, 2), (3, 2), (4, 2), (5, 1)]
This is the least efficient solution, of course.
If you don't want to keep the counts for later, you don't need the counts list:
listy = [2,2,1,3,4,2,1,2,3,4,5]
checked = set()
for item in listy:
# "continue early" looks better when there is lots of code for
# handling the other case
if item in checked:
continue
checked.add(item)
print item, listy.count(item)
Don't modify a list while iterating over it, it will mess you up every time:
listy = [2,2,1,3,4,2,1,2,3,4,5]
# * * * Get hit
for i in listy:
print i
if listy.count(i) > 1:
print i, listy.count(i), 'item and occurences'
while i in listy: listy.remove(i)
else:
print i, listy.count(i)
First, you remove four 2s. Two are right at the beginning, so that puts you at the first 1.
Then you advance one when you get the next i from listy, putting you at the first 3.
Then you remove two 3s. The first is right there, so that puts you at the first 4.
Then you advance one again. The 2 is gone already, so this puts you at the second 1.
You then delete both 1s; this moves you forward two spaces. The 2 and 3 are gone, so this puts you at the 5.
You advance one, this moves you off the end of the list so the loop is over.
If what you want is to print each item only once, you can use the simple set method, or you could use the itertools unique_everseen recipe:
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in ifilterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
Which extends the basic set version to allow you to specify a special way to compare items.
If you want to know which items are only in the list once:
listy2 = filter(lambda i: listy.count(i) == 1, listy)
listy2 now has all the single occurrences.
If you don't like the lambda, just do:
def getsingles(listy):
def singles(i):
return listy.count(i) == 1
return singles
then:
listy2 = filter(getsingles(listy), listy)
This makes a special function that will tell you which items are in listy only once.
The reason of the behavior you get is here, in the note:
http://docs.python.org/reference/compound_stmts.html#index-811
Update 1
agf's solution isn't a good one for performance reason: the list is filtered according to the count of each element. The counting is done for each element, that is to say the counting process that consists to run through the entire list to count, is done as many times as there are elements in list: it's overconsuming time, imagine if your list is 1000 length
A better solution I think is to use an instance of Counter:
import random
from collections import Counter
li = [ random.randint(0,20) for i in xrange(30)]
c = Counter(li)
print c
print type(c)
res = [ k for k in c if c[k]==1]
print res
result
Counter({8: 5, 0: 3, 4: 3, 9: 3, 2: 2, 5: 2, 11: 2, 3: 1, 6: 1, 10: 1, 12: 1, 15: 1, 16: 1, 17: 1, 18: 1, 19: 1, 20: 1})
<class 'collections.Counter'>
[3, 6, 10, 12, 15, 16, 17, 18, 19, 20]
Another solution would be to add the read elements in a set in order that the program avoids to make a count for an already seen element.
Update 2
errrr.... my solution is stupid, you don't want to select the element appearing only one time in the list....
Then the following code is the right one , I think:
import random
from collections import Counter
listy = [ random.randint(0,20) for i in xrange(30)]
print 'listy==',listy
print
c = Counter(listy)
print c
print type(c)
print
slimmed_listy = []
for el in listy:
if el in c:
slimmed_listy.append(el)
print 'element',el,' count ==',c[el]
del c[el]
print
print 'slimmed_listy==',slimmed_listy
result
listy== [13, 10, 1, 1, 13, 11, 18, 15, 3, 15, 12, 11, 15, 18, 11, 10, 14, 10, 20, 3, 18, 9, 11, 2, 19, 15, 5, 14, 1, 1]
Counter({1: 4, 11: 4, 15: 4, 10: 3, 18: 3, 3: 2, 13: 2, 14: 2, 2: 1, 5: 1, 9: 1, 12: 1, 19: 1, 20: 1})
<class 'collections.Counter'>
element 13 count == 2
element 10 count == 3
element 1 count == 4
element 11 count == 4
element 18 count == 3
element 15 count == 4
element 3 count == 2
element 12 count == 1
element 14 count == 2
element 20 count == 1
element 9 count == 1
element 2 count == 1
element 19 count == 1
element 5 count == 1
slimmed_listy== [13, 10, 1, 11, 18, 15, 3, 12, 14, 20, 9, 2, 19, 5]
In case you wouldn't want the result in the order of listy, the code would be even simpler
Update 3
If you want only to print, then I propose:
import random
from collections import Counter
listy = [ random.randint(0,20) for i in xrange(30)]
print 'listy==',listy
print
def gener(li):
c = Counter(li)
for el in li:
if el in c:
yield el,c[el]
del c[el]
print '\n'.join('element %4s count %4s' % x for x in gener(listy))
result
listy== [16, 2, 4, 9, 15, 19, 1, 1, 3, 5, 12, 15, 12, 3, 17, 13, 8, 11, 4, 6, 15, 1, 0, 1, 3, 3, 6, 5, 0, 8]
element 16 count 1
element 2 count 1
element 4 count 2
element 9 count 1
element 15 count 3
element 19 count 1
element 1 count 4
element 3 count 4
element 5 count 2
element 12 count 2
element 17 count 1
element 13 count 1
element 8 count 2
element 11 count 1
element 6 count 2
element 0 count 2
Modifying a list while you iterate over it is a bad idea in every language I have encountered. My suggestion: don't do that. Here are some better ideas.
Use a set to find single occurrences
source = [2,2,1,3,4,2,1,2,3,4,5]
for s in set(source):
print s
And you get this:
>>> source = [2,2,1,3,4,2,1,2,3,4,5]
>>> for s in set(source):
... print s
...
1
2
3
4
5
If you want the counts, use defaultdict
from collections import defaultdict
d = defaultdict(int)
source = [2,2,1,3,4,2,1,2,3,4,5]
for s in source:
d[s] += 1
for k, v in d.iteritems():
print k, v
You'll get this:
>>> for k, v in d.iteritems():
... print k, v
...
1 2
2 4
3 2
4 2
5 1
If you want your results sorted, use sort and operator
import operator
for k, v in sorted(d.iteritems(), key=operator.itemgetter(1)):
print k, v
You'll get this:
>>> import operator
>>> for k, v in sorted(d.iteritems(), key=operator.itemgetter(1)):
... print k, v
...
5 1
1 2
3 2
4 2
2 4
I am not sure if it is a good idea to iterate the list and remove elements at the same time. If you really just want to output all items and their number of occurrences, I would do it like this:
listy = [2,2,1,3,4,2,1,2,3,4,5]
listx = []
listc = []
for i in listy:
if not i in listx:
listx += [i]
listc += [listy.count(i)]
for x, c in zip(listx, listc):
print x, c
Like agf said, modifying a list while you iterate it will cause problems. You could solve your code by using while and pop:
single_occurrences = []
while listy:
i = listy.pop(0)
count = listy.count(i)+1
if count > 1:
print i, count
while i in listy: listy.remove(i)
else:
print i, count
single_occurrences.append(i)
Output:
2 4
1 2
3 2
4 2
5 1
One way to do that would be to create a result list and test whether the tested value is in it :
res=[]
listy = [2,2,1,3,4,2,1,2,3,4,5]
for i in listy:
if listy.count(i)>1 and i not in res:
res.append(i)
for i in res:
print i, listy.count(i)
Result :
2 4
1 2
3 2
4 2