I have a list:
hello = ['1', '1', '2', '1', '2', '2', '7']
I wanted to display the most common element of the list, so I used:
m = max(set(hello), key=hello.count)
However, I realised that there could be two elements of the list that occur the same frequency, like the 1's and 2's in the list above. Max only outputs the first instance of a maximum frequency element.
What kind of command could check a list to see if two elements both have the maximum number of instances, and if so, output them both? I am at a loss here.
Using an approach similar to your current, you would first find the maximum count and then look for every item with that count:
>>> m = max(map(hello.count, hello))
>>> set(x for x in hello if hello.count(x) == m)
set(['1', '2'])
Alternatively, you can use the nice Counter class, which can be used to efficiently, well, count stuff:
>>> hello = ['1', '1', '2', '1', '2', '2', '7']
>>> from collections import Counter
>>> c = Counter(hello)
>>> c
Counter({'1': 3, '2': 3, '7': 1})
>>> common = c.most_common()
>>> common
[('1', 3), ('2', 3), ('7', 1)]
Then you can use a list comprehension to get all the items that have the maximum count:
>>> set(x for x, count in common if count == common[0][1])
set(['1', '2'])
Edit: Changed solution
>>> from collections import Counter
>>> from itertools import groupby
>>> hello = ['1', '1', '2', '1', '2', '2', '7']
>>> max_count, max_nums = next(groupby(Counter(hello).most_common(),
lambda x: x[1]))
>>> print [num for num, count in max_nums]
['1', '2']
from collections import Counter
def myFunction(myDict):
myMax = 0 # Keep track of the max frequence
myResult = [] # A list for return
for key in myDict:
print('The key is', key, ', The count is', myDict[key])
print('My max is:', myMax)
# Finding out the max frequence
if myDict[key] >= myMax:
if myDict[key] == myMax:
myMax = myDict[key]
myResult.append(key)
# Case when it is greater than, we will delete and append
else:
myMax = myDict[key]
del myResult[:]
myResult.append(key)
return myResult
foo = ['1', '1', '5', '2', '1', '6', '7', '10', '2', '2']
myCount = Counter(foo)
print(myCount)
print(myFunction(myCount))
Output:
The list: ['1', '1', '5', '2', '1', '6', '7', '10', '2', '2']
Counter({'1': 3, '2': 3, '10': 1, '5': 1, '7': 1, '6': 1})
The key is 10 , The count is 1
My max is: 0
The key is 1 , The count is 3
My max is: 1
The key is 2 , The count is 3
My max is: 3
The key is 5 , The count is 1
My max is: 3
The key is 7 , The count is 1
My max is: 3
The key is 6 , The count is 1
My max is: 3
['1', '2']
I wrote this simple program, I think it might also work. I was not aware of the most_common() function until I do a search. I think this will return as many most frequent element there is, it works by comparing the max frequent element, when I see a more frequent element, it will delete the result list, and append it once; or if it is the same frequency, it simply append to it. And keep going until the whole Counter is iterated through.
Related
Using python, I need to split my_list = ['1','2','2','3','3','3','4','4','5'] into a list with sublists that avoid the same value. Correct output = [['1','2','3','4','5'],['2','3','4'],['3']]
Probably not the most efficient approach but effective nonetheless:
my_list = ['1','2','2','3','3','3','4','4','5']
output = []
for e in my_list:
for f in output:
if not e in f:
f.append(e)
break
else:
output.append([e])
print(output)
Output:
[['1', '2', '3', '4', '5'], ['2', '3', '4'], ['3']]
I assumed you are indexing every unique element with its occurrence and also sorted the result list to better suit your desired output.
uniques = list(set(my_list))
uniques.sort()
unique_counts = {unique:my_list.count(unique) for unique in uniques}
new_list = []
for _ in range(max(unique_counts.values())):
new_list.append([])
for unique,count in unique_counts.items():
for i in range(count):
new_list[i].append(unique)
The output for new_list is
[['1', '2', '3', '4', '5'], ['2', '3', '4'], ['3']]
By using collections.Counter for recognizing the maximum number of the needed sublists and then distributing consecutive unique keys on sublists according to their frequencies:
from collections import Counter
my_list = ['1','2','2','3','3','3','4','4','5']
cnts = Counter(my_list)
res = [[] for i in range(cnts.most_common(1).pop()[1])]
for k in cnts.keys():
for j in range(cnts[k]):
res[j].append(k)
print(res)
[['1', '2', '3', '4', '5'], ['2', '3', '4'], ['3']]
Here's a way to do it based on getting unique values and counts using list comprehension.
my_list = ['1','2','2','3','3','3','4','4','5']
unique = [val for i,val in enumerate(my_list) if val not in my_list[0:i]]
counts = [my_list.count(val) for val in unique]
output = [[val for val,ct in zip(unique, counts) if ct > i] for i in range(max(counts))]
This question already has answers here:
How do I split a list into equally-sized chunks?
(66 answers)
Closed 1 year ago.
Let's say we have a list:
listA = ['stack', 'overflow', '1', '2', '3', '4', '1', '5', '3', '7', '2', '3', 'L', '1', ..., 'a', '23', 'Q', '1']
I want to create a new list such as:
new_list = ['1234', '1537', '23L1', ..., 'a23Q1']
So, in this case I want to create a new list using "listA" by removing first two elements and merging all next elements in groups of 4, so the elements: 1, 2, 3, 4; are one element now.
How to approach this in case that I have a very long list to modify. Also, would be nice to know how to approach this problem in case I don't need to create a new list, if all I want is just to modify the list I already have (such as: listA = ['1234', '1537', '23L1', ..., 'a23Q1']). Thanks.
You can create an iterator over that list and then use zip with four identical copies of that iterator to combine each four consecutive elements:
>>> it = iter(listA[2:])
>>> [''.join(x) for x in zip(*[it]*4)]
['1234', '1537', '23L1', 'a23Q1']
itertools.islice allows you to avoid making a temporary copy of that list via slicing:
>>> import itertools
>>> it = itertools.islice(iter(listA), 2, None)
For your specific question, I would just loop through it.
I would do something like this. (Sorry if the format is a bit off this is one of my first answers) This will loop through and combine all complete sets of 4 elements. You could add custom logic if you wanted to keep the end as well. If you don't want to create a new list you can just use "ListA" and ignore the data scrubbing I did by using ListB.
listA = ['stack', 'overflow', '1', '2', '3', '4', '1', '5', '3', '7', '2', '3', 'L', '1', 'a', '23', 'Q', '1']
listB = listA[2:] # remove first two elements
partialList = []
wholeList = []
position = 0
for element in listB:
if position < 4:
partialList.append(element)
position+=1
else:
wholeList.append(''.join(partialList))
position = 0
partialList = []
partialList.append(element)
print(wholeList)
If you don't need a new list you could just create one then set the old list to equal it afterwards. You could try this iterative approach:
listA = ['stack', 'overflow', '1', '2', '3', '4', '1', '5', '3', '7', '2', '3', 'L', '1', 'a', '23']
new_list = [""]
count = 0
for item in listA:
if count % 4 == 0 and count > 0:
new_list.append("")
new_list[-1] += item
count += 1
listA = new_list
print(listA)
Output:
['stackoverflow12', '3415', '3723', 'L1a23']
I can't use csv module so I'm opening a csv file like this:
def readdata(filename):
res = []
tmp = []
with open(filename) as f:
l = f.readlines()
for i in range(len(l)):
tmp.append(l[i].strip().split(';'))
for i in range(len(tmp)):
for j in range(len(tmp[i])):
if j > len(res)-1:
res.append([])
res[j].append(tmp[i][j])
return res
res_count_file = "count.csv"
data_count_file = readdata(res_count_file)
This csv file contain this:
ro1;ro2;ro3
5;3;5
8;2;4
6;2;666
15;6;3
2;1;
6;9;7
Now my function read this and splits them into a list of 3 lists :
[['ro1', '5', '8', '6', '15', '2', '6'], ['ro2', '3', '2', '2', '6', '1', '9'], ['ro3', '5', '4', '666', '3', '', '7']]
I need to check if the values of a row are each less then x(let's say x = 10), and if they're not then :score+= 1
For exemple:
5;3;5 //none of them are greater then x so score += 1
8;2;4 //none of them are greater then x so score += 1
15;6;3 // 15 is greater then x so nothing happens
2;1; // none of them are greater then x so score += 1 even if there is nothing in ro3, I need to convert the empty string "''" into 0
Now I've tried to call this function in a for loop to check if a number is less then X and to increment score if this returns true but I can't figure out how to check all 3 of the numbers in R01 R02 R03 as shown in the exemple.
def apprenant_fiable(data,index_of,i):
if data[index_of][i] == "":
return True
elif int(data[index_of][i]) <= 10 :
#print(data[index_of][i],"***PASS")
return True
else :
#print(data[index_of][i],"***FAIL")
return False
The goal is to output the total score.
You can use sum on a generator:
lst = [['ro1', '5', '8', '6', '15', '2', '6'], ['ro2', '3', '2', '2', '6', '1', '9'], ['ro3', '5', '4', '666', '3', '0', '7']]
val = 10
score = sum(all(y <= val for y in x) for x in zip(*[map(int, x[1:]) for x in lst]))
# 4
Note that I've replaced empty string in the list to '0', which you need to handle while forming the list.
val = 10
for x in zip(*[map(int, x[1:]) for x in lst]):
if all(y <= val for y in x):
print(x)
This will now print out all rows that contributed to adding score.
Your problem is in the head of your function:
def apprenant_fiable(data, index_of, i):
########
You specifically tell your function to look at only one of the three lists. Get rid of this. Inside the function you will somewhere have
for value, index in enumerate(data)
You will need to check all the values before deciding what to return.
If you can't figure out how to do this, there are many places to teach you how to look for presence of a certain quality in a collection.
you can do it easly using the pandas module
import pandas as pd
# read the csv
df = pd.read_csv('input.csv', delimiter=';').fillna(0)
# leave only the rows with sum greater then 10
print(df[df.sum(axis=1) > 10].shape[0])
Something like this? where list_of_3_lists is your result of reading the input file
total = 0
for l in list_of_3_lists:
if all([int(t) > 10 for t in l[1:]]):
total +=1
print(total)
I have a dict like this
b = {'2': ['10', '5', '4'], '4': ['1', '9', '2'], '3': ['90', '87', '77'], '1': ['30']}
I need to compare each value in the list to others and return only the least value in the dict
I have tried
for k,v in b.items():
for r in range(len(v)):
print(min(v[r] + v[r]))
It is giving me a weird output!
This is the output obtained from that code.
0
5
4
1
9
2
0
7
7
0
0
0
0
I need the key and value which has the least value in the entire dict output like this d = {4:[1]}
Ugly one-liner:
b = {'2': ['10', '5', '4'], '4': ['1', '9', '2'], '3': ['90', '87', '77'], '1': ['30']}
result = dict([min(((int(k), [min(map(int, v))]) for k, v in b.items()), key=lambda t: t[1])])
print(result)
Output:
{4: [1]}
Breakdown:
b = {'2': ['10', '5', '4'], '4': ['1', '9', '2'], '3': ['90', '87', '77'], '1': ['30']}
# Generator of each key with its minimal element
# (here the generator would produce the list [(2, [4]), (4, [1]), (3, [77]), (1, [30])])
key_min = ((int(k), [min(map(int, v))]) for k, v in b.items())
# Pick tuple with minimal value
# (here the tuple (4, [1]) from the previous generator)
min_entry = min(key_min, key=lambda t: t[1])
# Make into dict
# (here {4: [1]}; first element of the tuple is the key and second element the value)
result = dict([min_entry])
print(result)
You can do it with a dict-comprehension
{int(key): [min( int(value) for value in value_list)] for key, value_list in b.items()}
If you want a straightforward answer without any confusion
min_list = {}
for k,v in b.items():
min_value = min(v)
min_list[min_value] = k
print({ min_list[min(min_list)]:min(min_list)})
You want the minimum of minimums, or:
min({k: min(b[k], key=int) for k in b}.items(), key=lambda x: x[1])
This returns the tuple ('4', '1').
First, your list is numbers as text. I did not correct that. If you can fix that then you can take off the in() in this code.
for k, v in b.items():
x = int(min(b[k]))
try:
lowVal
except:
lowVal = x
lowKey = k
else:
if x < lowVal:
lowKey = k
lowVal = x
print('{0}: {1}'.format(lowKey, lowVal))
Step through each item in the dict
find the lowest value and make it an int() and set to x for convenience
try to see if this is our first time through, if it is set the key to lowKey and the lowest value in the list to lowVal
Otherwise if lowVal already exists see if the current lowest value in the list is lower than the previous lowest. If it is then set lowKey and lowVal to the current loops values
Print
????
Profit
Edit: a word
What if you have multiple key value pairs with same minimum value?
This solution works fine for that as well.
result={k:min(map(int,v)) for k,v in b.items()}
minVal=min(result.values())
result={k:[minVal] for k in result.iterkeys() if result[k] == minVal}
print(result)
{'4': [1]}
for ex :
b = {'2': ['10', '5', '4'], '4': ['1', '9', '2'], '3': ['90', '1', '77'], '1': ['30']}
Output will be :
{'3': [1], '4': [1]}
I have the following list_A:
['0', '1', '2', '3', '4', '5', '6', '7']
and this other list_B:
['2','6','7']
I would like to check this: For each element in "list_A", if it is one of the elements in "list_B"
So:
for 0 <-> are you one of these? ['2','6','7']
for 1 <-> are you one of these? ['2','6','7']
for 2 <-> are you one of these? ['2','6','7']
And at the end, I would like to come up with a "list_C" that is identical to "list_A" in terms of element count but more like a map that looks like that:
['-1', '-1', '2', '-1', '-1', '-1', '6', '7']
Which is: "-1" for every non-matching element and "self" for every matching one. Obviously I am doing this with 2 nested for each cycles, and it works:
myStateMap = []
for a in list_A:
elementString = -1
for b in list_B:
if a == b:
# Update the elementString in case of a match
elementString = a
print "\tMatch"
else:
pass
print "\tNO Match!"
# Store the elementString
myStateMap.append(elementString)
The question is: How would you optimize this? How would you make it shorter and more efficient?
You can use a list comprehension:
>>> [('-1' if item not in list_B else item) for item in list_A]
['-1', '-1', '2', '-1', '-1', '-1', '6', '7']
Use a list comprehension with a conditional expression:
[i if i in list_B else '-1' for i in list_A]
Demo:
>>> list_A = ['0', '1', '2', '3', '4', '5', '6', '7']
>>> list_B = ['2','6','7']
>>> [i if i in list_B else '-1' for i in list_A]
['-1', '-1', '2', '-1', '-1', '-1', '6', '7']
if list_B is large, you should make it a set instead:
set_B = set(list_B)
to speed up the membership testing. in on a list has linear cost (the more elements need to be scanned, the longer it takes), while the same test against a set takes constant cost (independent of the number of values in the set).
For your specific example, using a set is already faster:
>>> timeit.timeit("[i if i in list_B else '-1' for i in list_A]", "from __main__ import list_A, list_B")
1.8152308464050293
>>> timeit.timeit("set_B = set(list_B); [i if i in set_B else '-1' for i in list_A]", "from __main__ import list_A, list_B")
1.6512861251831055
but if list_A ratios list_B are different and the sizes are small:
>>> list_A = ['0', '1', '2', '3']
>>> list_B = ['2','6','8','10']
>>> timeit.timeit("[i if i in list_B else '-1' for i in list_A]", "from __main__ import list_A, list_B")
0.8118391036987305
>>> timeit.timeit("set_B = set(list_B); [i if i in set_B else '-1' for i in list_A]", "from __main__ import list_A, list_B")
0.9360401630401611
That said, in the general case it is worth your while using sets.
The quickest way to optimize is to use if a in list_B: instead of your inner loop. So the new code would look like:
for a in list_A:
if a in list_B:
myStateMap.append(a)
print '\tMatch'
else:
print '\tNO Match!'
myStateMap.append(-1)
Here's another short list comprehension example that's a little different from the others:
a=[1,2,3,4,5,6,7]
b=[2,5,7]
c=[x * (x in b) for x in a]
Which gives c = [0, 2, 0, 0, 5, 6, 7]. If your list elements are actually strings, like they seem to be, then you either get the empty string '' or the original string. This takes advantage of the implicit conversion of a boolean value (x in b) to either 0 or 1 before multiplying it by the original value (which, in the case of strings, is "repeated concatenation").