How to find most repeated element in a list Python? - python

I'm currently using Counter() method for this. But the issue I'm facing is that when there are multiple elements with same number of values I'm getting the out of key value of number which occurs first in the list.
a=[1,3,2,2,3]
coun=Counter(a)
print(coun.most_common(1))
Output: [(3,2)]
a=[1,2,3,2,3]
coun=Counter(a)
print(coun.most_common(1))
Output: [(2,2)]
I want to get the key value which is lower instead of the one that occurs first i.e 2 here irrespective of the order. I could sort the list but I'm considering that sorting can use up a lot of time.
Please help
Sorry for the formatting mess.

Depending on the amount of duplicates you are expecting you could simply check more of the most_common values? Assuming that there's no more than 100 values with exactly the same amount you could simply do:
print(sorted(coun.most_common(100))[0])
You could use a different values for 100 of course. But now the list to sort would be at most 100 tuples, which of course isn't a problem.

Related

Python Matching Multiple Keys/ Unique Pairs to a Value

What would be the fastest, most efficient way to grab and map multiple values to one value. For a use case example, say you are multiplying two numbers and you want to remember if you have multiplied those numbers before. Instead of making a giant matrix of X by Y and filling it out, it would be nice to query a Dict to see if dict[2,3] = 6 or dict[3,2] = 6. This would be especially useful for more than 2 values.
I have seen an answer similar to what I'm asking here, but would this be O(n) time or O(1)?
print value for matching multiple key
for key in responses:
if user_message in key:
print(responses[key])
Thanks!
Seems like the easiest way to do this is to sort the values before putting them in the dict. Then sort the x,y... values before looking them up. And note that you need to use tuples to map into a dictionary (lists are mutable).
the_dict = {(2,3,4): 24, (4,5,6): 120}
nums = tuple(sorted([6,4,5]))
if nums in the_dict:
print(the_dict[nums])

Choosing N changing points in a sorted list

Imagine we have a sorted list with size P. How can we choose N indices for which the values reflect the range of the list more smoothly. For example if our list is:
List=[0,0,0,0,0,0,0,0,0,0.1,0.1,0.9,0.91,0.91,0.92,0.99,0.99,0.99]
Then how we choose let's say 5 indices that somehow shows the full range of the list?
In this example it would be something like :
indices=[0,9,11,14,15]
The final indices list doesn't have to be exactly like the one I wrote here though
This will give you a starting point:
[List.index(x) for x in set(List)]
Now this may have too many elements but "somehow" is totally subjective and not a clear enough definition for what you need to do. As a default you can keep the first and last element, then randomly pick as many as you require from the "middle".

Optimizing min function for changing lists

I am dealing with a problem where I need to keep track of the minimum number in a list. However, this list is constantly diminishing, say like from a million elements to a single element. I was looking for a way to avoid checking the minimum value everytime I got a one element smaller list. Like keeping track of the minimum element and if it is removed the next minimum becomes the minimum. I want to accomplish this in linear time.(It should be achievable given the mechanics of the problem)
What I thought of since I started that, I can use collections Counter to count elements in the list. Then I find the minimum(already O(2*n)), and everytime I remove an element, I subtract 1 from the value of the dictionary key. However when the minimum number's count is depleted, I would still require to find the second minimum element so it could replace it.
Please help me find a solution to this. I would say this is an interesting problem.
Let's say your program would take some time to sort that list
a = [10,9,10,8,7,6,5,4,3,2,1,1,1,0] # you're just removing
a = sorted(a) #sort ascending
# then you remove stuff from your list
# but always a[0] is minimum element
min = a[0] #you must be careful, there must be at least one item so check that before
#getting the min
So there is no need for searching it every time

Pythonic way to get last index value of enumerate

When I enumerate through a list, is there an intended, pythonic way of obtaining the last index value provided?
Something that would get the equivalent of this:
highest = None
for index, value in enumerate(long_list):
# do stuff with index and value
highest = index
return highest
This approach I don't like. It has hundreds of unnecessary variable assignments. Also, it's ugly.
Background: I have an ordered list build with a RDBS and SQLAlchemy, using numbers as indexes in the relation table. Also I store the highest used index number in the list table, for easy appending of new entries (without extra max lookup on relation table). For when things get messed up, for whatever reason, I included a reorg function, that rebuilds indexes starting from 0 (to remove any gaps). I to that by for-enumerate-iterating over the association table. After that I need to count them or max the index, to get my new highest index value for the list table. That kinda bugs me.
Suggestions? Preferably something that works in 2.7.
To get the last index of a list
len(mylist)-1
To get the last element of a list you can simply use a negative index.
mylist[-1]

Python Spark split list into sublists divided by the sum of value inside elements

I try to split a list of objects in python into sublists based on the cumulative value of one of the parameters in the object. Let me present it on the example:
I have a list of objects like this:
[{x:1, y:2}, {x:3, y:2}, ..., {x:5, y: 1}]
and I want to divide this list into sub-lists where the total sum of x values inside a sublist will be the same (or roughly the same) so the result could look like this:
[[{x:3, y:1}, {x:3, y:1}, {x:4, y:1}], [{x:2, y:1}, {x:2, y:1}, {x:6, y:1}]]
Where the sum of x'es is equal to 10. Objects I am working with are a little bit more complicated, and my x'es are float values. So I want to aggregate the values from the ordered list, up till the sum of x'es will be >= 10, and then start creating next sub-list.
In my case, the first list of elements is an ordered list, and the summation has to take place on the ordered list.
I done something like this already in C#, where I iterate through all my elements, and keep one counter of "x" value. I sum the value of x for consecutive objects, until it will hit my threshold, and then I create a new sub-list, and restart my counter.
Now I want to reimplement it in python, and next use it with Spark. So I am looking for a little bit more "functional" implementation, maybe something to work nicely with map-reduce framework. I can't figure out another way than the iterative approach.
If you have any suggestions, or possible solutions, I would welcome all constructive comments.

Categories

Resources