python multimdimensional sorting, combined values - python

I have a multidimensional list where I would like to sort on a combined weighting of two numeric elements, example, of results using: sorted(results, key=operator.itemgetter(2,3))
[..,1,34]
...
...
[..,10,2]
[..,11,1]
[..,13,3]
[..,13,3]
[..,13,3]
[..,16,1]
[..,29,1]
The problem with itemgetter is that is first sorts by element 2, then by element 3, where
I would like to have the 13,3 at the top/bottom (dependent on asc/desc sort).
Is this possible and if so how.
Many thanks
Edit 1.
Sorry for being obtuse, I am processing dom data, results from search pages, it's a generic search engine searcher, so to speak.
What I am doing is finding the a and div tags, then I create a count how many items a particular class or id occurs the the div/a tag, this is element 2, then I rescan the list of found tags again and see what other class/id's for the tags match the total for the current tag being processed, thus in this case item 13,3 has 13 matches for class/id for that type of tag, and 3 denotes that there are 3 other tags with class/id's that occur the same amount of times, hence why I wish to sort like that, and no, it is not a dict, it's definitely a list.
Thank you.

I'm making a total guess here, given lack of any other explanation, and assuming what you're actually trying to do is sort by the product of the last two keys in your list, secondarily sorted by magnitude of the first element in the product. That's the only explanation I can come up with offhand for why (13,3) would be the top result.
In that case, you'd be looking for something like this:
sorted(results, key=lambda x: (x[-2]*x[-1], x[-2]), reverse=True)
That would give you the following sort:
[[13, 3], [13, 3], [13, 3], [1, 34], [29, 1], [10, 2], [16, 1], [11, 1]]
Alternatively, if what you're actually looking for here is to have the results ordered by the number of times they appear in your list, we can use a collections.Counter. Unfortunately, lists aren't hashable, so we'll cheat a bit and convert them to tuples to use as the keys. There are ways around this, but this is the simplest way for me for now to demonstrate what I'm talking about.
import collections, json
def sort_results(results):
c = collections.Counter([tuple(k) for k in results])
return sorted(c, key=lambda x: c[x], reverse=True)
This gets you:
[(13, 3), (1, 34), (16, 1), (29, 1), (11, 1), (10, 2)]
Thanks J.F. Sebastian for pointing out that tuples could be used instead of str!

Yes, you can write whatever function you want as the key function. For example, if you wanted to sort by the sum of the second and third elements:
def keyfunc(item):
return sum(operator.itemgetter(2, 3)(item))
sorted(results, key=keyfunc)
So if you used this function as your keyfunc, the item with 13 as the second element 3 as the third element of the list would be sorted as though it were the value 16.
It's not clear how you want to sort these elements, but you can change the body of keyfunc to perform whatever operation you'd like.

Related

Python cross multiplication with an arbitrary number of lists

I'm not sure what the correct term is for the multiplication here but I need to multiply an element from List A for example by every element in List B and create a new list for the new elements, so that the total length of the new list is len(A)*len(B).
As an example
A = [1,3,5], B=[4,6,8]
I need to multiply the two together to get
C = [4,6,8,12,18,24,20,30,40]
I have researched this and I have found that itertools(product) have exactly what I needed, however it is for a specific number of lists and I need to generalise to any number of lists as requested by the user.
I don't have access to the full code right now but the code asks the user for some lists (can be any number of lists) and the lists can have any number of elements in the lists (but all lists contain the same number of elements). These lists are then stored in one big list.
For example (user input)
A = [2,5,8], B= [4,7,3]
The big list will be
C = [[2,5,8],[4,7,3]]
In this case there are two lists in the big list but in general it can be any number of lists.
Once the code has this I have
print([a*b for a,b in itertools.product(C[0],C[1])])
>> [8,14,6,20,35,15,32,56,24]
The output of this is exactly what I want, however in this case the code is written for exactly two lists and I need it generalised to n lists.
I've been thinking about creating a loop to somehow loop over it n times but so far I have not been successful in this. Since C could any of any length then the loop needs a way to know when it's reached the end of the list. I don't need it to compute the product with n lists at the same time
print([a0*a1*...*a(n-1) for a0,a1,...,a(n-1) in itertools.product(C[0],C[1],C[2],...C[n-1])])
The loop could multiply two lists at a time then use the result from that multiplication against the next list in C and so on until C[n-1].
I would appreciate any advice to see if I'm at least heading in the right direction.
p.s. I am using numpy and the lists are arrays.
You can pass variable number of arguments to itertools.product with *. * is the unpacking operator that unpacks the list and passes its values the values of list to the function as if they are separately passed.
import itertools
import math
A = [[1, 2], [3, 4], [5, 6]]
result = list(map(math.prod, itertools.product(*A)))
print(result)
Result:
[15, 18, 20, 24, 30, 36, 40, 48]
You can find many explanations on the internet about * operator. In short, if you call a function like f(*lst), it will be roughly equivalent to f(lst[0], lst[1], ..., lst[len(lst) - 1]). So, it will save you from the need to know the length of the list.
Edit: I just realized that math.prod is a 3.8+ feature. If you're running an older version of Python, you can replace it with its numpy equivalent, np.prod.
You could use a reduce function that is intended exactly for these types of operations, which is based on recursion and accumulation. I am providing you an example with a primitive function so you can better understand its functionality:
lists = [
[4, 6, 8],
[1, 3, 5]
]
def reduce(function, iterable, initializer=None):
it = iter(iterable)
if initializer is None:
value = next(it)
else:
value = initializer
for element in it:
value = function(value, element)
return value
def cmp(a, b):
for x in a:
for y in b:
yield x*y
summed = list(reduce(cmp, lists))
# OUTPUT
[4, 12, 20, 6, 18, 30, 8, 24, 40]
In case you need it sorted just make use of the sort() function.

Python, Make variable equal to the second column of an array

I realise that there's a fair chance this has been asked somewhere else, but to be honest I'm not sure exactly what terminology I should be using to search for it.
But basically I've got a list with a varying number of elements. Each element contains 3 values: A string, another list, and an integer eg:
First element = ('A', [], 0)
so
ListofElements[0] = [('A', [], 0)]
And what I am trying to do is make a new list that consists of all of the integers(3rd thing in the elements) that are given in ListofElements.
I can do this already by stepping through each element of ListofElements and then appending the integer onto the new list shown here:
NewList=[]
for element in ListofElements:
NewList.append(element[2])
But using a for loop seems like the most basic way of doing it, is there a way that uses less code? Maybe a list comprehension or something such as that. It seems like something that should be able to be done on a single line.
That is just a step in my ultimate goal, which is to find out the index of the element in ListofElements that has the minimum integer value. So my process so far is to make a new list, and then find the integer index of that new list using:
index=NewList.index(min(NewList))
Is there a way that I can just avoid making the new list entirely and generate the index straight away from the original ListofElements? I got stuck with what I would need to fill in to here, or how I would iterate through :
min(ListofElements[?][2])
You can use a list coprehension:
[x[2] for x in ListOfElements]
This is generally considered a "Pythonic" approach.
You can also find the minimum in a rather stylish manner using:
minimum = min(ListOfElements, key=lambda x: x[2])
index = ListOfElements.index(minimum)
Some general notes:
In python using underscores is the standard rather than CamelCase.
In python you almost never have to use an explicit for loop. Instead prefer
coprehensions or a functional pattern (map, apply etc.)
You can map your list with itemgetter:
>>> from operator import itemgetter
>>> l = [(1, 2, 3), (1, 2, 3), (1, 2, 3), (1, 2, 3), (1, 2, 3)]
>>> map(itemgetter(2), l)
[3, 3, 3, 3, 3]
Then you can go with your approach to find the position of minimum value.

Grouping lists of numbers together in Python

I have the following which is a list of a list in Python, and is a partial list of values that I have:
[1,33]
[2,10,42]
[5,1,33,44]
[10,42,98]
[44,12,100,124]
Is there a way of grouping them so they collect the values that are common in each list?
For example, if I look at the first list [1,33], I can see that the value exists in the third list: [5,1,33,44]
So, those are grouped together as
[5,1,33,44]
If I carry on looking, I can see that 44 is in the final list, and so that will be grouped along with this list.
[44,12,100,124] is added onto [5,1,33,44]
to give:
[1,5,12,33,44,100,124]
The second list [2,10,42] has common values with [10,42,98] and are therefore joined together to give:
[2,10,42,98]
So the final lists are:
[1,5,12,33,44,100,124]
[2,10,42,98]
I am guessing there is a specific name for this type of grouping. Is there a library available that can deal with it automatically? Or would I have to write a manual way of searching?
I hope the edit makes it clearer as to what I am trying to achieve.
Thanks.
Here's a solution that does not require anything from the standard library or 3rd party packages. Note that this will modify a. To avoid that, just make a copy of a and work with that. The result is a list of lists containing your resulting sorted lists.
a = [
[1,33],
[2,10,42],
[5,1,33,44],
[10,42,98],
[44,12,100,124]
]
res = []
while a:
el = a.pop(0)
res.append(el)
for sublist in a:
if set(el).intersection(set(sublist)):
res[-1].extend(sublist)
a.remove(sublist)
res = [sorted(set(i)) for i in res]
print(res)
# [[1, 5, 12, 33, 44, 100, 124], [2, 10, 42, 98]]
How this works:
Form an empty result list res. Groupings from a will be "transferred" here.
.pop() off the first element of a. This modifies a in place and defines el as that element.
Then loop through each sublist in a, comparing your popped el to those sublists and "building up" common sets. This is where your problem is a tiny bit tricky in that you need to gradually increment your intersected set rather than finding the intersection of multiple sublists all at once.
Repeat this process until a is empty.
Alternatively, if you just want to group together the even- and odd-numbered sublists (still a bit unclear from your question), you can use itertools:
from itertools import chain
grp1 = sorted(set(chain.from_iterable(a[::2])))
grp2 = sorted(set(chain.from_iterable(a[1::2])))
print(grp1)
print(grp2)
# [1, 5, 12, 33, 44, 100, 124]
# [2, 10, 42, 98]

Associating values from one list with values in another programmatically

(Asking again in a more concise way)
I have four lists of values and I need to link the first and last together like this:
so that I can plot the points (4, 8350.1416), (10, 13167.329), (15, 29200.063), etc.
The enumerate function can give me access to the indices of the rightmost list, but how can I associate the values in that one with the correct values in the leftmost list?
The lists change with each run of the code, so I need to do it programmatically, like in a for loop for example.
EDIT: My program reads the pixel values along a randomly selected row. List1 holds the minimum-valued pixels, and list2 holds their values. Then list3 holds the minimum values of those minimum values, and list4 holds their values. Describing it like that sounds a lot more confusing than it is!
I've tried using
ubermin_vals_x = []
for i in ubermin_values:
value = ubermin_pixels[i]
ubermin_vals_x.append(minimum_pixels[i])
but it tries to iterate over the values (8350.1416, 13167.329...) which of course can't be done.
I'm trying to plot the lists to look like this:
but have the black carets from list4 at the correct points along the x-axis, which are given in list1.
Naming lists from left to right as l1,l2,l3,l4 l2 seems useless to me, since it just replicates the value in l4, so if I understand the problem, the code could be:
for i,v in zip(l3,l4):
print (l1[i],v) #or plot
and you can replace v with l2[i].
Or even simpler:
for i in l3:
print (l1[i],l2[i])
As from comment below in your example elements of l3 seem to be sigle-element list, the code becomes:
for i in l3:
print (l1[i[0]],l2[i[0]])
It's not very clear to me what you are trying to do, but here is my guess
find the index of element in 4th array in 2nd array
use that index to extract the number in 1st array
and the implementation is as follows
a4 = [ 8350.1416, 13167.329, 29200.063 ]
a2 = [13846, 8350.1416, 0, 13167.329, 0, 29200.063]
a1 = [1, 4, 7, 10, 12, 15, 18]
idx = [a1[a2.index(x)] for x in a4]
result = zip(idx, a4)
I also suspect #Vincenzooo 's answer is already very close to what you want. Maybe
for i in l3:
print (l1[i[0]],l2[i[0]])
Thanks, everyone, especially #Vincenzoo
It works now with this:
uberminxlist = []
uberminylist = []
for i in ubermin_pixels:
uberminxlist.append(minimum_pixels[i[0]])
uberminylist.append(minimum_values[i[0]])
Lovely :)

Swapping maximum and minimum values in a list

Given a list (for instance, [1,1,2,1,2,2,3]) which is not sorted highest to lowest, and contains multiples of all numbers, I need to swap, in place, the maximums with the minimums, the second maxes with the second mins, etc. So, our example list would become [3,3,2,3,2,2,1].
Also, just to clarify, it's not just the max and min, but each layer of maxes and mins. So if the max was 4, 1's and 4's should switch as well as 2's and 3's.
I found this question on the topic: How to swap maximums with the minimums? (python)
but the code examples given seemed verbose, and assumed that there were no duplicates in the list. Is there really no better way to do this? It seems like a simple enough thing.
This is one way to do it, possible because Python is such an expressive language:
>>> a = [1,1,2,1,2,2,3]
>>> d = dict(zip(sorted(set(a)), sorted(set(a), reverse=True)))
>>> [d[x] for x in a]
[3, 3, 2, 3, 2, 2, 1]

Categories

Resources