How to check if two lists are equal with duplicates? - python

I'm trying to compare two lists in Python, checking if they are the same. The problem is that both lists can contain duplicate elements, and in order to be considered equal, they need to have the same amount of duplicate elements.
I've currently "solved" this by creating a copy of both lists, and removing an element from both lists if they are equal:
def equals(v1: Vertex, v2: Vertex) -> bool:
# also checks if neighbourhoods are the same size
if v1.label == v2.label:
# copy the neighbourhoods to prevent data loss on removal of checked vertices
v1_neighbours = v1.neighbours.copy()
v2_neighbours = v2.neighbours.copy()
# for every Vertex in v1.neighbours, check if there is a corresponding Vertex in v2.neighbours
# if there is, remove that Vertex from both lists
for n1 in v1_neighbours:
for n2 in v2_neighbours:
if n1.label == n2.label:
v1_neighbours.remove(n1)
v2_neighbours.remove(n2)
break
else:
return False
if len(v1_neighbours) == 0 and len(v2_neighbours) == 0:
return True
return False
I doubt this solution works: doesn't List.remove(element) remove all occurrences of that element? Also, I don't think it's memory efficient, which is important, as the neighborhoods will be pretty big.
Could anyone tell me how I can compare v1_neighbours and v2_neighbours properly, checking for an equal amount of duplicates while not altering the lists, without copying the lists?

Count them and compare the Counter-dicts:
a= [ (x,y) for x in range(5) for y in range(5)]+[ (x,y) for x in range(3) for y in range(3)]
b= [ (x,y) for x in range(5) for y in range(5)]+[ (x,y) for x in range(3) for y in range(3)]
c= [ (x,y) for x in range(5) for y in range(5)]+[ (x,y) for x in range(4) for y in range(3)]
from collections import Counter
ca = Counter(a)
cb = Counter(b)
cc = Counter(c)
print(ca==cb) # True
print(ca==cc) # False
print(ca)
Output:
True
False
Counter({(0, 0): 2, (0, 1): 2, (0, 2): 2, (1, 0): 2, (1, 1): 2, (1, 2): 2,
(2, 0): 2, (2, 1): 2, (2, 2): 2, (0, 3): 1, (0, 4): 1, (1, 3): 1,
(1, 4): 1, (2, 3): 1, (2, 4): 1, (3, 0): 1, (3, 1): 1, (3, 2): 1,
(3, 3): 1, (3, 4): 1, (4, 0): 1, (4, 1): 1, (4, 2): 1, (4, 3): 1,
(4, 4): 1})

While collections.Counter would be the usual way to perform this kind of multiset comparison in Python, I think comparing neighbors is a fundamentally misguided approach to vertex equality testing. Vertex equality should use either the default identity-based equality, or label-based equality, depending on the details of your program.
You seem to be trying to implement a comparison where two vertices are equal if they have equal labels and equal collections of neighbors. However, if it's possible for two different vertices to have equal labels, then it should be possible for two distinct vertices to have the same label and the same neighbors, making this a broken equality comparison. If it's not possible for two vertices to have equal labels, then comparing neighbors is unnecessary.
Your neighbor comparison nested loop also assumes that vertices are equal if the have equal labels, further supporting a label-based comparison. If this assumption is wrong, then you have the problem of how to determine that neighbors are equal. If you try to compare neighbors with ==, you'll run into infinite recursion.
With the additional clarification that you're implementing a color refinement algorithm, we can confirm that comparing neighbors by label only is actually correct. However, equals seems like a misleading name for the function you're implementing, since you're not testing whether the given Vertex objects represent the same vertex.

Related

Is this a reasonable way for choosing a certain index from a list by considering the elements of a corresponding list in python?

I am relatively new to programming, and have this problem:
There are two lists: C=[i,i,k,l,i] and D =[m,n,o,p,q]
I want to select the index of the minimum element of C. If k or l is the minimum, it is quite simple, since the min function will directly return the desired index. But if i is the minimum, there are several possibilities. In that case, I want to look at list D's elements, but only at the indices where i occurs in C. I then want to chose my sought-after index based on the minimum of those particular elements in D
I thought of the following code:
min_C = min(C)
if C.count(min_C) == 1:
soughtafter_index = C.index(min_C)
else:
possible_D_value = []
for iterate in C:
if iterate==min_C:
possible_index = C.index(iterate)
possible_D_value.append(D[possible_index])
best_D_value = min(possible_D_value)
soughtafter_index = D.index(best_D_value)
(Note that in the problem C and D will always have the same length)
I havent had a chance to test the code yet, but wanted to ask whether it is reasonable? Is there a better way to handle this? (and what if there is a third list-- then this code will get even longer...)
Thank you all
Try this:
soughtafter_index = list(zip(C, D)).index(min(zip(C,D)))
UPDATE with the required explanation:
>>> C = [1, 5, 1, 3, 1, 4]
>>> D = [0, 1, 1, 3, 0, 1]
>>> list(zip(C, D))
[(1, 0), (5, 1), (1, 1), (3, 3), (1, 0), (4, 1)]
>>> min(zip(C, D))
(1, 0)

Python, permutation to permuation-index function

I have some permutations of a list:
>>> import itertools
>>> perms = list(itertools.permutations([0,1,2,3]))
>>> perms
[(0, 1, 2, 3), (0, 1, 3, 2), (0, 2, 1, 3), (0, 2, 3, 1), (0, 3, 1, 2), (0, 3, 2, 1), (1, 0, 2, 3), (1, 0, 3, 2), (1, 2, 0, 3), (1, 2, 3, 0), (1, 3, 0, 2), (1, 3, 2, 0), (2, 0, 1, 3), (2, 0, 3, 1), (2, 1, 0, 3), (2, 1, 3, 0), (2, 3, 0, 1), (2, 3, 1, 0), (3, 0, 1, 2), (3, 0, 2, 1), (3, 1, 0, 2), (3, 1, 2, 0), (3, 2, 0, 1), (3, 2, 1, 0)]
>>> len(perms)
24
What function can I use (without access to the list perm) to get the index of an arbitrary permutation, e.g. (0, 2, 3, 1) -> 3?
(You can assume that permuted elements are always an ascending list of integers, starting at zero.)
Hint: The factorial number system may be involved. https://en.wikipedia.org/wiki/Factorial_number_system
Off the top of my head I came up with the following, didn't test it thoroughly.
from math import factorial
elements = list(range(4))
permutation = (3, 2, 1, 0)
index = 0
nf = factorial(len(elements))
for n in permutation:
nf //= len(elements)
index += elements.index(n) * nf
elements.remove(n)
print(index)
EDIT: replaced nf /= len(elements) with nf //= len(elements)
I suppose this is a challenge, so here is my (recursive) answer:
import math
import itertools
def get_index(l):
# In a real function, there should be more tests to validate that the input is valid, e.g. len(l)>0
# Terminal case
if len(l)==1:
return 0
# Number of possible permutations starting with l[0]
span = math.factorial(len(l)-1)
# Slightly modifying l[1:] to use the function recursively
new_l = [ val if val < l[0] else val-1 for val in l[1:] ]
# Actual solution
return get_index(new_l) + span*l[0]
get_index((0,1,2,3))
# 0
get_index((0,2,3,1))
# 3
get_index((3,2,1,0))
# 23
get_index((4,2,0,1,5,3))
# 529
list(itertools.permutations((0,1,2,3,4,5))).index((4,2,0,1,5,3))
# 529
You need to write your own function. Something like this would work
import math
def perm_loc(P):
N = len(P)
assert set(P) == set(range(N))
def rec(perm):
nums = set(perm)
if not perm:
return 0
else:
sub_res = rec(perm[1:]) # Result for tail of permutation
sub_size = math.factorial(len(nums) - 1) # How many tail permutations exist
sub_index = sorted(nums).index(perm[0]) # Location of first element in permutaiotn
# in the sorted list of number
return sub_index * sub_size + sub_res
return rec(P)
The function that does all the work is rec, with perm_loc just serving as a wrapper around it. Note that this algorithm is based on the nature of the permutation algorithm that itertools.permutation happens to use.
The following code tests the above function. First on your sample, and then on all permutations of range(7):
print perm_loc([0,2,3,1]) # Print the result from the example
import itertools
def test(N):
correct = 0
perms = list(itertools.permutations(range(N)))
for (i, p) in enumerate(perms):
pl = perm_loc(p)
if i == pl:
correct += 1
else:
print ":: Incorrect", p, perms.index(p), perm_loc(N, p)
print ":: Found %d correct results" % correct
test(7) # Test on all permutations of range(7)
from math import factorial
def perm_to_permidx(perm):
# Extract info
n = len(perm)
elements = range(n)
# "Gone"s will be the elements of the given perm
gones = []
# According to each number in perm, we add the repsective offsets
offset = 0
for i, num in enumerate(perm[:-1], start=1):
idx = num - sum(num > gone for gone in gones)
offset += idx * factorial(n - i)
gones.append(num)
return offset
the_perm = (0, 2, 3, 1)
print(perm_to_permidx(the_perm))
# 3
Explanation: All permutations of a given range can be considered as a groups of permutations. So, for example, for the permutations of 0, 1, 2, 3 we first "fix" 0 and permute rest, then fix 1 and permute rest, and so on. Once we fix a number, the rest is again permutations; so we again fix a number at a time from the remaining numbers and permute the rest. This goes on till we are left with one number only. Every level of fixing has a corresponding (n-i)! permutations.
So this code finds the "offsets" for each level of permutation. The offset corresonds to where the given permutation starts when we fix numbers of perm in order. For the given example of (0, 2, 3, 1), we first look at the first number in the given perm which is 0, and figure the offset as 0. Then this goes to gones list (we will see its usage). Then, at the next level of permutation we see 2 as the fixing number. To calculate the offset for this, we need the "order" of this 2 among the remaining three numbers. This is where gones come into play; if an already-fixed and considered number (in this case 0) is less than the current fixer, we subtract 1 to find the new order. Then offset is calculated and accumulated. For the next number 3, the new order is 3 - (1 + 1) = 1 because both previous fixers 0 and 2 are at the "left" of 3.
This goes on till the last number of the given perm since there is no need to look at it; it will have been determined anyway.

Filter generated permutations in python

I want to generate permutations of elements in a list, but only keep a set where each element is on each position only once.
For example [1, 2, 3, 4, 5, 6] could be a user list and I want 3 permutations. A good set would be:
[1,2,3,5,4,6]
[2,1,4,6,5,3]
[3,4,5,1,6,2]
However, one could not add, for example, [1,3,2,6,5,4] to the above, as there are two permutations in which 1 is on the first position twice, also 5 would be on the 5th position twice, however other elements are only present on those positions once.
My code so far is :
# this simply generates a number of permutations specified by number_of_samples
def generate_perms(player_list, number_of_samples):
myset = set()
while len(myset) < number_of_samples:
random.shuffle(player_list)
myset.add(tuple(player_list))
return [list(x) for x in myset]
# And this is my function that takes the stratified samples for permutations.
def generate_stratified_perms(player_list, number_of_samples):
user_idx_dict = {}
i = 0
while(i < number_of_samples):
perm = generate_perms(player_list, 1)
for elem in perm:
if not user_idx_dict[elem]:
user_idx_dict[elem] = [perm.index(elem)]
else:
user_idx_dict[elem] += [perm.index(elem)]
[...]
return total_perms
but I don't know how to finish the second function.
So in short, I want to give my function a number of permutations to generate, and the function should give me that number of permutations, in which no element appears on the same position more than the others (once, if all appear there once, twice, if all appear there twice, etc).
Let's starting by solving the case of generating n or fewer rows first. In that case, your output must be a Latin rectangle or a Latin square. These are easy to generate: start by constructing a Latin square, shuffle the rows, shuffle the columns, and then keep just the first r rows. The following always works for constructing a Latin square to start with:
1 2 3 ... n
2 3 4 ... 1
3 4 5 ... 2
... ... ...
n 1 2 3 ...
Shuffling rows is a lot easier than shuffling columns, so we'll shuffle the rows, then take the transpose, then shuffle the rows again. Here's an implementation in Python:
from random import shuffle
def latin_rectangle(n, r):
square = [
[1 + (i + j) % n for i in range(n)]
for j in range(n)
]
shuffle(square)
square = list(zip(*square)) # transpose
shuffle(square)
return square[:r]
Example:
>>> latin_rectangle(5, 4)
[(2, 4, 3, 5, 1),
(5, 2, 1, 3, 4),
(1, 3, 2, 4, 5),
(3, 5, 4, 1, 2)]
Note that this algorithm can't generate all possible Latin squares; by construction, the rows are cyclic permutations of each other, so you won't get Latin squares in other equivalence classes. I'm assuming that's OK since generating a uniform probability distribution over all possible outputs isn't one of the question requirements.
The upside is that this is guaranteed to work, and consistently in O(n^2) time, because it doesn't use rejection sampling or backtracking.
Now let's solve the case where r > n, i.e. we need more rows. Each column can't have equal frequencies for each number unless r % n == 0, but it's simple enough to guarantee that the frequencies in each column will differ by at most 1. Generate enough Latin squares, put them on top of each other, and then slice r rows from it. For additional randomness, it's safe to shuffle those r rows, but only after taking the slice.
def generate_permutations(n, r):
rows = []
while len(rows) < r:
rows.extend(latin_rectangle(n, n))
rows = rows[:r]
shuffle(rows)
return rows
Example:
>>> generate_permutations(5, 12)
[(4, 3, 5, 2, 1),
(3, 4, 1, 5, 2),
(3, 1, 2, 4, 5),
(5, 3, 4, 1, 2),
(5, 1, 3, 2, 4),
(2, 5, 1, 3, 4),
(1, 5, 2, 4, 3),
(5, 4, 1, 3, 2),
(3, 2, 4, 1, 5),
(2, 1, 3, 5, 4),
(4, 2, 3, 5, 1),
(1, 4, 5, 2, 3)]
This uses the numbers 1 to n because of the formula 1 + (i + j) % n in the first list comprehension. If you want to use something other than the numbers 1 to n, you can take it as a list (e.g. players) and change this part of the list comprehension to players[(i + j) % n], where n = len(players).
If runtime is not that important I would go for the lazy way and generate all possible permutations (itertools can do that for you) and then filter out all permutations which do not meet your requirements.
Here is one way to do it.
import itertools
def permuts (l, n):
all_permuts = list(itertools.permutations(l))
picked = []
for a in all_permuts:
valid = True
for p in picked:
for i in range(len(a)):
if a[i] == p[i]:
valid = False
break
if valid:
picked.append (a)
if len(picked) >= n:
break
print (picked)
permuts ([1,2,3,4,5,6], 3)

How to calculate output of a multivariable equation

I have a function,
f(x,y)=4x^2*y+3x+y
displayed as
four_x_squared_y_plus_three_x_plus_y = [(4, 2, 1), (3, 1, 0), (1, 0, 1)]
where the first item in the tuple is the coefficient, the second item is the exponent of x, and the third item is the exponent of y. I am trying to calculate the output at a certain value of x and y
I have tried to split the list of terms up, into what they represent and then feed in the values of x and y when I input them however I am getting unsupported operand type regarding ** tuples - even though I tried to split them up into separate values within the terms
Is this an effective method of splitting up tuples like this of have I missed a trick?
def multivariable_output_at(list_of_terms, x_value, y_value):
coefficient, exponent, intersect = list_of_terms
calculation =int(coefficient*x_value^exponent*y_value)+int(coefficient*x_value)+int(y_value)
return calculation
multivariable_output_at(four_x_squared_y_plus_three_x_plus_y, 1, 1) # 8 should be the output
please try this:
four_x_squared_y_plus_three_x_plus_y = [(4, 2, 1), (3, 1, 0), (1, 0, 1)]
def multivariable_output_at(list_of_terms, x_value, y_value):
return sum(coeff*(x_value**x_exp)*(y_value**y_exp) for coeff,x_exp,y_exp in list_of_terms)
print(multivariable_output_at(four_x_squared_y_plus_three_x_plus_y, 1, 1))
NOTICE:
this is different from how your code originally treated variables, and is based on my intuition of what the list of term means, given your example.
If you have more examples of input -> output, you should check my answer with all of them to make sure what I did is correct.
The first line of code unpacks the list of tuples into three distinct tuples:
coefficient, exponent, intersect = list_of_terms
# coefficient = (4, 2, 1)
# exponent = (3, 1, 0)
# intersect = (1, 0, 1)
The product operator * is not supported by tuples, do you see the issue?

Numpy: Efficient way to convert indices of a square matrix to its upper triangular indices [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Question: given a tuple of index, return its order in upper triangular indices. Here is an example:
Suppose we have a square matrix A of shape (3, 3).
A has 6 upper triangular indices, namely, (0, 0), (0, 1), (0, 2), (1, 1), (1, 2), (2, 2).
Now I know an element at index (1, 2), which is a index belongs to the upper triangular part of A. I would like to return 4 (which means it is the 5th element in all upper triangular indices.)
Any ideas on how to do that in general?
Best,
Zhihao
One can write down the explicit formula:
def utr_idx(N, i, j):
return (2*N+1-i)*i//2 + j-i
Demo:
>>> N = 127
>>> X = np.transpose(np.triu_indices(N))
>>> utr_idx(N, *X[2123])
2123
For an n×n matrix, the (i, j)-th item of the upper triangle is the i×(2×n-i+1)/2+j-i-th element of the matrix.
We can also do the math in reverse and obtain the (i, j) element for the k-th element with:
i = ⌊(-√((2n+1)2-8k)+2n+1)/2⌋ and j = k+i-i×(2×n-i+1)/2
So for example:
from math import floor, sqrt
def coor_to_idx(n, i, j):
return i*(2*n-i+1)//2+j-i
def idx_to_coor(n, k):
i = floor((-sqrt((2*n+1)*(2*n+1)-8*k)+2*n+1)/2)
j = k + i - i*(2*n-i+1)//2
return i, j
For example:
>>> [idx_to_coor(4, i) for i in range(10)]
[(0, 0), (0, 1), (0, 2), (0, 3), (1, 1), (1, 2), (1, 3), (2, 2), (2, 3), (3, 3)]
>>> [coor_to_idx(4, i, j) for i in range(4) for j in range(i, 4)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Given the numbers are not huge (well if these are huge, calculations are no longer done in constant time), we can thus calculate the k-th coordinate in O(1), for example:
>>> idx_to_coor(1234567, 123456789)
(100, 5139)
which is equivalent to obtaining it through enumeration:
>>> next(islice(((i, j) for i in range(1234567) for j in range(i, 1234567)), 123456789, None))
(100, 5139)
Here converting indices to a coordinate can also have, for large numbers, some rounding errors due to floating point imprecision.
IIUC, you can get the indexes using itertools combinations with replacement
>>> ind = tuple(itertools.combinations_with_replacement(range(3),2))
((0, 0), (0, 1), (0, 2), (1, 1), (1, 2), (2, 2))
To retrieve the index, just use index method
>>> ind.index((1,2))
4
You could use np.triu_indices and a dictionary:
import numpy as np
iu1 = np.triu_indices(3)
table = {(i, j): c for c, (i, j) in enumerate(zip(*iu1))}
print(table[(1, 2)])
Output
4
Similar to #DanielMesejo, you can use np.triu_indices with either argwhere or nonzero:
my_index = (1,2)
>>> np.nonzero((np.stack(np.triu_indices(3), axis=1) == my_index).all(1))
(array([4]),)
>>> np.argwhere((np.stack(np.triu_indices(3), axis=1) == my_index).all(1))
array([[4]])
Explanation:
np.stack(np.triu_indices(3), axis=1) gives you the indices of your upper triangle in order:
array([[0, 0],
[0, 1],
[0, 2],
[1, 1],
[1, 2],
[2, 2]])
So all you have to do is find where it matches [1,2] (which you can do with the == operator and all)
Constructing upper indices would be costly. We can directly get the corresponding index like so -
def triu_index(N, x, y):
# Get index corresponding to (x,y) in upper triangular list
idx = np.r_[0,np.arange(N,1,-1).cumsum()]
return idx[x]+y-x
Sample run -
In [271]: triu_index(N=3, x=1, y=2)
Out[271]: 4

Categories

Resources