Test if all N variables are different - python

I want to make a condition where all selected variables are not equal.
My solution so far is to compare every pair which doesn't scale well:
if A!=B and A!=C and B!=C:
I want to do the same check for multiple variables, say five or more, and it gets quite confusing with that many. What can I do to make it simpler?

Create a set and check whether the number of elements in the set is the same as the number of variables in the list that you passed into it:
>>> variables = [a, b, c, d, e]
>>> if len(set(variables)) == len(variables):
... print("All variables are different")
A set doesn't have duplicate elements so if you create a set and it has the same number of elements as the number of elements in the original list then you know all elements are different from each other.

If you can hash your variables (and, uh, your variables have a meaningful __hash__), use a set.
def check_all_unique(li):
unique = set()
for i in li:
if i in unique: return False #hey I've seen you before...
unique.add(i)
return True #nope, saw no one twice.
O(n) worst case. (And yes, I'm aware that you can also len(li) == len(set(li)), but this variant returns early if a match is found)
If you can't hash your values (for whatever reason) but can meaningfully compare them:
def check_all_unique(li):
li.sort()
for i in range(1,len(li)):
if li[i-1] == li[i]: return False
return True
O(nlogn), because sorting. Basically, sort everything, and compare pairwise. If two things are equal, they should have sorted next to each other. (If, for some reason, your __cmp__ doesn't sort things that are the same next to each other, 1. wut and 2. please continue to the next method.)
And if ne is the only operator you have....
import operator
import itertools
li = #a list containing all the variables I must check
if all(operator.ne(*i) for i in itertools.combinations(li,2)):
#do something
I'm basically using itertools.combinations to pair off all the variables, and then using operator.ne to check for not-equalness. This has a worst-case time complexity of O(n^2), although it should still short-circuit (because generators, and all is lazy). If you are absolutely sure that ne and eq are opposites, you can use operator.eq and any instead.
Addendum: Vincent wrote a much more readable version of the itertools variant that looks like
import itertools
lst = #a list containing all the variables I must check
if all(a!=b for a,b in itertools.combinations(lst,2)):
#do something
Addendum 2: Uh, for sufficiently large datasets, the sorting variant should possibly use heapq. Still would be O(nlogn) worst case, but O(n) best case. It'd be something like
import heapq
def check_all_unique(li):
heapq.heapify(li) #O(n), compared to sorting's O(nlogn)
prev = heapq.heappop(li)
for _ in range(len(li)): #O(n)
current = heapq.heappop(li) #O(logn)
if current == prev: return False
prev = current
return True

Put the values into a container type. Then just loop trough the container, comparing each value. It would take about O(n^2).
pseudo code:
a[0] = A; a[1] = B ... a[n];
for i = 0 to n do
for j = i + 1 to n do
if a[i] == a[j]
condition failed

You can enumerate a list and check that all values are the first occurrence of that value in the list:
a = [5, 15, 20, 65, 48]
if all(a.index(v) == i for i, v in enumerate(a)):
print "all elements are unique"
This allows for short-circuiting once the first duplicate is detected due to the behaviour of Python's all() function.
Or equivalently, enumerate a list and check if there are any values which are not the first occurrence of that value in the list:
a = [5, 15, 20, 65, 48]
if not any(a.index(v) != i for i, v in enumerate(a)):
print "all elements are unique"

Related

Execution Timed Out on codewars when finding the least used element in an array. Needs to be optimized but dont know how

Instructions on codewars:
There is an array with some numbers. All numbers are equal except for one. Try to find it!
find_uniq([ 1, 1, 1, 2, 1, 1 ]) == 2
find_uniq([ 0, 0, 0.55, 0, 0 ]) == 0.55
It’s guaranteed that array contains at least 3 numbers.
The tests contain some very huge arrays, so think about performance.
This is the code I wrote:
def find_uniq(arr):
for n in arr:
if arr.count(n) == 1:
return n
exit()
It works as follows:
For every character in the array, if that character appears only once, it returns said character and exits the code. If the character appears more than once, it does nothing
When attempting the code on codewars, I get the following error:
STDERR
Execution Timed Out (12000 ms)
I am a beginner so I have no idea how to further optimize the code in order for it to not time out
The first version of my code looked like this:
def find_uniq(arr):
arr.sort()
rep = str(arr)
for character in arr:
cantidad = arr.count(character)
if cantidad > 1:
rep = rep.replace(str(character), "")
rep = rep.replace("[", "")
rep = rep.replace("]", "")
rep = rep.replace(",", "")
rep = rep.replace(" ", "")
rep = float(rep)
n = rep
return n
After getting timed out, I assumed it was due to the repetitive replace functions and the fact that the code had to go through every element even if it had already found the correct one, since the code was deleting the incorrect ones, instead of just returning the correct one
After some iterations that I didn't save we got to the current code, which checks if the character is only once in the array, returns that and exits
def find_uniq(arr):
for n in arr:
if arr.count(n) == 1:
return n
exit()
I have no clue how to further optimize this
.count() iterates over the entire array every time that you call it. If your array has n elements, it will iterate over the array n times, which is quite slow.
You can use collections.Counter as Unmitigated suggests, but if you're not familiar with the module, it might seem overkill for this problem. Since in this case you know that there's only two unique elements in the array, you can get all of the unique elements using set(), and then check the frequency of each unique element:
def find_uniq(arr):
for n in set(arr):
if arr.count(n) == 1:
return n
You can use a dict or collections.Counter to get the frequency of each element with linear time complexity. Then return the element with a frequency of one.
def find_uniq(l):
from collections import Counter
return Counter(l).most_common()[-1][0]
Compare the first two numbers. If they match, find the one in the array that doesn't match (longest solution). Otherwise, return the one that doesn't match the third. Coded:
def find_uniq(arr):
if arr[0]==arr[1]:
target=arr[0]
for i in range(2,len(arr)):
if arr[i] != target:
return arr[i]
else:
if arr[0]==arr[2]:
return arr[1]
else:
return arr[0]
In your original code:
def find_uniq(arr):
for n in arr:
if arr.count(n) == 1:
return n
exit() # note: this line does nothing because you already returned
you're calling arr.count once for each element in the array (assuming the worst case scenario where the unique element is at the very end). Each call to arr.count(n) scans through the entire array counting up n -- so you're iterating over the entire array of N elements N times, which makes this O(N^2) -- very slow if N is big!
The second version of your code has the same problem, but it adds a huge amount of extra complexity by turning the list into a string and then trying to parse the string -- don't do that!
The way to make this fast is to iterate over the entire list once and keep track of the count of each item as you go. This is easiest to do with the built in collections.Counter class:
from collections import Counter
def find_uniq(arr):
return next(i for i, c in Counter(arr).items() if c == 1)
Given the constraint that there are only two different values in the array and exactly one of them is unique, you can make this more efficient (such that you don't even need to iterate over the entire array in all cases) by breaking it into two possibilities: either the first two items are identical and you just need to look for the item that's not equal to those, or they're different and you just need to return the one that's not equal to the third.
def find_uniq(arr):
if arr[0] == arr[1]:
# First two items are the same, iterate through
# the rest of the array to find the unique one.
return next(i for i in arr if i != arr[0])
# Otherwise, either arr[0] or arr[1] is unique.
return arr[0] if arr[1] == arr[2] else arr[1]
In this approach, you only ever need to iterate through the array as far as the unique item (or exactly one item past it in the case where it's one of the first two items). In the specific case where the unique item is toward the start of a very long array, this will be much faster than an approach that iterates over the entire array no matter what. In the worst case scenario, you will still have only a single iteration.

Python's list comprehension: Modify list elements if a certain value occurs

How can I do the following in Python's list comprehension?
nums = [1,1,0,1,1]
oFlag = 1
res = []
for x in nums:
if x == 0:
oFlag = 0
res.append(oFlag)
print(res)
# Output: [1,1,0,0,0]
Essentially in this example, zero out the rest of the list once a 0 occurs.
Some context, a list comprehension is a sort of "imperative" syntax for the map and filter functions that exist in many functional programing languages. What you're trying to do is usually referred to as an accumulate, which is a slightly different operation. You can't implement an accumulate in terms of a map and filter except by using side effects. Python allows you have side effects in a list comprehension so it's definitely possible but list comprehensions with side effects are a little wonky. Here's how you could implement this using accumulate:
nums = [1,1,0,1,1]
def accumulator(last, cur):
return 1 if (last == 1 and cur == 1) else 0
list(accumulate(nums, accumulator))
or in one line:
list(accumulate(nums, lambda last, cur: 1 if (last == 1 and cur == 1) else 0))
Of course there are several ways to do this using an external state and a list comprehension with side effects. Here's an example, it's a bit verbose but very explicit about how state is being manipulated:
class MyState:
def __init__(self, initial_state):
self.state = initial_state
def getNext(self, cur):
self.state = accumulator(self.state, cur)
return self.state
mystate = MyState(1)
[mystate.getNext(x) for x in nums]
nums = [1,1,0,1,1]
[int(all(nums[:i+1])) for i in range(len(nums))]
This steps through the list, applying the all operator to the entire sub-list up to that point.
Output:
[1, 1, 0, 0, 0]
Granted, this is O(n^2), but it gets the job done.
Even more effective is simply to find the index of the first 0.
Make a new list made of that many 1s, padded with the appropriate quantity of zeros.
if 0 in nums:
idx = nums.index(0)
new_list = [1] * idx + [0] * (len(nums) - idx)
... or if the original list can contain elements other than 0 and 1, copy the list that far rather than repeating 1s:
new_list = nums[:idx] + [0] * (len(nums) - idx)
I had an answer using list comprehension, but #Prune beat me to it. It was really just a cautionary tail, showing how it would be done while making an argument against that approach.
Here's an alternative approach that might fit your needs:
import itertools
import operator
nums = [1,1,0,1,1]
res = itertools.accumulate(nums, operator.and_)
In this case res is an iterable. If you need a list, then
res = list(itertools.accumulate(nums, operator.and_))
Let's break this down. The accumulate() function can be used to generate a running total, or 'accumulated sums'. If only one argument is passed the default function is addition. Here we pass in operator.and_. The operator module exports a set of efficient functions corresponding to the intrinsic operators of Python. When an accumulated and is run on a list of 0's and 1's the result is a list that has 1's up till the first 0 is found, then all 0's after.
Of course we're not limited to using functions defined in the operator module. You can use any function that accepts 2 parameters of the type of the elements in the first parameter (and probably returns the same type). You can get creative, but here I'll keep it simple and just implement and:
import itertools
nums = [1,1,0,1,1]
res = itertools.accumulate(nums, lambda a, b: a and b)
Note: using operator.and_ probably runs faster. Here we're just providing an example using the lambda syntax.
While a list comprehension is not used, to me it has a similar feel. It fits in one line and isn't too hard to read.
For a list comprehension approach, you could use index with enumerate:
firstIndex = nums.index(0) if 0 in nums else -1
[1 if i < firstIndex else 0 for i, x in enumerate(nums)]
Another approach using numpy:
import numpy as np
print(np.cumprod(np.array(nums) != 0).tolist())
#[1, 1, 0, 0, 0]
Here we take the convert nums to a numpy array and check to see if the values are not equal to 0. We then take the cumulative product of the array, knowing that once a 0 is found we will multiply by 0 from that point forward.
Here is a linear-time solution that doesn't mutate global state, doesn't require any other iterators except the nums, and that does what you want, albeit requiring some auxiliary data-structures, and using a seriously hacky list-comprehension:
>>> nums = [1,1,0,1,1]
>>> [f for f, ns in [(1, nums)] for n in ns for f in [f & (n==1)]]
[1, 1, 0, 0, 0]
Don't use this. Use your original for-loop. It is more readable, and almost certainly faster. Don't strive to put everything in a list-comprehension. Strive to make your code simple, readable, and maintainable, which your code already was, and the above code is not.

Test if all values of a dictionary are equal - when value is unknown

I have 2 dictionaries:
the values in each dictionary should all be equal.
BUT I don't know what that number will be...
dict1 = {'xx':A, 'yy':A, 'zz':A}
dict2 = {'xx':B, 'yy':B, 'zz':B}
N.B. A does not equal B
N.B. Both A and B are actually strings of decimal numbers (e.g. '-2.304998') as they have been extracted from a text file
I want to create another dictionary - that effectively summarises this data - but only if all the values in each dictionary are the same.
i.e.
summary = {}
if dict1['xx'] == dict1['yy'] == dict1['zz']:
summary['s'] = dict1['xx']
if dict2['xx'] == dict2['yy'] == dict2['zz']:
summary['hf'] = dict2['xx']
Is there a neat way of doing this in one line?
I know it is possible to create a dictionary using comprehensions
summary = {k:v for (k,v) in zip(iterable1, iterable2)}
but am struggling with both the underlying for loop and the if statement...
Some advice would be appreciated.
I have seen this question, but the answers all seem to rely on already knowing the value being tested (i.e. are all the entries in the dictionary equal to a known number) - unless I am missing something.
sets are a solid way to go here, but just for code golf purposes here's a version that can handle non-hashable dict values:
expected_value = next(iter(dict1.values())) # check for an empty dictionary first if that's possible
all_equal = all(value == expected_value for value in dict1.values())
all terminates early on a mismatch, but the set constructor is well enough optimized that I wouldn't say that matters without profiling on real test data. Handling non-hashable values is the main advantage to this version.
One way to do this would be to leverage set. You know a set of an iterable has a length of 1 if there is only one value in it:
if len(set(dct.values())) == 1:
summary[k] = next(iter(dct.values()))
This of course, only works if the values of your dictionary are hashable.
While we can use set for this, doing so has a number of inefficiencies when the input is large. It can take memory proportional to the size of the input, and it always scans the whole input, even when two distinct values are found early. Also, the input has to be hashable.
For 3-key dicts, this doesn't matter much, but for bigger ones, instead of using set, we can use itertools.groupby and see if it produces multiple groups:
import itertools
groups = itertools.groupby(dict1.values())
# Consume one group if there is one, then see if there's another.
next(groups, None)
if next(groups, None) is None:
# All values are equal.
do_something()
else:
# Unequal values detected.
do_something_else()
Except for readability, I don't care for all the answers involving set or .values. All of these are always O(N) in time and memory. In practice it can be faster, although it depends on the distribution of values.
Also because set employs hashing operations, you may also have a hefty large constant multiplier to your time cost. And your values have to hashable, when a test for equality is all that's needed.
It is theoretically better to take the first value from the dictionary and search for the first example in the remaining values that is not equal to.
set might be quicker than the solution below because its workings are may reduce to C implementations.
def all_values_equal(d):
if len(d)<=1: return True # Treat len0 len1 as all equal
i = d.itervalues()
firstval = i.next()
try:
# Incrementally generate all values not equal to firstval
# .next raises StopIteration if empty.
(j for j in i if j!=firstval).next()
return False
except StopIteration:
return True
print all_values_equal({1:0, 2:1, 3:0, 4:0, 5:0}) # False
print all_values_equal({1:0, 2:0, 3:0, 4:0, 5:0}) # True
print all_values_equal({1:"A", 2:"B", 3:"A", 4:"A", 5:"A"}) # False
print all_values_equal({1:"A", 2:"A", 3:"A", 4:"A", 5:"A"}) # True
In the above:
(j for j in i if j!=firstval)
is equivalent to:
def gen_neq(i, val):
"""
Give me the values of iterator i that are not equal to val
"""
for j in i:
if j!=val:
yield j
I found this solution, which I find quite a bit I combined another solution found here: enter link description here
user_min = {'test':1,'test2':2}
all(value == list(user_min.values())[0] for value in user_min.values())
>>> user_min = {'test':1,'test2':2}
>>> all(value == list(user_min.values())[0] for value in user_min.values())
False
>>> user_min = {'test':2,'test2':2}
>>> all(value == list(user_min.values())[0] for value in user_min.values())
True
>>> user_min = {'test':'A','test2':'B'}
>>> all(value == list(user_min.values())[0] for value in user_min.values())
False
>>> user_min = {'test':'A','test2':'A'}
>>> all(value == list(user_min.values())[0] for value in user_min.values())
True
Good for a small dictionary, but I'm not sure about a large dictionary, since we get all the values to choose the first one

How to check if N can be expressed as sum of two other numbers in specific list

I have a list:
l = [1,3,4,6,7,8,9,11,13,...]
and a number n.
How do I efficiently check if the number n can be expressed as the sum of two numbers (repeats are allowed) within the list l.
If the number is in the list, it does not count unless it can be expressed as two numbers (e.g for l = [2,3,4] 3 would not count, but 4 would.
This, embarrassingly, is what I've tried:
def is_sum_of_2num_inlist(n, num_list):
num_list = filter(lambda x: x < n, num_list)
for num1 in num_list:
for num2 in num_list:
if num1+num2 == n:
return True
return False
Thanks
def summable(n, l):
for v in l:
l_no_v = l[:]
l_no_v.remove(v)
if n - v in l_no_v:
return True
return False
EDIT: Explanation...
The itertools.cominations is a nice way to get all possible answers, but it's ~4x slower than this version since this is a single loop that bails out once it gets to a possible solution.
This loops over the values in l, makes a copy of l removing v so that we don't add v to itself (i.e. no false positive if n = 4; l = [2, 1]). Then subtract v from n and if that value is in l then there are two numbers that sum up to n. If you want to return those numbers instead of returning True just return n, n - v.
Although you can check this by running through the list twice, I would recommend for performance converting the list to a set, since x in set() searches in linear time.
Since n can be the sum of the same number, all you have to do is iterate through the set once and check if n - i occurs elsewhere in the set.
Something like the following should work.
>>> def is_sum_of_numbers(n, numbers):
... for i in numbers:
... if n - i in numbers:
... return True
... return False
...
>>>
>>>
>>> numbers = {2,7,8,9}
>>> is_sum_of_numbers(9, numbers) # 2 + 7
True
>>> is_sum_of_numbers(5, numbers)
False
>>> is_sum_of_numbers(18, numbers) # 9 + 9
True
If the list is ordered you could use two variables to go through the list, one starting at the beginning of the list and one at the end, if the sum of the two variables is greater than N you assign to the variable at the end the values that precedes it, if the sum is less than N you assign to the variable at the beginning the following value in the list. If the sum is N you've found the two values. You can stop when the two variables meet eachother.
If the list is not ordered you start from the beginning of the list and use a variable x to go through the list. You'll need another structure like an hashset or another structure. At every step you'll look up in the second hashset if the value N-x is in there. If there is, you've found the two numbers that add up to N. If there isn't you'll add N-x in the hashset and assign to x the following value. I recommend using an hashset because both the operations of looking up and inserting are O(1).
Both algorithms are linear
I'm sorry I couldn't write directly the code in python because I don't use it.
As I said in the comment HERE there's a video in wich your problem is solved
If I got the OP's concern then-
As the question says repeats are allowed within the list l this process i think is good though a bit slower.So if you need to count the occurances along with the existence of a condition then go with this answer but if you want a bolean esixtence check the go with the others for the mere performance issue nothing else.
You can use itertools.combinations. It will give you all the combinations, not permutations. Now you can just use the sum function to get the sum.
from itertools import combinations
l = [1,3,4,6,7,8,9,11,13]
checks = [4,6] #these are the numbers to check
for chk in checks:
for sm in combinations(l,2):
if chk == sum(sm): #sum(sm) means sum(1,3) for the first pass of the loop
#Do something

How to get all the minimum elements according to its first element of the inside list in a nested list?

Simply put! there is this list say LST = [[12,1],[23,2],[16,3],[12,4],[14,5]] and i want to get all the minimum elements of this list according to its first element of the inside list. So for the above example the answer would be [12,1] and [12,4]. Is there any typical way in python of doing this?
Thanking you in advance.
Two passes:
minval = min(LST)[0]
return [x for x in LST if x[0] == minval]
One pass:
def all_minima(iterable, key=None):
if key is None: key = id
hasminvalue = False
minvalue = None
minlist = []
for entry in iterable:
value = key(entry)
if not hasminvalue or value < minvalue:
minvalue = value
hasminvalue = True
minlist = [entry]
elif value == minvalue:
minlist.append(entry)
return minlist
from operator import itemgetter
return all_minima(LST, key=itemgetter(0))
A compact single-pass solution requires sorting the list -- that's technically O(N log N) for an N-long list, but Python's sort is so good, and so many sequences "just happen" to have some embedded order in them (which timsort cleverly exploits to go faster), that sorting-based solutions sometimes have surprisingly good performance in the real world.
Here's a solution requiring 2.6 or better:
import itertools
import operator
f = operator.itemgetter(0)
def minima(lol):
return list(next(itertools.groupby(sorted(lol, key=f), key=f))[1])
To understand this approach, looking "from the inside, outwards" helps.
f, i.e., operator.itemgetter(0), is a key-function that picks the first item of its argument for ordering purposes -- the very purpose of operator.itemgetter is to easily and compactly build such functions.
sorted(lol, key=f) therefore returns a sorted copy of the list-of-lists lol, ordered by increasing first item. If you omit the key=f the sorted copy will be ordered lexicographically, so it will also be in order of increasing first item, but that acts only as the "primary key" -- items with the same first sub-item will in turn be sorted among them by the values of their second sub-items, and so forth -- while with the key=f you're guaranteed to preserve the original order among items with the same first sub-item. You don't specify which behavior you require (and in your example the two behaviors happen to produce the same result, so we cannot distinguish from that example) which is why I'm carefully detailing both possibilities so you can choose.
itertools.groupby(sorted(lol, key=f), key=f) performs the "grouping" task that is the heart of the operation: it yields groups from the sequence (in this case, the sequence sorted provides) based on the key ordering criteria. That is, a group with all adjacent items producing the same value among themselves when you call f with the item as an argument, then a group with all adjacent item producing a different value from the first group (but same among themselves), and so forth. groupby respect the ordering of the sequence it takes as its argument, which is why we had to sort the lol first (and this behavior of groupby makes it very useful in many cases in which the sequence's ordering does matter).
Each result yielded by groupby is a pair k, g: a key k which is the result of f(i) on each item in the group, an iterator g which yields each item in the group in sequence.
The next built-in (the only bit in this solution which requires Python 2.6) given an iterator produces its next item -- in particular, the first item when called on a fresh, newly made iterator (and, every generator of course is an iterator, as is groupby's result). In earlier Python versions, it would have to be groupby(...).next() (since next was only a method of iterators, not a built-in), which is deprecated since 2.6.
So, summarizing, the result of our next(...) is exactly the pair k, g where k is the minimum (i.e., first after sorting) value for the first sub-item, and g is an iterator for the group's items.
So, with that [1] we pick just the iterator, so we have an iterator yielding just the subitems we want.
Since we want a list, not an iterator (per your specs), the outermost list(...) call completes the job.
Is all of this worth it, performance-wise? Not on the tiny example list you give -- minima is actually slower than either code in #Kenny's answer (of which the first, "two-pass" solution is speedier). I just think it's worth keeping the ideas in mind for the next sequence processing problem you may encounter, where the details of typical inputs may be quite different (longer lists, rarer minima, partial ordering in the input, &c, &c;-).
m = min(LST, key=operator.itemgetter(0))[0]
print [x for x in LST if x[0] == m]
minval = min(x[0] for x in LST)
result = [x for x in LST if x[0]==minval]

Categories

Resources