How to make map for a nested function? - python

My intention was to mutate the list without using return. So my apporach was to create a helperfunction to modify the value and then map it to the whole list. However, map only consumes one function and one string. Thus I got stuck. Sorry for any inconvinece or misunderstanding
I have a list and two boundaries where lower boundary will replace any number below it in the list while upper boundary also replace number above it in the list.
def help(values,lower,upper):
def abc(value):
if value <= 100:
value = 100
elif value >= 0:
value = 0
else:
value
list(map (abc, values))
For example, give A=[0,1,2,3,4,5,6,7,8,9,10] with lower boundary of 2 and upper boundary of 8. It should return a list of:
[2,2,2,3,4,5,6,7,8,8,8]
The process of check will be like
A=[0,1,2,3,4,5,6,7,8,9,10]
check.expect("Function",help(A,2,8),None)
check.expect("List",help(A,2,8)[2,2,2,3,4,5,6,7,8,8,8])

Use just a list-comprehension that sets element based on conditions:
def rreplace(lst, l, u):
return [l if x < l else u if x > u else x for x in lst]
A = [0,1,2,3,4,5,6,7,8,9,10]
print(rreplace(A, 2, 8))
# [2, 2, 2, 3, 4, 5, 6, 7, 8, 8, 8]

Related

How do I move list elements to another list if a condition is met?

I have a list of values and I want to move certain (or all) values to another list if they exist in a reference list.
x = [2,3,4,5,6,7,8] # list of values
ref = [2,3,4,5,6,7,8] # reference list
result = [x.pop(i) for i, v in enumerate(x) if v in ref]
But because of popping the current index, it ends up giving every other value instead. Is there a nice straightforward way to do this?
What I want at the end of this example is x=[] and result=[2,3,4,5,6,7,8], but ref doesn't need to contain all elements of x, this was just for an example. In another case it might be:
x = [2,3,4,5,6,7,8] # list of values
ref = [2,6,7] # reference list
So then I want x = [3,4,5,8] and result = [2,6,7]
In general, we should avoid modifying objects while iterating over them.
For this problem, we could generate result and filter x in two steps using comprehensions (avoiding appending to lists) as in the following example.
result, x = [v for v in x if v in ref], [v for v in x if v not in ref]
You could do it the old-fashioned way, with a while loop and a pointer into x:
x = [2, 3, 4, 5, 6, 7, 8]
ref = [2, 6, 7]
result = []
i = 0
while i < len(x):
if x[i] in ref:
result.append(x.pop(i))
else:
i += 1
print(x)
print(result)
Output:
[]
[2, 3, 4, 5, 6, 7, 8]
You can simply iterate from the end to the start to avoid pop() changing the list size while iterating. Just call reverse() on your new list after running your loop if the order of the list matters.

Python arranging a list to include duplicates

I have a list in Python that is similar to:
x = [1,2,2,3,3,3,4,4]
Is there a way using pandas or some other list comprehension to make the list appear like this, similar to a queue system:
x = [1,2,3,4,2,3,4,3]
It is possible, by using cumcount
s=pd.Series(x)
s.index=s.groupby(s).cumcount()
s.sort_index()
Out[11]:
0 1
0 2
0 3
0 4
1 2
1 3
1 4
2 3
dtype: int64
If you split your list into one separate list for each value (groupby), you can then use the itertools recipe roundrobin to get this behavior:
x = ([1, 2, 2, 3, 3, 3, 4, 4])
roundrobin(*(g for _, g in groupby(x)))
If I'm understanding you correctly, you want to retain all duplicates, but then have the list arranged in an order where you create what are in essence separate lists of unique values, but they're all concatenated into a single list, in order.
I don't think this is possible in a listcomp, and nothing's occurring to me for getting it done easily/quickly in pandas.
But the straightforward algorithm is:
Create a different list for each set of unique values: For i in x: if x not in list1, add to list 1; else if not in list2, add to list2; else if not in list3, ad to list3; and so on. There's certainly a way to do this with recursion, if it's an unpredictable number of lists.
Evaluate the lists based on their values, to determine the order in which you want to have them listed in the final list. It's unclear from your post exactly what order you want them to be in. Querying by the value in the 0th position could be one way. Evaluating the entire lists as >= each other is another way.
Once you have that set of lists and their orders, it's straightforward to concatenate them in order, in the final list.
essentially what you want is pattern, this pattern is nothing but the order in which we found unique numbers while traversing the list x for eg: if x = [4,3,1,3,5] then pattern = 4 3 1 5 and this will now help us in filling x again such that output will be [4,3,1,5,3]
from collections import defaultdict
x = [1,2,2,3,3,3,4,4]
counts_dict = defaultdict(int)
for p in x:
counts_dict[p]+=1
i =0
while i < len(x):
for p,cnt in counts_dict.items():
if i < len(x):
if cnt > 0:
x[i] = p
counts_dict[p]-=1
i+=1
else:
continue
else:
# we have placed all the 'p'
break
print(x) # [1, 2, 3, 4, 2, 3, 4, 3]
note: python 3.6+ dict respects insertion order and I am assuming that you are using python3.6+ .
This is what I thought of doing at first but It fails in some cases..
'''
x = [3,7,7,7,4]
i = 1
while i < len(x):
if x[i] == x[i-1]:
x.append(x.pop(i))
i = max(1,i-1)
else:
i+=1
print(x) # [1, 2, 3, 4, 2, 3, 4, 3]
# x = [2,2,3,3,3,4,4]
# output [2, 3, 4, 2, 3, 4, 3]
# x = [3,7,1,7,4]
# output [3, 7, 1, 7, 4]
# x = [3,7,7,7,4]
# output time_out
'''

Find the permutations that sums to the three smallest numbers

I asked the same thing yesterday but was finding a hard time finding the right sentence to describe my problem, so I deleted it. But here it is again.
Let us say that we have 3 lists:
list1 = [1, 2]
list2 = [2, 3]
list3 = [1]
Let us say I want to find the 3 permutations of these list, which when added together, it results in the smallest number possible. So here, the permutations that we want would be:
1,2,1
2,2,1
1,3,1
Because the sum of the numbers on each permutation creates the smallest numbers possible.
2,3,1
Will not be a part of the solution since the sum is larger than the other three, thus, not a part of the three smallest.
Of course, using itertools and list all the permutations, and add the numbers on each permutation would be the most obvious solution, but I was wondering if there is a more efficient algorithm for this? Considering It should be able to take 1000 lists.
NOTE: If the number of list is N, then i would need to find N permutations. Thus, if there are 3 lists, I find the 3 smallest permutations.
PRECONDITIONS:
-A part of the precondition is that all of these lists are sorted.
-The number of elements on all list is 2N-1, to deal with the case where only one list have more than 1 element.
-All of the lists are sorted from smallest.
Since the lists are sorted, the smallest element in each list is the first one, the sum of which gives us the "minimal sum permutation". Picking any element except from the first one is going to increase the sum value.
We start off by calculating the difference between element i and the first one for each list. For example, for the lists [1, 3, 4, 8] and [3, 9, 12, 15], these differences would be [2, 3, 7] and [6, 9, 12] respectively. We keep them separate in cost_lists, because they will be needed later on. But in cost_global, we pool them all together and by sorting them in ascending order, we find a solution where for all lists but one we choose the minimal value. To keep track which element from which list will give us the next minimum sum, we group the difference values with both the index of the list it comes from and which element in that list it is.
However, this is not a complete approach. It is possible, for example, that taking the next value from two lists incurs a smaller cost than taking the next value from one list. So, we have to search for the product of the combinations for k = 2, 3, ..., N. Doing that normally would result to N**N complexity, but we can take some really good shortcuts.
From the partial solution above, we have a list of the minimal costs in order. Since we want only the first N minimal sums, we check what the cost value of the Nth permutation is (threshold). So, when we search for a group of two next values, we can safely ignore their sum if it exceeds our current threshold. And since the difference values within lists are in ascending order, once we cross the threshold, we can instantly exit the loop. Similarly, if we haven't found any new combinations within the threshold for k = 2, it is pointless to look for k > 2. Considering that most likely the smallest sum costs will be the result of a single nonminimal value, or a few small ones (unless most lists have massive differences between sequential values), we are bound to exit these loops rather quickly. The code I came up to achieve this is fairly ugly, but it effectively does the same as
for k in xrange(2, len(lists)):
for comb in itertools.combinations(cost_lists, k):
for group in itertools.product(*comb):
if sum(g[0] for g in group) <= threshold:
cost_global.append(group)
except that we exit the loops as soon as we guarantee not to find any results, lest we pointlessly shift through an innumerable number of combinations/products which are over the threshold.
def filter_cost(cost_lists, threshold):
cost = [[i for i in ilist if i[0] <= threshold] for ilist in cost_lists]
# the algorithm requires that we remove any lists that have become empty
return [ilist for ilist in cost if ilist]
def _combi(cost_lists, k, start, depth, subtotal, threshold):
if depth == k:
for i in xrange(start, len(cost_lists)):
for value in cost_lists[i]:
if value[0] + subtotal > threshold:
break
yield (value,)
else:
for i in xrange(start, len(cost_lists)):
for value in cost_lists[i]:
if value[0] + subtotal > threshold:
break
for c in _combi(cost_lists, k, i+1, depth+1,
value[0]+subtotal, threshold):
yield (value,) + c
def combinations_product(cost_lists, k, threshold):
for i in xrange(len(cost_lists)-k+1):
for value in cost_lists[i]:
if value[0] > threshold:
break
for comb in _combi(cost_lists, k, i+1, 2, value[0], threshold):
temp = (value,) + comb
cost, ilists, ith_items = zip(*temp)
yield sum(cost), ilists, ith_items
def find_smallest_sum_permutations(lists):
minima = [min(x) for x in lists]
cost_local = []
cost_global = []
for i, ilist in enumerate(lists):
if len(ilist) > 1:
first = ilist[0]
diff = [(num-first, i, j) for j, num in enumerate(ilist[1:], 1)]
cost_local.append(diff)
cost_global.extend(diff)
cost_global.sort()
threshold_index = len(lists) - 2
cost_threshold = cost_global[threshold_index][0]
cost_local = filter_cost(cost_local, cost_threshold)
for k in xrange(2, len(lists)):
group_combinations = tuple(combinations_product(cost_local, k,
cost_threshold))
if group_combinations:
cost_global.extend(group_combinations)
cost_global.sort()
cost_threshold = cost_global[threshold_index][0]
cost_local = filter_cost(cost_local, cost_threshold)
else:
break
permutations = [minima]
for k in xrange(N-1):
_, ilist, ith_item = cost_global[k]
if type(ilist) == int:
permutation = [minima[i]
if i != ilist else lists[ilist][ith_item]
for i in xrange(N)]
else:
# multiple nonminimal values combination
mapping = dict(zip(ilist, ith_item))
permutation = [minima[i]
if i not in mapping else lists[i][mapping[i]]
for i in xrange(N)]
permutations.append(permutation)
return permutations
Examples
Example in the question.
>>> lists = [
[1, 2],
[2, 3],
[1],
]
>>> for p in find_smallest_sum_permutations(lists):
... print p, sum(p)
[1, 2, 1] 4
[2, 2, 1] 5
[1, 3, 1] 5
Example I had generated with random lists.
>>> import random
>>> N = 5
>>> random.seed(1024)
>>> lists = [sorted(random.sample(range(10*N), 2*N-1)) for _ in xrange(N)]
>>> for p in find_smallest_sum_permutations(lists):
... print p, sum(p)
[4, 4, 1, 6, 0] 15
[4, 6, 1, 6, 0] 17
[4, 4, 3, 6, 0] 17
[4, 4, 1, 6, 4] 19
[4, 6, 3, 6, 0] 19
Example by user2357112 which had caught a glaring error in my previous iteration.
>>> lists = [
[1, 2, 30, 40],
[1, 2, 30, 40],
[10, 20, 30, 40],
[10, 20, 30, 40],
]
>>> for p in find_smallest_sum_permutations(lists):
... print p, sum(p)
[1, 1, 10, 10] 22
[2, 1, 10, 10] 23
[1, 2, 10, 10] 23
[2, 2, 10, 10] 24
The trick is to only generate the combinations that might possibly be needed, and store them in a heap. Each one that you pull out is the smallest one you have not yet seen. And the fact that THAT combination has been pulled out tells you that there are new ones which might also be small.
See https://docs.python.org/2/library/heapq.html for how to use a heap. We also need code for generating combinations. And with that, here is working code for getting the first n combinations for any list of lists:
import heapq
# Helper class for storing combinations.
class ListSelector:
def __init__(self, lists, indexes):
self.lists = lists
self.indexes = indexes
def value(self):
answer = 0
for i in range(0, len(self.lists)):
answer = answer + self.lists[i][self.indexes[i]]
return answer
def values(self):
return [self.lists[i][self.indexes[i]] for i in range(0, len(self.lists))]
# These are the next combinations. We are willing to increment any
# leading 0, or the first non-zero value. This will provide one and
# only one path to each possible combination.
def next_selectors(self):
lists = self.lists
indexes = self.indexes
selectors = []
for i in range(0, len(lists)):
if len(lists[i]) <= indexes[i] + 1:
if 0 == indexes[i]:
continue
else:
break
new_indexes = [
indexes[j] + (0 if j != i else 1)
for j in range(0, len(lists))]
selectors.append(ListSelector(lists, new_indexes))
if 0 < indexes[i]:
break
return selectors
# This will just return an iterator over all combinations, from smallest
# to largest. It does NOT generate them until needed.
def combinations(lists):
sel = ListSelector(lists, [0 for _ in range(len(lists))])
upcoming = [(sel.value(), sel)]
while len(upcoming):
value, sel = heapq.heappop(upcoming)
yield sel
for next_sel in sel.next_selectors():
heapq.heappush(upcoming, (next_sel.value(), next_sel))
# This just gets the first n of them. (It will return less if less.)
def smallest_n_combinations(n, lists):
i = 0
for sel in combinations(lists):
yield sel
i = i + 1
if i == n:
break
# Example usage
lists = [
[1, 2, 5],
[2, 3, 4],
[1]]
for sel in smallest_n_combinations(3, lists):
print(sel.value(), sel.values(), sel.indexes)
(This could be made more efficient for a long list of lists with tricks like caching the value inside of ListSelector and calculating it incrementally for new ones.)

Extract elements of list at odd positions

So I want to create a list which is a sublist of some existing list.
For example,
L = [1, 2, 3, 4, 5, 6, 7], I want to create a sublist li such that li contains all the elements in L at odd positions.
While I can do it by
L = [1, 2, 3, 4, 5, 6, 7]
li = []
count = 0
for i in L:
if count % 2 == 1:
li.append(i)
count += 1
But I want to know if there is another way to do the same efficiently and in fewer number of steps.
Solution
Yes, you can:
l = L[1::2]
And this is all. The result will contain the elements placed on the following positions (0-based, so first element is at position 0, second at 1 etc.):
1, 3, 5
so the result (actual numbers) will be:
2, 4, 6
Explanation
The [1::2] at the end is just a notation for list slicing. Usually it is in the following form:
some_list[start:stop:step]
If we omitted start, the default (0) would be used. So the first element (at position 0, because the indexes are 0-based) would be selected. In this case the second element will be selected.
Because the second element is omitted, the default is being used (the end of the list). So the list is being iterated from the second element to the end.
We also provided third argument (step) which is 2. Which means that one element will be selected, the next will be skipped, and so on...
So, to sum up, in this case [1::2] means:
take the second element (which, by the way, is an odd element, if you judge from the index),
skip one element (because we have step=2, so we are skipping one, as a contrary to step=1 which is default),
take the next element,
Repeat steps 2.-3. until the end of the list is reached,
EDIT: #PreetKukreti gave a link for another explanation on Python's list slicing notation. See here: Explain Python's slice notation
Extras - replacing counter with enumerate()
In your code, you explicitly create and increase the counter. In Python this is not necessary, as you can enumerate through some iterable using enumerate():
for count, i in enumerate(L):
if count % 2 == 1:
l.append(i)
The above serves exactly the same purpose as the code you were using:
count = 0
for i in L:
if count % 2 == 1:
l.append(i)
count += 1
More on emulating for loops with counter in Python: Accessing the index in Python 'for' loops
For the odd positions, you probably want:
>>>> list_ = list(range(10))
>>>> print list_[1::2]
[1, 3, 5, 7, 9]
>>>>
I like List comprehensions because of their Math (Set) syntax. So how about this:
L = [1, 2, 3, 4, 5, 6, 7]
odd_numbers = [y for x,y in enumerate(L) if x%2 != 0]
even_numbers = [y for x,y in enumerate(L) if x%2 == 0]
Basically, if you enumerate over a list, you'll get the index x and the value y. What I'm doing here is putting the value y into the output list (even or odd) and using the index x to find out if that point is odd (x%2 != 0).
You can also use itertools.islice if you don't need to create a list but just want to iterate over the odd/even elements
import itertools
L = [1, 2, 3, 4, 5, 6, 7]
li = itertools.islice(l, 1, len(L), 2)
You can make use of bitwise AND operator &:
>>> x = [1, 2, 3, 4, 5, 6, 7]
>>> y = [i for i in x if i&1]
[1, 3, 5, 7]
This will give you the odd elements in the list. Now to extract the elements at odd indices you just need to change the above a bit:
>>> x = [10, 20, 30, 40, 50, 60, 70]
>>> y = [j for i, j in enumerate(x) if i&1]
[20, 40, 60]
Explanation
Bitwise AND operator is used with 1, and the reason it works is because, odd number when written in binary must have its first digit as 1. Let's check:
23 = 1 * (2**4) + 0 * (2**3) + 1 * (2**2) + 1 * (2**1) + 1 * (2**0) = 10111
14 = 1 * (2**3) + 1 * (2**2) + 1 * (2**1) + 0 * (2**0) = 1110
AND operation with 1 will only return 1 (1 in binary will also have last digit 1), iff the value is odd.
Check the Python Bitwise Operator page for more.
P.S: You can tactically use this method if you want to select odd and even columns in a dataframe. Let's say x and y coordinates of facial key-points are given as columns x1, y1, x2, etc... To normalize the x and y coordinates with width and height values of each image you can simply perform:
for i in range(df.shape[1]):
if i&1:
df.iloc[:, i] /= heights
else:
df.iloc[:, i] /= widths
This is not exactly related to the question but for data scientists and computer vision engineers this method could be useful.

How to do an inverse `range`, i.e. create a compact range based on a set of numbers?

Python has a range method, which allows for stuff like:
>>> range(1, 6)
[1, 2, 3, 4, 5]
What I’m looking for is kind of the opposite: take a list of numbers, and return the start and end.
>>> magic([1, 2, 3, 4, 5])
[1, 5] # note: 5, not 6; this differs from `range()`
This is easy enough to do for the above example, but is it possible to allow for gaps or multiple ranges as well, returning the range in a PCRE-like string format? Something like this:
>>> magic([1, 2, 4, 5])
['1-2', '4-5']
>>> magic([1, 2, 3, 4, 5])
['1-5']
Edit: I’m looking for a Python solution, but I welcome working examples in other languages as well. It’s more about figuring out an elegant, efficient algorithm. Bonus question: is there any programming language that has a built-in method for this?
A nice trick to simplify the code is to look at the difference of each element of the sorted list and its index:
a = [4, 2, 1, 5]
a.sort()
print [x - i for i, x in enumerate(a)]
prints
[1, 1, 2, 2]
Each run of the same number corresponds to a run of consecutive numbers in a. We can now use itertools.groupby() to extract these runs. Here's the complete code:
from itertools import groupby
def sub(x):
return x[1] - x[0]
a = [5, 3, 7, 4, 1, 2, 9, 10]
ranges = []
for k, iterable in groupby(enumerate(sorted(a)), sub):
rng = list(iterable)
if len(rng) == 1:
s = str(rng[0][1])
else:
s = "%s-%s" % (rng[0][1], rng[-1][1])
ranges.append(s)
print ranges
printing
['1-5', '7', '9-10']
Sort numbers, find consecutive ranges (remember RLE compression?).
Something like this:
input = [5,7,9,8,6, 21,20, 3,2,1, 22,23, 50]
output = []
first = last = None # first and last number of current consecutive range
for item in sorted(input):
if first is None:
first = last = item # bootstrap
elif item == last + 1: # consecutive
last = item # extend the range
else: # not consecutive
output.append((first, last)) # pack up the range
first = last = item
# the last range ended by iteration end
output.append((first, last))
print output
Result: [(1, 3), (5, 9), (20, 23), (50, 50)]. You figure out the rest :)
I thought you might like my generalised clojure solution.
(def r [1 2 3 9 10])
(defn successive? [a b]
(= a (dec b)))
(defn break-on [pred s]
(reduce (fn [memo n]
(if (empty? memo)
[[n]]
(if (pred (last (last memo)) n)
(conj (vec (butlast memo))
(conj (last memo) n))
(conj memo [n]))))
[]
s))
(break-on successive? r)
Since 9000 beat me to it, I'll just post the second part of the code, that prints pcre-like ranges from the previously computed output plus the added type check:
for i in output:
if not isinstance(i, int) or i < 0:
raise Exception("Only positive ints accepted in pcre_ranges")
result = [ str(x[0]) if x[0] == x[1] else '%s-%s' % (x[0], x[1]) for x in output ]
print result
Output: ['1-3', '5-9', '20-23', '50']
Let's try generators!
# ignore duplicate values
l = sorted( set( [5,7,9,8,6, 21,20, 3,2,1, 22,23, 50] ) )
# get the value differences
d = (i2-i1 for i1,i2 in zip(l,l[1:]))
# get the gap indices
gaps = (i for i,e in enumerate(d) if e != 1)
# get the range boundaries
def get_ranges(gaps, l):
last_idx = -1
for i in gaps:
yield (last_idx+1, i)
last_idx = i
yield (last_idx+1,len(l)-1)
# make a list of strings in the requested format (thanks Frg!)
ranges = [ "%s-%s" % (l[i1],l[i2]) if i1!=i2 else str(l[i1]) \
for i1,i2 in get_ranges(gaps, l) ]
This has become rather scary, I think :)
This is kind of elegant but also kind of disgusting, depending on your point of view. :)
import itertools
def rangestr(iterable):
end = start = iterable.next()
for end in iterable:
pass
return "%s" % start if start == end else "%s-%s" % (start, end)
class Rememberer(object):
last = None
class RangeFinder(object):
def __init__(self):
self.o = Rememberer()
def __call__(self, x):
if self.o.last is not None and x != self.o.last + 1:
self.o = Rememberer()
self.o.last = x
return self.o
def magic(iterable):
return [rangestr(vals) for k, vals in
itertools.groupby(sorted(iterable), RangeFinder())]
>>> magic([5,7,9,8,6, 21,20, 3,2,1, 22,23, 50])
['1-3', '5-9', '20-23', '50']
Explanation: it uses itertools.groupby to group the sorted elements together by a key, where the key is a Rememberer object. The RangeFinder class keeps a Rememberer object as long as a consecutive bunch of items belongs to the same range block. Once you've passed out of a given block, it replaces the Rememberer so that the key won't compare equal and groupby will make a new group. As groupby walks over the sorted list, it passes the elements one-by-one into rangestr, which constructs the string by remembering the first and the last element and ignoring everything in between.
Is there any practical reason to use this instead of 9000's answer? Probably not; it's basically the same algorithm.

Categories

Resources