Find missing sequences in list that cycles - python

Given 2 lists:
target = [0, 1, 2, 3, 4, 5, 6, 7]
given = [1, 2, 5, 6]
The missing numbers is [[0], [3,4], [7]]. However, both list circulates, which means the end of the list is linked to the first, so they actually look like this:
target = [0,1,2,3,4,5,6,7,0,1,2,3,4,5,......]
given = [1,2, 5,6, 1,2, 5,6,......] # I put a space to better tell where numbers missing
That way, the desired output of missing numbers should actually be [[7,0], [3,4]] since 7 and 0 is consecutive.
How do I make up the function that does the job?

First you can construct a list of missing patches using itertools.groupby and then glue the left-most and right-most sublists if necessary:
import itertools
def missing(target, given):
output = [list(g) for k, g in itertools.groupby(target, key=lambda x: x in given) if not k]
if output and output[0][0] == target[0] and output[-1][-1] == target[-1]:
output[-1] += output.pop(0)
return output
print(missing([0, 1, 2, 3, 4, 5, 6, 7], [1, 2, 5, 6])) # [[3, 4], [7, 0]]
print(missing([0, 1, 2, 3, 4, 5, 6, 7, 8], [1, 2, 5, 6, 8])) # [[0], [3, 4], [7]]
Update: If you don't want to import itertools module, then replace the line output = [list(g) ...] with the following:
output, temp = [], [] # an empty "temporary bucket"
for x in target:
if x in given:
if temp: # if temp is nonempty
output.append(temp) # put the bucket into output
temp = [] # a new empty bucket
else: # if x is "missing"
temp.append(x) # append x into the bucket
else: # this is called when for loop is over (or target is empty)
if temp: # if the last temporary bucket is nonempty
output.append(temp)

Related

How to get the unselected population in python random module

So, I know I can get a random list from a population using the random module,
l = [0, 1, 2, 3, 4, 8 ,9]
print(random.sample(l, 3))
# [1, 3, 2]
But, how do I get the list of the unselected ones? Do, I need to remove them manually from the list? Or, is there a method to get them too?
Edit: The list l from example doesn't contain the same items multiple times, but when it does I wouldn't want it removed more than it's selected as sample.
l = [0, 1, 2, 3, 4, 8 ,9]
s1 = set(random.sample(l, 3))
s2 = set(l).difference(s1)
>>> s1
{0, 3, 8}
>>> s2
{1, 2, 4, 9}
Update: same items multiple times
You can shuffle your list first and partition your population after in two:
l = [7, 4, 5, 4, 5, 9, 8, 6, 6, 6, 9, 8, 6, 3, 8]
pop = l[:]
random.shuffle(pop)
pop1, pop2 = pop[:3], pop[3:]
>>> pop1
[8, 4, 9]
>>> pop2
[7, 6, 8, 6, 5, 6, 9, 6, 5, 8, 4, 3]
Because your list can contain multiple same items, you can change to the approach below:
import random
l = [0, 1, 2, 3, 4, 8 ,9]
random.shuffle(l)
selected = l[:3]
unselected = l[3:]
print(selected)
# [4, 0, 1]
print(unselected)
# [8, 2, 3, 9]
If you want to keep track of duplicates, you could count the items of each type and compare the population count to the sample count.
If you don't care about the order of items in the population, you could do it like this:
from collections import Counter
import random
population = [1, 1, 2, 2, 9, 7, 9]
sample = random.sample(population, 3)
pop_count = Counter(population)
samp_count = Counter(sample)
unsampled = [
k
for k in pop_count
for i in range(pop_count[k] - samp_count[k])
]
If you care about the order in the population, you could do something like this:
check = sample.copy()
unsampled = []
for val in population:
if val in check:
check.remove(val)
else:
unsampled.append(val)
Or there's this weird list comprehension (not recommended):
check = sample.copy()
unsampled = [
x
for x in population
if x not in check or check.remove(x)
]
The if clause here uses two tricks:
both parts of the test will be Falseish if x is not in check (list.remove() always returns None), and
remove() will only be called if the first part fails, i.e., if x is in check.
Basically, if (and only if) x is in check, it will bomb through and check the next condition, which will also be False (None), but will have the side effect of removing one copy of x from check.
You can do with:
import random
l = [0, 1, 2, 3, 4, 8 ,9]
rand = random.sample(l, 3)
rest = list(set(l) - set(rand))
print(f"initial list: {l}")
print(f"random list: {rand}")
print (f"rest list: {rest}")
Result:
initial list: [0, 1, 2, 3, 4, 8, 9]
random list: [2, 9, 0]
rest list: [8, 1, 3, 4]

List comprehension headaches

I have a nested list like this which:
list = [[1,2,3], [2,5,7,6], [1,-1], [5,7], [6,3,7,4,3], [2, 5, 1, -5]]
What I am trying to do is to remove nested lists, where the value within these lists are both positive and negative. I have tried doing it by list comprehension, but I couldn't figure it out.
def method(list):
return [obj for obj in list if (x for x in obj if -x not in obj)]
The obtained results should be like:
list = [[1,2,3], [2,5,7,6], [5,7], [6,3,7,4,3]]
Assuming you want lists where elements are either all negative or all positive you can use all predefined function to check for both possibilities
result = [L for L in x if all(y>0 for y in L) or all(y<0 for y in L)]
EDIT:
In the comments you clarified what is a valid list (e.g. [-1, 2] is valid)... with this new formulation the test should be
result = [L for L in x if all(-y not in L for y in L)]
where each single test is however now quadratic in the size of the list. Using set this problem can be removed
result = [L for L in x if all(-y not in S for S in (set(L),) for y in L)]
Using list comprehension you can do something like:
def method2(list):
return [obj for obj in list if (all(n>0 for n in obj) or all(n<0 for n in obj))]
that, with your example, give as output:
[[1, 2, 3], [2, 5, 7, 6], [5, 7], [6, 3, 7, 4, 3]]
In general is better to split the task by steps:
Given list find the positives (positives function)
Given list find the negatives and multiply them by -1 (negatives function)
If the intersection of both positives and negatives is not empty remove.
So, you could do:
def positives(ls):
return set(l for l in ls if l > 0)
def negatives(ls):
return set(-1*l for l in ls if l < 0)
list = [[1, 2, 3], [2, 5, 7, 6], [1, -1], [5, 7], [6, 3, 7, 4, 3], [2, 5, 1, -5]]
result = [l for l in list if not negatives(l) & positives(l)]
print(result)
Output
[[1, 2, 3], [2, 5, 7, 6], [5, 7], [6, 3, 7, 4, 3]]
As a side note you should not use list as a variable name as it shadows the built-int list function.
Your generator should yield whether the condition to filter an object applies.
You then feed the generator to an aggregator to determine if obj should be filtered.
the aggregator could be any or all, or something different.
# assuming obj should be filtered if both x and the inverse of x are in obj
def method_with_all(src):
return [obj for obj in src if all(-x not in obj for x in obj)]
def method_with_any(src):
return [obj for obj in src if any(-x in obj for x in obj)]
you can filter out the lists that have both negative and positive elements:
def keep_list(nested_list):
is_first_positive = nested_list[0] > 0
for element in nested_list[1:]:
if (element > 0) != is_first_positive:
return False
return True
my_list = [[1,2,3], [2,5,7,6], [1,-1], [5,7], [6,3,7,4,3], [2, 5, 1, -5]]
print(list(filter(keep_list, my_list)))
output:
[[1, 2, 3], [2, 5, 7, 6], [5, 7], [6, 3, 7, 4, 3]]
Numpy can be used as well. My solution here is similar to the "all"-operation suggested by others but coded explicitly and only needs one condition. It checks whether the sign of the all the elements equals the sign of the first element (could be any other as well).
from numpy import *
def f(b):
return [a for a in b if sum(sign(array(a)) == sign(a[0])) == len(a)]
For your case...
data = [[1,2,3], [2,5,7,6], [1,-1], [5,7], [6,3,7,4,3], [2, 5, 1, -5]]
print(f(data))
...it will return:
[[1, 2, 3], [2, 5, 7, 6], [5, 7], [6, 3, 7, 4, 3]]

How to reshape a list without numpy

How do I reshape a list into a n-dimensional list
Input:
list = [1, 2, 3, 4, 5, 6, 7, 8]
shape = [2, 2, 2]
output = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
This recursive approach should work.
lst = [1, 2, 3, 4, 5, 6, 7, 8]
shape = [2, 2, 2]
from functools import reduce
from operator import mul
def reshape(lst, shape):
if len(shape) == 1:
return lst
n = reduce(mul, shape[1:])
return [reshape(lst[i*n:(i+1)*n], shape[1:]) for i in range(len(lst)//n)]
reshape(lst, shape)
You probably want to wrap that with a check that your dimensions make sense... e.g.
assert reduce(mul, shape) == len(lst)
oooold post.. but since i'm currently looking for a more elegant way than mine, i just tell you my approach
# first, i create some data
l = [ i for i in range(256) ]
# now I reshape in to slices of 4 items
x = [ l[x:x+4] for x in range(0, len(l), 4) ]
Here is an approach using the grouper once on each dimension except the first:
import functools as ft
# example
L = list(range(2*3*4))
S = 2,3,4
# if tuples are acceptable
tuple(ft.reduce(lambda x, y: zip(*y*(x,)), (iter(L), *S[:0:-1])))
# (((0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11)), ((12, 13, 14, 15), (16, 17, 18, 19), (20, 21, 22, 23)))
# if it must be lists
list(ft.reduce(lambda x, y: map(list, zip(*y*(x,))), (iter(L), *S[:0:-1])))
# [[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]
The code below should do the trick.
The solution given below very general. The input list can be a nested list of lists of an any old/undesired shape; it need not be a list of integers.
Also, there are separate re-usable tools. For example the all_for_one function is very handy.
EDIT:
I failed to note something important. If you put 1s inside of the shape parameter, then you can get superfluous list nestings (only one list inside of a list instead of five or six lists inside of a list)
For example, if shape is [1, 1, 2]
then the return value might be [[[0.1, 0.2]]] instead of [0.1, 0.2]
the length of shape is the number of valid subscripts in the output list.
For example,
shape = [1, 2] # length 2
lyst = [[0.1, 0.2]]
print(lyst[0][0]) # valid.... no KeyError raised
If you want a true column or row vector, then len(shape) must be 1.
For example, shape = [49] will give you a row/column vector of length 49.
shape = [2] # length 2
output = [0.1, 0.2]
print(lyst[0])
Here's the code:
from operator import mul
import itertools as itts
import copy
import functools
one_for_all = lambda one: itts.repeat(one, 1)
def all_for_one(lyst):
"""
EXAMPLE:
INPUT:
[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
OUTPUT:
iterator to [1, 2, 3, 4, 5, 6, 7, 8]
IN GENERAL:
Gets iterator to all nested elements
of a list of lists of ... of lists of lists.
"""
# make an iterator which **IMMEDIATELY**
# raises a `StopIteration` exception
its = itts.repeat("", 0)
for sublyst in lyst:
if hasattr(sublyst, "__iter__") and id(sublyst) != id(lyst):
# Be careful ....
#
# "string"[0] == "s"[0] == "s"[0][0][0][0][0][0]...
#
# do not drill down while `sublyst` has an "__iter__" method
# do not drill down while `sublyst` has a `__getitem__` method
#
it = all_for_one(sublyst)
else:
it = one_for_all(sublyst)
# concatenate results to what we had previously
its = itts.chain(its, it)
return its
merged = list(all_for_one([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]))
print("merged == ", merged)
def reshape(xread_lyst, xshape):
"""
similar to `numpy.reshape`
EXAMPLE:
lyst = [1, 2, 3, 4, 5, 6, 7, 8]
shape = [2, 2, 2]
result = reshape(lyst)
print(result)
result ==
[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
For this function, input parameter `xshape` can be
any iterable containing at least one element.
`xshape` is not required to be a tuple, but it can be.
The length of xshape should be equal to the number
of desired list nestings
If you want a list of integers: len(xshape) == 1
If you want a list of lists: len(xshape) == 2
If you want a list of lists of lists: len(xshape) == 3
If xshape = [1, 2],
outermost list has 1 element
that one element is a list of 2 elements.
result == [[1, 2]]
If xshape == [2]
outermost list has 2 elements
those 2 elements are non-lists:
result: [1, 2]
If xshape = [2, 2],
outermost list has 2 elements
each element is a list of 2 elements.
result == [[1, 2] [3, 4]]
"""
# BEGIN SANITIZING INPUTS
# unfortunately, iterators are not re-usable
# Also, they don't have `len` methods
iread_lyst = [x for x in ReshapeTools.unnest(xread_lyst)]
ishape = [x for x in self.unnest(xshape)]
number_of_elements = functools.reduce(mul, ishape, 1)
if(number_of_elements != len(iread_lyst)):
msg = [str(x) for x in [
"\nAn array having dimensions ", ishape,
"\nMust contain ", number_of_elements, " element(s).",
"\nHowever, we were only given ", len(iread_lyst), " element(s)."
]]
if len(iread_lyst) < 10:
msg.append('\nList before reshape: ')
msg.append(str([str(x)[:5] for x in iread_lyst]))
raise TypeError(''.join(msg))
ishape = iter(ishape)
iread_lyst = iter(iread_lyst)
# END SANITATIZATION OF INPUTS
write_parent = list()
parent_list_len = next(ishape)
try:
child_list_len = next(ishape)
for _ in range(0, parent_list_len):
write_child = []
write_parent.append(write_child)
i_reshape(write_child, iread_lyst, child_list_len, copy.copy(ishape))
except StopIteration:
for _ in range(0, parent_list_len):
write_child = next(iread_lyst)
write_parent.append(write_child)
return write_parent
def ilyst_reshape(write_parent, iread_lyst, parent_list_len, ishape):
"""
You really shouldn't call this function directly.
Try calling `reshape` instead
The `i` in the name of this function stands for "internal"
"""
try:
child_list_len = next(ishape)
for _ in range(0, parent_list_len):
write_child = []
write_parent.append(write_child)
ilyst_reshape(write_child, iread_lyst, child_list_len, copy.copy(ishape))
except StopIteration:
for _ in range(0, parent_list_len):
write_child = next(iread_lyst)
write_parent.append(write_child)
return None
three_dee_mat = reshape(merged, [2, 2, 2])
print("three_dee_mat == ", three_dee_mat)
Not particularly elegant:
from functools import reduce
from itertools import islice
l=[1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4]
s=[2,3,4]
if s and reduce(lambda x,y:x*y, s) == len(l):
# if number of elements matches product of dimensions,
# the first dimension is actually redundant
s=[1:]
else:
print("length of input list does not match shape")
return
while s:
size = s.pop() # how many elements for this dimension
#split the list based on the size of the dimension
it=iter(l)
l = list(iter(lambda:list(islice(it,size)),[]))
# [[[1, 2, 3, 4], [5, 6, 7, 8], [9, 0, 1, 2]],
# [[3, 4, 5, 6], [7, 8, 9, 0], [1, 2, 3, 4]]]

Creating a new list when before it reaches a number

How do I create a new list that contains sublists of ints but the way of divide it is when the next number is the minimun (or equal to the first value founded)?
For example
List1=[1,2,3,4,5,1,2,3,4,1,2,3,4,5,6]
The output that I am looking for is shown below:
Complete_List=[[1,2,3,4,5],[1,2,3,4],[1,2,3,4,5,6]]
I tried looping through the list and appending it when the value is greater than 1 . However it will not work as it doesn't create another list inside it.
Do I have to right a regex for this problem?
Some guidance would be really helpful.
Thank you
Here's something that will split a generic iterable on a given value.
def split_on_value(iterable, split_value):
iterator = iter(iterable)
outer, inner = [], [next(iterator)]
for value in iterator:
if value == split_value:
outer.append(inner)
inner = []
inner.append(value)
outer.append(inner)
return outer
value_list = [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
print split_on_value(value_list, 1)
# [[1, 2, 3, 4, 5], [1, 2, 3, 4], [1, 2, 3, 4, 5, 6]]
print split_on_value(value_list, 3)
# [[1, 2], [3, 4, 5, 1, 2], [3, 4, 1, 2], [3, 4, 5, 6]]
A vanilla, straightforward, CS101 solution. Though, possibly the most efficient one, because it scans the list exactly once. It also does not assume that segments begin with 1.
fragment = []
result = []
prev = List1[0] - 1 # Preset the previous element marker
for n in List1:
if n > prev:
fragment.append(n)
else:
result.append(fragment)
fragment = [n]
prev = n
result.append(fragment)
#[[1, 2, 3, 4, 5], [1, 2, 3, 4], [1, 2, 3, 4, 5, 6]]
First you search for the 1's, or whatever your condition is, and get the indices within the list. Don't forget to append the len(list) to include the last segment.
idx = [i for i, l in enumerate(List1) if l == 1] + [len(List1)]
Optional, if you want the beginning end of the List. That is, you do not know if there will be a 1 always at index 0.
idx = [0] + idx if idx[0] != 0 else idx
Then, split the list at those indices you found.
complete_list = [List1[ind1:ind2] for ind1, ind2 in zip(idx[:-1], idx[1:])]
and the result:
[[1, 2, 3, 4, 5], [1, 2, 3, 4], [1, 2, 3, 4, 5, 6]]
You can try this to split at every instance of 1:
List1=[1,2,3,4,5,1,2,3,4,1,2,3,4,5,6]
print [map(int, list("1"+i)) for i in ''.join(map(str, List1)).split("1")][1:]
By mapping over List1 with the string function, we can then join all the numbers in the list into one large string. From there, the algorithm splits itself at each instance of one, creating a list containing the new strings of digits. from there, the code maps the integer function over a list created of the strings and appending 1 at the front of the string to make up for the lost 1 when it originally split, creating a list within a list.

Python: split list of integers based on step between them

I have the following problem. Having a list of integers, I want to split it, into a list of lists, whenever the step between two elements of the original input list is not 1.
For example: input = [0, 1, 3, 5, 6, 7], output = [[0, 1], [3], [5, 6, 7]]
I wrote the following function, but it's uggly as hell, and I was wondering if anyone of you guys would help me get a nicer solution. I tried to use itertools, but couldn't solve it.
Here's my solution:
def _get_parts(list_of_indices):
lv = list_of_indices
tuples = zip(lv[:-1], lv[1:])
split_values = []
for i in tuples:
if i[1] - i[0] != 1:
split_values.append(i[1])
string = '/'.join([str(i) for i in lv])
substrings = []
for i in split_values:
part = string.split(str(i))
substrings.append(part[0])
string = string.lstrip(part[0])
substrings.append(string)
result = []
for i in substrings:
i = i.rstrip('/')
result.append([int(n) for n in i.split('/')])
return result
Thanks a lot!
This works with any iterable
>>> from itertools import groupby, count
>>> inp = [0, 1, 3, 5, 6, 7]
>>> [list(g) for k, g in groupby(inp, key=lambda i,j=count(): i-next(j))]
[[0, 1], [3], [5, 6, 7]]
def _get_parts(i, step=1):
o = []
for x in i:
if o and o[-1] and x - step == o[-1][-1]:
o[-1].append(x)
else:
o.append([x])
return o
_get_parts([0, 1, 3, 5, 6, 7], step=1)
# [[0, 1], [3], [5, 6, 7]])
Here is a solution utilizing a for loop.
def splitbystep(alist):
newlist = [[alist[0]]]
for i in range(1,len(alist)):
if alist[i] - alist[i-1] == 1:
newlist[-1].append(alist[i])
else:
newlist.append([alist[i]])
return newlist
This is how I'd do it:
inp = [0, 1, 3, 5, 6, 7]
base = []
for item in inp:
if not base or item - base[-1][-1] != 1: # If base is empty (first item) or diff isn't 1
base.append([item]) # Append a new list containing just one item
else:
base[-1].append(item) # Otherwise, add current item to the last stored list in base
print base # => [[0, 1], [3], [5, 6, 7]]
This is the textbook use case for function split_when from module more_itertools:
import more_itertools
print(list(more_itertools.split_when([0, 1, 3, 5, 6, 7], lambda x,y: y-x != 1)))
# [[0, 1], [3], [5, 6, 7]]
Or, even more simple with more_itertools.consecutive_groups:
print([list(g) for g in more_itertools.consecutive_groups([0, 1, 3, 5, 6, 7])])
# [[0, 1], [3], [5, 6, 7]]

Categories

Resources