Extract subarray between certain value in Python - python

I have a list of values that are the result of merging many files. I need to pad some of the values. I know that each sub-section begins with the value -1. I am trying to basically extract a sub-array between -1's in the main array via iteration.
For example supposed this is the main list:
-1 1 2 3 4 5 7 -1 4 4 4 5 6 7 7 8 -1 0 2 3 5 -1
I would like to extract the values between the -1s:
list_a = 1 2 3 4 5 7
list_b = 4 4 4 5 6 7 7 8
list_c = 0 2 3 5 ...
list_n = a1 a2 a3 ... aM
I have extracted the indices for each -1 by searching through the main list:
minus_ones = [i for i, j in izip(count(), q) if j == -1]
I also assembled them as pairs using a common recipe:
def pairwise(iterable):
a, b = tee(iterable)
next(b, None)
return izip(a,b)
for index in pairwise(minus_ones):
print index
The next step I am trying to do is grab the values between the index pairs, for example:
list_b: (7 , 16) -> 4 4 4 5 6 7 7 8
so I can then do some work to those values (I will add a fixed int. to each value in each sub-array).

You mentioned numpy in the tags. If you're using it, have a look at np.split.
For example:
import numpy as np
x = np.array([-1, 1, 2, 3, 4, 5, 7, -1, 4, 4, 4, 5, 6, 7, 7, 8, -1, 0, 2,
3, 5, -1])
arrays = np.split(x, np.where(x == -1)[0])
arrays = [item[1:] for item in arrays if len(item) > 1]
This yields:
[array([1, 2, 3, 4, 5, 7]),
array([4, 4, 4, 5, 6, 7, 7, 8]),
array([0, 2, 3, 5])]
What's going on is that where will yield an array (actually a tuple of arrays, therefore the where(blah)[0]) of the indicies where the given expression is true. We can then pass these indicies to split to get a sequence of arrays.
However, the result will contain the -1's and an empty array at the start, if the sequence starts with -1. Therefore, we need to filter these out.
If you're not already using numpy, though, your (or #DSM's) itertools solution is probably a better choice.

If you only need the groups themselves and don't care about the indices of the groups (you could always reconstruct them, after all), I'd use itertools.groupby:
>>> from itertools import groupby
>>> seq = [-1, 1, 2, 3, 4, 5, 7, -1, 4, 4, 4, 5, 6, 7, 7, 8, -1, 0, 2, 3, 5, -1]
>>> groups = [list(g) for k,g in groupby(seq, lambda x: x != -1) if k]
>>> groups
[[1, 2, 3, 4, 5, 7], [4, 4, 4, 5, 6, 7, 7, 8], [0, 2, 3, 5]]
I missed the numpy tags, though: if you're working with numpy arrays, using np.split/np.where is a better choice.

I would do it something like this, which is a little different from the path you started down:
input_list = [-1,1,2,3,4,5,7,-1,4,4,4,5,6,7,7,8,-1,0,2,3,5,-1]
list_index = -1
new_lists = []
for i in input_list:
if i == -1:
list_index += 1
new_lists.append([])
continue
else:
print list_index
print new_lists
new_lists[list_index].append(i)

I think when you build your list, you can directly add the values to a string. So rather than starting with a list like xx = [], you can start with xx = '', and then do an update like xx = xx + ' ' + str (val). The result will be a string rather than a list. Then, you can just use the split() method on the strihg.
In [48]: xx
Out[48]: '-1 1 2 3 4 5 7 -1 4 4 4 5 6 7 7 8 -1 0 2 3 5 -1'
In [49]: xx.split('-1')
Out[49]: ['', ' 1 2 3 4 5 7 ', ' 4 4 4 5 6 7 7 8 ', ' 0 2 3 5 ', '']
In [50]: xx.split('-1')[1:-1]
Out[50]: [' 1 2 3 4 5 7 ', ' 4 4 4 5 6 7 7 8 ', ' 0 2 3 5 ']
Am sure you can take it from here ...

Related

Given two integers m and n, loop repeatedly through an array of m and remove each nth element. Return the last element left

Given two integers m and n, loop repeatedly through an array of m and remove each nth element.
Return the last element left.
(If m = 7 and n = 4, then begin with the array [1, 2, 3, 4, 5, 6, 7] and remove, in order, [4, 1, 6, 5, 2, 7] and return 3)
Shouldn't this return 2 and not 3?
Yes it should return 2, if the algorithm is meant to return the last element in the list
1 2 3 4 5 6 7
1 2 3 . 5 6 7 (4)
. 2 3 5 6 7 (1)
2 3 5 . 7 (6)
2 3 . 7 (5)
2 3 . (7)
2 . (3)
Yes, It should return 2.
Sample code
a=[i for i in range(1,8)]
n=4
k=n
m=7
while len(a)!=0:
print(a)
print("poped ",a.pop((n-1)%m))
n = (n-1)%m+ k
m = m-1
print()
and the output,
[1, 2, 3, 4, 5, 6, 7]
poped 4
[1, 2, 3, 5, 6, 7]
poped 1
[2, 3, 5, 6, 7]
poped 6
[2, 3, 5, 7]
poped 5
[2, 3, 7]
poped 7
[2, 3]
poped 3
[2]
poped 2

function to print all elements inside a list into a rectangle

Lets say I have a list:
li = [1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10]
I need to make a rectangle with dimension 5*4 with all elements inside the list. Which should output (output is plain string, not list):
1 2 3 4 5
6 7 8 9 10
1 2 3 4 5
6 7 8 9 10
How to do this? Beside that, I need to find the general formula that would allow me to create a rectangle of length*width dimension that can take input from list of any length.
Here is a working code:
li = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
width = 5
length = len(li)//width
for i in range(length):
print(li[i*width:width*(i+1)])
I did like this:
def list_rect(li,dim1,dim2):
i=0
for line in range(dim2):
for col in range(dim1):
if i<len(li):
print(li[i],end=' ')
i+=1
else:
i=0
print(li[i],end=' ')
print()

How to resample an array by duplicating/skipping every N item?

I am confused on how to achieve the following:
Say I have an array of size X (e.g: 3000 items). I want to create a function that will stretch that array to size Y (e.g: 4000) by duplicating every N item.
Along with another function to do the opposite, remove every N items to make the array size 2000 for example.
I guess this is more of a math problem than a programming problem, and as you can tell maths aren't my strong point. Here's what I have so far:
def upsample(originalArray, targetSize):
newArray = []
j = 0
for i in range (0, len(originalArray)):
newArray.append(originalArray[i])
# calculate at what interval items need to be duplicated
# this is what I'm having trouble with
if j == interval:
newArray.append(originalArray[i])
j = 0
j+=1
return newArray
Here is an example of what I'm trying to do:
# stretch array from 10 to 12 items
originalArray = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
upsample(originalArray, 11)
# output: [0, 1, 2, 3, 4, 4, 5, 6, 7, 8, 9, 9]
Any help will be much appreciated
Create a floating point linspace and map it back to integer to use it as indices for your original Array. (Since you wanted [0, 1, 2, 3, 4, 4, 5, 6, 7, 8, 9, 9] instead of [0, 0, 1, 2, 3, 4, 4, 5, 6, 7, 8, 9] you need to do this flipping stuff in the if condition).
The code avoids loops for performance.
import numpy as np
def upsample(originalArray, targetSize):
x = np.linspace(0, originalArray.size, num=targetSize, endpoint=False)
if targetSize > originalArray.size:
x = -np.flip(x, axis=0) + originalArray.size
x[-1] = originalArray.size - 1
x = originalArray[x.astype(int)]
return x
upsample(originalArray, 21) gives [0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 9]
upsample(originalArray, 23) gives [0 0 1 1 2 2 3 3 3 4 4 5 5 6 6 6 7 7 8 8 9 9 9]
upsample(originalArray, 5) gives [0 2 4 6 8]
etc.
To downsample your array:
N =2 #downsampling by 2
new = originalArray[0:N:]
To upsample (a being originaArray):
new = [item for t in [[a[i]]*2 if i%N==0 else [a[i],] for i in range(0,len(a))] for item in t]
or more explicitly:
res = list()
i=0
while(i<len(originalArray)):
res.append(originalArray[i])
if i%N==0:
continue
i +=1

Group lists of different rows by multiple columns values with Pandas

I have a dataframe df1 like this:
import pandas as pd
dic = {'A':[0,0,2,2,2,1,5,5],'B':[[1,5,3,8],[1,8,7,5],[7,8,9,5],[3],[1,5,9,3],[0,3,5],[],[4,2,3,1]],'C':['a','b','c','c','d','e','f','f'],'D':['0','8','7','6','4','5','2','2']}
df1 = pd.DataFrame(dic)
and looks like this:
#Initial dataframe
A B C D
0 0 [1, 5, 3, 8] a 0
1 0 [1, 8, 7, 5] b 8
2 2 [7, 8, 9, 5] c 7
3 2 [3] c 6
4 2 [1, 5, 9, 3] d 4
5 1 [0, 3, 5] e 5
6 5 [] f 2
7 5 [4, 2, 3, 1] f 2
My goal is to group rows that have the same values in column A and C and merge the content of column B in such a way that the result looks like this:
#My GOAL
A B C
0 0 [1, 5, 3, 8] a
1 0 [1, 8, 7, 5] b
2 2 [3, 7, 8, 9, 5] c
3 2 [1, 5, 9, 3] d
4 1 [0, 3, 5] e
5 5 [4, 2, 3, 1] f
As you can see, rows having the same items in column A and C are merged while if at least one is different they are left as is.
My idea was to use the groupby and sum functions like this:
df1.groupby(by=['A','C'],as_index=False,sort=True).sum()
but Python returns an error message: Function does not reduce
Could you please tell me what is wrong with my line of code? What should I write in order to achieve my goal?
Note: I do not care about what happens to column D which can be discarted.
One of the possibilities would be to flatten the list of lists until it gets exhausted with the help of itertools.chain(*iterables)
import itertools
df1.groupby(['A', 'C'])['B'].apply(lambda x: list(itertools.chain(*x))).reset_index()
(Or)
Use sum with lambda:
df1.groupby(by=['A','C'])['B'].apply(lambda x: x.sum()).reset_index()
Both yield:
By default, groupby().sum() looks for numeric types (scalar) values to perform aggregation and not a collection of elements like list for example.
Another possibility:
df1.groupby(by=['A','C'],as_index=False,sort=True).agg({'B': lambda x: tuple(sum(x, []))})
Result:
A C B
0 0 a (1, 5, 3, 8)
1 0 b (1, 8, 7, 5)
2 1 e (0, 3, 5)
3 2 c (7, 8, 9, 5, 3)
4 2 d (1, 5, 9, 3)
5 5 f (4, 2, 3, 1)
Based in this answer (it seems that lists do not work too well with aggregation).

Android pattern plotting in Python

I need to make make an android pattern or just a pattern in a 3x3 matrix. The pattern is [8, 7, 6, 5, 4, 3, 2, 0, 1] and I need to plot it in a 3x3 matrix. The first entry in the pattern is the beginning point and it connects to the second in the row. The result needs to be the following:
8, 9, 7
6, 5, 4
3, 2, 1
pattern = [8, 7, 6, 5, 4, 3, 2, 0, 1]
matrix = [0,0,0,0,0,0,0,0,0]
lst = ([matrix[i:i + 3] for i in range(0, len(matrix), 3)])
for i in lst:
print(i)
for char in pattern:
matrix[char]=char
Do you mean something like this:
def print_pattern(pattern, cols=3):
for ii, pp in enumerate(pattern):
if ii % cols == 0:
print("")
print(pp),
Then you can call this function as
pattern = [8, 7, 6, 5, 4, 3, 2, 0, 1]
print_pattern(pattern)
This results in the following output:
8 7 6
5 4 3
2 0 1
If you want to print the pattern in the opposite order you can pass a reversed list of your pattern, e.g.:
print_pattern(reversed(pattern))
Gives the following output:
1 0 2
3 4 5
6 7 8
This functions accepts an integer n and an iterable. It makes a list of tuples of width n from that iterable
def mat(n, it):
return list(zip(*[iter(it)]*n))

Categories

Resources