How to pad multiple lists with trailing zeros? - python

Suppose I have two lists containing the same number of elements which are lists of integers. For instance:
a = [[1, 7, 3, 10, 4], [1, 3, 8], ..., [2, 5, 10, 91, 54, 0]]
b = [[5, 4, 23], [1, 2, 0, 4], ..., [5, 15, 11]]
For each index, I want to pad the shorter list with trailing zeros. The example above should look like:
a = [[1, 7, 3, 10, 4], [1, 3, 8, 0], ..., [2, 5, 10, 91, 54, 0]]
b = [[5, 4, 23, 0, 0], [1, 2, 0, 4], ..., [51, 15, 11, 0, 0, 0]]
Is there an elegant way to perform this comparison and padding build into Python lists or perhaps numpy? I am aware that numpy.pad can perform the padding, but its the iteration and comparison over the lists that has got me stuck.

I'm sure there's an elegant Python one-liner for this sort of thing, but sometimes a straightforward imperative solution will get the job done:
for i in xrange(0, len(a)):
x = len(a[i])
y = len(b[i])
diff = max(x, y)
a[i].extend([0] * (diff - x))
b[i].extend([0] * (diff - y))
print a, b
Be careful with "elegant" solutions too, because they can be very difficult to comprehend (I can't count the number of times I've come back to a piece of code I wrote using reduce() and had to struggle to figure out how it worked).

One line? Yes. Elegant? No.
In [2]: from itertools import izip_longest
In [3]: A, B = map(list, zip(*[map(list, zip(*izip_longest(l1,l2, fillvalue=0)))
for l1,l2 in zip(a,b)]))
In [4]: A
Out[4]: [[1, 7, 3, 10, 4], [1, 3, 8, 0], [2, 5, 10, 91, 54, 0]]
In [5]: B
Out[5]: [[5, 4, 23, 0, 0], [1, 2, 0, 4], [5, 15, 11, 0, 0, 0]]

Note: Creates 2 new lists. Preserves the old lists.
from itertools import repeat
>>> b = [[5, 4, 23], [1, 2, 0, 4],[5, 15, 11]]
>>> a = [[1, 7, 3, 10, 4], [1, 3, 8],[2, 5, 10, 91, 54, 0]]
>>> [y+list(repeat(0, len(x)-len(y))) for x,y in zip(a,b)]
[[5, 4, 23, 0, 0], [1, 2, 0, 4], [5, 15, 11, 0, 0, 0]]
>>> [x+list(repeat(0, len(y)-len(x))) for x,y in zip(a,b)]
[[1, 7, 3, 10, 4], [1, 3, 8, 0], [2, 5, 10, 91, 54, 0]]

a = [[1, 7, 3, 10, 4], [1, 3, 8], [2, 5, 10, 91, 54, 0]]
b = [[5, 4, 23], [1, 2, 0, 4], [5, 15, 11]]
for idx in range(len(a)):
size_diff = len(a[idx]) - len(b[idx])
if size_diff < 0:
a[idx].extend( [0] * abs(size_diff) )
elif size_diff > 0:
b[idx].extend( [0] * size_diff )

Related

Splitting list using a list of indices

I'm trying to split a list into groups based on index pairs from another list, given:
>>> l = list(range(10))
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> idx = [0, 5]
I need to break up the list resulting in:
>>> l[0:5]
[0, 1, 2, 3, 4]
>>> l[5:]
[5, 6, 7, 8, 9]
The list idx will at a minimum always be [0], but may be of size n; values inside idx will always be sorted ascending.
Currently I have:
>>> l = list(range(10))
>>> idx = [0, 5]
>>> idx.append(None)
>>> [l[idx[i]:idx[i + 1]] for i in range(len(idx) - 1)]
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]
Is there a way to accomplish this without explicitly appending Non and iterating over a range?
Edit: for another example...
Given:
>>> l = list(range(14))
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
>>> idx = [0, 5, 10]
Desired result:
[[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13]]
You could try about itertools.zip_longest:
from itertools import zip_longest
l = list(range(14))
idx = [0, 5, 10]
print([l[pre: next] for pre, next in zip_longest(idx,idx[1:])])
Result:
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13]]
With numpy you can use numpy.split()
import numpy as np
res =[list(x) for x in np.split(l, idx) if x.size != 0]
print(res)
Output:
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13]]
result = [l[curr_idx:curr_idx+idx[1]] for curr_idx in idx]
result
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13]]

How can i find sum of subtraction in numpy

I would like to use a numpy function in a daily report, because my data is quite large.
Let consider i have a numpy 2d-array
A = array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7],
[6, 7, 8],
[7, 8, 9]])
I want to do something like this
abs(array([0, 1, 2]) - array([[3, 4, 5], [4, 5, 6], ..., [7, 8, 9]])).sum()
abs(array([1, 2, 3]) - array([[4, 5, 6], [5, 6, 7], ..., [7, 8, 9]])).sum()
...
abs(array([3, 4, 5]) - array([[0, 1, 2], [6, 7, 8], [7, 8, 9]])).sum()
abs(array([4, 5, 6]) - array([[0, 1, 2], [1, 2, 3], [7, 8, 9]])).sum()
...
abs(array([7, 8, 9]) - array([[0, 1, 2], [1, 2, 3], ..., [4, 5, 6]])).sum()
I have tried this, but cannot skip arrays with any of elements on the left side that are in array on the right side.
for i in range(len(A)):
temp = np.roll(A, -i, axis=0)
print(abs(temp[0] - temp[3:]).sum())
This is the expected results
results = [75, 54, ..., 30, 30, ...75]
Sorry for my poor english explanation, thank you.
If you wish to have a simple one-liner solution involving only NumPy functionality, I propose this:
import numpy as np
results = np.apply_along_axis(arr=A,
axis=1,
func1d=lambda x:
np.abs(x - A[~np.isin(A, x).any(axis=1),:]).sum()
)
The results is as expected:
array([75, 54, 36, 30, 30, 36, 54, 75])
Here You go:
=^..^=
import numpy as np
A = np.array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7],
[6, 7, 8],
[7, 8, 9]])
def sum_data(select_row):
# roll data
rolled_data = np.roll(A, -select_row, axis=0)
drop_numbers = []
for item in rolled_data[0]:
drop_numbers.append(item)
# find rows to drop
drop_rows = []
for item in drop_numbers:
# get rows
gg = np.unique(np.where(rolled_data == item)[0])
for number in gg:
drop_rows.append(number)
# get unique rows numbers
unique_rows = list(set(drop_rows))
del unique_rows[0] # delete first number that is selected row
# delete rows
rolled_data = np.delete(rolled_data, unique_rows, axis=0)
# calculate
difference_value = 0
for i in range(1, len(rolled_data), 1):
difference_value += abs(rolled_data[0] - rolled_data[i]).sum()
return difference_value
# loop over each row
collect_values = []
for j in range(len(A)):
collect_values.append(sum_data(j))
Output:
[75, 54, 36, 30, 30, 36, 54, 75]

Python- Removing items

I want to remove item from a list called mom. I have another list called cut
mom= [[0,8,1], [0, 6, 2, 7], [0, 11, 12, 3, 9], [0, 5, 4, 10]]
cut =[0, 9, 8, 2]
How do I remove what in cut from mom, except for zero?
My desire result is
mom=[[0,1],[0,6,7],[0,11,12,3],[0,5,4,10]]
>>> [[e for e in l if e not in cut or e == 0] for l in mom]
[[0, 1], [0, 6, 7], [0, 11, 12, 3], [0, 5, 4, 10]]
This is how I'd do it with List comprehension.
mom= [[0,8,1], [0, 6, 2, 7], [0, 11, 12, 3, 9], [0, 5, 4, 10]]
cut =[0, 9, 8, 2]
mom = [[x for x in subList if x not in cut or x == 0 ] for subList in mom ]
The answers provided by Ingnacio and Dom are perfect. The same can be done in a more clear and easy to understand way. Try the following:
mom= [[0,8,1], [0, 6, 2, 7], [0, 11, 12, 3, 9], [0, 5, 4, 10]]
cut =[0, 9, 8, 2]
for e in mom:
for f in e:
if f in cut and f != 0:
e.remove(f) #used the remove() function of list
print(mom)
Much easier for a novice in Python. Isn't it?
Given the cut=[0,9,8,2] and
mom = [[0,8,1], [0, 6, 2, 7], [0, 11, 12, 3, 9], [0, 5, 4, 10]]
Assuming 0 element is removed from cut list
cut=[9,8,2]
result =[]
for e in mom:
result.append(list(set(e)-set(cut)))
o/p
result
[[0, 1], [0, 6, 7], [0, 11, 3, 12], [0, 10, 4, 5]]

How to randomly shuffle data and target in python?

I have a 4D array training images, whose dimensions correspond to (image_number,channels,width,height). I also have a 2D target labels,whose dimensions correspond to (image_number,class_number). When training, I want to randomly shuffle the data by using random.shuffle, but how can I keep the labels shuffled by the same order of my images? Thx!
from sklearn.utils import shuffle
import numpy as np
X = np.array([[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]])
y = np.array([0, 1, 2, 3, 4])
X, y = shuffle(X, y)
print(X)
print(y)
[[1 1 1]
[3 3 3]
[0 0 0]
[2 2 2]
[4 4 4]]
[1 3 0 2 4]
There is another easy way to do that. Let us suppose that there are total N images. Then we can do the following:
from random import shuffle
ind_list = [i for i in range(N)]
shuffle(ind_list)
train_new = train[ind_list, :,:,:]
target_new = target[ind_list,]
If you want a numpy-only solution, you can just reindex the second array on the first, assuming you've got the same image numbers in both:
In [67]: train = np.arange(20).reshape(4,5).T
In [68]: target = np.hstack([np.arange(5).reshape(5,1), np.arange(100, 105).reshape(5,1)])
In [69]: train
Out[69]:
array([[ 0, 5, 10, 15],
[ 1, 6, 11, 16],
[ 2, 7, 12, 17],
[ 3, 8, 13, 18],
[ 4, 9, 14, 19]])
In [70]: target
Out[70]:
array([[ 0, 100],
[ 1, 101],
[ 2, 102],
[ 3, 103],
[ 4, 104]])
In [71]: np.random.shuffle(train)
In [72]: target[train[:,0]]
Out[72]:
array([[ 2, 102],
[ 3, 103],
[ 1, 101],
[ 4, 104],
[ 0, 100]])
In [73]: train
Out[73]:
array([[ 2, 7, 12, 17],
[ 3, 8, 13, 18],
[ 1, 6, 11, 16],
[ 4, 9, 14, 19],
[ 0, 5, 10, 15]])
If you're looking for a sync/ unison shuffle you can use the following func.
def unisonShuffleDataset(a, b):
assert len(a) == len(b)
p = np.random.permutation(len(a))
return a[p], b[p]
the one above is only for 2 numpy. One can extend to more than 2 by adding the number of input vars on the func. and also on the return of the function.
Depending on what you want to do, you could also randomly generate a number for each dimension of your array with
random.randint(a, b) #a and b are the extremes of your array
which would select randomly amongst your objects.
Use the same seed to build the random generator multiple times to shuffle different arrays:
>>> seed = np.random.SeedSequence()
>>> arrays = [np.arange(10).repeat(i).reshape(10, -1) for i in range(1, 4)]
>>> for ar in arrays:
... np.random.default_rng(seed).shuffle(ar)
...
>>> arrays
[array([[1],
[2],
[7],
[8],
[0],
[4],
[3],
[6],
[9],
[5]]),
array([[1, 1],
[2, 2],
[7, 7],
[8, 8],
[0, 0],
[4, 4],
[3, 3],
[6, 6],
[9, 9],
[5, 5]]),
array([[1, 1, 1],
[2, 2, 2],
[7, 7, 7],
[8, 8, 8],
[0, 0, 0],
[4, 4, 4],
[3, 3, 3],
[6, 6, 6],
[9, 9, 9],
[5, 5, 5]])]

Number list with no repeats and ordered

This code returns a list [0,0,0] to [9,9,9], which produces no repeats and each element is in order from smallest to largest.
def number_list():
b=[]
for position1 in range(10):
for position2 in range(10):
for position3 in range(10):
if position1<=position2 and position2<=position3:
b.append([position1, position2, position3])
return b
Looking for a shorter and better way to write this code without using multiple variables (position1, position2, position3), instead only using one variable i.
Here is my attempt at modifying the code, but I'm stuck at implementing the if statements:
def number_list():
b=[]
for i in range(1000):
b.append(map(int, str(i).zfill(3)))
return b
On the same note as the other itertools answer, there is another way with combinations_with_replacement:
list(itertools.combinations_with_replacement(range(10), 3))
Simply use list comprehension, one way to do it:
>>> [[x,y,z] for x in range(10) for y in range(10) for z in range(10) if x<=y and y<=z]
[[0, 0, 0], [0, 0, 1], [0, 0, 2], [0, 0, 3], [0, 0, 4], [0, 0, 5], [0, 0, 6],
[0, 0, 7], [0, 0, 8], [0, 0, 9], [0, 1, 1], [0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 1, 5], [0, 1, 6], [0, 1, 7], [0, 1, 8], [0, 1, 9], [0, 2, 2], [0, 2, 3],
[0, 2, 4], [0, 2, 5], [0, 2, 6], [0, 2, 7], [0, 2, 8], [0, 2, 9], [0, 3, 3],
[0, 3, 4], [0, 3, 5], [0, 3, 6], [0, 3, 7], [0, 3, 8],....[6, 8, 8], [6, 8, 9],
[6, 9, 9], [7, 7, 7], [7, 7, 8], [7, 7, 9], [7, 8, 8], [7, 8, 9], [7, 9, 9],
[8, 8, 8], [8, 8, 9], [8, 9, 9], [9, 9, 9]]
Here's a simpler way than doing the checks, but which is still IMO worse than combinations_with_replacement:
[(a, b, c) for a in range(10)
for b in range(a, 10)
for c in range(b, 10)]
Namely, instead of filtering values after production you just only produce those values you want in the first place.
You can use itertools.product() to eliminate nested loops:
>>> filter(lambda i: i[0] <= i[1] <= i[2],
... itertools.product(range(10), range(10), range(10)))
Or better with list comprehensions:
>>> numbers = itertools.product(range(10), range(10), range(10))
>>> [(a, b, c) for a, b, c in numbers if a <= b <= c]
I think it is worthwhile to point out that the original code is weird and can be rewritten easily to be simpler:
def number_list2():
b=[]
for position1 in range(10):
for position2 in range(position1, 10):
for position3 in range(position2, 10):
if position1<=position2 and position2<=position3:
b.append([position1, position2, position3])
return b
There are better solutions here, but this one is the stepping stone to getting to them.
This code could be done pretty easily with recursion, without using itertools.
n - being the length of the tuple
m - being the upper bound of each value
The Code:
def non_decreasing(n, m):
if n==0:
return []
if n==1:
return [[i] for i in range(1,m+1)]
return [[i] + t for t in non_decreasing(n-1, m) for i in range(1,t[0]+1)]
The result is the output of non_decreasing(3,9)

Categories

Resources