What is the best way to do this? Looking to take the difference but not like this horrible way. For each A, B, C it is subtracted from subtract from
A = [500, 500, 500, 500, 5000]
B = [100, 100, 540, 550, 1200]
C = [540, 300, 300, 100, 10]
triples= [tuple(A),tuple(B), tuple(C)]
subtract_from = tuple([1234,4321,1234,4321,5555])
diff = []
for main in subtract_from:
for i in range(len(triples)):
for t in triples[i]:
diff[i].append(main-t)
Try something like this:
all_lists = [A, B, C]
[[i-j for i,j in zip(subtract_from,l)] for l in all_lists]
[
[734, 3821, 734, 3821, 555],
[1134, 4221, 694, 3771, 4355],
[694, 4021, 934, 4221, 5545]
]
It is the best practice of doing this. no need to import any library, just use builtins.
You could try using map and operator:
import operator
A = [500, 500, 500, 500, 5000]
B = [100, 100, 540, 550, 1200]
C = [540, 300, 300, 100, 10]
l = [A, B, C]
subtract_from = [1234,4321,1234,4321,5555]
diff = list((list(map(operator.sub, subtract_from , i)) for i in l))
print(diff)
# [[734, 3821, 734, 3821, 555], [1134, 4221, 694, 3771, 4355], [694, 4021, 934, 4221, 5545]]
First of all, if you want tuples, use tuples explicitly without converting lists. That being said, you should write something like this:
a = 500, 500, 500, 500, 5000
b = 100, 100, 540, 550, 1200
c = 540, 300, 300, 100, 10
vectors = a, b, c
data = 1234, 4321, 1234, 4321, 5555
diff = [
[de - ve for de, ve in zip(data, vec)]
for vec in vectors
]
If you want list of tuples, use tuple(de - ve for de, ve in zip(data, vec)) instead of [de - ve for de, ve in zip(data, vec)].
I think everyone else nails it with list comprehensions already so here's a few odd ones in cases if you are using a mutable lists and reusing it in an imperative style is acceptable style, then the following code can be done
A = [500, 500, 500, 500, 5000]
B = [100, 100, 540, 550, 1200]
C = [540, 300, 300, 100, 10]
subtract_from = (1234,4321,1234,4321,5555)
for i,x in enumerate(subtract_from):
A[i], B[i], C[i] = x-A[i], x-B[i], x-C[i]
# also with map
#for i,x in enumerate(zip(subtract_from,A,B,C)):
# A[i], B[i], C[i] = map(x[0].__sub__, x[1:])
diff = [A,B,C]
It's less elegant but more efficient*(...I have not done any benchmark for this claim)
Related
I have a Python list consisting of data:
[60, 120, 180, 240, 480]
I need to calculate the intervals between the elements in the list and get the value in between the elements. For the above list, I'm looking for this output:
[[60, 90], [90, 150], [150, 210], [210, 360], [360, 480]]
The first and last values of the list is directly transferred, but the values in-between are obtained by the following method: e.g. for 60 and 120: (120 - 60 = 60 / 2 = 30 + 60 = 90)
I cannot work out how to do this in a simple pythonic fashion, and I have buried myself in if/else statements to solve it.
You can do this fairly simply with pairwise. It's included in Python as of version 3.10, but if you're on an earlier version, you can get it from more_itertools or implement it yourself. (I also use mean, which is a handy convenience even though it's trivial to reimplement.)
from itertools import pairwise
from statistics import mean
original = [60, 120, 180, 240, 480]
midpoints = [mean(pair) for pair in pairwise(original)]
output = list(pairwise([original[0], *midpoints, original[-1]]))
print(output)
[(60, 90), (90, 150), (150, 210), (210, 360), (360, 480)]
Note that this outputs each pair as a tuple, rather than a list, as in your sample output. I think this is more idiomatic, and I would prefer it in my own code. However, if you'd prefer lists, it's a simple change:
from itertools import pairwise
from statistics import mean
original = [60, 120, 180, 240, 480]
midpoints = [mean(pair) for pair in pairwise(original)]
output = [
list(pair) for pair in pairwise([original[0], *midpoints, original[-1]])
]
print(output)
[[60, 90], [90, 150], [150, 210], [210, 360], [360, 480]]
We can replace inner points by midpoints and then turn adjacent pairs into intervals:
a[1:-1] = map(mean, pairwise(a))
a[:] = pairwise(a)
As non-modifying function (Try it online!):
from itertools import pairwise
from statistics import mean
def intervals(a):
a = a[:]
a[1:-1] = map(mean, pairwise(a))
return list(pairwise(a))
print(intervals([60, 120, 180, 240, 480]))
Output:
[(60, 90), (90, 150), (150, 210), (210, 360), (360, 480)]
The intervals are tuples instead of lists, but like CrazyChucky I think that tuples are more idiomatic for this (unless you actually have a need for them to be lists).
The shortest version that i can do
A = [60, 120, 180, 240, 480]
B = []
for i in range(len(A)):
if i == 0:
B.append([A[i], (A[i] + A[i+1])//2])
else:
if(i < len(A)-1):
B.append([B[i - 1][1], (A[i] + A[i+1])//2])
else:
B.append([B[i - 1][1], A[i]])
print(B)
x_list = [60, 120, 180, 240, 480]
y = []
for i in range(len(x_list) - 1):
y.append(int((x_list[i]+x_list[i+1])/2))
print(y)
z = [[x_list[0], y[0]]]
for i in range(0, len(y)-1):
z.append([y[i], y[i+1]])
print(z)
z.append([y[-1], x_list[-1]])
print(z)
Try the above.
The first line printed are the midpoints, and the last row shows the final list.
It's a matter of understanding indices. There are probably faster ways to do it, but learning, it's better to iterate.
$ python example.py
[90, 150, 210, 360]
[[60, 90], [90, 150], [150, 210], [210, 360], [360, 480]]
$
You can use the following:
def getIntervals(l):
l2 = []
l2.append(l[0])
for i in range(0, len(l) - 1):
l2.append((l[i] + l[i+1]) // 2)
l2.append(l[-1])
ret = []
for i in range(len(l2) - 1):
ret.append([l2[i], l2[i+1]])
return ret
And then you can call it like this:
x = [60, 120, 180, 240, 480]
y = getIntervals(x)
print(y) # prints [[60, 90], [90, 150], [150, 210], [210, 360], [360, 480]]
Basically, you iterate through the list, and then you find the endpoints of each interval, and then you iterate through again and form the pairs of endpoints.
Most pythonic way I can come up with:
from itertools import pairwise # python ≥3.10 only
l = [60, 120, 180, 240, 480]
mid_points = map(lambda pair: sum(pair)//2, pairwise(l)) # [90, 150, 210, 360]
all_values = l[0], *mid_points, l[-1] # pre/append 60 and 480 resp. to mid_points
# all_values: (60, 90, 150, 210, 360, 480)
list(pairwise(all_values))
# [(60, 90), (90, 150), (150, 210), (210, 360), (360, 480)]
If you don't have Python 3.10 you can emulate pairwise with:
from itertools import tee
def pairwise(iterable):
# pairwise('ABCDEFG') --> AB BC CD DE EF FG
a, b = tee(iterable)
next(b, None)
return zip(a, b)
Solution written in function format (without needing to import anything):
Function:
def get_interval_list(my_list):
interval_start = 0
interval_end = 0
interval_list = []
for i in range(len(my_list)):
if i == 0:
interval_start = my_list[0]
else:
interval_start = interval_end
if i == len(my_list)-1:
interval_end = my_list[-1]
else:
current_num = my_list[i]
next_num = my_list[i+1]
interval_end = (current_num + next_num) / 2
interval_list.append([int(interval_start), int(interval_end)])
return interval_list
Function with commented explanations:
def get_interval_list(my_list):
# variables (self explanatory... interval_list temporarily stores the interval start and
# end pair for each number being iterated through)
interval_start = 0
interval_end = 0
interval_list = []
# for loop iterates through the conditional statements for a number of times dependent on the
# length of my_list, finds interval_start and interval_end for each number in my_list,
# and then add each pair to interval_list
for i in range(len(my_list)):
# conditional statement checks if the current number is the first item in my_list,
# and if it is, assigns it to interval_start - if it isn't, the interval_end for the
# previous number will be the same as the interval_start for the
# current number (as per your requested output)
if i == 0:
interval_start = my_list[0]
else:
interval_start = interval_end
# conditional statement checks if the current number is the last item in my_list,
# and if it is, assigns it to interval_end - if it isn't, we calculate interval_end by
# adding the current number and the next number in my_list and dividing by 2
if i == len(my_list)-1:
interval_end = my_list[-1]
else:
current_num = my_list[i]
next_num = my_list[i+1]
interval_end = (current_num + next_num) / 2
# values of interval_start and interval_end are added as a pair to interval_list
# (use int() here if you do not want your interval pairs to be returned as floats)
interval_list.append([int(interval_start), int(interval_end)])
return interval_list
Sample run using function above:
list_of_numbers = [60, 120, 180, 240, 480]
print(get_interval_list(list_of_numbers))
Output:
[[60, 90], [90, 150], [150, 210], [210, 360], [360, 480]]
(Could be written more simply but I didn't want to sacrifice readability)
Just a side note: we don't need to use (120 - 60 = 60 / 2 = 30 + 60 = 90) to calculate the midpoints. There is a much simpler way; all we have to do is add the upper and lower limits and divide by 2 like so:
(60 + 120) / 2 = 90
(This mathematical method of finding midpoints works for any "range")
This is the best i can come up with, but i'm really unsure if this is the simplest approach:
def generate_split_intervals(input_list):
first_list_value = input_list[0]
last_list_value = input_list[-1]
return_list = []
last_append = 0
for idx, item in enumerate(input_list):
if item == first_list_value:
last_append = int(first_list_value + (abs(input_list[idx+1]-first_list_value)/2))
return_list.append([first_list_value, last_append ])
elif item == last_list_value:
return_list.append([last_append, last_list_value])
else:
this_append = int(item + (abs(input_list[idx+1]-item)/2))
return_list.append([last_append, this_append])
last_append = this_append
return return_list
arr = [60, 120, 180, 240, 480]
def foo(x, i):
return [(x[i] + x[max(0, i - 1)])//2,
(x[i] + x[min(len(x) - 1, i + 1)])//2]
[foo(arr, i) for i in range(len(arr))]
# [[60, 90], [90, 150], [150, 210], [210, 360], [360, 480]]
OR
lz = zip((zip(arr, [arr[0]] + arr[:-1])),
(zip(arr, arr[1:] + [arr[-1]])))
[[sum(a)//2, sum(b)//2] for a, b in lz]
# [[60, 90], [90, 150], [150, 210], [210, 360], [360, 480]]
Suppose I have an array (the elements can be floats also):
D = np.array([0,0,600,160,0,1200,1800,0,1800,900,900,300,1400,1500,320,0,0,250])
The goal is, starting from the beginning of the array, to find the max value (the last one if there are several equal ones) and cut the anterior part of the array. Then consecutively repeat this procedure till the end of the array. So, the expected result would be:
[[0,0,600,160,0,1200,1800,0,1800],
[900,900,300,1400,1500],
[320],
[0,0,250]]
I managed to find the last max value:
D_rev = D[::-1]
last_max_index = len(D_rev) - np.argmax(D_rev) - 1
i.e. I can get the first subarray of the desired answer. And then I can use a loop to get the rest.
My question is, if there is a numpy way to do it without looping?
IIUC, you can take the reverse cumulated max (see accumulate) of D to form groups, then split with itertools.groupby:
D = np.array([0,0,600,160,0,1200,1800,0,1800,900,900,300,1400,1500,320,0,0,250])
groups = np.maximum.accumulate(D[::-1])[::-1]
# array([1800, 1800, 1800, 1800, 1800, 1800, 1800, 1800, 1800, 1500, 1500,
# 1500, 1500, 1500, 320, 250, 250, 250])
from itertools import groupby
out = [list(list(zip(*g))[0]) for _, g in groupby(zip(D, groups), lambda x: x[1])]
# [[0, 0, 600, 160, 0, 1200, 1800, 0, 1800],
# [900, 900, 300, 1400, 1500],
# [320],
# [0, 0, 250]]
I want to iterate over a 3d array (sequences) with shape (1134500, 1, 50)
array([[[1000, 1000, 1000, ..., 1005, 1005, 1005]],
[[1000, 1000, 1000, ..., 1004, 1005, 1004]],
[[1000, 1000, 1000, ..., 1004, 1005, 1004]],
...,
[[1000, 1000, 1000, ..., 1005, 1005, 1004]],
[[1000, 1000, 1000, ..., 1005, 1005, 1005]],
[[1000, 1000, 1000, ..., 1004, 1005, 1004]]], dtype=int32)
To do this, I use the following for loop, which works well except for it overwriting the results from the batch before:
batchsize = 500
for i in range(0, sequences.shape[0], batchsize):
batch = sequences[i:i+batchsize]
relevances = lrp_model.lrp(batch)
As a result, I want an array (relevances) with shape (1134500, 1, 50), but I get one with shape (500, 1, 50)
Can someone tell me what's going wrong?
In case you want to save the relevances, maybe
batchsize = 500
relevances = np.zeros(sequences.shape)
for i in range(0, sequences.shape[0], batchsize):
batch = sequences[i:i+batchsize]
relevances[i:i+batchsize, :, :] = lrp_model.lrp(batch)
I have the ohlc list as below:
ohlc = [["open", "high", "low", "close"],
[100, 110, 70, 100],
[200, 210, 180, 190],
[300, 310, 300, 310]]
I want to slice it as:
[["open"],[100],[200],[300]]
We can easily slice that list using numpy, but I don't know how to do it without numpy's help.
I tried the method listed below but it didn't show the value I wanted:
ohlc[:][0]
ohlc[:][:1]
ohlc[0][:]
The zip function gets you tuples containing elements from the i-th index of every sublist:
In [217]: ohlc = [["open", "high", "low", "close"],
...: [100, 110, 70, 100],
...: [200, 210, 180, 190],
...: [300, 310, 300, 310]]
...:
In [218]: for t in zip(*ohlc): print(t)
('open', 100, 200, 300)
('high', 110, 210, 310)
('low', 70, 180, 300)
('close', 100, 190, 310)
You're looking for the first one of these, you call on your friend next().
In [219]: next(zip(*ohlc))
Out[219]: ('open', 100, 200, 300)
But that's just a single tuple with all the elements and not a list of lists like you wanted, so use a list comprehension:
In [220]: [[t] for t in next(zip(*ohlc))]
Out[220]: [['open'], [100], [200], [300]]
You can iterate over the list and take the element in index in every sub list
ohlc = [["open", "high", "low", "close"],
[100, 110, 70, 100],
[200, 210, 180, 190],
[300, 310, 300, 310]]
index = 0
result = [[o[index]] for o in ohlc] # [['open'], [100], [200], [300]]
Can it be the right solution this?
list_ = []
for i in ohlc:
list_.append((i[0]))
def slice_list(input):
ans = []
for x in input:
ans.append(x[0])
return ans
I'm stuck again on trying to make this merge sort work.
Currently, I have a 2d array with a Unix timecode(fig 1) and merge sorting using (fig 2) I am trying to check the first value in each array i.e array[x][0] and then move the whole array depending on array[x][0] value, however, the merge sort creates duplicates of data and deletes other data (fig 3) my question is what am I doing wrong? I know it's the merge sort but cant see the fix.
fig 1
[[1422403200 100]
[1462834800 150]
[1458000000 25]
[1540681200 150]
[1498863600 300]
[1540771200 100]
[1540771200 100]
[1540771200 100]
[1540771200 100]
[1540771200 100]]
fig 2
import numpy as np
def sort(data):
if len(data) > 1:
Mid = len(data) // 2
l = data[:Mid]
r = data[Mid:]
sort(l)
sort(r)
z = 0
x = 0
c = 0
while z < len(l) and x < len(r):
if l[z][0] < r[x][0]:
data[c] = l[z]
z += 1
else:
data[c] = r[x]
x += 1
c += 1
while z < len(l):
data[c] = l[z]
z += 1
c += 1
while x < len(r):
data[c] = r[x]
x += 1
c += 1
print(data, 'done')
unixdate = [1422403200, 1462834800, 1458000000, 1540681200, 1498863600, 1540771200, 1540771200,1540771200, 1540771200, 1540771200]
price=[100, 150, 25, 150, 300, 100, 100, 100, 100, 100]
array = np.column_stack((unixdate, price))
sort(array)
print(array, 'sorted')
fig 3
[[1422403200 100]
[1458000000 25]
[1458000000 25]
[1498863600 300]
[1498863600 300]
[1540771200 100]
[1540771200 100]
[1540771200 100]
[1540771200 100]
[1540771200 100]]
I couldn't spot any mistake in your code.
I have tried your code and I can tell that the problem does not happen, at least with regular Python lists: The function doesn't change the number of occurrence of any element in the list.
data = [
[1422403200, 100],
[1462834800, 150],
[1458000000, 25],
[1540681200, 150],
[1498863600, 300],
[1540771200, 100],
[1540771200, 100],
[1540771200, 100],
[1540771200, 100],
[1540771200, 100],
]
sort(data)
from pprint import pprint
pprint(data)
Output:
[[1422403200, 100],
[1458000000, 25],
[1462834800, 150],
[1498863600, 300],
[1540681200, 150],
[1540771200, 100],
[1540771200, 100],
[1540771200, 100],
[1540771200, 100],
[1540771200, 100]]
Edit, taking into account the numpy context and the use of np.column_stack.
-I expect what happens there is that np.column_stack actually creates a view mapping over the two arrays. To get a real array rather than a link to your existing arrays, you should copy that array:-
array = np.column_stack((unixdate, price)).copy()
Edit 2, taking into account the numpy context
This behavior has actually nothing to do with np.column_stack; np.column_stack already performs a copy.
The reason your code doesn't work is because slicing behaves differently with numpy than with python. Slicing create a view of the array which maps indexes.
The erroneous lines are:
l = data[:Mid]
r = data[Mid:]
Since l and r just map to two pieces of the memory held by data, they are modified when data is. This is why the lines data[c] = l[z] and data[c] = r[x] overwrite values and create copies when moving values.
If data is a numpy array, we want l and r to be copies of data, not just views. This can be achieved using the copy method.
l = data[:Mid]
r = data[Mid:]
if isinstance(data, np.ndarray):
l = l.copy()
r = r.copy()
This way, I tested, the copy works.
Note
If you wanted to sort the data using python lists rather than numpy arrays, the equivalent of np.column_stack in vanilla python is zip:
z = zip([10, 20, 30, 40], [100, 200, 300, 400], [1000, 2000, 3000, 4000])
z
# <zip at 0x7f6ef80ce8c8>
# `zip` creates an iterator, which is ready to give us our entries.
# Iterators can only be walked once, which is not the case of lists.
list(z)
# [(10, 100, 1000), (20, 200, 2000), (30, 300, 3000), (40, 400, 4000)]
The entries are (non-mutable) tuples. If you need the entries to be editable, map list on them:
z = zip([10, 20, 30, 40], [100, 200, 300, 400], [1000, 2000, 3000, 4000])
li = list(map(list, z))
# [[10, 100, 1000], [20, 200, 2000], [30, 300, 3000], [40, 400, 4000]]
To transpose a matrix, use zip(*matrix):
def transpose(matrix):
return list(map(list, zip(*matrix)))
transpose(l)
# [[10, 20, 30, 40], [100, 200, 300, 400], [1000, 2000, 3000, 4000]]
You can also sort a python list li using li.sort(), or sort any iterator (lists are iterators), using sorted(li).
Here, I would use (tested):
sorted(zip(unixdate, price))