I have a list containing some values. I want to calculate the sum of every 5 elements and then divide it by 5 and then store it in an empty list. While doing so I am not sure if I can iterate over a list the way I am doing. Being a newbie to python, any help would be much appreciated.
My list looks like this:
My code is:
a = []
i = np.arange(0,125,5)
j = np.arange(5,130,5)
for q,r in i,j:
cov = (np.sum(l[q:r]))/5
cov.append(a)
print(a)
I am getting the following error:
Instead of np.sum([i:i=+5])/5 you can use np.average().
instead of two value you can use range(0,length,5).
Try this:
a = []
for r in range(0,len(l),5):
try:
cov = (np.average(l[r:r+5]))
except IndexError:
cov = (np.average(l[r:]))
a.append(cov)
print(a)
If numpy is not a hard requirement I'd definitely do it with something simple like this:
values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
values_avg = []
temp_sum = 0
for i in range(len(values)):
temp_sum += values[i]
if (i + 1) % 5 == 0:
values_avg.append(temp_sum / 5)
temp_sum = 0
print(values_avg)
# [3.0, 8.0, 8.0, 3.0]
Related
Let's say I have an array (or even a list) that looks like:
tmp_data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
And then I have another ray that are distance values:
dist_data = [ 15.625 46.875 78.125 109.375 140.625 171.875 203.125 234.375 265.625 296.875]
Now, say I want to create a threshold of distance that I would like to perform an operation on from tmp_data. For this example, let's just take the max value. And let's set the threshold distance to 100. What I would like to do is take the n number of elements every 100 distance units and replace all elements in that with the maximum value in that small array. For example: I would want the final output to be
max_tmp_data_100 = [2,2,2,5,5,5,8,8,8,9]
This is because the first 3 elements in dist_data are below 100, so we take the first three elements of tmp_data (0,1,2), and get the maximum of this and replace all elements in there with that value, 2
Then, the next set of data that would be below the next 100 value would be
tmp_dist_array_100 = [109.375 140.625 171.875]
tmp_data_100 = [3,4,5]
max_tmp_data_100 = [5,5,5]
(append to [2,2,2])
I have come up with the following:
# Initialize
final_array = []
d_array = []
idx = 1
for i in range(0,10):
if dist_data[i] < idx * final_res:
d_array.append(tmp_data[i])
elif dist_data[i] > idx * final_res:
# Now get the values
max_val = np.amax(d_array)
new_array = np.ones(len(d_array)) * max_val
final_array.extend(new_array)
idx = idx + 1
But the outcome is
[2.0, 2.0, 2.0, 5.0, 5.0, 5.0, 5.0, 5.0]
When it should be [2,2,2,5,5,5,8,8,8,9]
With numpy:
import numpy as np
cdist_data = [15.625, 46.875, 78.125, 109.375, 140.625, 171.875, 203.125, 234.375,265.625, 296.875]
cut = 100
a = np.array(dist_data)
vals = np.searchsorted(a, np.r_[cut:a.max() + cut:cut]) - 1
print(vals[(a/cut).astype(int)])
It gives:
[2 2 2 5 5 5 9 9 9 9]
You can do with groupby
from itertools import groupby
dist_data = [ 15.625, 46.875 ,78.125 ,109.375 ,140.625 ,171.875 ,203.125 ,234.375, 265.625 ,296.875]
tmp_data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
result = []
index_list = [[dist_data.index(i) for i in l]
for k, l in groupby(dist_data, key=lambda x:x//100)]
for i in tmp_data:
for lst in index_list:
if i in lst:
result.append(max(lst))
print(result)
# [2, 2, 2, 5, 5, 5, 9, 9, 9, 9]
A per your requirements last 4 elements will comes under next threshold value, the max of last 4 element is 9.
I have four given variables:
group size
total of groups
partial sum
1-D tensor
and I want to add zeros when the sum within a group reached the partial sum. For example:
groupsize = 4
totalgroups = 3
partialsum = 15
d1tensor = torch.tensor([ 3, 12, 5, 5, 5, 4, 11])
The expected result is:
[ 3, 12, 0, 0, 5, 5, 5, 0, 4, 11, 0, 0]
I have no clue how can I achieve that in pure pytorch. In python it would be something like this:
target = [0]*(groupsize*totalgroups)
cursor = 0
current_count = 0
d1tensor = [ 3, 12, 5, 5, 5, 4, 11]
for idx, ele in enumerate(target):
subgroup_start = (idx//groupsize) *groupsize
subgroup_end = subgroup_start + groupsize
if sum(target[subgroup_start:subgroup_end]) < partialsum:
target[idx] = d1tensor[cursor]
cursor +=1
Can anyone help me with that? I have already googled it but couldn't find anything.
Some logic, Numpy and list comprehensions are sufficient here.
I will break it down step by step, you can make it slimmer and prettier afterwards:
import numpy as np
my_val = 15
block_size = 4
total_groups = 3
d1 = [3, 12, 5, 5, 5, 4, 11]
d2 = np.cumsum(d1)
d3 = d2 % my_val == 0 #find where sum of elements is 15 or multiple
split_points= [i+1 for i, x in enumerate(d3) if x] # find index where cumsum == my_val
#### Option 1
split_array = np.split(d1, split_points, axis=0)
padded_arrays = [np.pad(array, (0, block_size - len(array)), mode='constant') for array in split_array] #pad arrays
padded_d1 = np.concatenate(padded_arrays[:total_groups]) #put them together, discard extra group if present
#### Option 2
split_points = [el for el in split_points if el <len(d1)] #make sure we are not splitting on the last element of d1
split_array = np.split(d1, split_points, axis=0)
padded_arrays = [np.pad(array, (0, block_size - len(array)), mode='constant') for array in split_array] #pad arrays
padded_d1 = np.concatenate(padded_arrays)
I have written a code for SPC and I am attempting to highlight certain out of control runs.
So I was wondering if there was a way to pull out n(in my case 7) amount of increasing elements in an array so I can index with with the color red when I go to plot them.
This is what I attempted but I obviously get an indexing error.
import numpy as np
import matplotlib.pyplot as plt
y = np.linspace(0,10,15)
x = np.array([1,2,3,4,5,6,7,8,9,1,4,6,4,6,8])
col =[]
for i in range(len(x)):
if x[i]<x[i+1] and x[i+1]<x[i+2] and x[i+2]<x[i+3] and x[i+3]<x[i+4] and x[i+4]<x[i+5] and x[i+5]<x[i+6] and x[i+6]<x[i+7]:
col.append('red')
elif x[i]>x[i+1] and x[i+1]>x[i+2] and x[i+2]>x[i+3] and x[i+3]>x[i+4] and x[i+4]>x[i+5] and x[i+5]>x[i+6] and x[i+6]>x[i+7]:
col.append('red')
else:
col.append('blue')
for i in range(len(x)):
# plotting the corresponding x with y
# and respective color
plt.scatter(y[i], x[i], c = col[i], s = 10,
linewidth = 0)
Any help would be greatly appreciated!
As Andy said in his comment you get the index error because at i=8 you get to 15 which is the length of x.
Either you only loop over len(x)-7 and just repeat the last entry in col 7 times or you could do something like this:
import numpy as np
import matplotlib.pyplot as plt
y = np.linspace(0,10,20)
x = np.array([1,2,3,4,5,6,1,2,3,1,0,-1,-2,-3,-4,-5,-6,4,5])
col =[]
diff = np.diff(x) # get diff to see if x inc + or dec - // len(x)-1
diff_sign = np.diff(np.sign(diff)) # get difference of the signs to get either 1 (true) or 0 (false) // len(x)-2
zero_crossings = np.where(diff_sign)[0] + 2 # get indices (-2 from len(x)-2) where a zero crossing occures
diff_zero_crossings = np.diff(np.concatenate([[0],zero_crossings,[len(x)]])) # get how long the periods are till next zero crossing
for i in diff_zero_crossings:
if i >= 6:
for _ in range(i):
col.append("r")
else:
for _ in range(i):
col.append("b")
for i in range(len(x)):
# plotting the corresponding x with y
# and respective color
plt.scatter(y[i], x[i], c = col[i], s = 10,
linewidth = 0)
plt.show()
To determine if all integer elements of a list are ascending, you could do this:-
def ascending(arr):
_rv = True
for i in range(len(arr) - 1):
if arr[i + 1] <= arr[i]:
_rv = False
break
return _rv
a1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 8, 10, 11, 12, 13, 14, 16]
a2 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16]
print(ascending(a1))
print(ascending(a2))
If you want to limit the sequence of ascending values then you could just use nested loops. It may look inelegant but it's surprisingly efficient and much simpler than bringing dataframes into the mix:-
def ascending(arr, seq):
for i in range(len(arr) - seq + 1):
state = True
for j in range(i, i + seq - 1):
if arr[j] >= arr[j + 1]:
state = False
break
if state:
return True
return False
a1 = [100, 99, 98, 6, 7, 8, 10, 11, 12, 13, 14, 13]
a2 = [9, 8, 7, 6, 5, 4, 3, 2, 1]
print(ascending(a1, 7))
print(ascending(a2, 7))
I am trying to achieve functionality. It's working should be this way:
It takes two lists.
Mark some indexes, preferably center few.
Both parents switches marked indexes.
Other indexes go sequentially to their parent element.
If the same element is already present in that parent, it maps and check where other parent same element was and goes there.
import random
def pm(indA, indB):
size = min(len(indA), len(indB))
c1, c2 = [0] * size, [0] * size
# Initialize the position of each indices in the individuals
for i in range(1,size):
c1[indA[i]] = i
c2[indB[i]] = i
crosspoint1 = random.randint(0, size)
crosspoint2 = random.randint(0, size - 1)
if crosspoint2 >= crosspoint1:
crosspoint2 += 1
else: # Swap the two cx points
crosspoint1, crosspointt2 = crosspoint2, crosspoint1
for i in range(crosspoint1, crosspoint2):
# Keep track of the selected values
temp1 = indA[i]
temp2 = indB[i]
# Swap the matched value
indA[i], indA[c1[temp2]] = temp2, temp1
indB[i], indB[c2[temp1]] = temp1, temp2
# Position bookkeeping
c1[temp1], c1[temp2] = c1[temp2], c1[temp1]
c2[temp1], c2[temp2] = c2[temp2], c2[temp1]
return indA, indB
a,b = pm([3, 4, 8, 2, 7, 1, 6, 5],[4, 2, 5, 1, 6, 8, 3, 7])
Error:
in pm
c1[indA[i]] = i
IndexError: list assignment index out of range
Not sure whether there are other errors in your code (I didn't run it), but here's the explanation for this one. In Python (as most of other languages), lists (sequences to be more precise) index is 0 based:
>>> l = [1, 2, 3, 4, 5, 6]
>>>
>>> for e in l:
... print(e, l.index(e))
...
1 0
2 1
3 2
4 3
5 4
6 5
>>>
>>> l[0]
1
>>> l[5]
6
>>> l[6]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
To summarize your problem:
Your indA and indB lists have each 6 elements ([1..6]), and their indexes: [0..5]
Your c1 and c2 lists also have 6 elements (indexes also [0..5])
But, your using values from #1. as indexes in lists from #2., and the value 6 is a problem, as there's no such index
To fix your problem, you should use valid index values. Either:
Have the proper values in indA and indB (this is the one I'd chose):
a, b = pmxCrossover([0, 3, 1, 2, 5, 4], [4, 0, 2, 3, 5, 1])
Subtract 1, wherever you encounter values from indA or indB used as indexes:
c1[indA[i] - 1] = i
As a general advice: whenever you encounter errors, add print statements before the faulty line (printing (partial) stuff from it), and that might give you clues that could lead to solving the problem yourself.
#EDIT0
Posting (a slightly modified version of) the original code, with the index conversion:
Before the algorithm: subtract 1 (from each element) to have valid indexes
After the algorithm: add 1 to come back to 1 based indexes
code00.py:
#!/usr/bin/env python3
import sys
import random
def pmx_crossover(ind_a, ind_b):
size = min(len(ind_a), len(ind_b))
c1, c2 = [0] * size, [0] * size
# Initialize the position of each indices in the individuals
for i in range(1, size):
c1[ind_a[i]] = i
c2[ind_b[i]] = i
# Choose crossover points
crosspoint1 = random.randint(0, size)
crosspoint2 = random.randint(0, size - 1)
if crosspoint2 >= crosspoint1:
crosspoint2 += 1
else: # Swap the two cx points
crosspoint1, crosspointt2 = crosspoint2, crosspoint1
# Apply crossover between cx points
for i in range(crosspoint1, crosspoint2):
# Keep track of the selected values
temp1 = ind_a[i]
temp2 = ind_b[i]
# Swap the matched value
ind_a[i], ind_a[c1[temp2]] = temp2, temp1
ind_b[i], ind_b[c2[temp1]] = temp1, temp2
# Position bookkeeping
c1[temp1], c1[temp2] = c1[temp2], c1[temp1]
c2[temp1], c2[temp2] = c2[temp2], c2[temp1]
return ind_a, ind_b
def main():
#initial_a, initial_b = [1, 2, 3, 4, 5, 6, 7, 8], [3, 7, 5, 1, 6, 8, 2, 4]
initial_a, initial_b = [1, 4, 2, 3, 6, 5], [5, 1, 3, 4, 6, 2]
index_offset = 1
temp_a = [i - index_offset for i in initial_a]
temp_b = [i - index_offset for i in initial_b]
a, b = pmx_crossover(temp_a, temp_b)
final_a = [i + index_offset for i in a]
final_b = [i + index_offset for i in b]
print("Initial: {0:}, {1:}".format(initial_a, initial_b))
print("Final: {0:}, {1:}".format(final_a, final_b))
if __name__ == "__main__":
print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
main()
print("\nDone.")
Output (one of the possibilities (due to random.randint)):
[cfati#CFATI-5510-0:e:\Work\Dev\StackOverflow\q058424002]> "e:\Work\Dev\VEnvs\py_064_03.07.03_test0\Scripts\python.exe" code00.py
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] 64bit on win32
Initial: [1, 4, 2, 3, 6, 5], [5, 1, 3, 4, 6, 2]
Final: [1, 3, 2, 4, 6, 5], [5, 1, 4, 3, 6, 2]
Done.
c1 is out of range because in your for at the fourth index the value of indA[4] is 6.
And the range of c1 index it's 0-5 (it's lengh is 6).
With c1[indA[i]] = i
you try to do c1[6] = 4
### import math
def mean(values):
return sum(values)*1.0/len(values)
def std():
pass
print(std())
def std(values):
length = len(values)
if length < 2:
return("Standard deviation requires at least two data points")
m = mean(values)
total_sum = 0
for i in range(length):
total_sum += (values[i]-m)**2
under_root = total_sum*1.0/length
return math.sqrt(under_root)
vals = [5]
stan_dev = std(vals)
print(stan_dev)
values = [1, 2, 3, 4, 5]
stan_dev = std(values)
print(stan_dev)
__________________________________________________________________________
lst = [3, 19, 21, 1435, 653342]
sum = reduce((lambda x, y: x +y), lst)
print (sum)
# list = [3, 19, 21, 1435, 653342]
i need to be able to get the stDev without using sum or len
i need to 'unpack' the stDev ???
You can do it with two loops (there are shorter ways but this is simple):
arr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# Calculate the mean first
N, X = 0, 0
for xi in arr:
N += 1
X += xi
mean = X/N
# Calculate the standard deviation
DSS = 0
for xi in arr:
DSS += (xi - mean)**2
std = (DSS/N)**(1/2)
Outputs 4.5 for mean and 2.872 for std.