Calculating and plotting count ratios with Pandas

Calculating and plotting count ratios with Pandas - python

I have multidimensional data in a pandas data frame with one variable indicating class. For example here is my attempt with a poor-maps heatmap scatter plot:
import pandas as pd
import random
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.cm import get_cmap
nrows=1000
df=pd.DataFrame([[random.random(), random.random()]+[random.randint(0, 1)] for _ in range(nrows)],
columns=list("ABC"))
bins=np.linspace(0, 1, 20)
df["Abin"]=[bins[i-1] for i in np.digitize(df.A, bins)]
df["Bbin"]=[bins[i-1] for i in np.digitize(df.B, bins)]
g=df.ix[:,["Abin", "Bbin"]+["C"]].groupby(["Abin", "Bbin"])
data=g.agg(["sum", "count"])
data.reset_index(inplace=True)
data["classratio"]=data[("C", "sum")]/data[("C","count")]
plt.scatter(data.Abin, data.Bbin, c=data.classratio, cmap=get_cmap("RdYlGn_r"), marker="s")
I'd like to plot class densities over binned features. Now I used np.digitize for binning and some complicating Python hand-made density calculation to plot a heatmap.
Surely, this can be done more compactly with Pandas (pivot?)? Do you know a neat way to bin the two features (for example 10 bins on the interval 0...1) and then plot a class density heatmap where color indicates the ratio of 1's to total rows within this 2D-bin?

Yep, it can be done in a very concise way using the build in cut function:
In [65]:
nrows=1000
df=pd.DataFrame([[random.random(), random.random()]+[random.randint(0, 1)] for _ in range(nrows)],
columns=list("ABC"))
In [66]:
#This does the trick.
pd.crosstab(np.array(pd.cut(df.A, 20)), np.array(pd.cut(df.B, 20))).values
Out[66]:
array([[2, 2, 2, 2, 7, 2, 3, 5, 1, 4, 2, 2, 1, 3, 2, 1, 7, 2, 4, 2],
[1, 2, 4, 2, 0, 3, 3, 3, 1, 1, 2, 1, 4, 3, 2, 1, 1, 2, 2, 1],
[0, 4, 1, 3, 1, 3, 2, 5, 2, 3, 1, 1, 1, 4, 2, 3, 6, 5, 2, 2],
[5, 2, 3, 2, 2, 1, 3, 2, 4, 0, 3, 2, 0, 4, 3, 2, 1, 3, 1, 3],
[2, 2, 4, 1, 3, 2, 2, 4, 1, 4, 3, 5, 5, 2, 3, 3, 0, 2, 4, 0],
[2, 3, 3, 5, 2, 0, 5, 3, 2, 3, 1, 2, 5, 4, 4, 3, 4, 3, 6, 4],
[3, 2, 2, 4, 3, 3, 2, 0, 0, 4, 3, 2, 2, 5, 4, 0, 1, 2, 2, 3],
[0, 0, 4, 4, 3, 2, 4, 6, 4, 2, 0, 5, 2, 2, 1, 3, 4, 4, 3, 2],
[3, 2, 2, 3, 4, 2, 1, 3, 1, 3, 4, 2, 4, 3, 2, 3, 2, 3, 4, 4],
[0, 1, 1, 4, 1, 4, 3, 0, 1, 1, 1, 2, 6, 4, 3, 5, 3, 3, 1, 4],
[2, 2, 4, 1, 3, 4, 1, 2, 1, 3, 3, 3, 1, 2, 1, 5, 2, 1, 4, 3],
[0, 0, 0, 4, 2, 0, 2, 3, 2, 2, 2, 4, 4, 2, 3, 2, 1, 2, 1, 0],
[3, 3, 0, 3, 1, 5, 1, 1, 2, 5, 6, 5, 0, 0, 3, 2, 1, 5, 7, 2],
[3, 3, 2, 1, 2, 2, 2, 2, 4, 0, 1, 3, 3, 1, 5, 6, 1, 3, 2, 2],
[3, 0, 3, 4, 3, 2, 1, 4, 2, 3, 4, 0, 5, 3, 2, 2, 4, 3, 0, 2],
[0, 3, 2, 2, 1, 5, 1, 4, 3, 1, 2, 2, 3, 5, 1, 2, 2, 2, 1, 2],
[1, 3, 2, 1, 1, 4, 4, 3, 2, 2, 5, 5, 1, 0, 1, 0, 4, 3, 3, 2],
[2, 2, 2, 1, 1, 3, 1, 6, 5, 2, 5, 2, 3, 4, 2, 2, 1, 1, 4, 0],
[3, 3, 4, 7, 0, 2, 6, 4, 1, 3, 4, 4, 1, 4, 1, 1, 2, 1, 3, 2],
[3, 6, 3, 4, 1, 3, 1, 3, 3, 1, 6, 2, 2, 2, 1, 1, 4, 4, 0, 4]])
In [67]:
abins=np.linspace(df.A.min(), df.A.max(), 21)
bbins=np.linspace(df.B.min(), df.B.max(), 21)
Z=pd.crosstab(np.array(pd.cut(df.ix[df.C==1, 'A'], abins)),
np.array(pd.cut(df.ix[df.C==1, 'B'], bbins)), aggfunc=np.mean).div(
pd.crosstab(np.array(pd.cut(df.A, abins)),
np.array(pd.cut(df.B, bbins)), aggfunc=np.mean)).values
Z = np.ma.masked_where(np.isinf(Z),Z)
x=np.linspace(df.A.min(), df.A.max(), 20)
y=np.linspace(df.B.min(), df.B.max(), 20)
X,Y=np.meshgrid(x, y)
plt.contourf(X, Y, Z, vmin=0, vmax=1)
plt.colorbar()
plt.pcolormesh(X, Y, Z, vmin=0, vmax=1)
plt.colorbar()

Related

Randomize list without same entry successively

order_list_raw = []
for i in range(1, 73):
order_list_raw.append(1)
order_list_raw.append(2)
order_list_raw.append(3)
How can I create the same list with a randomized order but without having the same entry successively (e.g. "1, 3, 2" is okay but not "1, 1, 3").
For randomization I would create a new list like this:
order_list = random.sample(order_list_raw, len(order_list_raw))

A solution would be:
result = []
for i in range(72):
options = [1, 2, 3]
try:
last_item = result[-1]
options.remove(last_item)
except IndexError:
pass
result.append(random.choice(options))
print(result)
Output:
[1, 3, 2, 1, 2, 3, 1, 2, 3, 2, 1, 3, 2, 3, 2, 1, 2, 1, 2, 3, 1, 2, 1, 3, 1, 2, 3, 2, 3, 2, 3, 2, 1, 2, 3, 1, 2, 3, 2, 1, 2, 1, 3, 2, 3, 2, 3, 2, 1, 2, 3, 2, 3, 1, 3, 2, 1, 3, 1, 3, 1, 3, 1, 2, 3, 2, 1, 3, 1, 2, 1, 3]
Here we simply take our options, check what the last value in the list is and delete that value from the options. Then we take a random value from the left over options, and append it to the list.

In case if you want to generate the input data randomly then you can use this solution.
import random
b=[]
for i in range(0,73):
x=random.randint(1,10)
if len(b)==0 or b[-1]!=x:
b.append(x)
print(b)
Output :
[6, 2, 3, 5, 6, 5, 3, 8, 1, 5, 4, 9, 4, 9, 8, 6, 9, 2, 1, 5, 8, 6, 1, 9, 6, 9, 3, 6, 5, 7, 9, 1, 9, 5, 9, 3, 4, 3, 7, 8, 3, 4, 5, 9, 1, 4, 9, 2, 1, 5, 7, 1, 10, 2, 4, 2, 1, 7, 1, 5, 4, 1, 2]
But in case if your input data is fixed, then you can try this solution as below.
a=[1,1,4]
b=[]
c=[[b.append(i) for i in a if len(b)==0 or b[-1]!=i]for j in range(0,100)]
print(b)
Output :
[1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4]

selecting certain indices in Numpy ndarray using another array

I'm trying to mark the value and indices of max values in a 3D array, getting the max in the third axis.
Now this would have been obvious in a lower dimension:
argmaxes=np.argmax(array)
maximums=array[argmaxes]
but NumPy doesn't understand the second syntax properly for higher than 1D.
Let's say my 3D array has shape (8,8,250). argmaxes=np.argmax(array,axis=-1)would return a (8,8) array with numbers between 0 to 250. Now my expected output is an (8,8) array containing the maximum number in the 3rd dimension. I can achieve this with maxes=np.max(array,axis=-1) but that's repeating the same calculation twice (because I need both values and indices for later calculations)
I can also just do a crude nested loop:
for i in range(8):
for j in range(8):
maxes[i,j]=array[i,j,argmaxes[i,j]]
But is there a nicer way to do this?

You can use advanced indexing. This is a simpler case when shape is (8,8,3):
arr = np.random.randint(99, size=(8,8,3))
x, y = np.indices(arr.shape[:-1])
arr[x, y, np.argmax(array,axis=-1)]
Sample run:
>>> x
array([[0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7, 7, 7]])
>>> y
array([[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7]])
>>> np.argmax(arr,axis=-1)
array([[2, 1, 1, 2, 0, 0, 0, 1],
[2, 2, 2, 1, 0, 0, 1, 0],
[1, 2, 0, 1, 1, 1, 2, 0],
[1, 0, 0, 0, 2, 1, 1, 0],
[2, 0, 1, 2, 2, 2, 1, 0],
[2, 2, 0, 1, 1, 0, 2, 2],
[1, 1, 0, 1, 1, 2, 1, 0],
[2, 1, 1, 1, 0, 0, 2, 1]], dtype=int64)
This is a visual example of array to help to understand it better:

How can I vectorize this for loop below, where I need to set values to a range I need to round?

I have a np.array q with some values for example: [1,3,5,7] .
And a np.array z. with some values that I need to round and than they are used as index in the
Third array 'mapping'.
import numpy as np
q = [1,3,5,7]
z = [0,50.3,240.4,252.9,256]
mapping = np.zeros(256)
for i in range(len(q)):
print(i)
start, end = int(round(z[i])), int(round(z[i + 1]))
mapping[start:end] = int(round(q[i]))
print(mapping)
The output here is:

Here's my approach:
repeats = np.diff(list(np.round(z))+ [256]).astype(int)
# repeats = array([ 49, 191, 12, 3])
np.repeat(np.round(q), repeats)
Output:
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 7, 7, 7])
Note: this only has 255 elements and it's different from your expected output, because, tbh I don't really understand your logic.

COByLA does not terminate

I have a function call to cobyla, that does not terminate.
I want to find a local minimum of some (multivariate) polynomial, in a given orthant.
The smallest example, I could reproduce is the following.
import numpy as np
import scipy.optimize
A = np.array([[ 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 6, 12, 4, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 3, 3, 3, 1, 2],
[ 0, 0, 2, 2, 2, 4, 10, 0, 4, 4, 12, 4, 0, 2, 4, 0, 3, 4, 3, 3, 2, 3, 3, 4, 3, 2, 3, 3, 4, 3],
[ 0, 4, 0, 6, 10, 10, 4, 4, 4, 8, 2, 0, 4, 2, 4, 2, 4, 4, 3, 4, 3, 5, 3, 4, 4, 4, 3, 4, 4, 4],
[ 0, 0, 6, 0, 0, 6, 2, 12, 10, 0, 2, 8, 0, 8, 4, 2, 5, 3, 5, 3, 3, 4, 4, 4, 2, 3, 4, 4, 3, 4]])
b = np.array([ 3.81330727e+00, 1.30927853e+00, 1.89829563e+00, 1.55301205e+00, 2.05509780e+00, 4.72913144e+00, 8.64125139e+00, 6.78452109e+00, 1.97505381e+01, 8.10184002e+00, 8.56817472e+00, 1.76581791e+00, 6.90448362e+00, 8.44460914e-02, 1.52023325e+00, -1.97710183e+00, -1.66933212e-01, -2.71655065e-01, -2.03262146e+00, -6.74143747e-01, -1.53382538e+00, -9.94362458e-01, 1.86147837e-01, -6.23838626e-01, 1.04835921e+00, 3.49272629e-01, -6.47927068e-01, -4.69780766e-01, 1.48099164e-02, 3.61251102e-01])
x0 = np.array([ 3.75422451, -4.13253284, -46.27451838, -29.48396097])
def f(x):
return np.dot(np.prod(np.power(x,A.T),axis = 1),b)
res = scipy.optimize.fmin_cobyla(f, x0, lambda x: x*np.array([1,-1,-1,-1]), disp = 3)
Then the last line of code does not terminate.
Even with maximum display level, I do not get a single line of output.
Worse, Ctrl+C does not terminate the computation in IPython (I assume, the code is stuck in Fortran).
How can I avoid this problem?

I believe there is an issue in the way you have described constraints.
I have tried the following 2 forms of code and it works:
import numpy as np
import scipy.optimize
A = np.array([[ 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 6, 12, 4, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 3, 3, 3, 1, 2],
[ 0, 0, 2, 2, 2, 4, 10, 0, 4, 4, 12, 4, 0, 2, 4, 0, 3, 4, 3, 3, 2, 3, 3, 4, 3, 2, 3, 3, 4, 3],
[ 0, 4, 0, 6, 10, 10, 4, 4, 4, 8, 2, 0, 4, 2, 4, 2, 4, 4, 3, 4, 3, 5, 3, 4, 4, 4, 3, 4, 4, 4],
[ 0, 0, 6, 0, 0, 6, 2, 12, 10, 0, 2, 8, 0, 8, 4, 2, 5, 3, 5, 3, 3, 4, 4, 4, 2, 3, 4, 4, 3, 4]])
b = np.array([ 3.81330727e+00, 1.30927853e+00, 1.89829563e+00, 1.55301205e+00, 2.05509780e+00, 4.72913144e+00, 8.64125139e+00, \
6.78452109e+00, 1.97505381e+01, 8.10184002e+00, 8.56817472e+00, 1.76581791e+00, 6.90448362e+00, 8.44460914e-02, 1.52023325e+00, \
-1.97710183e+00, -1.66933212e-01, -2.71655065e-01, -2.03262146e+00, -6.74143747e-01, -1.53382538e+00, -9.94362458e-01, 1.86147837e-01, \
-6.23838626e-01, 1.04835921e+00, 3.49272629e-01, -6.47927068e-01, -4.69780766e-01, 1.48099164e-02, 3.61251102e-01])
x0 = np.array([ 3.75422451, -4.13253284, -46.27451838, -29.48396097])
def fun(x):
return np.dot(np.prod(np.power(x,A.T),axis = 1),b)
def constraint_func(x_in):
factor = np.array([1,-1,-1,-1])
constraints_list = []
for i in range(len(x_in)):
constraints_list.append({lambda x: x[i]*factor[i]})
res = scipy.optimize.fmin_cobyla(fun, x0, constraint_func, disp = 3)
Another way using another function of Scipy library:
import numpy as np
import scipy.optimize
A = np.array([[ 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 6, 12, 4, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 3, 3, 3, 1, 2],
[ 0, 0, 2, 2, 2, 4, 10, 0, 4, 4, 12, 4, 0, 2, 4, 0, 3, 4, 3, 3, 2, 3, 3, 4, 3, 2, 3, 3, 4, 3],
[ 0, 4, 0, 6, 10, 10, 4, 4, 4, 8, 2, 0, 4, 2, 4, 2, 4, 4, 3, 4, 3, 5, 3, 4, 4, 4, 3, 4, 4, 4],
[ 0, 0, 6, 0, 0, 6, 2, 12, 10, 0, 2, 8, 0, 8, 4, 2, 5, 3, 5, 3, 3, 4, 4, 4, 2, 3, 4, 4, 3, 4]])
b = np.array([ 3.81330727e+00, 1.30927853e+00, 1.89829563e+00, 1.55301205e+00, 2.05509780e+00, 4.72913144e+00, 8.64125139e+00, \
6.78452109e+00, 1.97505381e+01, 8.10184002e+00, 8.56817472e+00, 1.76581791e+00, 6.90448362e+00, 8.44460914e-02, 1.52023325e+00, \
-1.97710183e+00, -1.66933212e-01, -2.71655065e-01, -2.03262146e+00, -6.74143747e-01, -1.53382538e+00, -9.94362458e-01, 1.86147837e-01, \
-6.23838626e-01, 1.04835921e+00, 3.49272629e-01, -6.47927068e-01, -4.69780766e-01, 1.48099164e-02, 3.61251102e-01])
x0 = np.array([ 3.75422451, -4.13253284, -46.27451838, -29.48396097])
def fun(x):
return np.dot(np.prod(np.power(x,A.T),axis = 1),b)
def constraint_func(x_in):
factor = np.array([1,-1,-1,-1])
constraints_list = []
for i in range(len(x_in)):
constraints_list.append({'type': 'ineq', 'fun': lambda x: x[i]*factor[i]})
return constraints_list
constraints = constraint_func(x0)
res = scipy.optimize.minimize(fun, x0, method='COBYLA', constraints= constraints)
print(res)

Creating a list from data within another list

I have created a list
a=[1,2,3,4,5]*100
I now need to create another list that will contain the first 8 prime number locations from within a.
I have tried these two lines of code and they didn't work
b=a[2:3:5:7:11:13:17:19]
a[2:3:5:7:11:13:17:19]=b
The output for list A is "[1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5]" so its the locations 2,3,5,7,11,13,17,19 out of that output

a=[1,2,3,4,5]*100
indices = [2,3,5,7,11,13,17,19]
b = []
for i in indices:
b.append(a[i])
print(b)
You have to access each element individually. b=a[2:3:5:7:11:13:17:19] is not valid syntatically in Python. Actually, this is not the way to access elements at particular indices.
Pythonic way to do the same thing (It will reduce code length) using List Comprehension:
indices = [2,3,5,7,11,13,17,19]
b = [a[i] for i in indices]

I would try it like this using list comprehension (beware the test_prime method is not optimized at all):
def test_prime(n):
if (n==1):
return False
elif (n==2):
return True;
else:
for x in range(2,n):
if(n % x==0):
return False
return True
a=[1,2,3,4,5]*100
b = [item for item in range(len(a)) if test_prime(a[item])]
b = b[0:8]
print b
which outputs (note Python counts from 0, so the first element of an array is 0 and not 1):
[1, 2, 4, 6, 7, 9, 11, 12]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Calculating and plotting count ratios with Pandas - python

Related

Randomize list without same entry successively

selecting certain indices in Numpy ndarray using another array

How can I vectorize this for loop below, where I need to set values to a range I need to round?

COByLA does not terminate

Creating a list from data within another list

Categories

Resources