Creating a list from data within another list

Creating a list from data within another list - python

I have created a list
a=[1,2,3,4,5]*100
I now need to create another list that will contain the first 8 prime number locations from within a.
I have tried these two lines of code and they didn't work
b=a[2:3:5:7:11:13:17:19]
a[2:3:5:7:11:13:17:19]=b
The output for list A is "[1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5]" so its the locations 2,3,5,7,11,13,17,19 out of that output

a=[1,2,3,4,5]*100
indices = [2,3,5,7,11,13,17,19]
b = []
for i in indices:
b.append(a[i])
print(b)
You have to access each element individually. b=a[2:3:5:7:11:13:17:19] is not valid syntatically in Python. Actually, this is not the way to access elements at particular indices.
Pythonic way to do the same thing (It will reduce code length) using List Comprehension:
indices = [2,3,5,7,11,13,17,19]
b = [a[i] for i in indices]

I would try it like this using list comprehension (beware the test_prime method is not optimized at all):
def test_prime(n):
if (n==1):
return False
elif (n==2):
return True;
else:
for x in range(2,n):
if(n % x==0):
return False
return True
a=[1,2,3,4,5]*100
b = [item for item in range(len(a)) if test_prime(a[item])]
b = b[0:8]
print b
which outputs (note Python counts from 0, so the first element of an array is 0 and not 1):
[1, 2, 4, 6, 7, 9, 11, 12]

Related

Assess clusters stability for each cluster

I have clustered some data points twice and obtained four clusters (A=1,B=2,C=3,D=4) for both of them. I want to assess the overall stability of the clustering, but also assess each cluster individually (cluster A for the first result(A1) vs cluster A for the second result(A2), B1 vs B2, C1 vs C2, and D1 vs D2).
For the overall stability, I am using the adjusted rand index (ARI) function and have no problem. Nevertheless, when I want to assess ex. A1 vs A2, I don't really know how I should proceed.
The clustering results are the following:
c1 <- c(1, 2, 3, 2, 1, 3, 4, 3, 2, 2, 3, 4, 3, 2, 1, 2, 3, 4, 3, 2, 1, 2, 3, 4, 2, 3, 2, 3, 2, 1, 3, 4, 4, 4, 4, 3, 2, 3, 2, 3, 1, 3, 2, 1, 2, 3, 4, 3, 2, 1, 4, 3, 2, 2, 2, 3, 4, 3, 3, 3, 2, 1, 1, 1, 2)
c2 <- c(1, 2, 4, 4, 1, 3, 4, 2, 2, 2, 3, 4, 1, 2, 1, 2, 3, 4, 3, 2, 1, 2, 2, 4, 2, 3, 2, 3, 2, 1, 3, 3, 4, 3, 4, 3, 2, 3, 2, 3, 1, 1, 1, 1, 2, 3, 4, 3, 2, 1, 4, 3, 2, 2, 2, 3, 4, 3, 3, 3, 2, 1, 1, 1, 2)
Is there any good strategy to look between each type of cluster (ex. A1 vs A2)?
Suggestions that require R or python syntax are accepted.
Thanks in advance!

Generating random result based on previous result

I want to generate random result with weights based on the previous result.
I am generating random results with weights in the following manner
a = [1, 2, 3, 4, 5, 6]
results = random.choices(a, weights=(
(10, 20, 30, 30, 10, 6)), k=100)
print (results[0:100]
What I want to do is if results[n] = 1, the next result cannot be 6(i.e. it can be between 1 and 5).
I am new to this page and python. Any help would be useful

Using Kelly Bundy's suggestion from the comments:
a = [1, 2, 3, 4, 5, 6]
choices = random.choices(a, weights=(10, 20, 30, 30, 10, 6), k=100)
# print(choices)
for i, (prev, curr) in enumerate(zip(choices[:-1], choices[1:])):
if prev == 1 and curr == 6:
# print(f'Rerolled {i + 1}')
choices[i + 1] = random.choices(a, weights=(10, 20, 30, 30, 10, 0), k=1)[0]
print(choices)
Example Output:
[5, 4, 4, 1, 3, 4, 5, 6, 3, 4, 1, 2, 4, 2, 4, 3, 1, 3, 2, 3, 1, 2, 5, 4, 4, 3, 6, 4, 3, 4, 1, 5, 1, 3, 1, 3, 2, 1, 3, 4, 2, 4, 4, 1, 5, 2, 4, 4, 4, 6, 3, 3, 3, 2, 1, 4, 2, 2, 5, 4, 4, 2, 2, 4, 1, 3, 4, 4, 5, 4, 4, 4, 4, 3, 1, 4, 3, 4, 4, 4, 4, 4, 4, 2, 3, 5, 3, 2, 4, 4, 1, 2, 5, 6, 3, 4, 4, 6, 4, 3]
You could change the corresponding weight for 6 to 0 for the next choice if the previous choice was a 1:
a = [1, 2, 3, 4, 5, 6]
choices = []
k = 100
i = 0
previous_choice_1 = False
while i < k:
if previous_choice_1:
choice = random.choices(a, weights=(10, 20, 30, 30, 10, 0), k=1)[0]
else:
choice = random.choices(a, weights=(10, 20, 30, 30, 10, 6), k=1)[0]
choices.append(choice)
previous_choice_1 = choice == 1
i += 1
print(choices)
Example Output:
[6, 3, 3, 6, 2, 3, 3, 1, 2, 3, 6, 1, 2, 2, 6, 4, 3, 3, 2, 2, 1, 2, 3, 4, 4, 3, 2, 5, 4, 4, 3, 3, 4, 3, 3, 3, 1, 4, 5, 1, 2, 4, 4, 2, 4, 3, 4, 3, 6, 4, 1, 5, 3, 4, 4, 2, 4, 3, 4, 3, 3, 4, 3, 2, 2, 2, 3, 4, 3, 2, 1, 4, 4, 3, 3, 1, 3, 4, 5, 4, 4, 3, 1, 2, 2, 6, 3, 2, 3, 3, 4, 2, 1, 2, 4, 3, 3, 3, 1, 4]

Randomize list without same entry successively

order_list_raw = []
for i in range(1, 73):
order_list_raw.append(1)
order_list_raw.append(2)
order_list_raw.append(3)
How can I create the same list with a randomized order but without having the same entry successively (e.g. "1, 3, 2" is okay but not "1, 1, 3").
For randomization I would create a new list like this:
order_list = random.sample(order_list_raw, len(order_list_raw))

A solution would be:
result = []
for i in range(72):
options = [1, 2, 3]
try:
last_item = result[-1]
options.remove(last_item)
except IndexError:
pass
result.append(random.choice(options))
print(result)
Output:
[1, 3, 2, 1, 2, 3, 1, 2, 3, 2, 1, 3, 2, 3, 2, 1, 2, 1, 2, 3, 1, 2, 1, 3, 1, 2, 3, 2, 3, 2, 3, 2, 1, 2, 3, 1, 2, 3, 2, 1, 2, 1, 3, 2, 3, 2, 3, 2, 1, 2, 3, 2, 3, 1, 3, 2, 1, 3, 1, 3, 1, 3, 1, 2, 3, 2, 1, 3, 1, 2, 1, 3]
Here we simply take our options, check what the last value in the list is and delete that value from the options. Then we take a random value from the left over options, and append it to the list.

In case if you want to generate the input data randomly then you can use this solution.
import random
b=[]
for i in range(0,73):
x=random.randint(1,10)
if len(b)==0 or b[-1]!=x:
b.append(x)
print(b)
Output :
[6, 2, 3, 5, 6, 5, 3, 8, 1, 5, 4, 9, 4, 9, 8, 6, 9, 2, 1, 5, 8, 6, 1, 9, 6, 9, 3, 6, 5, 7, 9, 1, 9, 5, 9, 3, 4, 3, 7, 8, 3, 4, 5, 9, 1, 4, 9, 2, 1, 5, 7, 1, 10, 2, 4, 2, 1, 7, 1, 5, 4, 1, 2]
But in case if your input data is fixed, then you can try this solution as below.
a=[1,1,4]
b=[]
c=[[b.append(i) for i in a if len(b)==0 or b[-1]!=i]for j in range(0,100)]
print(b)
Output :
[1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4]

How can I vectorize this for loop below, where I need to set values to a range I need to round?

I have a np.array q with some values for example: [1,3,5,7] .
And a np.array z. with some values that I need to round and than they are used as index in the
Third array 'mapping'.
import numpy as np
q = [1,3,5,7]
z = [0,50.3,240.4,252.9,256]
mapping = np.zeros(256)
for i in range(len(q)):
print(i)
start, end = int(round(z[i])), int(round(z[i + 1]))
mapping[start:end] = int(round(q[i]))
print(mapping)
The output here is:

Here's my approach:
repeats = np.diff(list(np.round(z))+ [256]).astype(int)
# repeats = array([ 49, 191, 12, 3])
np.repeat(np.round(q), repeats)
Output:
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 7, 7, 7])
Note: this only has 255 elements and it's different from your expected output, because, tbh I don't really understand your logic.

Calculating and plotting count ratios with Pandas

I have multidimensional data in a pandas data frame with one variable indicating class. For example here is my attempt with a poor-maps heatmap scatter plot:
import pandas as pd
import random
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.cm import get_cmap
nrows=1000
df=pd.DataFrame([[random.random(), random.random()]+[random.randint(0, 1)] for _ in range(nrows)],
columns=list("ABC"))
bins=np.linspace(0, 1, 20)
df["Abin"]=[bins[i-1] for i in np.digitize(df.A, bins)]
df["Bbin"]=[bins[i-1] for i in np.digitize(df.B, bins)]
g=df.ix[:,["Abin", "Bbin"]+["C"]].groupby(["Abin", "Bbin"])
data=g.agg(["sum", "count"])
data.reset_index(inplace=True)
data["classratio"]=data[("C", "sum")]/data[("C","count")]
plt.scatter(data.Abin, data.Bbin, c=data.classratio, cmap=get_cmap("RdYlGn_r"), marker="s")
I'd like to plot class densities over binned features. Now I used np.digitize for binning and some complicating Python hand-made density calculation to plot a heatmap.
Surely, this can be done more compactly with Pandas (pivot?)? Do you know a neat way to bin the two features (for example 10 bins on the interval 0...1) and then plot a class density heatmap where color indicates the ratio of 1's to total rows within this 2D-bin?

Yep, it can be done in a very concise way using the build in cut function:
In [65]:
nrows=1000
df=pd.DataFrame([[random.random(), random.random()]+[random.randint(0, 1)] for _ in range(nrows)],
columns=list("ABC"))
In [66]:
#This does the trick.
pd.crosstab(np.array(pd.cut(df.A, 20)), np.array(pd.cut(df.B, 20))).values
Out[66]:
array([[2, 2, 2, 2, 7, 2, 3, 5, 1, 4, 2, 2, 1, 3, 2, 1, 7, 2, 4, 2],
[1, 2, 4, 2, 0, 3, 3, 3, 1, 1, 2, 1, 4, 3, 2, 1, 1, 2, 2, 1],
[0, 4, 1, 3, 1, 3, 2, 5, 2, 3, 1, 1, 1, 4, 2, 3, 6, 5, 2, 2],
[5, 2, 3, 2, 2, 1, 3, 2, 4, 0, 3, 2, 0, 4, 3, 2, 1, 3, 1, 3],
[2, 2, 4, 1, 3, 2, 2, 4, 1, 4, 3, 5, 5, 2, 3, 3, 0, 2, 4, 0],
[2, 3, 3, 5, 2, 0, 5, 3, 2, 3, 1, 2, 5, 4, 4, 3, 4, 3, 6, 4],
[3, 2, 2, 4, 3, 3, 2, 0, 0, 4, 3, 2, 2, 5, 4, 0, 1, 2, 2, 3],
[0, 0, 4, 4, 3, 2, 4, 6, 4, 2, 0, 5, 2, 2, 1, 3, 4, 4, 3, 2],
[3, 2, 2, 3, 4, 2, 1, 3, 1, 3, 4, 2, 4, 3, 2, 3, 2, 3, 4, 4],
[0, 1, 1, 4, 1, 4, 3, 0, 1, 1, 1, 2, 6, 4, 3, 5, 3, 3, 1, 4],
[2, 2, 4, 1, 3, 4, 1, 2, 1, 3, 3, 3, 1, 2, 1, 5, 2, 1, 4, 3],
[0, 0, 0, 4, 2, 0, 2, 3, 2, 2, 2, 4, 4, 2, 3, 2, 1, 2, 1, 0],
[3, 3, 0, 3, 1, 5, 1, 1, 2, 5, 6, 5, 0, 0, 3, 2, 1, 5, 7, 2],
[3, 3, 2, 1, 2, 2, 2, 2, 4, 0, 1, 3, 3, 1, 5, 6, 1, 3, 2, 2],
[3, 0, 3, 4, 3, 2, 1, 4, 2, 3, 4, 0, 5, 3, 2, 2, 4, 3, 0, 2],
[0, 3, 2, 2, 1, 5, 1, 4, 3, 1, 2, 2, 3, 5, 1, 2, 2, 2, 1, 2],
[1, 3, 2, 1, 1, 4, 4, 3, 2, 2, 5, 5, 1, 0, 1, 0, 4, 3, 3, 2],
[2, 2, 2, 1, 1, 3, 1, 6, 5, 2, 5, 2, 3, 4, 2, 2, 1, 1, 4, 0],
[3, 3, 4, 7, 0, 2, 6, 4, 1, 3, 4, 4, 1, 4, 1, 1, 2, 1, 3, 2],
[3, 6, 3, 4, 1, 3, 1, 3, 3, 1, 6, 2, 2, 2, 1, 1, 4, 4, 0, 4]])
In [67]:
abins=np.linspace(df.A.min(), df.A.max(), 21)
bbins=np.linspace(df.B.min(), df.B.max(), 21)
Z=pd.crosstab(np.array(pd.cut(df.ix[df.C==1, 'A'], abins)),
np.array(pd.cut(df.ix[df.C==1, 'B'], bbins)), aggfunc=np.mean).div(
pd.crosstab(np.array(pd.cut(df.A, abins)),
np.array(pd.cut(df.B, bbins)), aggfunc=np.mean)).values
Z = np.ma.masked_where(np.isinf(Z),Z)
x=np.linspace(df.A.min(), df.A.max(), 20)
y=np.linspace(df.B.min(), df.B.max(), 20)
X,Y=np.meshgrid(x, y)
plt.contourf(X, Y, Z, vmin=0, vmax=1)
plt.colorbar()
plt.pcolormesh(X, Y, Z, vmin=0, vmax=1)
plt.colorbar()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Creating a list from data within another list - python

Related

Assess clusters stability for each cluster

Generating random result based on previous result

Randomize list without same entry successively

How can I vectorize this for loop below, where I need to set values to a range I need to round?

Calculating and plotting count ratios with Pandas

Categories

Resources