Plotly python bar plot stack order - python

Here is my code for the dataframe
df = pd.DataFrame({'var_1':['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'],
'var_2':['m', 'n', 'o', 'm', 'n', 'o', 'm', 'n', 'o'],
'var_3':[np.random.randint(25, 33) for _ in range(9)]})
Here is the dataframe that I have
var_1 var_2 var_3
0 a m 27
1 a n 28
2 a o 28
3 b m 31
4 b n 30
5 b o 25
6 c m 27
7 c n 32
8 c o 27
Here is the code I used to get the stacked bar plot
fig = px.bar(df, x='var_3', y='var_1', color='var_2', orientation='h', text='var_3')
fig.update_traces(textposition='inside', insidetextanchor='middle')
fig
But I want the bar to stack in descending order of the values, largest at the start/bottom and smallest at top
How should I update the layout to get that

df = pd.DataFrame({'var_1':['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'],
'var_2':['m', 'n', 'o', 'm', 'n', 'o', 'm', 'n', 'o'],
'var_3':[np.random.randint(25, 33) for _ in range(9)]})
df.sort_values(['var_1', 'var_3'], ignore_index=True, inplace=True, ascending=False)
# colors
colors = {'o': 'red',
'm': 'blue',
'n': 'green'}
# traces
data = []
# loop across the different rows
for i in range(df.shape[0]):
data.append(go.Bar(x=[df['var_3'][i]],
y=[df['var_1'][i]],
orientation='h',
text=str(df['var_3'][i]),
marker=dict(color=colors[df['var_2'][i]]),
name=df['var_2'][i],
legendgroup=df['var_2'][i],
showlegend=(i in [1, 2, 3])))
# layout
layout = dict(barmode='stack',
yaxis={'title': 'var_1'},
xaxis={'title': 'var_3'})
# figure
fig = go.Figure(data=data, layout=layout)
fig.update_traces(textposition='inside', insidetextanchor='middle')
fig.show()

Related

How to print an index 0 of a list, with for loop?

Why does this code, does not return (or print) the 0 index of the list?
N = int(input())
letters = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
for i in range(N):
print(letters[i]*i)
output: should be:
A
BB
CCC
DDDD
EEEEE
FFFFFF
GGGGGGG
But im getting this, without the "A":
B
CC
DDD
EEEE
FFFFF
GGGGGG
Because i starts counting at zero.
If you type:
for i in range(10):
print(i)
You'll see:
0
1
2
3
4
5
6
7
8
9
If you want to count from 1 to N instead of 0 to N-1, type:
print(letters[i]*(i+1))
Because 0 times a list wont print it. You need to add 1.
N = int(input())
letters = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
for i in range(N):
print(letters[i]*(i+1))
the for i in range starts at 0. It then tries to execute
print(letters[i]*i)
At index 0, this translates to
print(letters[0]*0)
Since the resulting string is multiplied by zero, it outputs nothing. Changing i to i + 1 would solve this issue, as other commenters have pointed out.
the i starts from 0.
So you need:
for i in range(N):
print(letters[i]*(i+1))

Change the set to a separate list in python

a= [{'w', 'r', 't', 'y', 'e'}, {'f', 'g', 'w', 's', 'd'}]
How to change this data set into a separate list so that
b = ['w', 'r', 't', 'y', 'e']
c = ['f', 'g', 'w', 's', 'd']
assert len(a) == 2
b, c = [list(s) for s in a]

adding data to only few column of an empty dataframe in pandas for each iteration

I have a code in which there is a loop and the output of each iteration is creating a dataframe df1 with different column names, say in iteration 1, the dataframe df1 created is giving me column A, B, C and in iteration 2 the columns are A, X, D. There is only a single record as the output for each dataframe.
In total there can be 26 distinct column and one iteration_seq column. I created a pandas empty dataframe df2 with all the possible columns.
df2 = pd.DataFrame(columns = ['iteration_seq','A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T',
'U', 'V', 'W', 'X', 'Y', 'Z'])
lis = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
'P', 'Q', 'R', 'S', 'T','U', 'V', 'W', 'X', 'Y', 'Z']
import random
random.shuffle(lis)
import numpy as np
for i in range(len(lis)-1):
pax = lis[i] # we have problem from here, say I want to take any aribitrary variable and put random value in that, and then update the df2, and later append the df2 in the df1
df2 = pd.DataFrame(columns = ['iteration_seq'])
df2[pax] = np.random.rand(1,1000)) # and how to take a variable value as column value, in R we had the option of "with"
df2['iteration_seq'] = i+1
How can I put the values of each iteration of the loop, as a separate row in the empty dataframe df2 for only the column which are in the output of the loop in df1, along with an additional column specifying the iteration number.

iterating markers in plots

I'm trying to denote the predictions with a color and the correct labels as markers for the iris data set. Here is what I have so far:
from sklearn.mixture import GMM
import pandas as pd
from sklearn import datasets
import matplotlib.pyplot as plt
import itertools
iris = datasets.load_iris()
x = iris.data
y = iris.target
gmm = GMM(n_components=3).fit(x)
labels = gmm.predict(x)
fig, axes = plt.subplots(4, 4)
Superman = iris.feature_names
markers = ["o" , "s" , "D"]
Mi=[]
for i in range(150):
Mi.append(markers[y[i]])
for i in range(4):
for j in range(4):
if(i != j):
axes[i, j].scatter(x[:, i], x[:, j], c=labels, marker = Mi, s=40, cmap='viridis')
else:
axes[i,j].text(0.15, 0.3, Superman[i], fontsize = 8)
I'm not sure why Colors iterate and markers do not, but is there a way to assign each marker a certain value like color? It also fails when I just enter the numeric values from y.
The code it returns is:
Unrecognized marker style ['o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D', 'D']
Using several markers in a single scatter is currently not a feature matplotlib supports. There is however a feature request for this at https://github.com/matplotlib/matplotlib/issues/11155
It is of course possible to draw several scatters, one for each marker type.
A different option is the one I proposed in the above thread, which is to set the markers after creating the scatter:
import numpy as np
import matplotlib.pyplot as plt
def mscatter(x,y,ax=None, m=None, **kw):
import matplotlib.markers as mmarkers
if not ax: ax=plt.gca()
sc = ax.scatter(x,y,**kw)
if (m is not None) and (len(m)==len(x)):
paths = []
for marker in m:
if isinstance(marker, mmarkers.MarkerStyle):
marker_obj = marker
else:
marker_obj = mmarkers.MarkerStyle(marker)
path = marker_obj.get_path().transformed(
marker_obj.get_transform())
paths.append(path)
sc.set_paths(paths)
return sc
N = 40
x, y, c = np.random.rand(3, N)
s = np.random.randint(10, 220, size=N)
m = np.repeat(["o", "s", "D", "*"], N/4)
fig, ax = plt.subplots()
scatter = mscatter(x, y, c=c, s=s, m=m, ax=ax)
plt.show()
If you only have numbers, instead of marker symbols you would first need to map numbers to symbols and supply the list of symbols to the function.
You could modify your code like the following to get the desired result:
markers = ["o" , "s" , "D"]
colors = ["red", "green", "blue"]
for i in range(4):
for j in range(4):
for k in range(x.shape[0]):
if(i != j):
axes[i, j].scatter(x[k, i], x[k, j], color=colors[labels[k]], marker = markers[y[k]], s=40, cmap='viridis')
else:
axes[i,j].text(0.15, 0.3, Superman[i], fontsize = 8)

Creating a Matrix in Python without numpy [duplicate]

This question already has answers here:
How do I split a list into equally-sized chunks?
(66 answers)
Closed 6 years ago.
I'm trying to create and initialize a matrix. Where I'm having an issue is that each row of my matrix I create is the same, rather than moving through the data set.
I've tried to correct it by checking if the value was already in the matrix and that didn't solve my problem.
def createMatrix(rowCount, colCount, dataList):
mat = []
for i in range (rowCount):
rowList = []
for j in range (colCount):
if dataList[j] not in mat:
rowList.append(dataList[j])
mat.append(rowList)
return mat
def main():
alpha = ['a','b','c','d','e','f','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
mat = createMatrix(5,5,alpha)
print (mat)
The output should be like this:
['a','b','c','d','e'] , ['f','h','i','j','k'], ['l','m','n','o','p'] , ['q','r','s','t','u'], ['v','w','x','y','z']
My issue is I just keep getting the first a,b,c,d,e list for all 5 lists returned
You need to keep track of the current index in your loop.
Essentially you want to turn a list like 0,1,2,3,4,....24 (these are the indices of your initial array, alpha) into:
R1C1, R1C2, R1C3, R1C4, R1C5
R2C1, R2C2... etc
I added the logic to do this the way you are currently doing it:
def createMatrix(rowCount, colCount, dataList):
mat = []
for i in range(rowCount):
rowList = []
for j in range(colCount):
# you need to increment through dataList here, like this:
rowList.append(dataList[rowCount * i + j])
mat.append(rowList)
return mat
def main():
alpha = ['a','b','c','d','e','f','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
mat = createMatrix(5,5,alpha)
print (mat)
main()
which then prints out:
[['a', 'b', 'c', 'd', 'e'], ['f', 'h', 'i', 'j', 'k'], ['l', 'm', 'n', 'o', 'p'], ['q', 'r', 's', 't', 'u'], ['v', 'w', 'x', 'y', 'z']]
The reason you were always receiving a,b,c,d,e is because when you write this:
rowList.append(dataList[j])
what it is effectively doing is it is iterating 0-4 for every row. So basically:
i = 0
rowList.append(dataList[0])
rowList.append(dataList[1])
rowList.append(dataList[2])
rowList.append(dataList[3])
rowList.append(dataList[4])
i = 1
rowList.append(dataList[0]) # should be 5
rowList.append(dataList[1]) # should be 6
rowList.append(dataList[2]) # should be 7
rowList.append(dataList[3]) # should be 8
rowList.append(dataList[4]) # should be 9
etc.
You can use a list comprehension:
>>> li= ['a','b','c','d','e','f','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
>>> [li[i:i+5] for i in range(0,len(li),5)]
[['a', 'b', 'c', 'd', 'e'], ['f', 'h', 'i', 'j', 'k'], ['l', 'm', 'n', 'o', 'p'], ['q', 'r', 's', 't', 'u'], ['v', 'w', 'x', 'y', 'z']]
Or, if you don't mind tuples, use zip:
>>> zip(*[iter(li)]*5)
[('a', 'b', 'c', 'd', 'e'), ('f', 'h', 'i', 'j', 'k'), ('l', 'm', 'n', 'o', 'p'), ('q', 'r', 's', 't', 'u'), ('v', 'w', 'x', 'y', 'z')]
Or apply list to the tuples:
>>> map(list, zip(*[iter(li)]*5))
[['a', 'b', 'c', 'd', 'e'], ['f', 'h', 'i', 'j', 'k'], ['l', 'm', 'n', 'o', 'p'], ['q', 'r', 's', 't', 'u'], ['v', 'w', 'x', 'y', 'z']]

Categories

Resources