Related
I am trying to add Columns data i.e. Temperature, Molarity etc to an ASE Atom Object in a .json file, which is formated as follows,
{"1": {
"cell": {"__ndarray__": [[3, 3], "float64", [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]},
"ctime": 23.062761176728078,
"mtime": 23.062761176728078,
"numbers": {"__ndarray__": [[63], "int32", [35, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 6, 6, 6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]},
"pbc": {"__ndarray__": [[3], "bool", [false, false, false]]},
"positions": {"__ndarray__": [[63, 3], "float64", [1.02088, -0.07049, -0.03212, 1.0973, -3.2788, 3.87728, 2.61528, -3.29081, 3.8975, 3.15998, -3.00975, 2.50308, 4.67887, -3.0308, 2.49295, 5.20986, -2.78486, 1.07948, 6.73103, -2.81918, 1.01952, 7.26742, -4.1725, 1.4482, 8.77632, -4.17848, 1.43408, 9.26424, -5.51692, 1.94735, 10.77335, -5.50232, 1.97542, 11.28332, -6.82712, 2.48445, 12.80015, -6.79016, 2.50407, 13.29845, -8.13591, 2.97445, 14.81673, -8.19203, 2.94905, 15.31473, -9.54583, 3.43333, 14.81153, -10.67945, 2.53755, 15.25421, -12.06622, 3.00288, 16.77287, -12.17072, 3.02548, 14.67692, -12.39547, 4.37401, 14.70058, -13.08165, 1.99749, 0.6998, -3.46574, 4.87783, 0.71285, -4.05308, 3.20454, 0.7212, -2.31013, 3.5324, 2.967, -4.26501, 4.25235, 2.97929, -2.53549, 4.60251, 2.81017, -2.03287, 2.15577, 2.78347, -3.76456, 1.80124, 4.99943, -4.00574, 2.86952, 5.07838, -2.27055, 3.17637, 4.86624, -1.80795, 0.71974, 4.80511, -3.54038, 0.39431, 7.14617, -2.02959, 1.65792, 7.05652, -2.60496, -0.00639, 6.87826, -4.96031, 0.78846, 6.94637, -4.41838, 2.46536, 9.15862, -3.37521, 2.07596, 9.15115, -3.99773, 0.42005, 8.89989, -6.32829, 1.30471, 8.87374, -5.70588, 2.95654, 11.12913, -4.69141, 2.62325, 11.16301, -5.3113, 0.96742, 10.92897, -7.63817, 1.83648, 10.89587, -7.0253, 3.4914, 13.15552, -5.99488, 3.16603, 13.17901, -6.58063, 1.49654, 12.87249, -8.89368, 2.31322, 12.93068, -8.34259, 3.98611, 15.22396, -7.40473, 3.5943, 15.18383, -7.99669, 1.93376, 14.98703, -9.68396, 4.46777, 16.4104, -9.51576, 3.43886, 15.16831, -10.54649, 1.51163, 13.71549, -10.72486, 2.51819, 17.16035, -11.53426, 3.81916, 17.15315, -11.86178, 2.05054, 17.04038, -13.21533, 3.22842, 15.13502, -11.7425, 5.12017, 14.91931, -13.43378, 4.61043, 13.59434, -12.25198, 4.34169, 15.10052, -12.84284, 1.00836, 13.60916, -13.00919, 1.99737, 15.01561, -14.08621, 2.30661]]},
"unique_id": "8fc6b41faacf669e07523ea9932aae59",
"user": null},
"ids": [1],
"nextid": 2}
So far I have tried adding an additional node to the positions graph, adding a 4th dimension to the positions graph, incrementing one axis of the positions graph by the column value, adding a new key to the json file, and adding the value within the built in "tags" key, so far all have either had no effect or have given an error which cannot be solved without editing the ASE library. I am not sure what to try next or if I have missed something obvious.
Full Code:
https://github.com/nfurth1/MatDeepLearn
I am creating a Sankey diagram with plotly as follows:
import plotly.graph_objects as go
fig = go.Figure(data=[go.Sankey(
valueformat = ".0f",
valuesuffix = " %",
orientation = "h",
node = dict(
pad = 20,
thickness = 20,
line = dict(color = "red", width = 1),
label = ['Equity',
'Global Equity',
'Tier 1',
'A looooooooooong',
'Tier 2',
'B looooooooooong',
'C looooooooooong',
'Tier 3',
'D looooooooooong',
'E looooooooooong',
'F looooooooooong',
'G looooooooooong',
'H looooooooooong'],
color = ['aqua',
'aqua',
'yellow',
'orange',
'yellow',
'orange',
'orange',
'yellow',
'orange',
'orange',
'orange',
'orange',
'orange'],
),
link = dict(
source = [0, 2, 1, 4, 4, 2, 7, 7, 7, 7, 7, 4],
target = [1, 3, 2, 5, 6, 4, 8, 9, 10, 11, 12, 7],
value = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
color = ['aqua',
'yellow',
'aqua',
'yellow',
'yellow',
'aqua',
'yellow',
'yellow',
'yellow',
'yellow',
'yellow',
'aqua'],
hovertemplate='This link has total value %{value}<extra></extra>'
))])
fig.update_layout(title_text="Waterfall Diagram",
font_size=16,
plot_bgcolor='white',
paper_bgcolor='white')
fig.show()
Output looks like this:
Is there a way:
to make sure links in color aqua are always below the yellow ones? to visually separate them better - I am not sure why current set up show them in that order
to give more space between the links, spreading them out more? Especially I would need links and nodes not to overlap each other
spread out the aqua links even further? I.e. visually dissociate them from the others
to control where and how node labels are shown? I.e. to the right or below node, and also controlling the font for each node
For 1, 2, & 3, you can set node locations explicitly as follows:
import plotly.graph_objects as go
fig = go.Figure(go.Sankey(
arrangement = "snap",
node = {
"label": ["A", "B", "C", "D", "E", "F"],
"x": [0.2, 0.1, 0.5, 0.7, 0.3, 0.5], #these are fractions of the domain (0,1)
"y": [0.7, 0.5, 0.2, 0.4, 0.2, 0.3],
'pad':10}, # 10 Pixels
link = {
"source": [0, 0, 1, 2, 5, 4, 3, 5],
"target": [5, 3, 4, 3, 0, 2, 2, 3],
"value": [1, 2, 1, 1, 1, 1, 1, 2]}))
fig.show()
Example from the plotly docs.
This question is closely related to an earlier one that I posted. I would like to draw confidence intervals for each bar within subplots of a figure, using the information from two columns in my data frame describing the upper and lower limit of each confidence interval. I tried to use the solution from that earlier post, but it does not seem to be applicable when one wants to use different colors and/or different rows in order to draw subplots for the figure.
For example, the following code does not produce the right confidence intervals. For instance, the CI of the 3rd bar in the second row should go from 11 to 5:
import pandas as pd
import plotly.express as px
df = pd.DataFrame(
{"x": [0, 1, 2, 3, 0, 1, 2, 3],
"y": [6, 10, 2, 5, 8, 9, 10, 11],
"ci_upper": [8, 11, 2.5, 4, 9, 10, 11, 12],
"ci_lower": [5, 9, 1.5, 3, 7, 6, 5, 10],
"state": ['foo','foo','foo','foo','bar','bar','bar','bar'],
"color": ['0','0','1','1','0','0','1','1']}
)
fig = px.bar(df, x="x", y="y",facet_row='state',color='color').update_traces(
error_y={
"type": "data",
"symmetric": False,
"array": df["ci_upper"] - df["y"],
"arrayminus": df["y"] - df["ci_lower"],
}
)
fig.update_yaxes(dtick=1)
fig.show(renderer='png')
it's the same technique but solution needs to consider it's multiple traces (4 in this example)
encoded in hovertemplate of each trace are the facet and color. Extract these and filter data down to appropriate rows
then build instruction for error bars as with simpler condition
import pandas as pd
import plotly.express as px
df = pd.DataFrame(
{
"x": [0, 1, 2, 3, 0, 1, 2, 3],
"y": [6, 10, 2, 5, 8, 9, 10, 11],
"ci_upper": [8, 11, 2.5, 4, 9, 10, 11, 12],
"ci_lower": [5, 9, 1.5, 3, 7, 6, 5, 10],
"state": ["foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"],
"color": ["0", "0", "1", "1", "0", "0", "1", "1"],
}
)
fig = px.bar(df, x="x", y="y", facet_row="state", color="color")
fig.update_yaxes(dtick=1)
def error_facet(t):
# filter data frame based on contents of hovertemplate
d = df.query(
" and ".join(
[
f"{q.split('=')[0]}==\"{q.split('=')[1]}\""
for q in t.hovertemplate.split("<br>")[0:2]
]
)
)
t.update(
{
"error_y": {
"type": "data",
"symmetric": False,
"array": d["ci_upper"] - d["y"],
"arrayminus": d["y"] - d["ci_lower"],
}
}
)
fig.for_each_trace(error_facet)
fig
What I want to achieve:
>>> from cerberus import Validator
>>> schema = {"x": {"type": "integer", "required": False}, "y": {"type": "integer", "required": False}}
>>> v = Validator(schema)
>>> v.validate({"x": 5})
True
>>> v.validate({"y": 6})
True
>>> v.validate({"x": 5, "y": 6})
True
>>> v.validate({})
False
I have checked all the document but still don't know how to achieve this result. How should I define the schema?
The only viable solution is to use Validator() multiple times.
from cerberus import Validator
def composite_validator(document):
REQUIRED_INTEGER = {"type": 'integer', "required": True}
OPTIONAL_INTEGER = {"type": 'integer', "required": False}
schemas = [
{"x": REQUIRED_INTEGER, "y": OPTIONAL_INTEGER},
{"x": OPTIONAL_INTEGER, "y": REQUIRED_INTEGER},
]
common_schema = {"z1": REQUIRED_INTEGER, "z2": OPTIONAL_INTEGER, "z3": REQUIRED_INTEGER}
for s in schemas:
s.update(common_schema)
validator = Validator()
return any(validator(document, s) for s in schemas)
Test results:
for case in [
{"x": 5, "z1": 0, "z3": -1},
{"y": 6, "z1": 0, "z3": -1},
{"x": 5, "y": 6, "z1": 0, "z3": -1},
{"z1": 0, "z3": -1}]:
print(case)
print(composite_validator(case))
#{'x': 5, 'z1': 0, 'z3': -1}
#True
#{'y': 6, 'z1': 0, 'z3': -1}
#True
#{'x': 5, 'y': 6, 'z1': 0, 'z3': -1}
#True
#{'z1': 0, 'z3': -1}
#False
I am trying to plot a scatter plot using matplotlib, i am getting " IndexError: pop from empty list" error and I am not sure how to fix it.
import matplotlib.pyplot as plt
import matplotlib
import numpy as np
import time
import itertools
d = {'5000cca229d10d09': {374851: 1}, '5000cca229cf3f8f': {372496:3},'5000cca229d106f9': {372496: 3, 372455: 2}, '5000cca229d0b3e4': {380904: 2, 380905: 1, 380906: 1, 386569: 1}, '5000cca229d098f8': {379296: 2, 379297: 2, 379299: 2, 379303: 1, 379306: 1, 379469: 1, 379471: 1, 379459: 1, 379476: 1, 379456: 4, 379609: 4}, '5000cca229d03957': {380160: 3, 380736: 3, 380162: 1, 380174: 1, 381072: 2, 379608: 2, 380568: 3, 380569: 1, 380570: 1, 379296: 3, 379300: 1, 380328: 3, 379306: 1, 380331: 1, 379824: 2, 379825: 1, 379827: 1, 380344: 1, 379836: 1, 379456: 3, 380737: 1, 380739: 1, 379462: 1, 379476: 1, 379992: 3, 379609: 1, 379994: 1, 379611: 1, 379621: 1, 380006: 1, 380904: 3, 380905: 1, 380907: 1, 380535: 3, 380536: 1, 380538: 1}, '5000cca229cf6d0b': {372768: 10, 372550: 15, 372616: 14, 372617: 20, 372653: 3, 372505: 2}, '5000cca229cec4f1': {372510: 132}}
colors = list("rgbcmyk")
for data_dict in d.values():
x = data_dict.keys()
#print x
#X= time.asctime(time.localtime(x))
y = data_dict.values()
#plt.scatter(x,y,color=colors.pop(),s = 60)
plt.scatter(x,y,color=colors.pop(),s = 90, marker='^')
plt.ylabel("Errors" , fontsize=18, color="Green")
plt.xlabel("Occured on",fontsize=18, color="Green")
plt.title("DDN23b", fontsize=25, color="Blue")
plt.gca().get_xaxis().get_major_formatter().set_useOffset(False)
plt.xticks(rotation='vertical')
#plt.ylim(min(y),max(y))
#plt.grid()
#for x, y in dict(itertools.chain(*[item.items() for item in d.values()])).items():
# plt.text(x, y, time.strftime("%m/%d/%y, %H:%M:%S", time.localtime(x*3600)), ha='center', va='top', rotation='vertical', fontsize = '11', fontstyle = 'italic', color = '#844d4d')
plt.xticks(plt.xticks()[0], [time.strftime("%m/%d/%y, %H:%M:%S", time.localtime(item)) for item in plt.xticks()[0]*3600])
plt.legend(d.keys())
mng = plt.get_current_fig_manager()
mng.resize(*mng.window.maxsize())
plt.subplots_adjust(bottom=.24,right=.98,left=0.03,top=.89)
plt.grid()
plt.show()
I have several data sets for d, and d is a dictionary. when the data set is smaller, it works without any errors. When the data set is large, it runs out of collars. How do I add more colors to the list so every key in "d" gets its own color.
Feel free to edit my code and make suggestions.
Colormaps are callable. When passed a float between 0 and 1, it returns an RGBA color:
In [73]: jet = plt.cm.jet
In [74]: jet(0.5)
Out[74]: (0.49019607843137247, 1.0, 0.47754585705249841, 1.0)
So, you could generate len(d) number of colors by passing the NumPy array np.linspace(0, 1, len(d)) to the colormap:
jet = plt.cm.jet
colors = jet(np.linspace(0, 1, len(d)))
The colors selected will then be equally spaced along the colormap gradient.
import matplotlib.pyplot as plt
import numpy as np
import time
d = {'5000cca229d10d09': {374851: 1}, '5000cca229cf3f8f': {372496:3},'5000cca229d106f9': {372496: 3, 372455: 2}, '5000cca229d0b3e4': {380904: 2, 380905: 1, 380906: 1, 386569: 1}, '5000cca229d098f8': {379296: 2, 379297: 2, 379299: 2, 379303: 1, 379306: 1, 379469: 1, 379471: 1, 379459: 1, 379476: 1, 379456: 4, 379609: 4}, '5000cca229d03957': {380160: 3, 380736: 3, 380162: 1, 380174: 1, 381072: 2, 379608: 2, 380568: 3, 380569: 1, 380570: 1, 379296: 3, 379300: 1, 380328: 3, 379306: 1, 380331: 1, 379824: 2, 379825: 1, 379827: 1, 380344: 1, 379836: 1, 379456: 3, 380737: 1, 380739: 1, 379462: 1, 379476: 1, 379992: 3, 379609: 1, 379994: 1, 379611: 1, 379621: 1, 380006: 1, 380904: 3, 380905: 1, 380907: 1, 380535: 3, 380536: 1, 380538: 1}, '5000cca229cf6d0b': {372768: 10, 372550: 15, 372616: 14, 372617: 20, 372653: 3, 372505: 2}, '5000cca229cec4f1': {372510: 132}}
jet = plt.cm.jet
colors = jet(np.linspace(0, 1, len(d)))
fig, ax = plt.subplots()
for color, data_dict in zip(colors, d.values()):
x = data_dict.keys()
y = data_dict.values()
ax.scatter(x,y,color=color, s = 90, marker='^')
plt.ylabel("Errors" , fontsize=18, color="Green")
plt.xlabel("Occured on",fontsize=18, color="Green")
plt.title("DDN23b", fontsize=25, color="Blue")
ax.get_xaxis().get_major_formatter().set_useOffset(False)
plt.xticks(rotation='vertical')
plt.xticks(plt.xticks()[0],
[time.strftime("%m/%d/%y, %H:%M:%S", time.localtime(item))
for item in plt.xticks()[0]*3600])
plt.legend(d.keys())
plt.subplots_adjust(bottom=.24,right=.98,left=0.03,top=.89)
plt.grid()
plt.show()