Decision trees graph not working python 3.6 not saving - python

I am trying to print s decesion tree in python but for some reason i am getting an error message:
InvocationException: GraphViz's executables not found
import graphviz
tree = DecisionTreeClassifier(criterion='entropy',max_depth=18,random_state=0)
tree.fit(X_train, y_train)
dot_data = StringIO()
export_graphviz(tree,out_file = dot_data,filled=True,rounded=True,feature_names=X_train.columns.values.tolist(),class_names = ['0', '1'],special_characters=True)
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
graph.write_png("C:/Temp/Tree.png")
print('Visible tree plot saved as png.')
graph

You need to add graphviz to PATH. Find your own version of this:
C:\Users\Env\Library\bin\graphviz
And add it to PATH.

Related

Generating a graph with greek or arabic symbols

I am using a combination of networkx, pygraphviz and graphviz to create and visualise graphs in python,
However I keep encountering utf-8 encoding errors when writing my dot file from my networkx graph that has greek letters as nodes.
I would like to be able to create dot graphs with nodes such as these: Κύριος, Θεός, Πᾶσα, Μέγας, Νέμεσις but am unable to do so.
Are there any encoding tricks I need to know about?
The example you posted works fine for me (using ipython with python 3.7) when using the correct dot file path.
import networkx as nx
from networkx.drawing.nx_pydot import write_dot
import graphviz as grv
n = "Νέμεσις"
G = nx.Graph()
G.add_node(n)
write_dot(G, 'test.dot')
grv.render('neato', 'svg', 'test.dot')

Python Databricks cannot visualise dtreeviz decision tree

I need to visualize a decision tree in dtreeviz in Databricks.
The code seems to be working fine.
However, instead of showing the decision tree it throws the following:
Out[23]: <dtreeviz.trees.DTreeViz at 0x7f5b27a91160>
Running the following code:
import pandas as pd
from sklearn import preprocessing, tree
from dtreeviz.trees import dtreeviz
Things = {'Feature01': [3,4,5,0],
'Feature02': [4,5,6,0],
'Feature03': [1,2,3,8],
'Target01': ['Red','Blue','Teal','Red']}
df = pd.DataFrame(Things,
columns= ['Feature01', 'Feature02',
'Feature02', 'Target01'])
label_encoder = preprocessing.LabelEncoder()
label_encoder.fit(df.Target01)
df['target'] = label_encoder.transform(df.Target01)
classifier = tree.DecisionTreeClassifier()
classifier.fit(df.iloc[:,:3], df.target)
dtreeviz(classifier,
df.iloc[:,:3],
df.target,
target_name='toy',
feature_names=df.columns[0:3],
class_names=list(label_encoder.classes_)
)
if you look into dtreeviz documentation you'll see that dtreeviz method just creates an object, and then you need to use function like .view() to show it. On Databricks, view won't work, but you can use .svg() method to generate output as SVG, and then use displayHTML function to show it. Following code:
viz = dtreeviz(classifier,
...)
displayHTML(viz.svg())
will give you desired output:
P.S. You need to have the dot command-line tool to generate output. It could be installed by executing in a cell of the notebook:
%sh apt-get install -y graphviz

Cannot set graphviz output to pdf

I'm trying to use Graphviz fo the decision tree classifier, the code is here.
I expect the output of a diagram but the actual output is this message:
warning, language pdf not recognized, use one of:
dot canon plain plain-ext
dot: option -O unrecognized
warning, language svg not recognized, use one of:
dot canon plain plain-ext
Any help is appreciated
import graphviz
dot_data = tree.export_graphviz(dtc, out_file=None)
graph = graphviz.Source(dot_data)
graph.render("data")
graph

Using graphviz to plot decision tree in python

I am following the answer presented to a previous post: Is it possible to print the decision tree in scikit-learn?
from sklearn.datasets import load_iris
from sklearn import tree
from sklearn.externals.six import StringIO
import pydot
clf = tree.DecisionTreeClassifier()
iris = load_iris()
clf = clf.fit(iris.data, iris.target)
tree.export_graphviz(clf, out_file='tree.dot')
dot_data = StringIO()
tree.export_graphviz(clf, out_file=dot_data)
graph = pydot.graph_from_dot_data(dot_data.getvalue())
graph.write_pdf("iris.pdf")
Unfortunately, I cannot figure out the following error:
'list' object has no attribute 'write_pdf'
Does anyone know a way around this as the structure of the generated tree.dot file is a list?
Update
I have attempted using the web application http://webgraphviz.com/. This works, however, the decision tree conditions, together with the classes are not displayed. Is there any way to include these in the tree.dot file?
Looks like data that you collect in graph is of type list.
graph = pydot.graph_from_dot_data(dot_data.getvalue())
type(graph)
<type 'list'>
We are only interested in first element of the list.
So you can do this one of following of two ways,
1) Change line where you collect dot_data value in graph to
(graph, ) = pydot.graph_from_dot_data(dot_data.getvalue())
2) Or collect entire list in graph but just use first element to be sent to pdf
graph[0].write_pdf("iris.pdf")
Here is what I get as output of iris.pdf
Update
To get around path error,
Exception: "dot.exe" not found in path.
Install graphviz from here
Then use either following in your code.
import os
os.environ["PATH"] += os.pathsep + 'C:/Program Files (x86)/Graphviz2.38/bin/'
Or simply add following to your windows path in control panel.
C:\Program Files (x86)\Graphviz2.38\bin
As per graphviz documentation, it does not get added to windows path during installation.

Python, PyDot and DecisionTree

I'm trying to visualize my DecisionTree, but getting the error
The code is:
X = [i[1:] for i in dataset]#attribute
y = [i[0] for i in dataset]
clf = tree.DecisionTreeClassifier()
dot_data = StringIO()
tree.export_graphviz(clf.fit(train_X, train_y), out_file=dot_data)
graph = pydot.graph_from_dot_data(dot_data.getvalue())
graph.write_pdf("tree.pdf")
And the error is
Traceback (most recent call last):
if data.startswith(codecs.BOM_UTF8):
TypeError: startswith first arg must be str or a tuple of str, not bytes
Can anyone explain me whats the problem? Thank you a lot!
In case of using Python 3, just use pydotplus instead of pydot. It will also have a soft installation process by pip.
import pydotplus
<your code>
dot_data = StringIO()
tree.export_graphviz(clf, out_file=dot_data)
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
graph.write_pdf("iris.pdf")
I had the same exact problem and just spent a couple hours trying to figure this out. I can't guarantee what I share here will work for others but it may be worth a shot.
I tried installing official pydot packages but I have Python 3 and they simply did not work. After finding a note in a thread from one of the many websites I scoured through, I ended up installing this forked repository of pydot.
I went to graphviz.org and installed their software on my Windows 7 machine. If you don't have Windows, look under their Download section for your system.
After successful install, in Environment Variables (Control Panel\All Control Panel Items\System\Advanced system settings > click Environment Variables button > under System variables I found the variable path > click Edit... > I added ;C:\Program Files (x86)\Graphviz2.38\bin to the end in the Variable value: field.
To confirm I can now use dot commands in the Command Line (Windows Command Processor), I typed dot -V which returned dot - graphviz version 2.38.0 (20140413.2041).
In the below code, keep in mind that I'm reading a dataframe from my clipboard. You might be reading it from file or whathaveyou.
In IPython Notebook:
import pandas as pd
import numpy as np
from sklearn import tree
import pydot
from IPython.display import Image
from sklearn.externals.six import StringIO
df = pd.read_clipboard()
X = df[df.columns[:-1]]
y = df[df.columns[-1]]
dtr = tree.DecisionTreeRegressor(max_depth=3)
dtr.fit(X, y)
dot_data = StringIO()
tree.export_graphviz(dtr, out_file=dot_data, feature_names=X.columns)
graph = pydot.graph_from_dot_data(dot_data.getvalue())
Image(graph.create_png())
Alternatively, if you're not using IPython, you can generate your own image from the command line as long as you have graphviz installed (step 2 above). Using my same example code above, you use this line after fitting the model:
tree.export_graphviz(dtr.tree_, out_file='treepic.dot', feature_names=X.columns)
then open up command prompt where the treepic.dot file is and enter this command line:
dot -T png treepic.dot -o treepic.png
A .png file should be created with your decision tree.
The line in question is checking to see if the stream/file is encoded as UTF-8
Instead of:
if data.startswith(codecs.BOM_UTF8):
use:
if codecs.BOM_UTF8 in data:
You will likely have more success...

Categories

Resources