How to create box plot from pandas object using matplotlib? - python

I read data into pandas object and then I want to create a box plot using matplotlib (not pandas.boxplot()). This is just for learning purposes. This is my code, in which myData['MyColumn'] fails.
import matplotlib.pyplot as plt
import pandas as pd
myData = pd.read_csv('data/myData.csv')
plt.boxplot(myData['MyColumn'])
plt.show()

Your code works fine with fake data. Check the type of the data you're trying to plot.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
myData = pd.DataFrame(np.random.rand(10, 2), columns=['MyColumn', 'blah'])
plt.boxplot(myData['MyColumn'])
plt.show()

Related

Cannot create a boxplot from a CSV file in Python with pandas and matplotlib

I cannot create a full boxplot from my CSV file with pandas and matplotlib in Python.
This is what I get:
My CSV file:
id,type_EN,nb_EN
1,VP,15
2,VN,600
3,FP,78
4,FN,17
5,NOK,974
My code:
# import the required library
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
# load the dataset
df = pd.read_csv("boxplot.csv")
# display 5 rows of dataset
# df.head()
# print the boxplot
df.boxplot(by ="type_EN", column =["nb_EN"], grid = False)

How to change or customize the colors from pandas?

hi I'm just starting to use pandas on python to graph some data instead of excel,
i want to customize the colors as well as the opacity of some given data because its always going into its default color lists
heres my code :
from pandas import DataFrame
import matplotlib.pyplot as plt
import numpy as np
x=np.array([[4,8,5,7,6],[2,3,4,2,6],[4,7,4,7,8],[2,6,4,8,6],[2,4,3,3,2]])
df=DataFrame(x, columns=['a','b','c','d','e'], index=[2,4,6,8,10])
df.plot(kind='bar')
plt.show()
You can call df.plot.bar directly and pass a dictionary of column name to color mappings to the color parameter.
from pandas import DataFrame
import matplotlib.pyplot as plt
import numpy as np
x=np.array([[4,8,5,7,6],[2,3,4,2,6],[4,7,4,7,8],[2,6,4,8,6],[2,4,3,3,2]])
df=DataFrame(x, columns=['a','b','c','d','e'], index=[2,4,6,8,10])
df.plot.bar(color={'a':'gold','b':'silver','c':'green','d':'purple','e':'blue'})
plt.show()

Convert and plot date and time with pandas or numpy

Trying to plot date and time using pandas. 'dt' and 'quality'
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
%matplotlib inline
data = pd.read_csv('pontina_FCD.csv')
I've tried lots of options about transfer date, but sill facing the error.
Pandas has methods that support plotting.
Can you try the following:
data.index = pd.to_datetime(data['dt'], format='%d/%m/%Y %H:%M')
data['quality'].plot()
plot.show()

How do I make one line in this graph a different color from the rest?

I have a graph, and I would like to make one of my lines different color
Tried using the matplotlib recommendation which just made me print two graphs
import numpy as np
import pandas as pd
import seaborn as sns
data = pd.read_csv("C:\\Users\\Nathan\\Downloads\\markouts_changed_maskedNEW.csv");
data.columns = ["Yeet","Yeet1","Yeet 2","Yeet 3","Yeet 4","Yeet 7","Exchange 5","Yeet Average","Intelligent Yeet"];
mpg = data[data.columns]
mpg.plot(color='green', linewidth=2.5)

How to change the space between histograms in pandas

I'm currently using df.hist(alpha = .5), but all of the subplots are too close from each other, like this:
Histograms
Which way is better to change the space between them?
Or is better to plot each one in a separate .png file?
One simple way is to manipulate figsize and add pyplot.tight_layout. Below is the example.
Without adjustment:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(6400)
.reshape((100, 64)), columns=['col_{}'.format(i) for i in range(64)])
df.hist(alpha=0.5)
plt.show()
You will get this as you showed:
In contrast, if you add figsize (with arbitrary size) and pyplot.tight_layout like below:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(6400)
.reshape((100, 64)), columns=['col_{}'.format(i) for i in range(64)])
df.hist(alpha=0.5, figsize=(20, 10))
plt.tight_layout()
plt.show()
In this case you will get more aligned view:
Hope this helps.

Categories

Resources