Python: Iris Data Set, include the species - python

I have the below code and it returns me the min and max values of the chosen column, however, I would also like to include the species that this value relates to. I have also included the column names in the csv file.
import pandas as pd
import numpy as np
import scipy as sp
import matplotlib as mpl
import seaborn as sns
df = pd.read_csv("iris head.csv")
print(min(df['Sepal Length']))
print(max(df['Sepal Length']))

IIUC is as follows:
df.groupby(['class'])['Sepal Length'].agg(['max','min'])

Related

Pandas converting sub-columns into rows

I have added a picture of my dataframe. There are sub-columns names which are 'GBPEUR=X' 'GBPJPY=X' and 'USDMXN=X'. I would like to take these sub-headings and turn them into different rows. Any idea of how to do this as cant find anything else on the internet? Code below:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf
%matplotlib inline
df = yf.Tickers('GBPJPY=X GBPEUR=X USDMXN=X')
currencies = df.history(period='max')

How do I find covariance and correlation?

I have 2 data sets saved in the csv file. Column names "avg" and "hu". I want to find the covariance and correlation values ​​of these two data sets. I tried it with some simple codes. But every time I got an error. What am I doing wrong ?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data=pd.read_csv("80hucov.csv")
avg=data["avg"]
hu=data["hu"]
data = np.array(["avg, hu"])
covMatrix = np.cov(data,bias=True)
print (covMatrix)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data=pd.read_csv("80hucov.csv")
data = {'A': ["avg"],
'B': ["hu"],}
df = pd.DataFrame(data,columns=['A','B'])
covMatrix = pd.DataFrame.cov(df)
sn.heatmap(covMatrix, annot=True, fmt='g')
plt.show()
It seems you may need to redefine your definition of the array.
Currently you have:
data = np.array(["avg, hu"])
You can do:
data_array = data[['avg', 'hu']].to_numpy()
I recommend using different names for different objets within your code. In your example you use "data" for both your dataframe and your array.

How to change or customize the colors from pandas?

hi I'm just starting to use pandas on python to graph some data instead of excel,
i want to customize the colors as well as the opacity of some given data because its always going into its default color lists
heres my code :
from pandas import DataFrame
import matplotlib.pyplot as plt
import numpy as np
x=np.array([[4,8,5,7,6],[2,3,4,2,6],[4,7,4,7,8],[2,6,4,8,6],[2,4,3,3,2]])
df=DataFrame(x, columns=['a','b','c','d','e'], index=[2,4,6,8,10])
df.plot(kind='bar')
plt.show()
You can call df.plot.bar directly and pass a dictionary of column name to color mappings to the color parameter.
from pandas import DataFrame
import matplotlib.pyplot as plt
import numpy as np
x=np.array([[4,8,5,7,6],[2,3,4,2,6],[4,7,4,7,8],[2,6,4,8,6],[2,4,3,3,2]])
df=DataFrame(x, columns=['a','b','c','d','e'], index=[2,4,6,8,10])
df.plot.bar(color={'a':'gold','b':'silver','c':'green','d':'purple','e':'blue'})
plt.show()

Taking data from specific columns in a dataset

I need to take data from only 3 columns in my dataset, how do I do this? I am trying to make a correlation graph. This is my code:
import matplotlib.pyplot as plt
import pandas as pd
crimedata = pd.read_csv('MasterFileCSV.csv')
crime_df = pd.DataFrame(crimedata)
plt.matshow(crime_df.corr())
plt.show

How can I show True/False graph on Python

I have a csv file and I want to show this data on grap. I have date,place and status data but I don't need place so I fetch data like this.
And going like this
Here is my code. How can I get a graph with 1-0 values according to date value. Which method should I use ? Thanks
import pandas as pd from pandas
import DataFrame
import datetime
import pandas.io.data
import matplotlib.pyplot as plt from mpl_toolkits.mplot3d
import Axes3D import pylab rows_list=[] df=pd.read_csv('filepath',header=None,parse_dates=True,pr‌​efix='column')
for row in df.iterrows():
if row[1][1]=='Beweging in de living':
if row[1][2]=='OPEN': rows_list.append([row[1][0],'1'])
else: rows_list.append([row[1][0],'0'])
df2 = pd.DataFrame(rows_list)
df3=df2.set_index(0)
print df3 plt.plot(df3)
plt.show()

Categories

Resources