How to plot multiple layers with Geoframes in python? - python

Context: I've two data frames that i read with pandas from .csv files, one of them (dfevents) has a latitude and longitude fields on it, the other dataframe (dfplacedetails) has multiple points that creates a polygon. I'm usign "intersets" properties to verify when the first data frame cross with the polygon of the other one. That actually works fine, but when I'm triying to plot both of the layers is just not posibble, they plot separete
My code is as follow:
# Libraries
from matplotlib import pyplot as plt
import geopandas as gp
import pandas as pd
# Creating data frames
dfevents = pd.read_csv (r'C:\Users\alan_\Desktop\TAT\Inputs\Get Events\Get_Events.csv')
print(dfevents)
dfplacedetails = pd.read_csv (r'C:\Users\alan_\Desktop\TAT\Inputs\Get Place Details\Get_Place_Details.csv')
print(dfplacedetails)
# Make them proper Geometrys
dfevents['point'] = gp.GeoSeries.from_xy(dfevents.longitude, dfevents.latitude)
dfplacedetails['polygon'] = gp.GeoSeries.from_wkt('POLYGON' + dfplacedetails.polygon)
# Make them GeoDataFrames
dfevents = gp.GeoDataFrame(dfevents, geometry='point')
dfplacedetails = gp.GeoDataFrame(dfplacedetails, geometry='polygon')
# Output (It works fine)
dfout = dfevents.intersects(dfplacedetails)
print(dfout)
# Plot
fig, ax =plt.subplots(figsize =(20,10))
dfplacedetails.plot(ax=ax, color='blue')
dfevents.plot(ax=ax, color='red',markersize=10)
ax.set_axis_on()
The result that i got when I plot as I described up in my code is as follow:
But when I plot separate both of the layers plot fine:
Is there any way to plot both of them in the same image?
Thanks for you help!
By the way i'm using Visual Studio Code

Related

How do i plot a graph using a 2D List?

I can't seem to figure out how to plot a graph as i always get an error message of "no numeric data to plot". I have also tried using a csv file to plot but it has not been successful.
This is my 2d list;
listofstock.append([1,"Microsoft","Mega",100,188,207])
listofstock.append([2,"Amazon","Mega",5,1700,3003])
listofstock.append([3,"PayPal","Large",80,100,188])
listofstock.append([4,"Apple","Large",100,60,110])
listofstock.append([5,"Fastly","Mid",30,40,76])
listofstock.append([6,"Square","Mid",30,40,178])
You can try this
import pandas as pd
listofstock = []
listofstock.append([1,"Microsoft","Mega",100,188,207])
listofstock.append([2,"Amazon","Mega",5,1700,3003])
listofstock.append([3,"PayPal","Large",80,100,188])
listofstock.append([4,"Apple","Large",100,60,110])
listofstock.append([5,"Fastly","Mid",30,40,76])
listofstock.append([6,"Square","Mid",30,40,178])
# if you are in a ipython-notebook
pd.DataFrame.from_records(listofstock).drop(0, axis=1).set_index([1]).plot()
# if you want to save the figure to a file
fig = pd.DataFrame.from_records(listofstock).drop(0, axis=1).set_index([1]).plot().get_figure()
fig.savefig('test.png')
# if you want to open in a new window
fig.show()

Interactive 3D scatter plot from text file created by concatenating multiple files

I have multiple text files with specific filename format in a directory, I want to concatenate all the content from all the files to a single .csv file and need to make an interactive 3D scatter plot using the specific data columns from the final CSV file. For this, I tried to concatenate the file's data into one. But my output has around 5000 entries instead of five hundred(after the 500 entries, the values repeating itself). Help me to find the error.
[Interactive plot : Able to zoom in / zoom out/ rotate the plot using mouse]
import fnmatch
import pandas as pd
data = pd.DataFrame()
for f_name in os.listdir(os.getcwd()):
if fnmatch.fnmatch(f_name, 'hypoDD.reloc.*'):
print(f_name)
df=pd.read_csv(f_name,header=None,sep="\s+|\t")
data=data.append(df,ignore_index=True)
#print(data)
data.to_csv('outfile.txt',index=False)
OR
I want to make an interactive single 3D scatter plot using specific data columns from each file, and each file's data should be represented by different scatter color. ( I have ~18 different files and I don't even know 18 different colour names!)
Finally, I am able to write the code, even though the figure needs some more modifications like ( Put axis limits, reduce the scatter size, give scatter colour according to each file, Z-axis direction should be downward)
Suggestions?
import os
import glob
mypath = os.getcwd()
file_count = len(glob.glob1(mypath,"hypoDD.reloc.*"))
print("Number of clusters is:" ,file_count)
# Get .txt files
import fnmatch
import pandas as pd
data = pd.DataFrame()
for f_name in os.listdir(os.getcwd()):
if fnmatch.fnmatch(f_name, 'hypoDD.reloc.*'):
print(f_name)
df=pd.read_csv(f_name,header=None,sep="\s+|\t")
data=data.append(df,ignore_index=True)
#print(data)
data.to_csv('outfile.txt',index=False)
latitude=data.iloc[:,1]
longitude=data.iloc[:,2]
depth=data.iloc[:,3]
scatter_data = pd.concat([longitude, latitude,depth], axis=1)
scatter_data.columns=['lon','lat','depth']
#------------------------------3D scatter--------------------------------
#----setting default renderer------------
import plotly.io as pio
pio.rrenderers
pio.renderers.default = "browser"
#-----------------------------------------
import plotly.express as px
fig = px.scatter_3d(scatter_data,x='lon', y='lat', z='depth')
fig.show()
fig.write_image("fig1.jpg")

How to plot time series graph in jupyter?

I have tried to plot the data in order to achieve something like this:
But I could not and I just achieved this graph with plotly:
Here is the small sample of my data
Does anyone know how to achieve that graph?
Thanks in advance
You'll find a lot of good stuff on timeseries on plotly.ly/python. Still, I'd like to share some practical details that I find very useful:
organize your data in a pandas dataframe
set up a basic plotly structure using fig=go.Figure(go.Scatter())
Make your desired additions to that structure using fig.add_traces(go.Scatter())
Plot:
Code:
import plotly.graph_objects as go
import pandas as pd
import numpy as np
# random data or other data sources
np.random.seed(123)
observations = 200
timestep = np.arange(0, observations/10, 0.1)
dates = pd.date_range('1/1/2020', periods=observations)
val1 = np.sin(timestep)
val2=val1+np.random.uniform(low=-1, high=1, size=observations)#.tolist()
# organize data in a pandas dataframe
df= pd.DataFrame({'Timestep':timestep, 'Date':dates,
'Value_1':val1,
'Value_2':val2})
# Main plotly figure structure
fig = go.Figure([go.Scatter(x=df['Date'], y=df['Value_2'],
marker_color='black',
opacity=0.6,
name='Value 1')])
# One of many possible additions
fig.add_traces([go.Scatter(x=df['Date'], y=df['Value_1'],
marker_color='blue',
name='Value 2')])
# plot figure
fig.show()

Holoviews scatter plot color by categorical data

I've been trying to understand how to accomplish this very simple task of plotting two datasets, each with a different color, but nothing i found online seems to do it. Here is some sample code:
import pandas as pd
import numpy as np
import holoviews as hv
from holoviews import opts
hv.extension('bokeh')
ds1x = np.random.randn(1000)
ds1y = np.random.randn(1000)
ds2x = np.random.randn(1000) * 1.5
ds2y = np.random.randn(1000) + 1
ds1 = pd.DataFrame({'dsx' : ds1x, 'dsy' : ds1y})
ds2 = pd.DataFrame({'dsx' : ds2x, 'dsy' : ds2y})
ds1['source'] = ['ds1'] * len(ds1.index)
ds2['source'] = ['ds2'] * len(ds2.index)
ds = pd.concat([ds1, ds2])
Goal is to produce two datasets in a single frame, with a categorical column keeping track of the source. Then i try plotting a scatter plot.
scatter = hv.Scatter(ds, 'dsx', 'dsy')
scatter
And that works as expected. But i cannot seem to understand how to color the two datasets differently based on the source column. I tried the following:
scatter = hv.Scatter(ds, 'dsx', 'dsy', color='source')
scatter = hv.Scatter(ds, 'dsx', 'dsy', cmap='source')
Both throw warnings and no color. I tried this:
scatter = hv.Scatter(ds, 'dsx', 'dsy')
scatter.opts(color='source')
Which throws an error. I tried converting the thing to a Holoviews dataset, same type of thing.
Why is something that is supposed to be so simple so obscure?
P.S. Yes, i know i can split the data and overlay two scatter plots and that will give different colors. But surely there has to be a way to accomplish this based on categorical data.
You can create a scatterplot in Holoviews with different colors per category as follows. They are all elegant one-liners:
1) By simply using .hvplot() on your dataframe to do this for you.
import hvplot
import hvplot.pandas
df.hvplot(kind='scatter', x='col1', y='col2', by='category_col')
# If you are using bokeh as a backend you can also just use 'color' parameter.
# I like this one more because it creates a hv.Scatter() instead of hv.NdOverlay()
# 'category_col' is here just an extra vdim, which is used for colors
df.hvplot(kind='scatter', x='col1', y='col2', color='category_col')
2) By creating an NdOverlay scatter plot as follows:
import holoviews as hv
hv.Dataset(df).to(hv.Scatter, 'col1', 'col2').overlay('category_col')
3) Or doppler's answer slightly adjusted, which sets 'category_col' as an extra vdim and is then used for the colors:
hv.Scatter(
data=df, kdims=['col1'], vdims=['col2', 'category_col'],
).opts(color='category_col', cmap=['blue', 'orange'])
Resulting plot:
You need the following sample data if you want to use my example directly:
import numpy as np
import pandas as pd
# create sample dataframe
df = pd.DataFrame({
'col1': np.random.normal(size=30),
'col2': np.random.normal(size=30),
'category_col': np.random.choice(['category_1', 'category_2'], size=30),
})
As an extra:
I find it interesting that there are basically 2 solutions to the problem.
You can create a hv.Scatter() with the category_col as an extra vdim which provides the colors or alternatively 2 separate scatterplots which are put together by hv.NdOverlay().
In the backend the hv.Scatter() solution will look like this:
:Scatter [col1] (col2,category_col)
And the hv.NdOverlay() backend looks like this:
:NdOverlay [category_col] :Scatter [col1] (col2)
This may help: http://holoviews.org/user_guide/Style_Mapping.html
Concretely, you cannot use a dim transform on a dimension that is not declared, not obscure at all :)
scatter = hv.Scatter(ds, 'dsx', ['dsy', 'source']
).opts(color=hv.dim('source').categorize({'ds1': 'blue', 'ds2': 'orange'}))
should get you there (haven't tested it myself).
Related:
Holoviews color per category
Overlay NdOverlays while keeping color / changing marker

Reordering heatmap from seaborn using column info from additional text file

I wrote a python script to read in a distance matrix that was provided via a CSV text file. This distance matrix shows the difference between different animal species, and I'm trying to sort them in different ways(diet, family, genus, etc.) using data from another CSV file that just has one row of ordering information. Code used is here:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as mp
dietCols = pd.read_csv("label_diet.txt", header=None)
df = pd.read_csv("distance_matrix.txt", header=None)
ax = sns.heatmap(df)
fig = ax.get_figure()
fig.savefig("fig1.png")
mp.clf()
dfDiet = pd.read_csv("distance_matrix.txt", header=None, names=dietCols)
ax2 = sns.heatmap(dfDiet, linewidths=0)
fig2 = ax2.get_figure()
fig2.savefig("fig2.png")
mp.clf()
When plotting the distance matrix, the original graph looks like this:
However, when the additional naming information is read from the text file, the graph produced only has one column and looks like this:
You can see the matrix data is being used as row labeling, and I'm not sure why that would be. Some of the rows provided have no values so they're listed as "NaN", so I'm not sure if that would be causing a problem. Is there any easy way to order this distance matrix using an exterior file? Any help would be appreciated!

Categories

Resources