How to replicate an excel line plot in python - python

I want to replicate a line plot from excel using x-y data table, but the output from my code looks different than the plot I get from the excel.
I noticed that if I change the plotting style from line to scatter in the excel then both plots looks the same.
This is a sample code for reproduce:
import matplotlib.pyplot as plt
X=[1,5,7,15,20,25,30]
Y=[12,9,10,8,7,9,6]
plt.plot(X,Y)
How to replicate a line plot from excel in to python?

A line chart in Excel has a categorical horizontal (X) axis by default (or time if the data is dates). In a category axis, the data points are spread evenly across the X axis and not according to their value.
In an XY Scatter chart, however, both axes are numerical and the data points are plotted according to their value. A scatter chart series can be formatted to show dots, lines and any combination thereof, so it can look like a line chart.
If the python plot looks the same as the Excel scatter chart, that means that it uses a numeric X axis.
To make the python plot look like an Excel line chart with a categorical axis, change the X values in your python data to consecutive numbers 1 to 7. That will space out the data points evenly on the X axis.

Related

Matplotlib plotting data that doesnt exist

I am trying to plot three lines on one figure. I have data for three years for three sites and i am simply trying to plot them with the same x axis and same y axis. The first two lines span all three years of data, while the third dataset is usually more sparse. Using the object-oriented axes matplotlib format, when i try to plot my third set of data, I get points at the end of the graph that are out of the range of my third set of data. my third dataset is structured as tuples of dates and values such as:
data=
[('2019-07-15', 30.6),
('2019-07-16', 20.88),
('2019-07-17', 16.94),
('2019-07-18', 11.99),
('2019-07-19', 13.76),
('2019-07-20', 16.97),
('2019-07-21', 19.9),
('2019-07-22', 25.56),
('2019-07-23', 18.59),
...
('2020-08-11', 8.33),
('2020-08-12', 10.06),
('2020-08-13', 12.21),
('2020-08-15', 6.94),
('2020-08-16', 5.51),
('2020-08-17', 6.98),
('2020-08-18', 6.17)]
where the data ends in August 2020, yet the graph includes points at the end of 2020. This is happening with all my sites, as the first two datasets stay constant knowndf['DATE'] and knowndf['Value'] below.
Here is the problematic graph.
And here is what I have for the plotting:
fig, ax=plt.subplots(1,1,figsize=(15,12))
fig.tight_layout(pad=6)
ax.plot(knowndf['DATE'], knowndf['Value1'],'b',alpha=0.7)
ax.plot(knowndf['DATE'], knowndf['Value2'],color='red',alpha=0.7)
ax.plot(*zip(*data), 'g*', markersize=8) #when i plot this set of data i get nonexistent points
ax.tick_params(axis='x', rotation=45) #rotating for aesthetic
ax.set_xticks(ax.get_xticks()[::30]) #only want every 30th tick instead of every daily tick
I've tried ax.twinx() and that gives me two y axis that doesn't help me since i want to use the same x-axis and y-axis for all three sites. I've tried not using the axes approach, but there are things that come with axes that i need to plot with. Please please help!

How can I make line chart with uneven time intervals? in FacetGrid function

I made a line chart with seaborn FacetGrid, x-axis is the column called "dato", it is indicating the date of taking samples.
I have unequal time intervals between every sample taking, thereby I want to have corresponding space between every tick on x axis. how can I do that?
and I have a problem with making the correct order for dates also, tried some code, but not working for FacetGrid.
Here is my code and graph I made
mer=sns.FacetGrid(df, col='anlegg',hue='Merd',sharey=False,ylim=(0,4000))
sns.set_style('white')
sns.set_context('paper', font_scale=1.2)
mer.map_dataframe(sns.lineplot,x='dato',y='BW',marker='o',err_style='bars')
mer.add_legend()
mer.set_axis_labels('','BW')
mer.set_titles(col_template='anlegg {col_name}')
mer.set_xticklabels(rotation=90)

Is there a way to normalize the scales of axes in subplots in cufflinks library python?

I have used cufflinks to plot a few subplots, their scale ranges are different. I want same scales so I can compare the subplots. Is there a way to customize the axes of each subplot in cufflinks?
This link shows the picture of the subplots, the subplots of row 1, column1 and column2 have different ranges on the y axes. Is there a way to change the range of graph DRV.FV to -5 to 10?
MCVE:
df=cf.datagen.lines(4)
df.iplot(subplots=True, subplot_titles=True, legend=False)
I have taken this from the official plotly website.
Don't be alarmed if you run this and get a different sub plot because every time the data generated is random. So the plot will be different. The question here is just how to obtain same scale ranges on the sub plots.

Turning matplotlib grid of shaded values into a series of bar charts, one per row?

Using matlotlib, I can create figures that look like this:
Here, each row consists of a series of numbers from 0 to 0.6. The left hand axis text indicates the maximum value in each row. The bottom axis text represents the column indices.
The code for the actual grid essentially involves this line:
im = ax[r,c].imshow(info_to_use, vmin=0, vmax=0.6, cmap='gray')
where ax[r,c] is the current subplot axes at row r and column c, and info_to_use is a numpy array of shape (num_rows, num_cols) and has values between 0 and 0.6.
I am wondering if there is a way to convert the code above so that it instead displays bar charts, one per row? Something like this hand-drawn figure:
(The number of columns is not the same in my hand-drawn figure compared to the earlier one.) I know this would result in a very hard-to-read plot if it were embedded into a plot like the first one here. I would have this for a plot with fewer rows, which would make the bars easier to read.
The references that helped me make the first plot above were mostly from:
Python - Plotting colored grid based on values
custom matplotlib plot : chess board like table with colored cells
https://matplotlib.org/3.1.1/gallery/subplots_axes_and_figures/colorbar_placement.html#sphx-glr-gallery-subplots-axes-and-figures-colorbar-placement-py
https://matplotlib.org/3.1.1/gallery/images_contours_and_fields/image_annotated_heatmap.html#sphx-glr-gallery-images-contours-and-fields-image-annotated-heatmap-py
But I'm not sure how to make the jump from these to a bar chart in each row. Or at least something that could mirror it, e.g., instead of shading the full cell gray, only shade as much of it based on the percentage of the vmax?
import numpy as np
from matplotlib import pyplot as plt
a = np.random.rand(10,20)*.6
In a loop, call plt.subplot then plt.bar for each row in the 2-d array.
for i, thing in enumerate(a,1):
plt.subplot(a.shape[0],1,i)
plt.bar(range(a.shape[1]),thing)
plt.show()
plt.close()
Or, create all the subplots; then in a loop make a bar plot with each Axes.
fig, axes = plt.subplots(a.shape[0],1,sharex=True)
for ax, data in zip(axes, a):
ax.bar(range(a.shape[1]), data)
plt.show()
plt.close()

Scatter plots in python from a csv file with string in x-axis

I have a csv file with strings as x-axis. Now i have to make a scatter plot using matplotlib and pandas.
But its showing error "scatter plots require number in x axis"
I tried reading the file as a variable df and scatter plotted it.But it isnt able is to read strings. Its a huge file and i cant define string variables as
x = ["x1","x2",...etc]
*There are also multiple values for the same x axis id.
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv("scatter2")
ax = plt.gca()
df.plot(kind='scatter',x='Spectral Type',y='Hα EW',ax=ax)
The error message:
...ValueError: scatter requires x column to be numeric
Your x values have to be a normal range list (e.g. something like list(range(len(x)))) and then you just set your ticks to the desired value (something like plt.xticks(x_range_values, x_str_values)).
Have a look at Python Matplotlib - how to specify values on y axis? for a bit more information

Categories

Resources