I have a data set that has a somewhat strange layout of the data itself. it looks like this:
It's actually in csv format but I showed it above in excel for convenience.
so the x-axis values are located in the first column. The y-axis values are in the first row. Each blue cell is a data point, with the value in the cell representing an intensity. So the for the top left cell, the coordinates would be [0.000146,0,0]. The one below would be [0.000478,0,114]. I'm trying to figure out how to actually get this data plotted like a 2-D heatmap where the intensity is represented by the color, but haven't been able to do so so far. Any suggestions for how to go about doing this? I've looked at some matplotlib and pandas tutorials but they don't seem to address data formatted this way.
Related
I Posted this question about 3D plots of data frames:
3D plot of 2d Pandas data frame
and the user referred me very very helfully to this:
Plotting Pandas Crosstab Dataframe into 3D bar chart
It use useful and the code worked in principle, but it lookes like a mess (see image below) for several reasons:
I have huge number of values to plot (470 or so, along the y-axis) so perhaps a bar chart is not the best way (I am going for a histogram kind of look, so I assumed very narrow bars would be suitable)
my counts (z axis) do not give almost any information, because the differences I need to see are from 100 to the max value
how can I make the 3D plot that shows up interactive? (being able to rotate etc) - I have seen it done in blogs/videos but sure if it's something on Tools -> Preferences that I can't find
So re: the second issue, simple enough, I tried to just change the limits of the zbar as I would for a 2D Plot, by incorporating:
ax.set_zlim([110,150])
just before the axis labels, but obviously this is the wrong way:
SO do I have to limit the values from the original data set (i.e. filter out <110), or is there a way to do this from the plot?
I've got a large dataframe and each row has a count of the number of good, okay, bad and other events against it.
I'm trying to replace those 4 columns with a single column that visually represents the same data. i.e. I'd like to replace 4 cells with a single cell containing a stacked horizontal bar
So I want my table to look like the above rather than the like the bottom
I've struggling with the python thus far as the only route I can think of combining them would be to generate each bar seperately in matplotlib, export to jpg and then import into the dataframe.
Which feels like it won't be scaleable...
Any suggestions?
I have 2 arrays of shape (1,N) which are right ascension and declination. The scatter plot of that data is
As you can see in the top-left, data was not collected in this region and so is empty.
I would like to form the histogram of this data as a method of investigating data clustering. That empty spot (and many others like it), it seems to me, will cause a problem -- when numpy.histogram2d draws a grid on this data and begins counting data points in the cells, it will see cells that fall on the empty region and determine that there is no data there; hence the cell histogram value is zero. This pulls down the mean of the whole histogram. Having a sensible mean is important because I identify clusters via their standard deviation from the mean.
The problem is, of course, not that those cells are empty, but that there is no data to count; ideally those cells should be ignored.
I am very new to Python, but I am determined to learn how to use it. At the moment, I am working with .dat files that have a few columns only separated by spaces.
I was wondering if there is anyway that I can make a scatter plot with a specific column accompanied by each value's non-negative integer? (i.e. 0,1,2,3,4...).
Here is an example of the data I am working with:
Here is an example graph of what I would want to be the end result:
I have a specific problem that maybe can help me with. I have, currently, 3 arrays of data and I want to make a 2D histogram of the first two while using the third array as values that get summed up in each particular bin. I also want to include a color bar that shows the scale of different colors you see in the histogram.
As a start I looked into using matplotlib.pyplot.hexbin to do this and it seems to work fine but I don't want to have hexagons as the shape of my bins. Is somebody able to point me to some resources on how to do this?