Customizable heat map for strings - python

I am writing to you since I did not find a satisfactory answer to my question. Specifically I have a pandas data frame containing string characters for each variable. It is made as follows:
Own AUSTRIA. Own BELGIUM.
"-1.3" "-0.34"
"-0.43" "-1.89**"
"-1.2**" "-4.5"
"-1.9" "-2.3"
"-2**" "-6.1**"
"-.7" "-0.3"
"-0.06" "-7.2**"
... ...
"-1.1**" "-10.34"
What my goal is, is to produce an heatmap where terms having the "**" charachter are coloured in red, while others in blue (other color are fine). I know that heatmaps are based on the values' input but if I try to rescale the values with "**" either also other values are taken for I have too set too high (low) values soo that heat map understands which values need to be coloured.
Thank you,
Federico

In a 'heatmap dataframe', use large values for red, low values for blue.
heatmap = df.copy()
heatmap = heatmap.apply(lambda x: 255 if '**' in x else 0)
You can of course use any range you want, like [0, 1], [0, 255], [-1, 1].

Related

Colour bars based on values in pandas dataframe when using plotnine

I am trying to build a waterfall chart using plotnine. I would like to colour the starting and ending bars as grey (ideally I want to specify hexadecimal colours), increases as green and decreases as red.
Below is some sample data and my current plot. I am trying to set fill to the pandas column colour, but the bars are all black. I have also tied putting fill in the geom_segment, but this does not work either.
df = pd.DataFrame({})
df['label'] = ('A','B','C','D','E')
df['percentile'] = (10)*5
df['value'] = (100,80,90,110,110)
df['yStart'] = (0,100,80,90,0)
df['barLabel'] = ('100','-20','+10','+20','110')
df['labelPosition'] = ('105','75','95','115','115')
df['colour'] = ('grey','red','green','green','grey')
p = (ggplot(df, aes(x=np.arange(0,5,1), xend=np.arange(0,5,1), y='yStart',yend='value',fill='colour'))
+ theme_light(6)
+ geom_segment(size=10)
+ ylab('value')
+ scale_y_continuous(breaks=np.arange(0,141,20), limits=[0,140], expand=(0,0))
)
EDIT
Based on teunbrand's comment of changing fill to color, I have the following. How do I specify the actual colour, preferably in hexadecimal format?
Just to close this question off, credit goes to teunbrand in the comments for the solution.
geom_segment() has a colour aesthetic but not a fill aesthetic. Replace fill='colour' with colour='colour'.
Plotnine will use default colours for the bars. Use scale_color_identity() if the contents of the DataFrame column are literal colours, or scale_colour_manual() to manually specify a tuple or list of colours. Both forms accept hexadecimal colours.

Visualize multiple 2d Array with same color scheme

I am currently trying to visualize three 2D arrays with the same color. The arrays are 13x13 and contain integers. In an external file I have a color code in hex for each integer.
When I now try to visualize the arrays, two out of three arrays look good. All numbers match the color codes and display the arrays correctly. But in the last picture a part of the data is not assigned correctly.
.
color_names = [c.strip() for c in open(colors).readlines()]
color_dict = {v: k for v, k in enumerate(color_names)}
unique_classes = (np.unique(np.asarray(feature_map))).tolist()
number_classes = len(unique_classes)
color_code = [color_dict.get(cla) for cla in unique_classes]
cmap = plt.colors.ListedColormap(color_code)
norm = plt.colors.BoundaryNorm(unique_classes, cmap.N)
img = pyplot.imshow(feature_map[0],interpolation='nearest',
cmap = cmap,norm=norm)
pyplot.colorbar(img,cmap=cmap,
norm=norm,boundaries=unique_classes)
pyplot.show()
img1 = pyplot.imshow(feature_map[1],interpolation='nearest',
cmap = cmap,norm=norm)
pyplot.show()
img2 = pyplot.imshow(feature_map[2],interpolation='nearest',
cmap = cmap,norm=norm)
pyplot.colorbar(img2,cmap=cmap,
norm=norm,boundaries=unique_classes)
pyplot.show()
Exactly the same data as on the picture:
feature_map = [[[25,25,25,25,56,56,2,2,2,2,2,2,25],[25,25,25,25,25,25,59,7,72,72,72,72,2],[25,25,25,25,25,25,59,72,72,72,72,72,2],[25,25,25,24,24,24,62,0,0,0,0,25,25],[25,25,24,24,24,24,24,24,24,24,25,25,25],[26,26,24,24,24,24,24,26,26,26,6,6,6],[26,26,26,24,24,26,26,26,26,26,26,6,6],[26,26,26,0,0,26,26,26,26,26,26,6,6],[28,28,28,28,28,28,28,26,26,26,26,6,6],[28,28,28,28,28,28,28,26,26,26,13,13,6],[28,28,28,28,28,28,28,26,13,13,13,13,13],[28,28,28,28,28,28,28,13,13,13,13,13,13],[28,28,28,28,28,28,28,13,13,13,13,13,13]],[[25,25,25,25,59,56,59,2,0,0,0,0,0],[25,25,25,25,25,59,59,7,72,72,72,72,72],[25,25,25,25,25,25,59,72,72,72,72,72,72],[25,25,25,0,0,25,25,6,0,0,0,72,0],[25,25,0,0,0,0,6,0,0,0,0,25,6],[26,26,26,0,0,0,24,26,0,0,6,6,6],[26,26,26,0,0,0,26,26,26,26,26,6,6],[0,26,0,0,0,0,26,26,0,26,26,6,6],[0,28,28,28,28,28,28,26,0,26,26,6,6],[28,28,28,28,28,28,28,26,0,26,0,0,0],[28,28,28,28,28,28,28,26,13,13,13,13,0],[56,56,28,28,28,28,28,13,13,13,13,13,13]],[[0,28,28,28,28,28,28,13,13,13,13,13,0],[25,25,25,25,59,59,59,4,0,0,0,0,0],[25,25,25,25,59,59,59,7,7,7,72,72,6],[25,25,25,25,25,25,59,7,7,73,73,25,0],[25,25,25,0,0,25,6,7,0,6,6,6,0],[25,0,0,0,6,6,6,6,0,0,6,6,6],[0,0,0,0,0,6,6,6,0,0,6,6,6],[0,0,0,0,0,0,6,6,0,0,6,6,6],[0,0,0,0,0,0,6,0,0,0,6,6,6],[0,0,28,0,28,28,13,0,0,0,6,6,6],[28,28,28,28,28,28,13,13,13,0,13,6,6],[28,28,28,28,28,28,28,13,13,13,13,13,13],[56,28,28,28,28,28,28,13,13,13,13,13,13],[28,28,28,28,28,28,28,13,13,13,13,13,13]]]
The color code file is simply a file where each line contains a single hex code such as: #deb887
I have been working on this problem for several hours and can't reproduce the problem at the moment
I have tried to reproduce your results and something got my attention.
If you look closely to the feature_map[2] values you might see that the pixel you claim miss classified has actually a different value than the pixels around it. So it actually has the correct color for its value. So I think it is not because of a misclassification it is beacause of your data. That would be my answer IF what you mean by "part of the data" is the pixel at position (0,11) otherwise i have gotten it all wrong and sorry about this answer.
NOTE: About colors, I just picked some random colors. Don't worry if they don't match.

How do I create a colormap in VTK?

I created a simple unstructured square in VTK:
x = [0,10,0,10]
y = [0, 0, 10, 10]
z = [0,0,0,0]
data = np.asarray([x,y,z]).T
for i in range(0, len(x)):
points.InsertPoint(i, data[i])
quad = [2,3,1,0]
ugrid.InsertNextCell(vtk.VTK_QUAD, 4, quad)
ugrid.SetPoints(points)
Say I wanted to create a colormap of the temperature across the square. I know the temperature at the corners:
temp = [0,20,40,60]
How can I color the entire square knowing these values?
VTK provides one example to create a colormap in their tutorials (the ColoredElevationMap tutorial), however, I don't completely understand it and I believe there is another simpler way to create a colormap in VTK that I don't know of.
You should add your data array and set it as the scalars (= default array to use in VTK, particularly for coloring)
temperature = vtk.vtkIntArray()
temperature.SetName("Temp")
temp = [00,20,40,60]
for t in temp:
temperature.InsertNextValue(t)
ugrid.GetPointData().SetScalars(temperature)
Then in the rendering part, this array will be used by default for coloration. You still have to update the color range:
mapper.SetScalarRange(ugrid.GetScalarRange())
You can take a look to this example (edit: update link thanks to #paulo-carvalho)
Adding to Nico Vuaille's answer. The color of the map can be changed using a lookup table such as:
lut = vtk.vtkLookupTable()
lut.SetHueRange(0, 0)
lut.SetSaturationRange(0, 0)
lut.SetValueRange(0.2, 1.0)
lut.Build()
and
Mapper.SetLookupTable(lut)

Map boolean values to strings

I am plotting a graph where my x variable is 'Mg' and my y variable is 'Si'. I have a third variable called 'binary'. If binary is equal to 0 or 1, how do I colour the plotted point in red or black respectively?
I need to use the functions plt.scatter and colourbar(). I've read about colourbar but it seems to generate a continuous spectrum of colour. I've tried using plt.colors.from_levels_and_colors instead but I'm not really sure how to use it properly.
levels = [0,1]
colors = ['r','b']
cmap, norm = plt.colors.from_levels_and_colors(levels, colors)
plt.scatter(data_train['Mg'], data_train['Si'], c = data_train['binary'])
plt.show()
Also, in the future, instead of asking a question like this in this forum what can I do to solve the problem on my own? I try to read the documentation online first but often find it hard to understand.
np.where makes encoding binary values easy.
np.where([1, 0, 0, 1], 'yes', 'no')
# array(['yes', 'no', 'no', 'yes'], dtype='<U3')
colors = np.where(data_train['binary'], 'black', 'red')
plt.scatter(data_train['Mg'], data_train['Si'], c=colors)
If you're working with multiple "quantitive" colors, not with colormap, you probably should change your c from binary to mpl-friedly format. I.e.
point_colors = [colors[binary] for binary in data_train['binary']]
plt.scatter(data_train['Mg'], data_train['Si'], c=point_colors)

Map discrete value to color by type and position

I asked how to map discrete value to color yesterday and got the following useful answer.
Map discrete value to color
I am trying to graph colors based on 4 discrete value 1, 2, 3, 4. I want to define 1 as black, 2 as red, 3 as yellow and 4 as green. does anyone know how to do it?
You could try imshow instead, and use a dict to map the colors you want:
colordict = {1:(0,0,0),2:(1,0,0),3:(1,1,0),4:(0,1,0)}
test = ([1,2,2,1,3],[1,1,1,1,4],[2,1,1,2,1])
test_rgb = [[colordict[i] for i in row] for row in test]
plt.imshow(test_rgb, interpolation = 'none')
However, there are some data attributes for the lists of values and I would like to show them in the graph as well. Two things in particulars are:
each list of color has a unique type, so the graph should show type for each horizontal color bar.
each list of color has a corresponding list of positions, which indicate the end position from last one (all of them start with 0 and end at 20)
dict = {'Type': ['A', 'B','C'], 'ColorList': [[1,2,2,1,3],[1,1,1,1,4],[2,1,1,2]], 'Position': [[3,6,9,15,20], [2,10,13,16,20], [6, 10, 12, 20]]}
df = pd.DataFrame(dict)
so, the graph should look like similar except the x and y axis change to indicate the position of the color and type of the list of colors.
Any help is highly appreciated
You need the plt.annotate function; eg, for the first square plt.annotate ('black', xy = (0, 0), color = 'white')
Regarding the rest of your code, you'd be better off using a subclass of enum (requires Python 3.0):
>>> from enum import Enum
>>> class Color(Enum):
... black = 1
... red = 1
... yellow = 2
... green = 3
Possibly including something along the lines of:
... def RGB (self):
... return {1:(0, 0, 0), 2:(1, 0, 0), 3:(1, 1, 0), 4:(0, 1, 0)}[self.value]
Rewriting using a subclass of Enum and a list comprehension should make your code easier to work with.
PS: I think you're overriding the standard function dict with dict = ...

Categories

Resources