Customizable heat map for strings - python
I am writing to you since I did not find a satisfactory answer to my question. Specifically I have a pandas data frame containing string characters for each variable. It is made as follows:
Own AUSTRIA. Own BELGIUM.
"-1.3" "-0.34"
"-0.43" "-1.89**"
"-1.2**" "-4.5"
"-1.9" "-2.3"
"-2**" "-6.1**"
"-.7" "-0.3"
"-0.06" "-7.2**"
... ...
"-1.1**" "-10.34"
What my goal is, is to produce an heatmap where terms having the "**" charachter are coloured in red, while others in blue (other color are fine). I know that heatmaps are based on the values' input but if I try to rescale the values with "**" either also other values are taken for I have too set too high (low) values soo that heat map understands which values need to be coloured.
Thank you,
Federico
In a 'heatmap dataframe', use large values for red, low values for blue.
heatmap = df.copy()
heatmap = heatmap.apply(lambda x: 255 if '**' in x else 0)
You can of course use any range you want, like [0, 1], [0, 255], [-1, 1].
Related
Colour bars based on values in pandas dataframe when using plotnine
I am trying to build a waterfall chart using plotnine. I would like to colour the starting and ending bars as grey (ideally I want to specify hexadecimal colours), increases as green and decreases as red. Below is some sample data and my current plot. I am trying to set fill to the pandas column colour, but the bars are all black. I have also tied putting fill in the geom_segment, but this does not work either. df = pd.DataFrame({}) df['label'] = ('A','B','C','D','E') df['percentile'] = (10)*5 df['value'] = (100,80,90,110,110) df['yStart'] = (0,100,80,90,0) df['barLabel'] = ('100','-20','+10','+20','110') df['labelPosition'] = ('105','75','95','115','115') df['colour'] = ('grey','red','green','green','grey') p = (ggplot(df, aes(x=np.arange(0,5,1), xend=np.arange(0,5,1), y='yStart',yend='value',fill='colour')) + theme_light(6) + geom_segment(size=10) + ylab('value') + scale_y_continuous(breaks=np.arange(0,141,20), limits=[0,140], expand=(0,0)) ) EDIT Based on teunbrand's comment of changing fill to color, I have the following. How do I specify the actual colour, preferably in hexadecimal format?
Just to close this question off, credit goes to teunbrand in the comments for the solution. geom_segment() has a colour aesthetic but not a fill aesthetic. Replace fill='colour' with colour='colour'. Plotnine will use default colours for the bars. Use scale_color_identity() if the contents of the DataFrame column are literal colours, or scale_colour_manual() to manually specify a tuple or list of colours. Both forms accept hexadecimal colours.
Visualize multiple 2d Array with same color scheme
I am currently trying to visualize three 2D arrays with the same color. The arrays are 13x13 and contain integers. In an external file I have a color code in hex for each integer. When I now try to visualize the arrays, two out of three arrays look good. All numbers match the color codes and display the arrays correctly. But in the last picture a part of the data is not assigned correctly. . color_names = [c.strip() for c in open(colors).readlines()] color_dict = {v: k for v, k in enumerate(color_names)} unique_classes = (np.unique(np.asarray(feature_map))).tolist() number_classes = len(unique_classes) color_code = [color_dict.get(cla) for cla in unique_classes] cmap = plt.colors.ListedColormap(color_code) norm = plt.colors.BoundaryNorm(unique_classes, cmap.N) img = pyplot.imshow(feature_map[0],interpolation='nearest', cmap = cmap,norm=norm) pyplot.colorbar(img,cmap=cmap, norm=norm,boundaries=unique_classes) pyplot.show() img1 = pyplot.imshow(feature_map[1],interpolation='nearest', cmap = cmap,norm=norm) pyplot.show() img2 = pyplot.imshow(feature_map[2],interpolation='nearest', cmap = cmap,norm=norm) pyplot.colorbar(img2,cmap=cmap, norm=norm,boundaries=unique_classes) pyplot.show() Exactly the same data as on the picture: feature_map = [[[25,25,25,25,56,56,2,2,2,2,2,2,25],[25,25,25,25,25,25,59,7,72,72,72,72,2],[25,25,25,25,25,25,59,72,72,72,72,72,2],[25,25,25,24,24,24,62,0,0,0,0,25,25],[25,25,24,24,24,24,24,24,24,24,25,25,25],[26,26,24,24,24,24,24,26,26,26,6,6,6],[26,26,26,24,24,26,26,26,26,26,26,6,6],[26,26,26,0,0,26,26,26,26,26,26,6,6],[28,28,28,28,28,28,28,26,26,26,26,6,6],[28,28,28,28,28,28,28,26,26,26,13,13,6],[28,28,28,28,28,28,28,26,13,13,13,13,13],[28,28,28,28,28,28,28,13,13,13,13,13,13],[28,28,28,28,28,28,28,13,13,13,13,13,13]],[[25,25,25,25,59,56,59,2,0,0,0,0,0],[25,25,25,25,25,59,59,7,72,72,72,72,72],[25,25,25,25,25,25,59,72,72,72,72,72,72],[25,25,25,0,0,25,25,6,0,0,0,72,0],[25,25,0,0,0,0,6,0,0,0,0,25,6],[26,26,26,0,0,0,24,26,0,0,6,6,6],[26,26,26,0,0,0,26,26,26,26,26,6,6],[0,26,0,0,0,0,26,26,0,26,26,6,6],[0,28,28,28,28,28,28,26,0,26,26,6,6],[28,28,28,28,28,28,28,26,0,26,0,0,0],[28,28,28,28,28,28,28,26,13,13,13,13,0],[56,56,28,28,28,28,28,13,13,13,13,13,13]],[[0,28,28,28,28,28,28,13,13,13,13,13,0],[25,25,25,25,59,59,59,4,0,0,0,0,0],[25,25,25,25,59,59,59,7,7,7,72,72,6],[25,25,25,25,25,25,59,7,7,73,73,25,0],[25,25,25,0,0,25,6,7,0,6,6,6,0],[25,0,0,0,6,6,6,6,0,0,6,6,6],[0,0,0,0,0,6,6,6,0,0,6,6,6],[0,0,0,0,0,0,6,6,0,0,6,6,6],[0,0,0,0,0,0,6,0,0,0,6,6,6],[0,0,28,0,28,28,13,0,0,0,6,6,6],[28,28,28,28,28,28,13,13,13,0,13,6,6],[28,28,28,28,28,28,28,13,13,13,13,13,13],[56,28,28,28,28,28,28,13,13,13,13,13,13],[28,28,28,28,28,28,28,13,13,13,13,13,13]]] The color code file is simply a file where each line contains a single hex code such as: #deb887 I have been working on this problem for several hours and can't reproduce the problem at the moment
I have tried to reproduce your results and something got my attention. If you look closely to the feature_map[2] values you might see that the pixel you claim miss classified has actually a different value than the pixels around it. So it actually has the correct color for its value. So I think it is not because of a misclassification it is beacause of your data. That would be my answer IF what you mean by "part of the data" is the pixel at position (0,11) otherwise i have gotten it all wrong and sorry about this answer. NOTE: About colors, I just picked some random colors. Don't worry if they don't match.
How do I create a colormap in VTK?
I created a simple unstructured square in VTK: x = [0,10,0,10] y = [0, 0, 10, 10] z = [0,0,0,0] data = np.asarray([x,y,z]).T for i in range(0, len(x)): points.InsertPoint(i, data[i]) quad = [2,3,1,0] ugrid.InsertNextCell(vtk.VTK_QUAD, 4, quad) ugrid.SetPoints(points) Say I wanted to create a colormap of the temperature across the square. I know the temperature at the corners: temp = [0,20,40,60] How can I color the entire square knowing these values? VTK provides one example to create a colormap in their tutorials (the ColoredElevationMap tutorial), however, I don't completely understand it and I believe there is another simpler way to create a colormap in VTK that I don't know of.
You should add your data array and set it as the scalars (= default array to use in VTK, particularly for coloring) temperature = vtk.vtkIntArray() temperature.SetName("Temp") temp = [00,20,40,60] for t in temp: temperature.InsertNextValue(t) ugrid.GetPointData().SetScalars(temperature) Then in the rendering part, this array will be used by default for coloration. You still have to update the color range: mapper.SetScalarRange(ugrid.GetScalarRange()) You can take a look to this example (edit: update link thanks to #paulo-carvalho)
Adding to Nico Vuaille's answer. The color of the map can be changed using a lookup table such as: lut = vtk.vtkLookupTable() lut.SetHueRange(0, 0) lut.SetSaturationRange(0, 0) lut.SetValueRange(0.2, 1.0) lut.Build() and Mapper.SetLookupTable(lut)
Map boolean values to strings
I am plotting a graph where my x variable is 'Mg' and my y variable is 'Si'. I have a third variable called 'binary'. If binary is equal to 0 or 1, how do I colour the plotted point in red or black respectively? I need to use the functions plt.scatter and colourbar(). I've read about colourbar but it seems to generate a continuous spectrum of colour. I've tried using plt.colors.from_levels_and_colors instead but I'm not really sure how to use it properly. levels = [0,1] colors = ['r','b'] cmap, norm = plt.colors.from_levels_and_colors(levels, colors) plt.scatter(data_train['Mg'], data_train['Si'], c = data_train['binary']) plt.show() Also, in the future, instead of asking a question like this in this forum what can I do to solve the problem on my own? I try to read the documentation online first but often find it hard to understand.
np.where makes encoding binary values easy. np.where([1, 0, 0, 1], 'yes', 'no') # array(['yes', 'no', 'no', 'yes'], dtype='<U3') colors = np.where(data_train['binary'], 'black', 'red') plt.scatter(data_train['Mg'], data_train['Si'], c=colors)
If you're working with multiple "quantitive" colors, not with colormap, you probably should change your c from binary to mpl-friedly format. I.e. point_colors = [colors[binary] for binary in data_train['binary']] plt.scatter(data_train['Mg'], data_train['Si'], c=point_colors)
Map discrete value to color by type and position
I asked how to map discrete value to color yesterday and got the following useful answer. Map discrete value to color I am trying to graph colors based on 4 discrete value 1, 2, 3, 4. I want to define 1 as black, 2 as red, 3 as yellow and 4 as green. does anyone know how to do it? You could try imshow instead, and use a dict to map the colors you want: colordict = {1:(0,0,0),2:(1,0,0),3:(1,1,0),4:(0,1,0)} test = ([1,2,2,1,3],[1,1,1,1,4],[2,1,1,2,1]) test_rgb = [[colordict[i] for i in row] for row in test] plt.imshow(test_rgb, interpolation = 'none') However, there are some data attributes for the lists of values and I would like to show them in the graph as well. Two things in particulars are: each list of color has a unique type, so the graph should show type for each horizontal color bar. each list of color has a corresponding list of positions, which indicate the end position from last one (all of them start with 0 and end at 20) dict = {'Type': ['A', 'B','C'], 'ColorList': [[1,2,2,1,3],[1,1,1,1,4],[2,1,1,2]], 'Position': [[3,6,9,15,20], [2,10,13,16,20], [6, 10, 12, 20]]} df = pd.DataFrame(dict) so, the graph should look like similar except the x and y axis change to indicate the position of the color and type of the list of colors. Any help is highly appreciated
You need the plt.annotate function; eg, for the first square plt.annotate ('black', xy = (0, 0), color = 'white') Regarding the rest of your code, you'd be better off using a subclass of enum (requires Python 3.0): >>> from enum import Enum >>> class Color(Enum): ... black = 1 ... red = 1 ... yellow = 2 ... green = 3 Possibly including something along the lines of: ... def RGB (self): ... return {1:(0, 0, 0), 2:(1, 0, 0), 3:(1, 1, 0), 4:(0, 1, 0)}[self.value] Rewriting using a subclass of Enum and a list comprehension should make your code easier to work with. PS: I think you're overriding the standard function dict with dict = ...