Replace pixel values in a Numpy image - python

I have a zeros image with dimension 720*1280 and I have a list of pixels' coordinates to change:
x = [623, 623, 583, 526, 571, 669, 686, 697, 600, 594, 606, 657, 657, 657, 617, 646, 611, 657, 674, 571, 693, 688, 698, 700, 686, 687, 687, 693, 690, 686, 694]
y = [231, 281, 270, 270, 202, 287, 366, 428, 422, 517, 608, 422, 518, 608, 208, 214, 208, 231, 653, 652, 436, 441, 457, 457, 453, 461, 467, 469, 475, 477, 467]
here is the scatter plot :
yy= [720 -x for x in y]
plt.scatter(x, yy, s = 25, c = "r")
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(0, 1280)
plt.ylim(0, 720)
plt.show()
here is the code to generate binary image by set the pixel value to 255
image_zeros = np.zeros((720, 1280), dtype=np.uint8)
for i ,j in zip (x, y):
image_zeros[i, j] = 255
plt.imshow(image_zeros, cmap='gray')
plt.show()
here is the result : What is the problem!!

As Goyo pointed out, the resolution of the image is the problem. The default figure size is 6.4 inches by 4.8 inches, and the default resolution is 100 dpi (at least for the current version of matplotlib). So the default image size is 640 x 480. The figure includes not only the imshow image, but also the tickmarks, ticklabels and the x and y axis and a white border. So there are are even fewer than 640 x 480 pixels available for the imshow image by default.
Your image_zeros has shape (720, 1280). The array is too large to be fully rendered in an image of 640 x 480 pixels.
Thus, to generate white dots using imshow, set the figsize and dpi so that the number of pixels available for the imshow image is bigger than (1280, 720):
import numpy as np
import matplotlib.pyplot as plt
x = np.array([623, 623, 583, 526, 571, 669, 686, 697, 600, 594, 606, 657, 657, 657, 617, 646, 611, 657, 674, 571, 693, 688, 698, 700, 686, 687, 687, 693, 690, 686, 694])
y = np.array([231, 281, 270, 270, 202, 287, 366, 428, 422, 517, 608, 422, 518, 608, 208, 214, 208, 231, 653, 652, 436, 441, 457, 457, 453, 461, 467, 469, 475, 477, 467])
image_zeros = np.zeros((720, 1280), dtype=np.uint8)
image_zeros[y, x] = 255
fig, ax = plt.subplots(figsize=(26, 16), dpi=100)
ax.imshow(image_zeros, cmap='gray', origin='lower')
fig.savefig('/tmp/out.png')
Here is a closeup showing some of the white dots:
To make the white dots easier to see, you may wish to use scatter instead of imshow:
import numpy as np
import matplotlib.pyplot as plt
x = np.array([623, 623, 583, 526, 571, 669, 686, 697, 600, 594, 606, 657, 657, 657, 617, 646, 611, 657, 674, 571, 693, 688, 698, 700, 686, 687, 687, 693, 690, 686, 694])
y = np.array([231, 281, 270, 270, 202, 287, 366, 428, 422, 517, 608, 422, 518, 608, 208, 214, 208, 231, 653, 652, 436, 441, 457, 457, 453, 461, 467, 469, 475, 477, 467])
yy = 720 - y
fig, ax = plt.subplots()
ax.patch.set_facecolor('black')
ax.scatter(x, yy, s=25, c='white')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_xlim(0, 1280)
ax.set_ylim(0, 720)
fig.savefig('/tmp/out-scatter.png')

Related

How to create a histogram from counts with bins spaced every 0.1

I have the following dataframe:
df = {'count1': [2.2336, 2.2454, 2.2538, 2.2716999999999996, 2.2798000000000003, 2.2843, 2.2906, 2.2969, 2.3223000000000003, 2.3282, 2.3356999999999997, 2.3544, 2.3651999999999997, 2.3727, 2.3775, 2.3823000000000003, 2.392, 2.4051, 2.4092, 2.4133, 2.4168000000000003, 2.4175, 2.4209, 2.4392, 2.4476, 2.456, 2.461, 2.4723, 2.4776, 2.4882, 2.4989, 2.5095, 2.5221999999999998, 2.5318, 2.5422, 2.5494, 2.559, 2.5654, 2.5814, 2.5878, 2.6238, 2.6178000000000003, 2.624, 2.6303, 2.6366, 2.6425, 2.6481999999999997, 2.6525, 2.6553, 2.663, 2.6712, 2.6898, 2.7051, 2.7144, 2.727, 2.7416, 2.7472, 2.7512, 2.7557, 2.7574, 2.7594000000000003, 2.7636, 2.7699000000000003, 2.7761, 2.7809, 2.7855, 2.7902, 2.7948000000000004, 2.7995, 2.8043, 2.815, 2.8249, 2.8352, 2.8455, 2.8708, 2.8874, 2.9004000000000003, 2.9301, 2.9399, 2.9513000000000003, 2.9634, 2.9745999999999997, 2.9852, 2.9959000000000002, 3.0037, 3.0093, 3.015, 3.0184, 3.0206, 3.0225, 3.0245, 3.0264, 3.0282, 3.0305999999999997, 3.0331, 3.0334, 3.0361, 3.0388, 3.0418000000000003, 3.0443000000000002, 3.0463, 3.0464, 3.0481, 3.0496999999999996, 3.0514, 3.0530999999999997, 3.0544000000000002, 3.0556, 3.0569, 3.0581, 3.0623, 3.0627, 3.0633000000000004, 3.0638, 3.0643000000000002, 3.0648, 3.0652, 3.0656999999999996, 3.0663, 3.0675, 3.0682, 3.0688, 3.0695, 3.0702, 3.0721, 3.0741, 3.0761, 3.078, 3.08, 3.082, 3.0839000000000003, 3.0859, 3.0879000000000003, 3.0898000000000003, 3.0918, 3.0938000000000003, 3.0994, 3.1050999999999997, 3.1144000000000003, 3.1613, 3.1649000000000003, 3.1752, 3.1869, 3.1899, 3.1925, 3.1976, 3.2001, 3.2051999999999996, 3.2098, 3.2123000000000004],
'count2': [3144, 3944, 7888, 4428, 68874, 5480, 56697, 20560, 8744, 91190, 352, 924, 1308611, 480, 51146, 170373, 58792, 11424, 1288673, 1845105, 401464, 657930, 1361172, 199373, 19753, 39082, 776, 7533, 9289, 36731, 53865, 100140, 59274, 35740, 2648, 144998, 78616, 848241, 34579, 216591, 22512, 4024, 17168, 1552, 13760, 8344, 65589, 43104, 44672, 917115, 16256, 4168, 29679, 22571, 7720, 452, 8836, 6888, 18578, 5148, 9289, 442, 214, 485, 3164, 1101, 1010, 9048, 293, 1628, 960, 517, 2362, 1262, 1524, 1173, 1348, 1288, 25568, 8416, 5792, 4944, 504, 4696, 2336, 458, 453, 1220, 1149, 6688, 6956, 7324, 7100, 7784, 5650, 5076, 5336, 6792, 5212, 4592, 5260, 1279, 654, 842, 990, 782, 1412, 1363, 935, 996, 775, 1471, 1525, 1398, 1097, 1082, 1668, 1007, 497, 598, 645, 698, 541, 504, 549, 540, 1568, 514, 578, 2906, 4360, 3916, 11944, 1434, 1589, 732, 641, 477, 307, 1884, 3232, 2408, 1016, 332, 139, 344, 4784, 1784, 1324, 204]}
df = pd.DataFrame(df)
And I want to plot a barplot with it, where the x axis is count1 and the y axis count2, with bins spaced every 0.1 intervals.
I used this:
plt.bar(x=df['count1'], y=df['count2'], width=0.1)
But it returns me this error:
TypeError: bar() missing 1 required positional argument: 'height'
I'm trying to replicate an R code:
ggplot(df, aes(x= count1,
y= count2)) +
geom_col() +
ylim(0, 2000000) +
scale_x_binned()
That generates the following graph:
To get a histogram from values and counts, you can use the weights= parameter of plt.hist.
To create bins with a width of 0.1, you can use np.arange(...,..., 0.1).
The rwidth=0.9 parameter makes the bars a bit narrower.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = {'count1': [2.2336, 2.2454, 2.2538, 2.2716999999999996, 2.2798000000000003, 2.2843, 2.2906, 2.2969, 2.3223000000000003, 2.3282, 2.3356999999999997, 2.3544, 2.3651999999999997, 2.3727, 2.3775, 2.3823000000000003, 2.392, 2.4051, 2.4092, 2.4133, 2.4168000000000003, 2.4175, 2.4209, 2.4392, 2.4476, 2.456, 2.461, 2.4723, 2.4776, 2.4882, 2.4989, 2.5095, 2.5221999999999998, 2.5318, 2.5422, 2.5494, 2.559, 2.5654, 2.5814, 2.5878, 2.6238, 2.6178000000000003, 2.624, 2.6303, 2.6366, 2.6425, 2.6481999999999997, 2.6525, 2.6553, 2.663, 2.6712, 2.6898, 2.7051, 2.7144, 2.727, 2.7416, 2.7472, 2.7512, 2.7557, 2.7574, 2.7594000000000003, 2.7636, 2.7699000000000003, 2.7761, 2.7809, 2.7855, 2.7902, 2.7948000000000004, 2.7995, 2.8043, 2.815, 2.8249, 2.8352, 2.8455, 2.8708, 2.8874, 2.9004000000000003, 2.9301, 2.9399, 2.9513000000000003, 2.9634, 2.9745999999999997, 2.9852, 2.9959000000000002, 3.0037, 3.0093, 3.015, 3.0184, 3.0206, 3.0225, 3.0245, 3.0264, 3.0282, 3.0305999999999997, 3.0331, 3.0334, 3.0361, 3.0388, 3.0418000000000003, 3.0443000000000002, 3.0463, 3.0464, 3.0481, 3.0496999999999996, 3.0514, 3.0530999999999997, 3.0544000000000002, 3.0556, 3.0569, 3.0581, 3.0623, 3.0627, 3.0633000000000004, 3.0638, 3.0643000000000002, 3.0648, 3.0652, 3.0656999999999996, 3.0663, 3.0675, 3.0682, 3.0688, 3.0695, 3.0702, 3.0721, 3.0741, 3.0761, 3.078, 3.08, 3.082, 3.0839000000000003, 3.0859, 3.0879000000000003, 3.0898000000000003, 3.0918, 3.0938000000000003, 3.0994, 3.1050999999999997, 3.1144000000000003, 3.1613, 3.1649000000000003, 3.1752, 3.1869, 3.1899, 3.1925, 3.1976, 3.2001, 3.2051999999999996, 3.2098, 3.2123000000000004],
'count2': [3144, 3944, 7888, 4428, 68874, 5480, 56697, 20560, 8744, 91190, 352, 924, 1308611, 480, 51146, 170373, 58792, 11424, 1288673, 1845105, 401464, 657930, 1361172, 199373, 19753, 39082, 776, 7533, 9289, 36731, 53865, 100140, 59274, 35740, 2648, 144998, 78616, 848241, 34579, 216591, 22512, 4024, 17168, 1552, 13760, 8344, 65589, 43104, 44672, 917115, 16256, 4168, 29679, 22571, 7720, 452, 8836, 6888, 18578, 5148, 9289, 442, 214, 485, 3164, 1101, 1010, 9048, 293, 1628, 960, 517, 2362, 1262, 1524, 1173, 1348, 1288, 25568, 8416, 5792, 4944, 504, 4696, 2336, 458, 453, 1220, 1149, 6688, 6956, 7324, 7100, 7784, 5650, 5076, 5336, 6792, 5212, 4592, 5260, 1279, 654, 842, 990, 782, 1412, 1363, 935, 996, 775, 1471, 1525, 1398, 1097, 1082, 1668, 1007, 497, 598, 645, 698, 541, 504, 549, 540, 1568, 514, 578, 2906, 4360, 3916, 11944, 1434, 1589, 732, 641, 477, 307, 1884, 3232, 2408, 1016, 332, 139, 344, 4784, 1784, 1324, 204]}
df = pd.DataFrame(df)
bin_start = np.trunc(df['count1'].min() * 10) / 10
bin_end = df['count1'].max() + 0.1
plt.style.use('ggplot')
plt.hist(x=df['count1'], weights=df['count2'], bins=np.arange(bin_start, bin_end, 0.1), rwidth=0.9)
plt.gca().get_yaxis().get_major_formatter().set_scientific(False)
plt.xlabel('count1')
plt.ylabel('count2')
plt.tight_layout()
plt.show()

Fill area of overlap between two normal distributions in seaborn / matplotlib

I want to fill the area overlapping between two normal distributions. I've got the x min and max, but I can't figure out how to set the y boundaries.
I've looked at the plt documentation and some examples. I think this related question and this one come close, but no luck. Here's what I have so far.
import numpy as np
import seaborn as sns
import scipy.stats as stats
import matplotlib.pyplot as plt
pepe_calories = np.array([361, 291, 263, 284, 311, 284, 282, 228, 328, 263, 354, 302, 293,
254, 297, 281, 307, 281, 262, 302, 244, 259, 273, 299, 278, 257,
296, 237, 276, 280, 291, 278, 251, 313, 314, 323, 333, 270, 317,
321, 307, 256, 301, 264, 221, 251, 307, 283, 300, 292, 344, 239,
288, 356, 224, 246, 196, 202, 314, 301, 336, 294, 237, 284, 311,
257, 255, 287, 243, 267, 253, 257, 320, 295, 295, 271, 322, 343,
313, 293, 298, 272, 267, 257, 334, 276, 337, 325, 261, 344, 298,
253, 302, 318, 289, 302, 291, 343, 310, 241])
modern_calories = np.array([310, 315, 303, 360, 339, 416, 278, 326, 316, 314, 333, 317, 357,
304, 363, 387, 279, 350, 367, 321, 366, 311, 308, 303, 299, 363,
335, 357, 392, 321, 361, 285, 321, 290, 392, 341, 331, 338, 326,
314, 327, 320, 293, 333, 297, 315, 365, 408, 352, 359, 312, 300,
263, 358, 345, 360, 336, 378, 315, 354, 318, 300, 372, 305, 336,
286, 296, 413, 383, 328, 418, 388, 416, 371, 313, 321, 321, 317,
402, 290, 328, 344, 330, 319, 309, 327, 351, 324, 278, 369, 416,
359, 381, 324, 306, 350, 385, 335, 395, 308])
ax = sns.distplot(pepe_calories, fit_kws={"color":"blue"}, kde=False,
fit=stats.norm, hist=None, label="Pepe's");
ax = sns.distplot(modern_calories, fit_kws={"color":"orange"}, kde=False,
fit=stats.norm, hist=None, label="Modern");
# Get the two lines from the axes to generate shading
l1 = ax.lines[0]
l2 = ax.lines[1]
# Get the xy data from the lines so that we can shade
x1 = l1.get_xydata()[:,0]
y1 = l1.get_xydata()[:,1]
x2 = l2.get_xydata()[:,0]
y2 = l2.get_xydata()[:,1]
x2min = np.min(x2)
x1max = np.max(x1)
ax.fill_between(x1,y1, where = ((x1 > x2min) & (x1 < x1max)), color="red", alpha=0.3)
#> <matplotlib.collections.PolyCollection at 0x1a200510b8>
plt.legend()
#> <matplotlib.legend.Legend at 0x1a1ff2e390>
plt.show()
Any ideas?
Created on 2018-12-01 by the reprexpy package
import reprexpy
print(reprexpy.SessionInfo())
#> Session info --------------------------------------------------------------------
#> Platform: Darwin-18.2.0-x86_64-i386-64bit (64-bit)
#> Python: 3.6
#> Date: 2018-12-01
#> Packages ------------------------------------------------------------------------
#> matplotlib==2.1.2
#> numpy==1.15.4
#> reprexpy==0.1.1
#> scipy==1.1.0
#> seaborn==0.9.0
While gathering the pdf data from get_xydata is clever, you are now at the mercy of matplotlib's rendering / segmentation algorithm. Having x1 and x2 span different ranges also makes comparing y1 and y2 difficult.
You can avoid these problems by fitting the normals yourself instead of
letting sns.distplot do it. Then you have more control over the values you are
looking for.
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
norm = stats.norm
pepe_calories = np.array([361, 291, 263, 284, 311, 284, 282, 228, 328, 263, 354, 302, 293,
254, 297, 281, 307, 281, 262, 302, 244, 259, 273, 299, 278, 257,
296, 237, 276, 280, 291, 278, 251, 313, 314, 323, 333, 270, 317,
321, 307, 256, 301, 264, 221, 251, 307, 283, 300, 292, 344, 239,
288, 356, 224, 246, 196, 202, 314, 301, 336, 294, 237, 284, 311,
257, 255, 287, 243, 267, 253, 257, 320, 295, 295, 271, 322, 343,
313, 293, 298, 272, 267, 257, 334, 276, 337, 325, 261, 344, 298,
253, 302, 318, 289, 302, 291, 343, 310, 241])
modern_calories = np.array([310, 315, 303, 360, 339, 416, 278, 326, 316, 314, 333, 317, 357,
304, 363, 387, 279, 350, 367, 321, 366, 311, 308, 303, 299, 363,
335, 357, 392, 321, 361, 285, 321, 290, 392, 341, 331, 338, 326,
314, 327, 320, 293, 333, 297, 315, 365, 408, 352, 359, 312, 300,
263, 358, 345, 360, 336, 378, 315, 354, 318, 300, 372, 305, 336,
286, 296, 413, 383, 328, 418, 388, 416, 371, 313, 321, 321, 317,
402, 290, 328, 344, 330, 319, 309, 327, 351, 324, 278, 369, 416,
359, 381, 324, 306, 350, 385, 335, 395, 308])
pepe_params = norm.fit(pepe_calories)
modern_params = norm.fit(modern_calories)
xmin = min(pepe_calories.min(), modern_calories.min())
xmax = max(pepe_calories.max(), modern_calories.max())
x = np.linspace(xmin, xmax, 100)
pepe_pdf = norm(*pepe_params).pdf(x)
modern_pdf = norm(*modern_params).pdf(x)
y = np.minimum(modern_pdf, pepe_pdf)
fig, ax = plt.subplots()
ax.plot(x, pepe_pdf, label="Pepe's", color='blue')
ax.plot(x, modern_pdf, label="Modern", color='orange')
ax.fill_between(x, y, color='red', alpha=0.3)
plt.legend()
plt.show()
If, let's say, sns.distplot (or some other plotting function) made a plot that you did not want to have to reproduce, then you could use the data from get_xydata this way:
import numpy as np
import seaborn as sns
import scipy.stats as stats
import matplotlib.pyplot as plt
pepe_calories = np.array([361, 291, 263, 284, 311, 284, 282, 228, 328, 263, 354, 302, 293,
254, 297, 281, 307, 281, 262, 302, 244, 259, 273, 299, 278, 257,
296, 237, 276, 280, 291, 278, 251, 313, 314, 323, 333, 270, 317,
321, 307, 256, 301, 264, 221, 251, 307, 283, 300, 292, 344, 239,
288, 356, 224, 246, 196, 202, 314, 301, 336, 294, 237, 284, 311,
257, 255, 287, 243, 267, 253, 257, 320, 295, 295, 271, 322, 343,
313, 293, 298, 272, 267, 257, 334, 276, 337, 325, 261, 344, 298,
253, 302, 318, 289, 302, 291, 343, 310, 241])
modern_calories = np.array([310, 315, 303, 360, 339, 416, 278, 326, 316, 314, 333, 317, 357,
304, 363, 387, 279, 350, 367, 321, 366, 311, 308, 303, 299, 363,
335, 357, 392, 321, 361, 285, 321, 290, 392, 341, 331, 338, 326,
314, 327, 320, 293, 333, 297, 315, 365, 408, 352, 359, 312, 300,
263, 358, 345, 360, 336, 378, 315, 354, 318, 300, 372, 305, 336,
286, 296, 413, 383, 328, 418, 388, 416, 371, 313, 321, 321, 317,
402, 290, 328, 344, 330, 319, 309, 327, 351, 324, 278, 369, 416,
359, 381, 324, 306, 350, 385, 335, 395, 308])
ax = sns.distplot(pepe_calories, fit_kws={"color":"blue"}, kde=False,
fit=stats.norm, hist=None, label="Pepe's");
ax = sns.distplot(modern_calories, fit_kws={"color":"orange"}, kde=False,
fit=stats.norm, hist=None, label="Modern");
# Get the two lines from the axes to generate shading
l1 = ax.lines[0]
l2 = ax.lines[1]
# Get the xy data from the lines so that we can shade
x1, y1 = l1.get_xydata().T
x2, y2 = l2.get_xydata().T
xmin = max(x1.min(), x2.min())
xmax = min(x1.max(), x2.max())
x = np.linspace(xmin, xmax, 100)
y1 = np.interp(x, x1, y1)
y2 = np.interp(x, x2, y2)
y = np.minimum(y1, y2)
ax.fill_between(x, y, color="red", alpha=0.3)
plt.legend()
plt.show()
I suppose not using seaborn in cases where you want to have full control over the resulting plot is often a useful strategy. Hence just calculate the fits, plot them and use fill between the curves up to the point where they cross each other.
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
pepe_calories = np.array(...)
modern_calories = np.array(...)
x = np.linspace(150,470,1000)
y1 = stats.norm.pdf(x, *stats.norm.fit(pepe_calories))
y2 = stats.norm.pdf(x, *stats.norm.fit(modern_calories))
cross = x[y1-y2 <= 0][0]
fig, ax = plt.subplots()
ax.fill_between(x,y1,y2, where=(x<=cross), color="red", alpha=0.3)
ax.plot(x,y1, label="Pepe's")
ax.plot(x,y2, label="Modern")
ax.legend()
plt.show()

How to add a legend to matplotlib scatter plot

I'm attempting to plot a PCA and one of the colours is label 1 and the other should be label 2. When I want to add a legend with ax1.legend() I only get the label for the blue dot or no label at all. How can I add the legend with the correct labels for both the blue and purple dots?
sns.set(style = 'darkgrid')
fig, ax1 = sns.plt.subplots()
x1, x2 = X_bar[:,0], X_bar[:,1]
ax1.scatter(x1, x2, 100, edgecolors='none', c = colors)
fig.set_figheight(8)
fig.set_figwidth(15)
It looks like you are plotting each point oscillating between two colours. As per the answer to this question subsampling every nth entry in a numpy array You can use numpys array slicing to plot two separate arrays, then do legend as normal.
For some sample data:
import numpy as np
import numpy.random as nprnd
import matplotlib.pyplot as plt
A = nprnd.randint(1000, size=100)
A.shape = (50,2)
x1, x2 = np.sort(A[:,0], axis=0), np.sort(A[:,1], axis=0)
x1
Out[50]:
array([ 46, 63, 84, 96, 118, 127, 137, 142, 181, 187, 187, 207, 210,
238, 238, 330, 334, 335, 346, 346, 350, 392, 400, 426, 467, 531,
550, 567, 569, 572, 583, 625, 637, 661, 671, 677, 698, 713, 777,
796, 837, 850, 866, 868, 874, 890, 919, 972, 992, 993])
x2
Out[51]:
array([ 2, 44, 49, 51, 72, 84, 86, 118, 120, 133, 150, 155, 156,
159, 199, 202, 250, 281, 289, 317, 317, 386, 405, 414, 427, 461,
507, 510, 543, 552, 553, 555, 559, 576, 618, 622, 633, 647, 665,
672, 682, 685, 745, 767, 776, 802, 808, 813, 847, 973])
labels=['blue','red']
fig, ax1 = plt.subplots()
ax1.scatter(x1[0::2], x2[0::2], 100, edgecolors='none', c='red', label = 'red')
ax1.scatter(x1[1::2], x2[1::2], 100, edgecolors='none', c='black', label = 'black')
plt.legend()
plt.show()
For your code, you can do:
sns.set(style = 'darkgrid')
fig, ax1 = sns.plt.subplots()
x1, x2 = X_bar[:,0], X_bar[:,1]
ax1.scatter(x1[0::2], x2[0::2], 100, edgecolors='none', c = colors[0], label='one')
ax1.scatter(x1[1::2], x2[1::2], 100, edgecolors='none', c = colors[1], label='two')
fig.set_figheight(8)
fig.set_figwidth(15)
plt.legend()

Imposing one graph onto another

I'm in a physics lab class and we have to write some code to analyze some data we have collected. My question is simple and probably stupid but I was just wondering how to plot a graph on top of another graph using python. Here is my code so far thanks
%pylab
import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
#SIGNAL DATA
dataSig = [658, 679, 683, 691, 693, 693, 695, 696, 696, 696, 697, 699, 699, 700, 700, 700, 702, 703, 703, 704, 706, 706, 708, 708, 709, 709, 712, 712, 713, 714, 714, 715, 715, 715, 716, 716, 716, 717, 717, 717, 718, 718, 718, 718, 719, 720, 720, 721, 721, 721, 722, 723, 723, 724, 725, 725, 725, 726, 726, 726, 727, 727, 728, 728, 729, 730, 730, 731, 731, 731, 731, 732, 732, 733, 734, 734, 734, 734, 735, 736, 737, 738, 738, 738, 738, 740, 740, 741, 741, 741, 742, 743, 743, 743, 743, 743, 743, 743, 744, 744, 745, 746, 746, 746, 746, 747, 747, 747, 747, 748, 749, 749, 750, 750, 750, 750, 751, 751, 751, 751, 752, 752, 752, 754, 754, 756, 756, 757, 757, 757, 759, 759, 760, 760, 760, 762, 762, 762, 762, 762, 762, 763, 764, 765, 765, 765, 765, 766, 766, 766, 767, 767, 768, 769, 769, 770, 770, 771, 773, 775, 776, 780, 786, 786, 786, 787, 790, 790, 793, 796, 797, 798, 817, 823]
#[658,679,683,691,693,695,696,697,699,700,702,703,704,706,708,709,712,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,740,741,742,743,744,745,746,747,748,749,750,751,752,754,756,757,759,760,762,763,764,765,766,767,768,769,770,771,773,775,776,780,786,787,790,793,796,797,798,817,823] #[1,1,1,1,1,1,3,1,2,3,1,2,1,2,2,1,2,1,2,3,2,3,3,1,2,3,1,2,1,3,3,2,2,1,1,4,2,1,4,1,1,1,4,2,3,1,7,2,1,4,4,1,2,4,4,3,2,2,2,2,3,6,1,1,4,3,2,1,2,2,1,1,1,1,1,3,1,2,1,1,1,1,1,1]
#SIGNAL DEFINED VARIABLES
ntestpoints = 175
themean = 739.1
#sigma = ?
#amp = center/guassian
#SIGNAL GAUSSIAN FITTING FUNCTION
def mygauss(x, amp, center, sigma):
"""This is an example gaussian function, which takes in x values, the amplitude (amp),
the center x value (center) and the sigma of the Gaussian, and returns the respective y values."""
y = amp * np.exp(-.5*((x-center)/sigma)**2)
return y
#SIGNAL PLOT, NO GAUSS
plt.figure(figsize=(10,6))
plt.hist(dataSig,bins=ntestpoints/10,histtype="stepfilled",alpha=.5,color='g',range=[600,900])
plt.xlabel('Number of Counts/Second',fontsize=20)
plt.ylabel('Number of Measurements',fontsize=20)
plt.title('Measured Signal Count Rate Fitting with Gaussian Function',fontsize=22)
plt.axvline(themean,linestyle='-',color='r')
#plt.axvline(themean+error_on_mean,linestyle='--',color='b')
#plt.axvline(themean-error_on_mean,linestyle='--',color='b')
#plt.axvline(testmean,color='k',linestyle='-')
plt.show()
#------------------------------------------------------------
# define a function to make a gaussian with input values, used later
def mygauss(x, amp, center, sigma):
"""This is an example gaussian function, which takes in x values, the amplitude (amp),
the center x value (center) and the sigma of the Gaussian, and returns the respective y values."""
y = amp * np.exp(-.5*((x-center)/sigma)**2)
return y
npts = 40 # the number of points on the x axis
x = np.linspace(600,900,npts) # make a series of npts linearly spaced values between 0 and 10
amp = 40
center = 740.5
sigma = 40
y = mygauss(x, amp, center, sigma)
print y
plt.figure(figsize=(10,6))
plt.plot(x,y,'bo', label='data points')
plt.text(center, amp, "<-- peak is here",fontsize=16) # places text at any x/y location on the graph
plt.xlabel('X axis',fontsize=20)
plt.ylabel('Y axis', fontsize=20)
plt.title('A gaussian plot \n with some extras!',fontsize=20)
plt.legend(loc='best')
plt.show()
when you call for plt.figure() you are making a new plot area, in a different figure.
if you dont call for it the second time, you will plot in the same graph as the first one.
that however is not always a solution by itself, if they have very different scales, that can cause one graph to be massively outscaled by the other.
fortunately its not the case here so i wont get into details of how to use 2 different scales in one graph, but you can check it here (http://matplotlib.org/examples/api/two_scales.html)
by commenting the second plt.figure() from your code you get this:
hope it helps!
ps: next time try posting it with a matplotlib tag, it will get a fastter response than physics, since its basically a matplotlib question you had.

assign value of arbitrary line in 2-d array to nans

I have a 2D numpy array, z, in which I would like to assign values to nan based on the equation of a line +/- a width of 20. I am trying to implement the Raman 2nd scattering correction as it is done by the eem_remove_scattering method in the eemR package listed here:
https://cran.r-project.org/web/packages/eemR/vignettes/introduction.html
but the method isn't visible.
import numpy as np
ex = np.array([240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300,
305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, 365,
370, 375, 380, 385, 390, 395, 400, 405, 410, 415, 420, 425, 430,
435, 440, 445, 450])
em = np.array([300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324,
326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350,
352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376,
378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402,
404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428,
430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454,
456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480,
482, 484, 486, 488, 490, 492, 494, 496, 498, 500, 502, 504, 506,
508, 510, 512, 514, 516, 518, 520, 522, 524, 526, 528, 530, 532,
534, 536, 538, 540, 542, 544, 546, 548, 550, 552, 554, 556, 558,
560, 562, 564, 566, 568, 570, 572, 574, 576, 578, 580, 582, 584,
586, 588, 590, 592, 594, 596, 598, 600])
X, Y = np.meshgrid(ex, em)
z = np.sin(X) + np.cos(Y)
The equation that I would like to apply is em = - 2 ex/ (0.00036*ex-1) + 500.
I want to set every value in the array that intersects this line (+/- 20 ) to be set to nans. Its simple enough to set a single element to nans, but I havent been able to locate a python function to apply this equation to the array and only set values that intersect with this line to nans.
The desired output would be a new array with the same dimensions as z, but with the values that intersect the line equivalent to nan. Any suggestions on how to proceed are greatly appreciated.
Use np.where in the form np.where( "condition for intersection", np.nan, z):
zi = np.where( np.abs(-2*X/(0.00036*X-1) + 500 - Y) <= 20, np.nan, z)
As a matter of fact, there are no intersections here because (0.00036*ex-1) is close to -1 for all your values, which makes - 2*ex/(0.00036*ex-1) close to 2*ex, and adding 500 brings this over any values you have in em. But in principle this works.
Also, I suspect that the goal you plan to achieve by setting those values to NaN would be better achieved by using a masked array.

Categories

Resources