I've got some weird behaviour in matplotlib that I couldn't explain, and I was wondering if someone could see what was going on. What's essentially happening is that I'm trying to place what used to be two figures into one. I do so by creating two GridSpec objects, one for the left half of the figure and the other for the right. I draw the left hand side and add a colorbar, but when I select my first subplot on the right hand side, the figure on the left shifts to the right under the colorbar. If you try executing the example code excluding the last two lines, you will see what you expect, but if you execute the entirety of it, the plot on the left shifts. What's going on?
import matplotlib.gridspec as gridspec
import numpy as np
import pylab as pl
scores = np.array([[ 0.32 , 0.32 , 0.32 , 0.32 , 0.32 ,
0.32 , 0.32 , 0.32 , 0.32 ],
[ 0.32 , 0.32 , 0.32 , 0.49333333, 0.85333333,
0.92666667, 0.32 , 0.32 , 0.32 ],
[ 0.32 , 0.32 , 0.51333333, 0.87333333, 0.96 ,
0.95333333, 0.89333333, 0.44 , 0.34 ],
[ 0.32 , 0.51333333, 0.88 , 0.96 , 0.96666667,
0.95333333, 0.90666667, 0.47333333, 0.34 ],
[ 0.51333333, 0.88 , 0.96 , 0.96 , 0.96 ,
0.96 , 0.90666667, 0.47333333, 0.34 ],
[ 0.88 , 0.96 , 0.96 , 0.96 , 0.94666667,
0.96 , 0.90666667, 0.47333333, 0.34 ],
[ 0.96 , 0.96 , 0.96666667, 0.96 , 0.94 ,
0.96 , 0.90666667, 0.47333333, 0.34 ],
[ 0.96 , 0.96666667, 0.96666667, 0.94666667, 0.94 ,
0.96 , 0.90666667, 0.47333333, 0.34 ],
[ 0.96666667, 0.97333333, 0.96 , 0.94666667, 0.94 ,
0.96 , 0.90666667, 0.47333333, 0.34 ],
[ 0.96666667, 0.96666667, 0.96666667, 0.94666667, 0.94 ,
0.96 , 0.90666667, 0.47333333, 0.34 ],
[ 0.95333333, 0.96 , 0.96666667, 0.94666667, 0.94 ,
0.96 , 0.90666667, 0.47333333, 0.34 ]])
C_range = 10.0 ** np.arange(-2, 9)
gamma_range = 10.0 ** np.arange(-5, 4)
pl.figure(0, figsize=(16,6))
gs = gridspec.GridSpec(1,1)
gs.update(left=0.05, right=0.45, bottom=0.15, top=0.95)
pl.subplot(gs[0,0])
pl.imshow(scores, interpolation='nearest', cmap=pl.cm.spectral)
pl.xlabel('gamma')
pl.ylabel('C')
pl.colorbar()
pl.xticks(np.arange(len(gamma_range)), gamma_range, rotation=45)
pl.yticks(np.arange(len(C_range)), C_range)
gs = gridspec.GridSpec(3,3)
gs.update(left=0.5, right=0.95, bottom=0.05, top=0.95)
pl.subplot(gs[0,0]) # here's where the shift happens
You can create the colorbar after # here's where the shift happens
pl.figure(0, figsize=(16,6))
gs = gridspec.GridSpec(1,1)
gs.update(left=0.05, right=0.45, bottom=0.15, top=0.95)
ax = pl.subplot(gs[0,0]) # save the axes to ax
pl.imshow(scores, interpolation='nearest', cmap=pl.cm.spectral)
pl.xlabel('gamma')
pl.ylabel('C')
pl.xticks(np.arange(len(gamma_range)), gamma_range, rotation=45)
pl.yticks(np.arange(len(C_range)), C_range)
gs = gridspec.GridSpec(3,3)
gs.update(left=0.5, right=0.95, bottom=0.05, top=0.95)
pl.subplot(gs[0,0]) # here's where the shift happens
pl.colorbar(ax=ax) # create colorbar for ax
pl.show()
Related
I'm in the process of decompiling a hex string made up of analog values. I was told that it consists of 4 byte hex per number.
#0-799 Load data, 4 bytes each
#800-1599 Position data, 4 bytes each
I'm trying to decode it but can't seem to get the results that were given to me. I'm wondering if perhaps my scaling functions are incorrect for unsigned 32 to float conversion. I realize it could be my engineering units (new max, new min) that it is being scaled too but i want to rule out that i am doing something incorrectly.
Here is my code
#INSERT LIBRARIES
import matplotlib.pyplot as plt
#this hex string is from the sample given
hex_string = '00706A450090574500F0484500C0394500802D45001027450050284500E0304500603E4500204D450010594500C05E4500505C4500F051450060414500C02D4500E01A4500C00C4500C0064500000B4500D019450060314500E04D4500006A450008804500A08545007884450030794500205F4500E03F4500D0214500F00A450080FF440070024500D0124500202E450060504500F0744500D08B4500009B4500A0A7450040B24500F8BB4500F0C54500D0D0450080DC450028E8450068F2450000FA450028FE4500D8FE4500F0FC4500B8F9450088F6450068F44500D8F34500C8F44500B8F6450028F94500A0FB4500F0FD45000C0046002401460058024600B803460038054600B8064600F4074600980846005C08460010074600B40446009001460050FC450040F6450048F2450068F1450020F4450040FA45006C01460044064600BC0A4600F80D4600280F4600CC0D4600B409460010034600C0F44500D0E04500F8CB4500C8B7450090A5450038964500308A45009881450060784500B072450010704500206E4500706A45000000005C8F0240D7A380408FC2CD409A9911413D0A3F4152B86E4152B89041F628AC41713DCA41A470EB413D0A08427B141C42CDCC314285EB4842C3F56042B81E79420A578842856B93423D8A9D4214AEA6420000AF42ECD1B6425C8FBE423D8AC6425C0FCF425238D8420000E2423333EC42E17AF64252380043ECD104439AD9084333330C43D7E30E43CD0C114348E11243299C1443146E1643D7631843146E1A43856B1C43A4301E43B89E1F43F6A82043B85E214329DC21431F45224385AB22430A1723435278234314AE23439A9923437B1423437B1422435C8F20437B941E4333331C438F821943CD8C16430A5713439AD90F433D0A0C4366E60743A4700343C375FD4214AEF34252B8E94252B8DF4271BDD542CDCCCB4285EBC1427B14B8428F42AE42A470A44200809A42CD4C9042F6A88542B81E75427B145E420AD74642F628304285EB1A4252B807429A99ED410000D0418FC2B5410AD79D410AD787413D0A6741713D4241A4702141EC5104413333D3400000A040CDCC5C4000000040295C4F3F0AD7A33D00000000'
#Signed Integer: A 16-bit signed integer ranging from -32,768 to +32,767
#Unsigned Integer: A 16-bit unsigned integer ranging from 0 to 65535.
#signed 32 bit int range is -2147483648 to 2147483647
#unsigned 32 int range is 0 to 4294967295
#sampe hex string:#5C8F0240
#Converted 32 bit equivalent: 1552876096
#Card hex string byte layout Description
#0-799 Load data, 4 bytes each
#800-1599 Position data, 4 bytes each
#create a new function that scales the 32 bit unsigned integer position value into a float value. Assuming a stroke range of 0-168 inches and that the integer value is unsigned 32 bit
def u32int_pos_to_float(u32int_value):
OldValue = u32int_value
OldMin = 0
OldMax = 4294967295
NewMin = 0
NewMax = 168
NewValue = (((OldValue - OldMin) * (NewMax - NewMin)) / (OldMax - OldMin)) + NewMin
return NewValue
#create a new function that scales the 32 bit unsigned integer load value into a float value. Assuming a load cell of 0-30000 lbs and that the integer value is unsigned 32 bit
def u32int_load_to_float(u32int_value):
OldValue = u32int_value
OldMin = 0
OldMax = 4294967295
NewMin = 0
NewMax = 30000
NewValue = (((OldValue - OldMin) * (NewMax - NewMin)) / (OldMax - OldMin)) + NewMin
return NewValue
#Card hex string byte layout Description
#0-799 Load data, 4 bytes each
#800-1599 Position data, 4 bytes each
#A byte (or octet) is 8 bits so is always represented by 2 Hex characters in the range 00 to FF
#4 bytes = 32 bits
#Find the middle index of the hex_string and split the string into two halves
print(' ')
print('-----this is the start of the hex string conversion logic-----')
print(' ')
print('The hex string is: ' + hex_string)
print(' ')
print('hex string length: ' + str(len(hex_string)))
middle_of_String = int(len(hex_string)/2)
print('middle of string is:',middle_of_String,)
lastloadinstring = middle_of_String
startposinstring = middle_of_String
print('')
hex_load_string = hex_string[:lastloadinstring]
print(len(hex_load_string))
hex_pos_string = hex_string[startposinstring:]
print(len(hex_pos_string))
print(' ')
print('----Start of the hexadecial load and position lists from Hex String dividing in half----')
print(' ')
print('hex_load_string length:',len(hex_load_string))
print(hex_load_string)
print('hex_pos_string length:',len(hex_pos_string))
print(hex_pos_string)
#parse the hex strings into 4 byte chunks
hex_load_list = [hex_load_string[i:i+8] for i in range(0, len(hex_load_string), 8)]
hex_pos_list = [hex_pos_string[i:i+8] for i in range(0, len(hex_pos_string), 8)]
print(' ')
print('---start of the hexadecimal load and position 4 byte "chunks" list----')
print('----Note from developer: 0-799 Load data and 800-1599 Position data are 4 bytes each----')
print(' ')
print('hex_load_list length:',len(hex_load_list))
print(hex_load_list)
print('hex_pos_list length:',len(hex_pos_list))
print(hex_pos_list)
#convert each hex chunk to 32 bit unsigned integer
load_list_int = []
pos_list_int = []
for i in range(0, len(hex_load_list)):
load_list_int.append(int(hex_load_list[i],16))
pos_list_int.append(int(hex_pos_list[i],16))
print(' ')
print('----start of the load and position unsigned 32 bit integer list----')
print(' ')
print('load_list_int length:',len(load_list_int))
print(load_list_int)
print('pos_list_int length:',len(pos_list_int))
print(pos_list_int)
#using the new function, convert the 32 bit unsigned integers to a new list of floats
load_list_float = []
for i in range(0, len(load_list_int)):
load_list_float.append(u32int_load_to_float(load_list_int[i]))
pos_list_float = []
for i in range(0, len(pos_list_int)):
pos_list_float.append(u32int_pos_to_float(pos_list_int[i]))
print(' ')
print('----start of the load and position floating value lists----')
print('--these are scaled using the functions are the top of the code. All scaled as unsigned 32 bit integers---')
print('---engineering units for scaling in function comments-----')
print(' ')
print('load_list_float length:',len(load_list_float))
print(load_list_float)
print('pos_list_float length:',len(pos_list_float))
print(pos_list_float)
#create a scatter plot of the position and load data
plt.scatter(pos_list_float, load_list_float)
plt.xlabel('Position (inches)')
plt.ylabel('Load (lbs)')
plt.title('Position vs Load')
plt.show()
the following are the expected results.
200
position load
0.11 , 14083
0.23 , 14033
0.46 , 14013
0.69 , 13905
0.92 , 13767
1.55 , 13744
2.22 , 13585
2.89 , 13675
3.56 , 13677
4.54 , 13539
5.61 , 13357
6.67 , 13287
7.74 , 13668
9.03 , 14350
10.45 , 15073
11.86 , 15544
13.27 , 15654
14.82 , 15944
16.53 , 16215
18.24 , 16453
19.94 , 16725
21.72 , 16908
23.67 , 17149
25.62 , 17466
27.57 , 17828
29.54 , 18133
31.69 , 18312
33.83 , 18585
35.97 , 19025
38.09 , 19668
40.38 , 20443
42.66 , 20364
44.94 , 20018
47.2 , 19619
49.54 , 19211
51.96 , 18805
54.38 , 18448
56.81 , 18173
59.23 , 18015
61.79 , 17828
64.32 , 17677
66.87 , 17542
69.42 , 17434
72.03 , 17454
74.64 , 17399
77.26 , 17269
79.87 , 17038
82.52 , 16739
85.17 , 16888
87.82 , 17075
90.48 , 17272
93.12 , 17502
95.75 , 17799
98.37 , 17993
100.99 , 18033
103.59 , 18058
106.15 , 18090
108.7 , 18078
111.27 , 18101
113.81 , 18048
116.26 , 17970
118.72 , 17905
121.19 , 17877
123.66 , 17920
125.96 , 17903
128.28 , 17775
130.6 , 17558
132.92 , 17398
135.07 , 17159
137.2 , 16945
139.34 , 16794
141.48 , 16707
143.41 , 16641
145.3 , 16543
147.19 , 16539
149.09 , 16520
150.75 , 16440
152.37 , 16306
153.98 , 16215
155.59 , 16181
156.95 , 16071
158.23 , 15892
159.5 , 15711
160.8 , 15621
161.82 , 15576
162.75 , 15569
163.69 , 15520
164.63 , 15415
165.3 , 15279
165.89 , 15162
166.48 , 15122
167.07 , 14976
167.39 , 14915
167.65 , 15001
167.9 , 15173
168.15 , 15251
168.15 , 15349
168.09 , 15502
168.03 , 15784
167.97 , 15959
167.68 , 15832
167.34 , 15507
167 , 15381
166.66 , 15151
166.11 , 14970
165.51 , 14816
164.92 , 14636
164.33 , 14475
163.55 , 14238
162.73 , 13731
161.91 , 13326
161.1 , 13048
160.12 , 12775
159.1 , 12715
158.08 , 12748
157.07 , 12773
155.91 , 12764
154.71 , 12703
153.5 , 12570
152.3 , 12555
150.99 , 12406
149.62 , 12135
148.25 , 11804
146.89 , 11512
145.42 , 11047
143.89 , 10545
142.38 , 9960
140.86 , 9393
139.28 , 9066
137.61 , 8782
135.94 , 8582
134.28 , 8473
132.61 , 8429
130.78 , 8317
128.98 , 8143
127.18 , 8133
125.38 , 8347
123.44 , 8811
121.49 , 9368
119.53 , 9770
117.57 , 10086
115.52 , 10395
113.41 , 10840
111.29 , 11309
109.16 , 11632
106.99 , 11774
104.71 , 11814
102.42 , 11686
100.14 , 11488
97.85 , 11265
95.43 , 11063
93.01 , 10980
90.6 , 10640
88.19 , 10245
85.7 , 9841
83.19 , 9367
80.69 , 9014
78.19 , 8772
75.66 , 8608
73.11 , 8477
70.55 , 8295
67.99 , 8260
65.43 , 8337
62.84 , 8508
60.26 , 8815
57.68 , 9184
55.07 , 9562
52.5 , 9889
49.92 , 10058
47.33 , 10357
44.72 , 10660
42.18 , 10967
39.68 , 11273
37.16 , 11525
34.64 , 11619
32.22 , 11675
29.9 , 11720
27.58 , 11749
25.25 , 11714
23.04 , 11613
21.01 , 11502
18.98 , 11349
16.95 , 11172
15.04 , 11061
13.4 , 11069
11.75 , 11137
10.1 , 11163
8.47 , 11180
7.37 , 11287
6.17 , 11385
4.96 , 11603
3.83 , 11862
3.1 , 12134
2.38 , 12405
1.65 , 12643
0.92 , 12875
0.72 , 13118
0.48 , 13388
0.24 , 13680
0.11 , 14083
0
7.4
168
0
0
1
"22-07"
0.82
I'm a newby at python and been staring at this too long so any help would be appreciated.
Your most likely error is a little-endian big-endian misinterpretation. Looking at the structure of the data it's much more likely that these are little-endian values. Beyond that, you haven't shown realistic-enough code, and a complete-enough specification to pinpoint exactly why your actual data don't match the sample; but at least the endianness fix produces results that are more plausible.
To even get into the neighbourhood of the expected values, you need to replace your scales (i.e. 30,000) with inferred values from linear regression. This does not produce exactly the expected results, but again, they're in the neighbourhood.
Don't scatterplot. Since these data vary somewhat continuously it's important to see the relationship of readings over time.
import struct
import matplotlib.pyplot as plt
hex_string = (
'00706A450090574500F0484500C0394500802D45001027450050284500E0304500603E4500204D450010594500C05E4500505C4500F05145'
'0060414500C02D4500E01A4500C00C4500C0064500000B4500D019450060314500E04D4500006A450008804500A085450078844500307945'
'00205F4500E03F4500D0214500F00A450080FF440070024500D0124500202E450060504500F0744500D08B4500009B4500A0A7450040B245'
'00F8BB4500F0C54500D0D0450080DC450028E8450068F2450000FA450028FE4500D8FE4500F0FC4500B8F9450088F6450068F44500D8F345'
'00C8F44500B8F6450028F94500A0FB4500F0FD45000C0046002401460058024600B803460038054600B8064600F4074600980846005C0846'
'0010074600B40446009001460050FC450040F6450048F2450068F1450020F4450040FA45006C01460044064600BC0A4600F80D4600280F46'
'00CC0D4600B409460010034600C0F44500D0E04500F8CB4500C8B7450090A5450038964500308A45009881450060784500B0724500107045'
'00206E4500706A45000000005C8F0240D7A380408FC2CD409A9911413D0A3F4152B86E4152B89041F628AC41713DCA41A470EB413D0A0842'
'7B141C42CDCC314285EB4842C3F56042B81E79420A578842856B93423D8A9D4214AEA6420000AF42ECD1B6425C8FBE423D8AC6425C0FCF42'
'5238D8420000E2423333EC42E17AF64252380043ECD104439AD9084333330C43D7E30E43CD0C114348E11243299C1443146E1643D7631843'
'146E1A43856B1C43A4301E43B89E1F43F6A82043B85E214329DC21431F45224385AB22430A1723435278234314AE23439A9923437B142343'
'7B1422435C8F20437B941E4333331C438F821943CD8C16430A5713439AD90F433D0A0C4366E60743A4700343C375FD4214AEF34252B8E942'
'52B8DF4271BDD542CDCCCB4285EBC1427B14B8428F42AE42A470A44200809A42CD4C9042F6A88542B81E75427B145E420AD74642F6283042'
'85EB1A4252B807429A99ED410000D0418FC2B5410AD79D410AD787413D0A6741713D4241A4702141EC5104413333D3400000A040CDCC5C40'
'00000040295C4F3F0AD7A33D00000000'
)
"""
Card hex string byte layout description
0-799 Load data, 4 bytes each
800-1599 Position data, 4 bytes each
"""
uints = struct.unpack('<200I', bytearray.fromhex(hex_string))
floats = [x/(1 << 32) for x in uints]
load = [x*173611 - 32993 for x in floats[:100]]
pos = [x*63 - 15.6437 for x in floats[100:]]
print(f'{"pos":>8} {"load":>8}')
for p, l in zip(pos, load):
print(f'{p:8.2f} {l:8.2f}')
# Omit 0-valued position endpoints
plt.plot(pos[1:-1], load[1:-1])
plt.xlabel('Position (inches)')
plt.ylabel('Load (lbs)')
plt.title('Position vs Load')
plt.show()
pos load
-15.64 14082.55
0.11 14032.55
0.23 13993.81
0.30 13953.58
0.37 13921.12
0.41 13904.07
0.46 13907.38
0.49 13930.06
0.52 13965.83
0.55 14004.90
0.58 14036.52
0.61 14051.59
0.63 14045.13
0.65 14017.65
0.67 13973.77
0.69 13921.79
0.71 13871.78
0.73 13834.37
0.74 13818.47
0.75 13829.73
0.76 13868.97
0.77 13931.39
0.77 14006.89
0.78 14081.39
0.79 14139.76
0.80 14154.57
0.81 14151.51
0.82 14121.63
0.83 14052.58
0.84 13969.80
0.84 13890.16
0.85 13829.56
0.85 13799.27
0.86 13807.05
0.86 13850.43
0.86 13922.78
0.86 14013.51
0.86 14110.37
0.87 14170.97
0.87 14211.20
0.87 14244.64
0.87 14272.79
0.87 14298.54
0.87 14324.94
0.88 14353.75
0.88 14384.71
0.88 14415.59
0.88 14442.75
0.88 14462.86
0.88 14473.87
0.88 14475.69
0.88 14470.64
0.88 14462.12
0.88 14453.67
0.88 14448.04
0.88 14446.55
0.87 14449.04
0.87 14454.17
0.87 14460.63
0.87 14467.17
0.86 14473.29
0.86 14478.88
0.86 14481.78
0.85 14484.97
0.85 14488.61
0.84 14492.58
0.83 14496.56
0.82 14499.83
0.81 14501.52
0.80 14500.90
0.79 14497.47
0.78 14491.22
0.78 14482.90
0.77 14468.99
0.76 14452.93
0.75 14442.42
0.74 14440.10
0.73 14447.30
0.71 14463.53
0.69 14482.52
0.67 14495.36
0.64 14507.19
0.62 14515.76
0.61 14518.91
0.58 14515.31
0.55 14504.46
0.53 14486.87
0.50 14448.96
0.48 14396.14
0.45 14340.92
0.42 14287.44
0.38 14239.18
0.36 14198.53
0.31 14166.66
0.26 14143.90
0.20 14119.47
0.11 14104.41
-0.06 14097.45
-0.47 14092.32
-15.64 14082.55
For anyone encountering the same problem. When converting data types in python i think its a good practice to convert the original data to binary THEN to the target format. In my problem i was going directly from original type to target type which was causing all sort of problems due to python making assumptions. I suppose its a good and bad thing. Anyways, use struct and binascii libraries for this conversion. Below is my code. Also refer to python sections.
https://docs.python.org/3/library/struct.html?highlight=struct#module-struct
https://docs.python.org/3/library/binascii.html
import struct as st
import binascii as b
#this function is used to convert the hex string to a position and load list of floating points
def hex_str_to_pl_lists(hex_string):
#convert the hex string to binary
hex_string_bin = b.unhexlify(hex_string)
#unpack the binary string into a list of floats in little endian format
hex_string_float = st.unpack('<' + 'f' * (len(hex_string_bin) // 4), hex_string_bin
First of all, I'm sorry if my questions doesn't make sens, I am new using CVXPY library and I don't understand everything :/
I am trying to solve a minimization problem that I thought would be easy handle.
I got a matrix S dimensions (9,7) with known coefficients, B dimensions (1,7) with known coefficients, Alpha dimensions (1,7) what I need to find, with various constraints :
Alpha must be positive
The sum of all the coefficients of Alpha must be equal to 1
I need to optimize Alpha such as : A # Alpha-B=0.
I discovered CVXPY and thought least square optimization was perfect for this issue.
This is the code I wrote :
Alpha = cp.Variable(7)
objective = cp.Minimize(cp.sum_squares(S # Alpha - B))
constraints = [0 <= Alpha, Alpha<=1, np.sum(Alpha.value)==1]
prob = cp.Problem(objective, constraints)
result = prob.solve()
print(Alpha.value)
With
S= np.array([[0.03,0.02,0.072,0.051,0.058,0.0495,0.021 ],
[0.0295, 0.025 , 0.1 , 0.045 , 0.064 , 0.055 , 0.032 ],
[0.02 , 0.018 , 0.16 , 0.032 , 0.054 , 0.064 , 0.025 ],
[0.0195, 0.03 , 0.144 , 0.027 , 0.04 , 0.06 , 0.04 ],
[0.02 , 0.0315, 0.156 , 0.0295 ,0.027 , 0.0615 ,0.05 ],
[0.021 , 0.033 , 0.168 , 0.03 , 0.0265 ,0.063 , 0.09 ],
[0.02 , 0.05 , 0.28 , 0.039 , 0.035 , 0.055 , 0.04 ],
[0.021 , 0.03 , 0.22 , 0.0305, 0.0255, 0.057 , 0.009 ],
[0.0195, 0.008 , 0.2 , 0.021 , 0.01 , 0.048 , 0.0495]])
B=np.array([0.1015, 0.0888, 0.0911, 0.0901, 0.0945, 0.0909, 0.078 , 0.0913,
0.0845])
My issue is the following one :
Without the constraint np.sum(Alpha.value)==1, the code gives me results; but when I add the constraint it returns me
None
I presume the formulation is not good, but I have no Idea how to write it in another way?
Or maybe the problem doesn't have solution?
Thank you for your time
Use just sum(Alpha) == 1. You are not supposed to use numpy functions in CVXPY expressions, you must use CVXPY functions listed in https://www.cvxpy.org/tutorial/functions/index.html
I'm trying to create a data visualization that's essentially a time series chart. But I have to use Panda, Python, and Plotly, and I'm stuck on how to actually label the dates. Right now, the x labels are just integers from 1 to 60, and when you hover over the chart, you get that integer instead of the date.
I'm pulling values from a Google spreadsheet, and for now, I'd like to avoid parsing csv things.
I'd really like some help on how to label x as dates! Here's what I have so far:
import pandas as pd
from matplotlib import pyplot as plt
import bpr
%matplotlib inline
import chart_studio.plotly as pl
import plotly.express as px
import plotly.graph_objects as go
f = open("../credentials.txt")
u = f.readline()
plotly_user = str(u[:-1])
k = f.readline()
plotly_api_key = str(k)
pl.sign_in(username = plotly_user, api_key = plotly_api_key)
rand_x = np.arange(61)
rand_x = np.flip(rand_x)
rand_y = np.array([0.91 , 1 , 1.24 , 1.25 , 1.4 , 1.36 , 1.72 , 1.3 , 1.29 , 1.17 , 1.57 , 1.95 , 2.2 , 2.07 , 2.03 , 2.14 , 1.96 , 1.87 , 1.25 , 1.34 , 1.13 , 1.31 , 1.35 , 1.54 , 1.38 , 1.53 , 1.5 , 1.32 , 1.26 , 1.4 , 1.89 , 1.55 , 1.98 , 1.75 , 1.14 , 0.57 , 0.51 , 0.41 , 0.24 , 0.16 , 0.08 , -0.1 , -0.24 , -0.05 , -0.15 , 0.34 , 0.23 , 0.15 , 0.12 , -0.09 , 0.13 , 0.24 , 0.22 , 0.34 , 0.01 , -0.08 , -0.27 , -0.6 , -0.17 , 0.28 , 0.38])
test_data = pd.DataFrame(columns=['X', 'Y'])
test_data['X'] = rand_x
test_data['Y'] = rand_y
test_data.head()
def create_line_plot(data, x, y, chart_title="Rate by Date", labels_dict={}, c=["indianred"]):
fig = px.line(
data,
x = x,
y = y,
title = chart_title,
labels = labels_dict,
color_discrete_sequence = c
)
fig.show()
return fig
fig = create_line_plot(test_data, 'X', 'Y', labels_dict={'X': 'Date', 'Y': 'Rate (%)'}) ```
Right now, the x labels are just integers from 1 to 60, and when you hover over the chart, you get that integer instead of the date.
This happens because you are setting rand_x as x labels, and rand_x is an array of integer. Setting labels_dict={'X': 'Date', 'Y': 'Rate (%)'} only adding text Date before x value. What you need to do is parsing an array of datetime values into x. For example:
rand_x = np.array(['2020-01-01','2020-01-02','2020-01-03'], dtype='datetime64')
I want to create a numpy array.
T = 200
I want to create an array from 0 to 199, in which each value will be divided by 200.
l = [0, 1/200, 2/200, ...]
Numpy have any such method for calculation?
Alternatively one can use linspace:
>>> np.linspace(0, 1., 200, endpoint=False)
array([ 0. , 0.005, 0.01 , 0.015, 0.02 , 0.025, 0.03 , 0.035,
0.04 , 0.045, 0.05 , 0.055, 0.06 , 0.065, 0.07 , 0.075,
...
0.92 , 0.925, 0.93 , 0.935, 0.94 , 0.945, 0.95 , 0.955,
0.96 , 0.965, 0.97 , 0.975, 0.98 , 0.985, 0.99 , 0.995])
Use np.arange:
>>> import numpy as np
>>> np.arange(200, dtype=np.float)/200
array([ 0. , 0.005, 0.01 , 0.015, 0.02 , 0.025, 0.03 , 0.035,
0.04 , 0.045, 0.05 , 0.055, 0.06 , 0.065, 0.07 , 0.075,
0.08 , 0.085, 0.09 , 0.095, 0.1 , 0.105, 0.11 , 0.115,
...
0.88 , 0.885, 0.89 , 0.895, 0.9 , 0.905, 0.91 , 0.915,
0.92 , 0.925, 0.93 , 0.935, 0.94 , 0.945, 0.95 , 0.955,
0.96 , 0.965, 0.97 , 0.975, 0.98 , 0.985, 0.99 , 0.995])
T = 200.0
l = [x / float(T) for x in range(200)]
import numpy as np
T = 200
np.linspace(0.0, 1.0 - 1.0 / float(T), T)
Personally I prefer linspace for creating evenly spaced arrays in general. It is more complex in this case as the endpoint depends on the number of points T.
I have a dataframe 16k records and multiple groups of countries and other fields. I have produced an initial output of the a data that looks like the snipit below. Now i need to do some data cleansing, manipulating, remove skews or outliers and replace it with a value based on certain rules.
i.e. on the below how could i identify the skewed points (any value greater than 1) and replace them with the average of the next two records or previous record if there no later records.(in that group)
So in the dataframe below I would like to replace Bill%4 for IT week1 of 1.21 with the average of week2 and week3 for IT so it is 0.81.
any tricks for this?
Country Week Bill%1 Bill%2 Bill%3 Bill%4 Bill%5 Bill%6
IT week1 0.94 0.88 0.85 1.21 0.77 0.75
IT week2 0.93 0.88 1.25 0.80 0.77 0.72
IT week3 0.94 1.33 0.85 0.82 0.76 0.76
IT week4 1.39 0.89 0.86 0.80 0.80 0.76
FR week1 0.92 0.86 0.82 1.18 0.75 0.73
FR week2 0.91 0.86 1.22 0.78 0.75 0.71
FR week3 0.92 1.29 0.83 0.80 0.75 0.75
FR week4 1.35 0.87 0.84 0.78 0.78 0.74
I don't know of any built-ins to do this, but you should be able to customize this to meet your needs, no?
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.rand(10,5),columns=list('ABCDE'))
df.index = list('abcdeflght')
# Define cutoff value
cutoff = 0.90
for col in df.columns:
# Identify index locations above cutoff
outliers = df[col][ df[col]>cutoff ]
# Browse through outliers and average according to index location
for idx in outliers.index:
# Get index location
loc = df.index.get_loc(idx)
# If not one of last two values in dataframe
if loc<df.shape[0]-2:
df[col][loc] = np.mean( df[col][loc+1:loc+3] )
else:
df[col][loc] = np.mean( df[col][loc-3:loc-1] )