use string to reference already assigned local variable [duplicate]

use string to reference already assigned local variable [duplicate] - python

This question already has answers here:
How can I select a variable by (string) name?
(5 answers)
Closed 8 months ago.
I want to use form_sate_data which is a string str(), to reference same named local variable inside the is_valid_phone function, you can see in print function.
form_state_data will always be two character short code for states that also exists as local variable containing list of postal code as integer in function _is_valid_phone .
form_state_data = 'AL'
form_phone_data_sliced = 205
def is_valid_phone(form_state_data, form_phone_data_sliced):
# Some codes are not correct.
AL = [205, 251, 256, 334, 938]
AK = [907]
AZ = [480, 520, 602, 623, 928]
AR = [479, 501, 870]
CA = [209, 213, 279, 310, 323, 408, 415, 424, 442, 510, 530, 559, 562, 619, 626, 628, 650, 657, 661, 669, 707, 714, 747, 760, 805, 818, 820, 831, 858, 909, 916, 925, 949, 951]
CO = [303, 719, 720, 970]
CT = [203, 475, 860, 959]
DE = [302]
DC = [202]
FL = [239, 305, 321, 352, 386, 407, 561, 727, 754, 772, 786, 813, 850, 863, 904, 941, 954]
GA = [229, 404, 470, 478, 678, 706, 762, 770, 912]
HI = [808]
ID = [208, 986]
IL = [217, 224, 309, 312, 331, 618, 630, 708, 773, 779, 815, 847, 872]
IN = [219, 260, 317, 463, 574, 765, 812, 930]
IA = [319, 515, 563, 641, 712]
KS = [316, 620, 785, 913]
KY = [270, 364, 502, 606, 859]
LA = [225, 318, 337, 504, 985]
ME = [207]
MT = [339, 351, 413, 508, 617, 774, 781, 857, 978]
NE = [308, 402, 531]
NV = [702, 725, 775]
NH = [603]
NJ = [201, 551, 609, 640, 732, 848, 856, 862, 908, 973]
NM = [505, 575]
NY = [212, 315, 332, 347, 516, 518, 585, 607, 631, 646, 680, 716, 718, 838, 845, 914, 917, 929, 934]
NC = [252, 336, 704, 743, 828, 910, 919, 980, 984]
ND = [701]
OH = [216, 220, 234, 330, 380, 419, 440, 513, 567, 614, 740, 937]
OK = [405, 539, 580, 918]
OR = [458, 503, 541, 971]
MD = [240, 301, 410, 443, 667]
MA = [218, 320, 507, 612, 651, 763, 952]
MI = [228, 601, 662, 769]
MN = [218, 320, 507, 612, 651, 763, 952]
MS = [314, 417, 573, 636, 660, 816]
MO = [406]
PA = [215, 223, 267, 272, 412, 445, 484, 570, 610, 717, 724, 814, 878]
RI = [401]
SC = [803, 843, 854, 864]
SD = [605]
TN = [423, 615, 629, 731, 865, 901, 931]
TX = [210, 214, 254, 281, 325, 346, 361, 409, 430, 432, 469, 512, 682, 713, 726, 737, 806, 817, 830, 832, 903, 915, 936, 940, 956, 972, 979]
UT = [385, 435, 801]
VT = [802]
VA = [276, 434, 540, 571, 703, 757, 804]
WA = [206, 253, 360, 425, 509, 564]
WV = [304, 681]
WI = [262, 414, 534, 608, 715, 920]
WY = [307]
print(form_phone_data_sliced in form_state_data)
is_valid_phone(form_state_data,form_phone_data_sliced)

You should use a dictionary to store the state codes, below is a example of how to achieve this.
states = {
'AL': [205, 251, 256, 334, 938],
'AK': [907],
'AZ': [480, 520, 602, 623, 928],
'AR': [479, 501, 870],
'CA': [209, 213, 279, 310, 323, 408, 415, 424, 442, 510, 530, 559, 562, 619, 626, 628, 650, 657, 661, 669, 707, 714, 747, 760, 805, 818, 820, 831, 858, 909, 916, 925, 949, 951],
'CO': [303, 719, 720, 970],
'CT': [203, 475, 860, 959],
'DE': [302],
'DC': [202],
'FL': [239, 305, 321, 352, 386, 407, 561, 727, 754, 772, 786, 813, 850, 863, 904, 941, 954],
'GA': [229, 404, 470, 478, 678, 706, 762, 770, 912],
'HI': [808],
'ID': [208, 986],
'IL': [217, 224, 309, 312, 331, 618, 630, 708, 773, 779, 815, 847, 872],
'IN': [219, 260, 317, 463, 574, 765, 812, 930],
'IA': [319, 515, 563, 641, 712],
'KS': [316, 620, 785, 913],
'KY': [270, 364, 502, 606, 859],
'LA': [225, 318, 337, 504, 985],
'ME': [207],
'MT': [339, 351, 413, 508, 617, 774, 781, 857, 978],
'NE': [308, 402, 531],
'NV': [702, 725, 775],
'NH': [603],
'NJ': [201, 551, 609, 640, 732, 848, 856, 862, 908, 973],
'NM': [505, 575],
'NY': [212, 315, 332, 347, 516, 518, 585, 607, 631, 646, 680, 716, 718, 838, 845, 914, 917, 929, 934],
'NC': [252, 336, 704, 743, 828, 910, 919, 980, 984],
'ND': [701],
'OH': [216, 220, 234, 330, 380, 419, 440, 513, 567, 614, 740, 937],
'OK': [405, 539, 580, 918],
'OR': [458, 503, 541, 971],
'MD': [240, 301, 410, 443, 667],
'MA': [218, 320, 507, 612, 651, 763, 952],
'MI': [228, 601, 662, 769],
'MN': [218, 320, 507, 612, 651, 763, 952],
'MS': [314, 417, 573, 636, 660, 816],
'MO': [406],
'PA': [215, 223, 267, 272, 412, 445, 484, 570, 610, 717, 724, 814, 878],
'RI': [401],
'SC': [803, 843, 854, 864],
'SD': [605],
'TN': [423, 615, 629, 731, 865, 901, 931],
'TX': [210, 214, 254, 281, 325, 346, 361, 409, 430, 432, 469, 512, 682, 713, 726, 737, 806, 817, 830, 832, 903, 915, 936, 940, 956, 972, 979],
'UT': [385, 435, 801],
'VT': [802],
'VA': [276, 434, 540, 571, 703, 757, 804],
'WA': [206, 253, 360, 425, 509, 564],
'WV': [304, 681],
'WI': [262, 414, 534, 608, 715, 920],
'WY': [307],
}
is_valid_phone = lambda state, code : code in states[state]
print(is_valid_phone('AL', 205))
print(is_valid_phone('AL', 2000005))

If you really want to assign these to variables (bad practice), instead of making you can change function for class, and then access all variables by calling vars() on the class.
class Phone:
AL = [1,2]
p = phone()
print(vars(p)['AL'])
vars() takes an object and outputs a dict of all objects inside it, accessible with strings.

What you want to do like people in the comments have said is make all of the states into a dictionary and then use this code:
if form_phone_data_sliced in states[form_state_code]:
return True

Related

How to create a histogram from counts with bins spaced every 0.1

I have the following dataframe:
df = {'count1': [2.2336, 2.2454, 2.2538, 2.2716999999999996, 2.2798000000000003, 2.2843, 2.2906, 2.2969, 2.3223000000000003, 2.3282, 2.3356999999999997, 2.3544, 2.3651999999999997, 2.3727, 2.3775, 2.3823000000000003, 2.392, 2.4051, 2.4092, 2.4133, 2.4168000000000003, 2.4175, 2.4209, 2.4392, 2.4476, 2.456, 2.461, 2.4723, 2.4776, 2.4882, 2.4989, 2.5095, 2.5221999999999998, 2.5318, 2.5422, 2.5494, 2.559, 2.5654, 2.5814, 2.5878, 2.6238, 2.6178000000000003, 2.624, 2.6303, 2.6366, 2.6425, 2.6481999999999997, 2.6525, 2.6553, 2.663, 2.6712, 2.6898, 2.7051, 2.7144, 2.727, 2.7416, 2.7472, 2.7512, 2.7557, 2.7574, 2.7594000000000003, 2.7636, 2.7699000000000003, 2.7761, 2.7809, 2.7855, 2.7902, 2.7948000000000004, 2.7995, 2.8043, 2.815, 2.8249, 2.8352, 2.8455, 2.8708, 2.8874, 2.9004000000000003, 2.9301, 2.9399, 2.9513000000000003, 2.9634, 2.9745999999999997, 2.9852, 2.9959000000000002, 3.0037, 3.0093, 3.015, 3.0184, 3.0206, 3.0225, 3.0245, 3.0264, 3.0282, 3.0305999999999997, 3.0331, 3.0334, 3.0361, 3.0388, 3.0418000000000003, 3.0443000000000002, 3.0463, 3.0464, 3.0481, 3.0496999999999996, 3.0514, 3.0530999999999997, 3.0544000000000002, 3.0556, 3.0569, 3.0581, 3.0623, 3.0627, 3.0633000000000004, 3.0638, 3.0643000000000002, 3.0648, 3.0652, 3.0656999999999996, 3.0663, 3.0675, 3.0682, 3.0688, 3.0695, 3.0702, 3.0721, 3.0741, 3.0761, 3.078, 3.08, 3.082, 3.0839000000000003, 3.0859, 3.0879000000000003, 3.0898000000000003, 3.0918, 3.0938000000000003, 3.0994, 3.1050999999999997, 3.1144000000000003, 3.1613, 3.1649000000000003, 3.1752, 3.1869, 3.1899, 3.1925, 3.1976, 3.2001, 3.2051999999999996, 3.2098, 3.2123000000000004],
'count2': [3144, 3944, 7888, 4428, 68874, 5480, 56697, 20560, 8744, 91190, 352, 924, 1308611, 480, 51146, 170373, 58792, 11424, 1288673, 1845105, 401464, 657930, 1361172, 199373, 19753, 39082, 776, 7533, 9289, 36731, 53865, 100140, 59274, 35740, 2648, 144998, 78616, 848241, 34579, 216591, 22512, 4024, 17168, 1552, 13760, 8344, 65589, 43104, 44672, 917115, 16256, 4168, 29679, 22571, 7720, 452, 8836, 6888, 18578, 5148, 9289, 442, 214, 485, 3164, 1101, 1010, 9048, 293, 1628, 960, 517, 2362, 1262, 1524, 1173, 1348, 1288, 25568, 8416, 5792, 4944, 504, 4696, 2336, 458, 453, 1220, 1149, 6688, 6956, 7324, 7100, 7784, 5650, 5076, 5336, 6792, 5212, 4592, 5260, 1279, 654, 842, 990, 782, 1412, 1363, 935, 996, 775, 1471, 1525, 1398, 1097, 1082, 1668, 1007, 497, 598, 645, 698, 541, 504, 549, 540, 1568, 514, 578, 2906, 4360, 3916, 11944, 1434, 1589, 732, 641, 477, 307, 1884, 3232, 2408, 1016, 332, 139, 344, 4784, 1784, 1324, 204]}
df = pd.DataFrame(df)
And I want to plot a barplot with it, where the x axis is count1 and the y axis count2, with bins spaced every 0.1 intervals.
I used this:
plt.bar(x=df['count1'], y=df['count2'], width=0.1)
But it returns me this error:
TypeError: bar() missing 1 required positional argument: 'height'
I'm trying to replicate an R code:
ggplot(df, aes(x= count1,
y= count2)) +
geom_col() +
ylim(0, 2000000) +
scale_x_binned()
That generates the following graph:

To get a histogram from values and counts, you can use the weights= parameter of plt.hist.
To create bins with a width of 0.1, you can use np.arange(...,..., 0.1).
The rwidth=0.9 parameter makes the bars a bit narrower.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = {'count1': [2.2336, 2.2454, 2.2538, 2.2716999999999996, 2.2798000000000003, 2.2843, 2.2906, 2.2969, 2.3223000000000003, 2.3282, 2.3356999999999997, 2.3544, 2.3651999999999997, 2.3727, 2.3775, 2.3823000000000003, 2.392, 2.4051, 2.4092, 2.4133, 2.4168000000000003, 2.4175, 2.4209, 2.4392, 2.4476, 2.456, 2.461, 2.4723, 2.4776, 2.4882, 2.4989, 2.5095, 2.5221999999999998, 2.5318, 2.5422, 2.5494, 2.559, 2.5654, 2.5814, 2.5878, 2.6238, 2.6178000000000003, 2.624, 2.6303, 2.6366, 2.6425, 2.6481999999999997, 2.6525, 2.6553, 2.663, 2.6712, 2.6898, 2.7051, 2.7144, 2.727, 2.7416, 2.7472, 2.7512, 2.7557, 2.7574, 2.7594000000000003, 2.7636, 2.7699000000000003, 2.7761, 2.7809, 2.7855, 2.7902, 2.7948000000000004, 2.7995, 2.8043, 2.815, 2.8249, 2.8352, 2.8455, 2.8708, 2.8874, 2.9004000000000003, 2.9301, 2.9399, 2.9513000000000003, 2.9634, 2.9745999999999997, 2.9852, 2.9959000000000002, 3.0037, 3.0093, 3.015, 3.0184, 3.0206, 3.0225, 3.0245, 3.0264, 3.0282, 3.0305999999999997, 3.0331, 3.0334, 3.0361, 3.0388, 3.0418000000000003, 3.0443000000000002, 3.0463, 3.0464, 3.0481, 3.0496999999999996, 3.0514, 3.0530999999999997, 3.0544000000000002, 3.0556, 3.0569, 3.0581, 3.0623, 3.0627, 3.0633000000000004, 3.0638, 3.0643000000000002, 3.0648, 3.0652, 3.0656999999999996, 3.0663, 3.0675, 3.0682, 3.0688, 3.0695, 3.0702, 3.0721, 3.0741, 3.0761, 3.078, 3.08, 3.082, 3.0839000000000003, 3.0859, 3.0879000000000003, 3.0898000000000003, 3.0918, 3.0938000000000003, 3.0994, 3.1050999999999997, 3.1144000000000003, 3.1613, 3.1649000000000003, 3.1752, 3.1869, 3.1899, 3.1925, 3.1976, 3.2001, 3.2051999999999996, 3.2098, 3.2123000000000004],
'count2': [3144, 3944, 7888, 4428, 68874, 5480, 56697, 20560, 8744, 91190, 352, 924, 1308611, 480, 51146, 170373, 58792, 11424, 1288673, 1845105, 401464, 657930, 1361172, 199373, 19753, 39082, 776, 7533, 9289, 36731, 53865, 100140, 59274, 35740, 2648, 144998, 78616, 848241, 34579, 216591, 22512, 4024, 17168, 1552, 13760, 8344, 65589, 43104, 44672, 917115, 16256, 4168, 29679, 22571, 7720, 452, 8836, 6888, 18578, 5148, 9289, 442, 214, 485, 3164, 1101, 1010, 9048, 293, 1628, 960, 517, 2362, 1262, 1524, 1173, 1348, 1288, 25568, 8416, 5792, 4944, 504, 4696, 2336, 458, 453, 1220, 1149, 6688, 6956, 7324, 7100, 7784, 5650, 5076, 5336, 6792, 5212, 4592, 5260, 1279, 654, 842, 990, 782, 1412, 1363, 935, 996, 775, 1471, 1525, 1398, 1097, 1082, 1668, 1007, 497, 598, 645, 698, 541, 504, 549, 540, 1568, 514, 578, 2906, 4360, 3916, 11944, 1434, 1589, 732, 641, 477, 307, 1884, 3232, 2408, 1016, 332, 139, 344, 4784, 1784, 1324, 204]}
df = pd.DataFrame(df)
bin_start = np.trunc(df['count1'].min() * 10) / 10
bin_end = df['count1'].max() + 0.1
plt.style.use('ggplot')
plt.hist(x=df['count1'], weights=df['count2'], bins=np.arange(bin_start, bin_end, 0.1), rwidth=0.9)
plt.gca().get_yaxis().get_major_formatter().set_scientific(False)
plt.xlabel('count1')
plt.ylabel('count2')
plt.tight_layout()
plt.show()

How to implement different sequences in shell sort in python?

Hi I have the following code for implementing Shell sort in Python. How can I implement the following sequences in Shell sort using the code below (Note this is not the list I want to sort) :
1, 4, 13, 40, 121, 364, 1093, 3280, 9841, 29524 (Knuth’s sequence)
1, 5, 17, 53, 149, 373, 1123, 3371, 10111, 30341
1, 10, 30, 60, 120, 360, 1080, 3240, 9720, 29160
interval = n // 2
while interval > 0:
for i in range(interval, n):
temp = array[i]
j = i
while j >= interval and array[j - interval] > temp:
array[j] = array[j - interval]
j -= interval
array[j] = temp
interval //= 2

You could modify the pseudo-code provided in the Wikipedia article for Shellsort to take in the gap sequence as a parameter:
from random import choices
from timeit import timeit
RAND_SEQUENCE_SIZE = 500
GAP_SEQUENCES = {
'CIURA_A102549': [701, 301, 132, 57, 23, 10, 4, 1],
'KNUTH_A003462': [29524, 9841, 3280, 1093, 364, 121, 40, 13, 4, 1],
'SPACED_OUT_PRIME_GAPS': [30341, 10111, 3371, 1123, 373, 149, 53, 17, 5, 1],
'SPACED_OUT_EVEN_GAPS': [29160, 9720, 3240, 1080, 360, 120, 60, 30, 10, 1],
}
def shell_sort(seq: list[int], gap_sequence: list[int]) -> None:
n = len(seq)
# Start with the largest gap and work down to a gap of 1. Similar to
# insertion sort but instead of 1, gap is being used in each step.
for gap in gap_sequence:
# Do a gapped insertion sort for every element in gaps.
# Each gap sort includes (0..gap-1) offset interleaved sorting.
for offset in range(gap):
for i in range(offset, n, gap):
# Save seq[i] in temp and make a hole at position i.
temp = seq[i]
# Shift earlier gap-sorted elements up until the correct
# location for seq[i] is found.
j = i
while j >= gap and seq[j - gap] > temp:
seq[j] = seq[j - gap]
j -= gap
# Put temp (the original seq[i]) in its correct location.
seq[j] = temp
def main() -> None:
seq = choices(population=range(1000), k=RAND_SEQUENCE_SIZE)
print(f'{seq = }')
print(f'{len(seq) = }')
for name, gap_sequence in GAP_SEQUENCES.items():
print(f'Shell sort using {name} gap sequence: {gap_sequence}')
print(f'Time taken to sort 100 times: {timeit(lambda: shell_sort(seq.copy(), gap_sequence), number=100)} seconds')
if __name__ == '__main__':
main()
Example Output:
seq = [331, 799, 153, 700, 373, 38, 203, 535, 894, 500, 922, 939, 507, 506, 89, 40, 442, 108, 112, 359, 280, 946, 395, 708, 140, 435, 588, 306, 202, 23, 6, 189, 570, 600, 857, 949, 606, 617, 556, 863, 521, 776, 436, 801, 501, 588, 927, 279, 210, 72, 460, 52, 340, 632, 385, 965, 730, 360, 88, 216, 991, 520, 74, 112, 770, 853, 483, 787, 229, 812, 259, 349, 967, 227, 957, 728, 780, 51, 604, 748, 3, 679, 33, 488, 130, 203, 493, 471, 397, 53, 49, 172, 7, 306, 613, 519, 575, 64, 168, 161, 376, 903, 338, 800, 58, 729, 421, 238, 967, 294, 967, 218, 456, 823, 649, 569, 144, 103, 970, 780, 859, 719, 15, 536, 263, 917, 0, 54, 370, 703, 911, 518, 78, 41, 106, 452, 355, 571, 249, 58, 274, 327, 500, 341, 743, 536, 432, 799, 597, 681, 301, 856, 219, 63, 653, 680, 891, 725, 537, 673, 815, 504, 720, 573, 60, 91, 909, 892, 964, 119, 793, 540, 303, 538, 130, 717, 755, 968, 46, 229, 837, 398, 182, 303, 99, 808, 56, 780, 415, 33, 511, 771, 875, 593, 120, 727, 505, 905, 619, 295, 958, 566, 8, 291, 811, 529, 789, 523, 545, 5, 631, 28, 107, 292, 831, 657, 952, 239, 814, 862, 912, 2, 147, 750, 132, 528, 408, 916, 718, 261, 488, 621, 261, 963, 880, 625, 151, 982, 819, 749, 224, 572, 690, 766, 278, 417, 248, 987, 664, 515, 691, 940, 860, 172, 898, 321, 381, 662, 293, 354, 642, 219, 133, 133, 854, 162, 254, 816, 630, 21, 577, 486, 792, 731, 714, 581, 633, 794, 120, 386, 874, 177, 652, 159, 264, 414, 417, 730, 728, 716, 973, 688, 106, 345, 153, 909, 382, 505, 721, 363, 230, 588, 765, 340, 142, 549, 558, 189, 547, 728, 974, 468, 182, 255, 637, 317, 40, 775, 696, 135, 985, 884, 131, 797, 84, 89, 962, 810, 520, 843, 24, 400, 717, 834, 170, 681, 333, 68, 159, 688, 422, 198, 621, 386, 391, 839, 283, 167, 655, 314, 820, 432, 412, 181, 440, 864, 828, 217, 491, 593, 298, 885, 831, 535, 92, 305, 510, 90, 949, 461, 627, 851, 606, 280, 413, 624, 916, 16, 517, 700, 776, 323, 161, 329, 25, 868, 258, 97, 219, 620, 69, 24, 794, 981, 361, 691, 20, 90, 825, 442, 531, 562, 240, 0, 440, 418, 338, 526, 34, 230, 381, 598, 734, 925, 209, 231, 980, 122, 374, 752, 144, 105, 920, 780, 828, 948, 515, 443, 810, 81, 303, 751, 779, 516, 394, 455, 116, 448, 652, 293, 327, 367, 793, 47, 946, 653, 927, 910, 583, 845, 442, 989, 393, 490, 564, 54, 656, 689, 626, 531, 941, 575, 628, 865, 705, 219, 42, 19, 10, 155, 436, 319, 510, 520, 869, 101, 918, 170, 826, 146, 389, 200, 992, 404, 982, 889, 818, 684, 524, 642, 991, 973, 561, 104, 418, 207, 963, 192, 410, 33]
len(seq) = 500
Shell sort using CIURA_A102549 gap sequence: [701, 301, 132, 57, 23, 10, 4, 1]
Time taken to sort 100 times: 0.06717020808719099 seconds
Shell sort using KNUTH_A003462 gap sequence: [29524, 9841, 3280, 1093, 364, 121, 40, 13, 4, 1]
Time taken to sort 100 times: 0.34870366705581546 seconds
Shell sort using SPACED_OUT_PRIME_GAPS gap sequence: [30341, 10111, 3371, 1123, 373, 149, 53, 17, 5, 1]
Time taken to sort 100 times: 0.3563524999190122 seconds
Shell sort using SPACED_OUT_EVEN_GAPS gap sequence: [29160, 9720, 3240, 1080, 360, 120, 60, 30, 10, 1]
Time taken to sort 100 times: 0.38147866702638566 seconds

Get all list elements between two specified elements

I have a list like this:
['|', 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, '|', 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, '|', 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, '|']
And I'd like to convert it so that each element between two "|" symbols are put into a nested list instead.
So what I'd like it to look like is:
[[480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720], [840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900], [960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140]]

You can use itertools.groupby for this. As the key for grouping, just test whether the current element is your separator, then discard all the segments where it is the separator.
>>> from itertools import groupby
>>> data = ["|", 1, 2, 3, 4, "|", 5, 6, "|", 7, 8, 9, "|"]
>>> [list(g) for k, g in groupby(data, key=lambda x: x != "|") if k]
[[1, 2, 3, 4], [5, 6], [7, 8, 9]]
This also works well whether or not your list starts and/or ends with a separator. Empty segments are discarded, though.
>>> data = [1, 2, "|", "|", 3, 4]
>>> [list(g) for k, g in groupby(data, key=lambda x: x != "|") if k]
[[1, 2], [3, 4]]

superList=[]
list=[]
for item in inputList:
if item == '|':
if list:
superList.append(list)
list=[]
else:
list.append(item)
print(superList)

assign value of arbitrary line in 2-d array to nans

I have a 2D numpy array, z, in which I would like to assign values to nan based on the equation of a line +/- a width of 20. I am trying to implement the Raman 2nd scattering correction as it is done by the eem_remove_scattering method in the eemR package listed here:
https://cran.r-project.org/web/packages/eemR/vignettes/introduction.html
but the method isn't visible.
import numpy as np
ex = np.array([240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300,
305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, 365,
370, 375, 380, 385, 390, 395, 400, 405, 410, 415, 420, 425, 430,
435, 440, 445, 450])
em = np.array([300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324,
326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350,
352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376,
378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402,
404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428,
430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454,
456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480,
482, 484, 486, 488, 490, 492, 494, 496, 498, 500, 502, 504, 506,
508, 510, 512, 514, 516, 518, 520, 522, 524, 526, 528, 530, 532,
534, 536, 538, 540, 542, 544, 546, 548, 550, 552, 554, 556, 558,
560, 562, 564, 566, 568, 570, 572, 574, 576, 578, 580, 582, 584,
586, 588, 590, 592, 594, 596, 598, 600])
X, Y = np.meshgrid(ex, em)
z = np.sin(X) + np.cos(Y)
The equation that I would like to apply is em = - 2 ex/ (0.00036*ex-1) + 500.
I want to set every value in the array that intersects this line (+/- 20 ) to be set to nans. Its simple enough to set a single element to nans, but I havent been able to locate a python function to apply this equation to the array and only set values that intersect with this line to nans.
The desired output would be a new array with the same dimensions as z, but with the values that intersect the line equivalent to nan. Any suggestions on how to proceed are greatly appreciated.

Use np.where in the form np.where( "condition for intersection", np.nan, z):
zi = np.where( np.abs(-2*X/(0.00036*X-1) + 500 - Y) <= 20, np.nan, z)
As a matter of fact, there are no intersections here because (0.00036*ex-1) is close to -1 for all your values, which makes - 2*ex/(0.00036*ex-1) close to 2*ex, and adding 500 brings this over any values you have in em. But in principle this works.
Also, I suspect that the goal you plan to achieve by setting those values to NaN would be better achieved by using a masked array.

Why my python multiprocessing code return the same result in randomized number? [duplicate]

This question already has answers here:
Python Multiprocessing Numpy Random [duplicate]
(2 answers)
Closed 7 years ago.
I'm analyzing a large graph. So, I divide the graph into chunks and hopefully with multi-core CPU it would be faster. However, my model is a randomized model so there's a chance that the results of each run won't be the same. I'm testing the idea and I get the same result all the time so I'm wondering if my code is correct.
Here's my code
from multiprocessing import Process, Queue
# split a list into evenly sized chunks
def chunks(l, n):
return [l[i:i+n] for i in range(0, len(l), n)]
def multiprocessing_icm(queue, nodes):
queue.put(independent_cascade_igraph(twitter_igraph, nodes, steps=1))
def dispatch_jobs(data, job_number):
total = len(data)
chunk_size = total / job_number
slice = chunks(data, chunk_size)
jobs = []
processes = []
queue = Queue()
for i, s in enumerate(slice):
j = Process(target=multiprocessing_icm, args=(queue, s))
jobs.append(j)
for j in jobs:
j.start()
for j in jobs:
j.join()
return queue
dispatch_jobs(['121817564', '121817564'], 2)
if you're wondering what independent_cascade_igraph is. Here's the code
def independent_cascade_igraph(G, seeds, steps=0):
# init activation probabilities
for e in G.es():
if 'act_prob' not in e.attributes():
e['act_prob'] = 0.1
elif e['act_prob'] > 1:
raise Exception("edge activation probability:", e['act_prob'], "cannot be larger than 1")
# perform diffusion
A = copy.deepcopy(seeds) # prevent side effect
if steps <= 0:
# perform diffusion until no more nodes can be activated
return _diffuse_all(G, A)
# perform diffusion for at most "steps" rounds
return _diffuse_k_rounds(G, A, steps)
def _diffuse_all(G, A):
tried_edges = set()
layer_i_nodes = [ ]
layer_i_nodes.append([i for i in A]) # prevent side effect
while True:
len_old = len(A)
(A, activated_nodes_of_this_round, cur_tried_edges) = _diffuse_one_round(G, A, tried_edges)
layer_i_nodes.append(activated_nodes_of_this_round)
tried_edges = tried_edges.union(cur_tried_edges)
if len(A) == len_old:
break
return layer_i_nodes
def _diffuse_k_rounds(G, A, steps):
tried_edges = set()
layer_i_nodes = [ ]
layer_i_nodes.append([i for i in A])
while steps > 0 and len(A) < G.vcount():
len_old = len(A)
(A, activated_nodes_of_this_round, cur_tried_edges) = _diffuse_one_round(G, A, tried_edges)
layer_i_nodes.append(activated_nodes_of_this_round)
tried_edges = tried_edges.union(cur_tried_edges)
if len(A) == len_old:
break
steps -= 1
return layer_i_nodes
def _diffuse_one_round(G, A, tried_edges):
activated_nodes_of_this_round = set()
cur_tried_edges = set()
for s in A:
for nb in G.successors(s):
if nb in A or (s, nb) in tried_edges or (s, nb) in cur_tried_edges:
continue
if _prop_success(G, s, nb):
activated_nodes_of_this_round.add(nb)
cur_tried_edges.add((s, nb))
activated_nodes_of_this_round = list(activated_nodes_of_this_round)
A.extend(activated_nodes_of_this_round)
return A, activated_nodes_of_this_round, cur_tried_edges
def _prop_success(G, src, dest):
'''
act_prob = 0.1
for e in G.es():
if (src, dest) == e.tuple:
act_prob = e['act_prob']
break
'''
return random.random() <= 0.1
Here's the result of multiprocessing
[['121817564'], [1538, 1539, 4, 517, 1547, 528, 2066, 1623, 1540, 538, 1199, 31, 1056, 1058, 547, 1061, 1116, 1067, 1069, 563, 1077, 1591, 1972, 1595, 1597, 1598, 1088, 1090, 1608, 1656, 1098, 1463, 1105, 1619, 1622, 1111, 601, 1627, 604, 1629, 606, 95, 612, 101, 1980, 618, 1652, 1897, 1144, 639, 640, 641, 647, 650, 1815, 1677, 143, 1170, 1731, 660, 1173, 1690, 1692, 1562, 1563, 1189, 1702, 687, 689, 1203, 1205, 1719, 703, 1219, 1229, 1744, 376, 1746, 211, 1748, 213, 1238, 218, 221, 735, 227, 1764, 741, 230, 1769, 1258, 1780, 1269, 1783, 761, 763, 1788, 1789, 1287, 769, 258, 1286, 263, 264, 780, 1298, 1299, 1812, 473, 1822, 1828, 806, 811, 1324, 814, 304, 478, 310, 826, 1858, 1349, 326, 327, 1352, 329, 1358, 336, 852, 341, 854, 1879, 1679, 868, 2022, 1385, 1902, 1904, 881, 1907, 1398, 1911, 888, 1940, 1402, 1941, 1920, 1830, 387, 1942, 905, 1931, 1411, 399, 1426, 915, 916, 917, 406, 407, 1433, 1947, 1441, 419, 1445, 1804, 428, 1454, 1455, 948, 1973, 951, 1466, 443, 1468, 1471, 1474, 1988, 966, 1479, 1487, 976, 467, 1870, 2007, 985, 1498, 990, 1504, 1124, 485, 486, 489, 492, 2029, 2033, 1524, 1534, 2038, 1018, 1535, 510, 1125]]
[['121817564'], [1538, 1539, 4, 517, 1547, 528, 2066, 1623, 1540, 538, 1199, 31, 1056, 1058, 547, 1061, 1116, 1067, 1069, 563, 1077, 1591, 1972, 1595, 1597, 1598, 1088, 1090, 1608, 1656, 1098, 1463, 1105, 1619, 1622, 1111, 601, 1627, 604, 1629, 606, 95, 612, 101, 1980, 618, 1652, 1897, 1144, 639, 640, 641, 647, 650, 1815, 1677, 143, 1170, 1731, 660, 1173, 1690, 1692, 1562, 1563, 1189, 1702, 687, 689, 1203, 1205, 1719, 703, 1219, 1229, 1744, 376, 1746, 211, 1748, 213, 1238, 218, 221, 735, 227, 1764, 741, 230, 1769, 1258, 1780, 1269, 1783, 761, 763, 1788, 1789, 1287, 769, 258, 1286, 263, 264, 780, 1298, 1299, 1812, 473, 1822, 1828, 806, 811, 1324, 814, 304, 478, 310, 826, 1858, 1349, 326, 327, 1352, 329, 1358, 336, 852, 341, 854, 1879, 1679, 868, 2022, 1385, 1902, 1904, 881, 1907, 1398, 1911, 888, 1940, 1402, 1941, 1920, 1830, 387, 1942, 905, 1931, 1411, 399, 1426, 915, 916, 917, 406, 407, 1433, 1947, 1441, 419, 1445, 1804, 428, 1454, 1455, 948, 1973, 951, 1466, 443, 1468, 1471, 1474, 1988, 966, 1479, 1487, 976, 467, 1870, 2007, 985, 1498, 990, 1504, 1124, 485, 486, 489, 492, 2029, 2033, 1524, 1534, 2038, 1018, 1535, 510, 1125]]
But here's the example if I run indepedent_cascade_igraph twice
independent_cascade_igraph(twitter_igraph, ['121817564'], steps=1)
[['121817564'],
[514,
1773,
1540,
1878,
2057,
1035,
1550,
2064,
1042,
533,
1558,
1048,
1054,
544,
545,
1061,
1067,
1885,
1072,
350,
1592,
1460,...
independent_cascade_igraph(twitter_igraph, ['121817564'], steps=1)
[['121817564'],
[1027,
2055,
8,
1452,
1546,
1038,
532,
1045,
542,
546,
1059,
549,
1575,
1576,
2030,
1067,
1068,
1071,
564,
573,
575,
1462,
584,
1293,
1105,
595,
599,
1722,
1633,
1634,
614,
1128,
1131,
1286,
621,
1647,
1648,
627,
636,
1662,
1664,
1665,
130,
1671,
1677,
656,
1169,
148,
1686,
1690,
667,
1186,
163,
1700,
1191,
1705,
1711,...
So, what I'm hoping to get out of this is if I have a list of 500 ids, I would like the first CPU to calculate the first 250 and the second CPU to calculate the last 250 and then merge the result. I'm not sure if I understand multiprocessing correctly.

As mentioned e.g. in this SO answer, in *nix child processes inherit the state of the RNG. Call random.seed() in every child process to initialize it yourself to a per-process seed, or randomly.

Haven't read your program in detail but my general feeling is that you probably have a random number generator seed problem. If you run twice the program on the same CPU the random number generator's state will be different the second time you run it. But if you run it on 2 different CPUs, maybe your generators are initialized with the same default seed, thus giving the same results.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

use string to reference already assigned local variable [duplicate] - python

What you want to do like people in the comments have said is make all of the states into a dictionary and then use this code: if form_phone_data_sliced in states[form_state_code]: return True

Related

How to create a histogram from counts with bins spaced every 0.1

How to implement different sequences in shell sort in python?

Get all list elements between two specified elements

assign value of arbitrary line in 2-d array to nans

Why my python multiprocessing code return the same result in randomized number? [duplicate]

Categories

Resources