Reading just after decimal point for an entire data file (Python) - python

I'm trying to read just the value after the decimal point of a parameter calculated in my python script for a whole data set. I've read up on using math.modf, and I understand how it should work but I'm unsure how to apply that to my dataset as a whole. Ultimately I'm trying to plot a scatter graph with the values calculated.
I only need the number after the decimal point from the equation where x is the imported dataset
p = (x[:,0]-y)/z
I understand math.modf gives a result (fractional, integer) so I tried adding [0] at the end but I think that interferes when I'm trying to read certain lines from the dataset.
Sorry if this is really basic I'm new to python, thanks in advance.
This is how I've inputted it so far
norm1 = np.loadtxt("dataset")
y = (numerical value)
z = (numerical value)
p = (norm1[:,0] - y)/z
decp = math.modf(p)
plt.scatter(decp, norm1[:,2])

Use like that:
import math
x = 5/2
print(x)
x = math.modf(x)[0]
print(x)
output:
2.5
0.5
edit 1:
For entire dataset:
import math
x = 5/2
y = 5/2
list = []
list.append(x)
list.append(y)
print(list)
for i in range(len(list)):
list[i] = math.modf(list[i])[0]
print(list)
output:
[2.5, 2.5]
[0.5, 0.5]

Can't you handle the numbers as a string and cast it back to int?
# example input
nbrs = [45646.45646, 45646.15649, 48646.67845, 486468.15684]
def get_first_frac(n: int)->int:
return int(str(n).split('.')[1][0])
nbrs_frac = [get_first_frac(n) for n in nbrs]
print(nbrs_frac)
result:
[4, 1, 6, 1]
edit: to apply this on a np array do the following
result = np.array(list(map(get_first_frac, x)))

Related

creating initial population for genetic algorithm

I need to create the initial population to solve this equation z = x⁵ - 10x³ + 30x - y² + 21y using genetic algorithm. The population must be binary and need to follow this rules:
X and Y range: [-2.5, 2.5]
The first bit represents the signal (0 or 1)
The second and third bit represents the integer part, values from 0 to 2 (00, 01, 10)
The rest should represents the float part, values from 0 to 5000.
def pop(pop_size):
pop = []
for i in range(pop_size):
for j in range(2):
signal = bin(np.random.randint(0, 2))[2:]
integer = bin(np.random.randint(0, 3))[2:]
float = bin(np.random.randint(0, 5001))[2:].zfill(13)
binary = [signal, integer, float]
binary = [''.join(binary)]
pop.append(binary)
return pop
My output right now looks like this: [['1110001000110000'], ['1100010010011000'], ['11000100010001010'], ['0100011000000010'], ['0100010111100001'], ['01000111001101110']]
But I need it to look like this: [['1110001000110000', '1100010010011000'], ['11000100010001010', '0100011000000010'], ['0100010111100001', '01000111001101110']] because each pair represents the value for X and Y.
Any idea of what I'm missing?
How about
def pop(pop_size)
rlt = []
for i in range(pop_size):
rlt.append([None,None])
for j in range(2):
signal = bin(np.random.randint(0, 2))[2:]
integer = bin(np.random.randint(0, 3))[2:]
floats = bin(np.random.randint(0, 5001))[2:].zfill(13)
rlt[-1][j] = signal+integer+floats
return rlt
Demo
>>> pop(3)
[['0100111010000110', '000110111010111'], ['100101100010010', '010010000101100'], ['0100011000011010', '0100111100001011']]

Finding similar numbers in a list and getting the average

I currently have the numbers above in a list. How would you go about adding similar numbers (by nearest 850) and finding average to make the list smaller.
For example I have the list
l = [2000,2200,5000,2350]
In this list, i want to find numbers that are similar by n+500
So I want all the numbers similar by n+500 which are 2000,2200,2350 to be added and divided by the amount there which is 3 to find the mean. This will then replace the three numbers added. so the list will now be l = [2183,5000]
As the image above shows the numbers in the list. Here I would like the numbers close by n+850 to all be selected and the mean to be found
It seems that you look for a clustering algorithm - something like K-means.
This algorithm is implemented in scikit-learn package
After you find your K means, you can count how many of your data were clustered with that mean, and make your computations.
However, it's not clear in your case what is K. You can try and run the algorithm for several K values until you get your constraints (the n+500 distance between the means)
You can use:
import numpy as np
l = np.array([2000,2200,5000,2350])
# find similar numbers (that are within each 500 fold)
similar = l // 500
# for each similar group get the average and convert it to integer (as in the desired output)
new_list = [np.average(l[similar == num]).astype(int) for num in np.unique(similar)]
print(new_list)
Output:
[2183, 5000]
Step 1:
list = [5620.77978515625,
7388.43017578125,
7683.580078125,
8296.6513671875,
8320.82421875,
8557.51953125,
8743.5,
9163.220703125,
9804.7939453125,
9913.86328125,
9940.1396484375,
9951.74609375,
10074.23828125,
10947.0419921875,
11048.662109375,
11704.099609375,
11958.5,
11964.8232421875,
12335.70703125,
13103.0,
13129.529296875,
16463.177734375,
16930.900390625,
17712.400390625,
18353.400390625,
19390.96484375,
20089.0,
34592.15625,
36542.109375,
39478.953125,
40782.078125,
41295.26953125,
42541.6796875,
42893.58203125,
44578.27734375,
45077.578125,
48022.2890625,
52535.13671875,
58330.5703125,
61597.91796875,
62757.12890625,
64242.79296875,
64863.09765625,
66930.390625]
Step 2:
seen = [] #to log used indices pairs
diff_dic = {} #to record indices and diff
for i,a in enumerate(list):
for j,b in enumerate(list):
if i!=j and (i,j)[::-1] not in seen:
seen.append((i,j))
diff_dic[(i,j)] = abs(a-b)
keys = []
for ind, diff in diff_dic.items():
if diff <= 850:
keys.append(ind)
uniques_k = [] #to record unique indices
for pair in keys:
for key in pair:
if key not in uniques_k:
uniques_k.append(key)
import numpy as np
list_arr = np.array(list)
nearest_avg = np.mean(list_arr[uniques_k])
list_arr = np.delete(list_arr, uniques_k)
list_arr = np.append(list_arr, nearest_avg)
list_arr
output:
array([ 5620.77978516, 34592.15625, 36542.109375, 39478.953125, 48022.2890625, 52535.13671875, 58330.5703125 , 61597.91796875, 62757.12890625, 66930.390625 , 20566.00205365])
You just need a conditional list comprehension like this:
l = [2000,2200,5000,2350]
n = 2000
a = [ (x) for x in l if ((n -250) < x < (n + 250)) ]
Then you can average with
np.mean(a)
or whatever method you prefer.

How would you store results of a for-loop as a single array?

I've managed to create a for-loop, which provides me with the results that I want, but I'm struggling to collate these results into a single array, so that I can plot it as my x value on a graph.
I have considered collating them into a single list first (but am also struggling to do this).
I have also tried to append, extend, and stack the array below, but nothing seems to work.
When trying to append, I got an error message appears to say that there is not 'value' present.
a = 0.1
x = 0.2
for i in range(1,10):
a = a**3
x = x**2
array = np.array ([a, x])
print (array)
The code above provides 9 individual arrays, as opposed to just 1.
i.e. [(a1, x1), (a2, x2), ... (a9, x9)]
Any suggestions to fix this or alternative methods would be greatly appreciated! Thank you!
okk so you want to store both variable values in this pattern (a1,x1),(a2,x2)....
So this can be done in this way
like first suppose two separate list for a and x , and then merge them into the desired format
the whole code is shown here
import numpy as np
a = 0.1
x = 0.2
list1= []
list2=[]
for i in range(1,10):
a = a**3
x = x**2
list1.append(a)
list2.append(x)
merged_list = [(list1[i], list2[i]) for i in range(0, len(list1))]
print(merged_list)
this will give you the desired output . Thanks for asking
Use append to append value in list
a = 0.1
x = 0.2
array = []
for i in range(1,10):
a = a**3
x = x**2
array.append([a, x])
print(array)
If you want numpy.array
a = np.power(np.repeat(0.1, 10), 3)
x = np.power(np.repeat(0.2, 10), 2)
print(np.array(list(zip(a,x))))
Do you want to append multiple items to a list?
First solution:
l = []
for i in range(1,10):
a = a**3
x = x**2
l.extend([a, x])
print(l)
Second solution:
l = []
for i in range(1,10):
a = a**3
x = x**2
l+= [a, x]
print(l)
Do you want to append multiple items to a numpy array?
array = np.array([])
for i in range(1,10):
a = a**3
x = x**2
array = np.append(array, [a,x])
print(array)

error :list indices must be integers, not float for median

Trying to find median but keep getting list indices must be integers, not float error and am not sure what to do.
sorted_data = sorted(data, key=lambda d:d.all_around_points_earned)
if len(data)%2==0:
a = sorted_data[len(data)/2]
b = sorted_data[len(data)/2-1]
median_val = (a+b)/2
else:
median_val = sorted_data[(len(data)-1)/2]
print(median_val) # median val
If you are using Python3, len(data)/2 will return you a float if len(data) is odd. Use // instead of / to get an integer result.
The statistics module is part of the standard library:
import statistics
data = [1, 2, 3, 4]
statistics.median(data)

Python 'for' loop issue, wht are these two variables not adding together properly in my 'for' loop?

I am writing a code snippet for a random algebraic equation generator for a larger project. Up to this point, everything has worked well. The main issue is simple. I combined the contents of a dictionary in sequential order. So for sake of argument, say the dictionary is: exdict = {a:1 , b:2 , c:3 , d:4}, I append those to a list as such: exlist = [a, b, c, d, 1, 2, 3, 4]. The length of my list is 8, which half of that is obviously 4. The algorithm is quite simple, whatever random number is generated between 1-4(or as python knows as 0-3 index), if you add half of the length of the list to that index value, you will have the correct value.
I have done research online and on stackoverflow but cannot find any answer that I can apply to my situation...
Below is the bug check version of my code. It prints out each variable as it happens. The issue I am having is towards the bottom, under the ### ITERATIONS & SETUP comment. The rest of the code is there so it can be ran properly. The primary issue is that a + x should be m, but a + x never equals m, m is always tragically lower.
Bug check code:
from random import randint as ri
from random import shuffle as sh
#def randomassortment():
letterss = ['a','b','x','d','x','f','u','h','i','x','k','l','m','z','y','x']
rndmletters = letterss[ri(1,15)]
global newdict
newdict = {}
numberss = []
for x in range(1,20):
#range defines max number in equation
numberss.append(ri(1,20))
for x in range(1,20):
rndmnumber = numberss[ri(1,18)]
rndmletters = letterss[ri(1,15)]
newdict[rndmletters] = rndmnumber
#x = randomassortment()
#print x[]
z = []
# set variable letter : values in list
for a in newdict.keys():
z.append(a)
for b in newdict.values():
z.append(b)
x = len(z)/2
test = len(z)
print 'x is value %d' % (x)
### ITERATIONS & SETUP
iteration = ri(2,6)
for x in range(1,iteration):
a = ri(1,x)
m = a + x
print 'a is value: %d' % (a)
print 'm is value %d' %(m)
print
variableletter = z[a]
variablevalue = z[m]
# variableletter , variablevalue
edit - My questions is ultimately, why is a + x returning a value that isn't a + x. If you run this code, it will print x , a , and m. m is supposed to be the value of a + x, but for some reason, it isnt?
The reason this isn't working as you expect is that your variable x originally means the length of the list, but it's replaced in your for x in range loop- and then you expect it to be equal to the length of the list. You could just change the line to
for i in range(iteration)
instead.
Also note that you could replace all the code in the for loop with
variableletter, variablevalue = random.choice(newdict.items())
Your problem is scope
which x are you looking for here
x = len(z)/2 # This is the first x
print 'x is value %d' % (x)
### ITERATIONS & SETUP
iteration = ri(2,6)
# x in the for loop is referencing the x in range...
for x in range(1,iteration):
a = ri(1,x)
m = a + x

Categories

Resources