How to append numpy x and y values into a text file

How to append numpy x and y values into a text file - python

I have some code which creates an x variable (frames) and a y variable (pixel intensity) in an infinite loop until the program ends. I would like to append these values every loop into a txt.file so that I can later work with the data. The data comes out as numpy arrays.
Say for example after 5 loops(5 frames) I get these values
1 2 3 4 5 (x values)
0 0 8 0 0 (y values)
I would like it to append these into a file every loop so I get after closing the program this:
1, 0
2, 0
3, 8
4, 0
5, 0
What would be the fastest way to implement this?
So far I have tried np.savetxt('data.txt', x) but this only saves the last value in the loop and doesn't add the data each loop. Is there a way to change this function or another function I could use that adds the data into the txt document.

First I will zip the values into (x,y) coordinate form and put them into a list so it is easier to append them to a text file, in your program you won't need to do this since you will have generated the x and y already within the loop prior.
x = [1, 2, 3, 4 ,5] #(x values)
y = [0, 0, 8, 0, 0] #(y values)
coordinate = list(zip(x,y))
print(coordinate)
So I used the Zip function, to store the sample results as (x_n, y_n) to a list for later.
Here is what I am appending to the text file with the below for loop (in the terminal display)
With in the loop itself you can use:
for element in coordinate: #you wouldn't need to write this since you are already in a loop
file1 = open("file.txt","a")
file1.write(f"{element} \n")
file1.close()
Output:

You can do something like this -- it is not complete because it will just append to the old file. The other issue with this is that it will not actually write the file until you close it. If you really need the file saved each time in the loop, then another solution is required.
import numpy as np
variable_to_ID_file = 3.
file_name_str = 'Foo_Var{0:.0f}.txt'.format(variable_to_ID_file)
# Need code here to delete old file if it is there
# with proper error checking, or write a blank file, then open it in append mode.
f_id = open(file_name_str, 'a')
for ii in range(4):
# Pick the delimiter you desire. I prefer tab '/t'
np.savetxt(f_id, np.column_stack((ii, 3*ii/4)), delimiter=', ', newline ='\n')
f_id.close()
If you do not need to write the file for each step in the loop, I recommend this option. It requires the Numpy arrays to be the same size.
import numpy as np
array1 = np.arange(1,5,1)
array2 = np.zeros(array1.size)
variable_to_ID_file = 3.
file_name_str = 'Foo_Var{0:.0f}.txt'.format(variable_to_ID_file)
# Pick the delimiter you desire. I prefer tab '/t'
np.savetxt(file_name_str, np.column_stack((array1, array2)), delimiter=', ')

Related

How to prevent duplicate dictionaries from appending to list continuously (python)

I am currently using Python ver 3.9.7
So I have a lot of serial port data continuously incoming. The data I have is incoming as dictionaries appending to a list. Each element is a dictionary, as the first part is an integer, and the rest of a string. I then have subcategorised each string, and am saving it to an excel spreadsheet.
I need to prevent duplicates appending to my list. Below is my code trying to do this, however when I view the Excel log being created, I am often seeing sometimes 50k rows of the same data repeatedly.
I was able to successfully prevent duplicate excel rows with reading from a text file with my approaches, but can't seem to find a solution for the continuous incoming data.
The output I would like is unique values only for each element appending to my list, and appearing on my excel file.
Code below:
import serial
import xlsxwriter
#ser-prt-connection
ser = serial.Serial(port='COM2',baudrate=9600)
#separate X (int) and Y (string)
regex = '(?:.*\)?X=(?P<X>-?\d+)\sY=(?P<Y>.*)'
extracted_vals = [] #list to be appended
less_vals = [] #want list with no duplicates
row = 0
workbook = xlsxwriter.Workbook('Serial_port_data.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write(row, 0, 'X')
worksheet.write(row, 1, 'Ya')
worksheet.write(row, 2, 'Yb')
worksheet.write(row, 3, 'Yc')
while True:
for line in ser.readlines():
signal = str(line)
for part in parts:
m = re.match(regex,parts)
if m is None:
continue
X,Y = m.groupdict(),values()
# each element appending to list (incl. duplicates)
item = dict(X=int(X),Y=str(Y.lower())
extracted_vals.append(item)
for i in range(0, len(extracted_vals):
i+=0
X_val =extracted_vals[i].setdefault('X')
Key = extracted_vals[i].keys()
data = extracted_vals[i].setdefault('Y')
for val in extracted_vals:
if val not in less_vals:
less_vals.append(val)
for j in range(0,len(less_vals)):
j+=0
X_val = less_vals[j].setdefault('X')
less_data = less_vals[j].setdefault('Y')
#separate Y part into substrings
Ya = less_data[0:3]
Yb = less_data[3:6]
Yc = less_data[6:]
less = X_val,Ya,Yb,Yc
#to check for no duplicates in ouput, compared with raw data
print(signal) #raw incoming data continuously
print(less) #want list, no duplicates
# write to excel file, column for X value
# 3 more columns for Ya, Yb, Yc each
if True:
row = row+1
worksheet.write(row,0,X_val)
worksheet.write(row,1,Ya)
worksheet.write(row,2,Yb)
worksheet.write(row,3,Yc)
#Chosen no. rows to write to file
if row == 10:
workbook.close()
serial.close()
Example of what a line of raw data 'signal' looks like:
X=-10Y=yyAyyBthisisYc
Example of what list 'less' is looking like for one line of raw data:
(-10, 'yyA','yyB', 'thisisYc')
#(repeats in simalar fashion for subsequent lines)
#each part has its own row in excel file
#-10 is X value
My main issue is that sometimes the data being printed is unique, but the excel file has many duplicates.
My other issue is that sometimes the data is printed as every second duplicate: like 1,2,1,2,1,2,1,2
and the same is being saved to the Excel file.
I have only been programming a few weeks now, so any advice at all in general is welcome

Your program has a real lot of problems. I just tried to show you how you can improve it. Bear in mind that I did not test it.
import serial
import xlsxwriter
#ser-prt-connection
ser = serial.Serial(port='COM2',baudrate=9600)
#separate X (int) and Y (string)
regex = '(?:.*\)?X=(?P<X>-?\d+)\sY=(?P<Y>.*)'
# First problem:
# maybe you don't need these lists. Perhaps, you could use a set and a list. Let's do it:
# extracted_vals = [] #list to be appended
extracted_vals = set()
less_vals = [] #want list with no duplicates
row = 0 # this will be used at the end of the program.
MAXROWS = 10 # Given that you want to limit the number of rows, I'd write that limit where it can be seen and changed easily.
#
workbook = xlsxwriter.Workbook('Serial_port_data.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write(row, 0, 'X')
worksheet.write(row, 1, 'Ya')
worksheet.write(row, 2, 'Yb')
worksheet.write(row, 3, 'Yc')
while True: # Second problem: this loop lacks a way to terminate!
# When you write 'while True:', you also MUST write 'break' or 'return' somewhere inside the loop! I'll take care of it.
for line in ser.readlines():
signal = str(line)
#
# Third problem: where is this 'parts' defined? The following line will throw an error! Maybe you meant "for part in signal"?
for part in signal: # was 'parts'
m = re.match(regex,signal) # was 'parts'
if m is None:
continue
X,Y = m.groupdict(),values()
#
# each element appending to list (incl. duplicates)
item = {(int(X), Y.lower())} # you forgot the closing parens, this is the Fourth problem. But this also counts as Fifth:
# you cannot create a dictionary as "dict(X=int(X),Y=str(Y.lower())"! It's a syntax error.
#
# Now that extracted_vals is a set, you can write:
extracted_vals.add(item)
# instead of 'extracted_vals.append(item)'
# At this point, you have no duplicates: sets automatically ignore them!
less_vals = list(extracted_vals) # And now you have a non-duplicated list.
# Sixth problem: all the following lines were included in the "for line in ser.readlines():" loop, you forgot to de-dent them.
# I've done it for you. Written the way you wrote them, for each line read from serial you also repeated all of the following!
# That might be the main reason for creating many duplicate rows.
for i in range(0, len(less_vals) ): # I added the last closing parens, this was the Fourth problem again
# Seventh problem: if you need to process all items in a sequence, you should use
# "for item in sequence:" not "for i in range(len(sequence)):" Moreover, the
# range() function does not need a starting value, if you need it to start from 0.
i+=0 # Eighth problem: this line does nothing. Why did you write it?
X_val = less_vals[i].setdefault('X') # this will put an item {'X': None} if missing in the dictionary extracted_vals[i]
# and assigns the value of less_vals[i]["X"] (possibly None) to X_val.
#Are you sure that you need an {'X': None} item?
# You could have written: X_val = less_vals[i]["X"] or, much better:
# for elem in less_vals:
# X_val, data = elem["X"], elem["Y"]
Key = less_vals[i].keys() # this assigns to 'Key' a dictionary view object, that you will never use. Ninth problem.
data = less_vals[i].setdefault('Y') # this will put an item {'Y': None} if missing in the dictionary less_vals[i]
# and assigns the value of less_vals[i]["Y"] (possibly None) to data.
#Are you sure that you need an {'Y': None} item?
# this loop is completely useless.
# for val in less_vals: # Sixth problem, again: this loop was included in the "for i in range(0, len(less_vals) ):" loop. I de-dented it.
# if val not in less_vals:
# less_vals.append(val)
# I noticed that I could put all following lines in the previous loop, without re-reading all less_vals.
#
# for j in range(0,len(less_vals)): # Sixth problem, again: this loop too was included in the "for i in range(0, len(less_vals) ):" loop. I de-dented it.
# j+=0 # Seventh problem, again: this line does nothing. Why did you write it?
#
# so, now we are continuing the loop on less_vals:
# X_val = less_vals[j].setdefault('X') we already have it
# less_data = less_vals[j].setdefault('Y') instead of 'less_data', we use 'data' that we already have
#separate Y part into substrings
# Remember that data can be None? Tenth problem, you should have dealt with this. I'll take care of it:
if data is None:
Ya = Yb = Yc = ""
else:
Ya = data[0:3]
Yb = data[3:6]
Yc = data[6:]
less = X_val,Ya,Yb,Yc
#to check for no duplicates in ouput, compared with raw data
print(signal) # it may contain only the last line read from serial, if the raw data contained more than 1 line
print(less) # I don't think this check is useful, but if you like it...
# write to excel file, column for X value
# 3 more columns for Ya, Yb, Yc each
# if True: Eleventh problem: this line does nothing
row = row+1
worksheet.write(row,0,X_val)
worksheet.write(row,1,Ya)
worksheet.write(row,2,Yb)
worksheet.write(row,3,Yc)
#
if row >= MAXROWS:
break # this exits the for line in ser.readlines(): loop
if row >= MAXROWS:
workbook.close()
break # this exits the while True: loop, solving the Second problem.
serial.close()
This is a refined version of your program; I tried to follow my own advices. I have no serial communication, (and I have no time to mock one) so I didn't test it properly.
# A bit refined, albeit untested.
import serial
import xlsxwriter
ser = serial.Serial(port='COM2',baudrate=9600)
#separate X (int) and Y (string)
regex = '(?:.*\)?X=(?P<X>-?\d+)\sY=(?P<Y>.*)'
MAXROWS = 50
row = 0 # this will be used at the end of the program.
#
workbook = xlsxwriter.Workbook('Serial_port_data.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write(row, 0, 'X')
worksheet.write(row, 1, 'Ya')
worksheet.write(row, 2, 'Yb')
worksheet.write(row, 3, 'Yc')
while True:
extracted_vals = set()
for line in ser.readlines():
m = re.match(regex,line)
if m is None:
continue
values = m.groupdict()
extracted_vals.add((int(values['X']), values['Y'].lower())) # each element in extracted_vals will be a tuple (integer, string)
#
for this in extracted_vals:
X_val = this[0]
data = this[1]
Ya = data[0:3]
Yb = data[3:6]
Yc = data[6:]
row = row+1
worksheet.write(row,0,X_val)
worksheet.write(row,1,Ya)
worksheet.write(row,2,Yb)
worksheet.write(row,3,Yc)
#
if row >= MAXROWS:
break
if row >= MAXROWS:
workbook.close()
break
serial.close()

Change from array of specified size to dynamic array?

I have a program that's looking for certain values in a log file and listing them out. Essentially, one line of a 50000 line file would look like this:
Step Elapsed Temp Press Volume TotEng KinEng PotEng E_mol E_pair Pxx Pyy Pzz Pxz Pxy Pyz
0 0 298 -93.542117 448382.78 -67392.894 17986.81 -85379.704 12349.955 -97729.659 -313.09273 44.936408 -12.47003 100.97953 -215.4029 254.07517
10 10 301.05619 -14.956923 448382.78 -66191.142 18171.277 -84362.419 12474.283 -96836.702 -56.794471 103.79453 -91.870824 300.09707 -27.638439 196.2738
The bit of code that's doing the searching and appending looks like this:
line=fp.readline()
while line:
line=fp.readline()
words = line.split()
if (words[0]=="Step"):
break
numcol = len(words)
header = words
data = numpy.zeros((numcol,100000))
ln = 0
while line:
line=fp.readline()
words=line.split()
if(words[0]=="Loop"):
break
for i in range(numcol):
data[i][ln]=(float(words[i]))
ln_original = ln
ln = ln +1
Currently, I'm specifying the number of columns in my array. I can't seem to figure out how to get appending to work. Any ideas as to what I could change so that the array can be dynamic for log files of various lengths instead of specifying something like 1,000,000 lines in the array to begin with?

make a list of lists and append items to those lists. when you get to the end of the file cast the list of lists to a np.ndarray.
change
data = numpy.zeros((numcol,100000))
to
data = [[] for i in range(numcol)]
and change
data[i][ln]=(float(words[i]))
to
data[i].append(float(words[i]))
at the end of the code add
data = np.array(data)

How convert multidimensional array to two dimensional array

Here, my code feats value form text file; and create matrices as multidimensional array, but the problem is the code create more then two dimensional array, that I can't manipulate, I need two dimensional array, how I do that?
Explain algorithm of my code:
Moto of code:
My code fetch value from a specific folder, each folder contain 7 'txt' file, that generate from one user, in this way multiple folder contain multiple data of multiple user.
step1: Start a 1st for loop, and control it using how many folder have in specific folder,and in variable 'path' store the first path of first folder.
step2: Open the path and fetch data of 7 txt file using 2nd for loop.after feats, it close 2nd for loop and execute the rest code.
step3: Concat the data of 7 txt file in one 1d array.
step4(Here the problem arise): Store the 1d arry of each folder as 2d array.end first for loop.
Code:
import numpy as np
from array import *
import os
f_path='Result'
array_control_var=0
#for feacth directory path
for (path,dirs,file) in os.walk(f_path):
if(path==f_path):
continue
f_path_1= path +'\page_1.txt'
#Get data from page1 indivisualy beacuse there string type data exiest
pgno_1 = np.array(np.loadtxt(f_path_1, dtype='U', delimiter=','))
#only for page_2.txt
f_path_2= path +'\page_2.txt'
with open(f_path_2) as f:
str_arr = ','.join([l.strip() for l in f])
pgno_2 = np.asarray(str_arr.split(','), dtype=int)
#using loop feach data from those text file.datda type = int
for j in range(3,8):
#store file path using variable
txt_file_path=path+'\page_'+str(j)+'.txt'
if os.path.exists(txt_file_path)==True:
#genarate a variable name that auto incriment with for loop
foo='pgno_'+str(j)
else:
break
#pass the variable name as string and store value
exec(foo + " = np.array(np.loadtxt(txt_file_path, dtype='i', delimiter=','))")
#z=np.array([pgno_2,pgno_3,pgno_4,pgno_5,pgno_6,pgno_7])
#marge all array from page 2 to rest in single array in one dimensation
f_array=np.concatenate((pgno_2,pgno_3,pgno_4,pgno_5,pgno_6,pgno_7), axis=0)
#for first time of the loop assing this value
if array_control_var==0:
main_f_array=f_array
else:
#here the problem arise
main_f_array=np.array([main_f_array,f_array])
array_control_var+=1
print(main_f_array)
current my code generate array like this(for 3 folder)
[
array([[0,0,0],[0,0,0]]),
array([0,0,0])
]
Note: I don't know how many dimension it have
But I want
[
array(
[0,0,0]
[0,0,0]
[0,0,0])
]

I tried to write a recursive code that recursively flattens the list of lists into one list. It gives the desired output for your case, but I did not try it for many other inputs(And it is buggy for certain cases such as :list =[0,[[0,0,0],[0,0,0]],[0,0,0]])...
flat = []
def main():
list =[[[0,0,0],[0,0,0]],[0,0,0]]
recFlat(list)
print(flat)
def recFlat(Lists):
if len(Lists) == 0:
return Lists
head, tail = Lists[0], Lists[1:]
if isinstance(head, (list,)):
recFlat(head)
return recFlat(tail)
else:
return flat.append(Lists)
if __name__ == '__main__':
main()
My idea behind the code was to traverse the head of each list, and check whether it is an instance of a list or an element. If the head is an element, this means I have a flat list and I can return the list. Else, I should recursively traverse more.

for loop in for loop sumation is overwriting data values python

I have a question concering my code for a data evaluation of an experiment:
in a first for loop I am opening file after file which I want to analyze. inside this for loop, so inside one file, I want to create a second for loop to evaluate the some specific paramters for evaluation. when I do it just for one file, the parameters are correct, but when I loop over all files, it looks like in the second for loop these paramteres are summed up. so the normal value should be in the range of ar= 0.0001, for one file perfectly working. when I loop over the files I then get 0.0001 for the first one, 0.0002 for the second, 0.0003 for the thrid, etc.
update:
ok so here is the whole part of the code. for each file I want after fitting the data to get the sum over the difference between two datapoints in the first column (x[j]) inside the file multiplicated by the coressponding value in the second columnn (y[j]) (each file has two columns with a length of 720 datapoints) and the result of this should then be stored in AR for each file.
def sum_list(l):
sum = 0
for k in l:
sum += k
return sum
INV= []
DIFFS= []
AR= []
for i in range(0,len(fnames)):
data= np.loadtxt(fnames[i])
x= data[:,0]
y=data[:,1]
gmod=lm.Model(linmod)
result= gmod.fit(y, x=x, p=0.003, bg=0.001)
plt.plot(x, y)
plt.plot(x, result.best_fit, 'r-')
plt.show()
print result.best_values['bg']
print result.best_values['p']
p= result.best_values['p']
bg1= result.best_values['bg']
for j in range(0, 719):
diffs = ((x[j+1]- x[j])*y[j])
DIFFS.append(diffs)
ar= sum_list(DIFFS)
AR.append(ar)
inr= (x[0]-bg1)*(y[0]**3)/3 + ar
INV[i]= inr

If you are working with files (e.g opening them), I suggest to use os module, maybe a construct like this will help you to avoid the nested for loop:
for root,dirs,files in os.walk(os.getcwd()):
for i in files:
with open(os.path.join(root,i)) as f:
#do your summation

Read file elements into 3 different arrays

I have a file that is space delimited with values for x,y,x. I need to visualise the data so I guess I need so read the file into 3 separate arrays (X,Y,Z) and then plot them. How do I read the file into 3 seperate arrays I have this so far which removes the white space element at the end of every line.
def fread(f=None):
"""Reads in test and training CSVs."""
X = []
Y = []
Z = []
if (f==None):
print("No file given to read, exiting...")
sys.exit(1)
read = csv.reader(open(f,'r'),delimiter = ' ')
for line in read:
line = line[:-1]
I tried to add something like:
for x,y,z in line:
X.append(x)
Y.append(y)
Z.append(z)
But I get an error like "ValueError: too many values to unpack"
I have done lots of googling but nothing seems to address having to read in a file into a separate array every element.
I should add my data isn't sorted nicely into rows/columns it just looks like this
"107745590026 2 0.02934046648 0.01023879368 3.331810236 2 0.02727724425 0.07867902517 3.319272757 2 0.01784882881"......
Thanks!

EDIT: If your data isn't actually separated into 3-element lines (and is instead one long space-separated list of values), you could use python list slicing with stride to make this easier:
X = read[::3]
Y = read[1::3]
Z = read[2::3]
This error might be happening because some of the lines in read contain more than three space-separated values. It's unclear from your question exactly what you'd want to do in these cases. If you're using python 3, you could put the first element of a line into X, the second into Y, and all the rest of that line into Z with the following:
for x, y, *z in line:
X.append(x)
Y.append(y)
for elem in z:
Z.append(elem)
If you're not using python 3, you can perform the same basic logic in a slightly more verbose way:
for i, elem in line:
if i == 0:
X.append(elem)
elif i == 1:
Y.append(elem)
else:
Z.append(elem)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to append numpy x and y values into a text file - python

Related

How to prevent duplicate dictionaries from appending to list continuously (python)

Change from array of specified size to dynamic array?

How convert multidimensional array to two dimensional array

for loop in for loop sumation is overwriting data values python

Read file elements into 3 different arrays

Categories

Resources