I would like to ask you how to change file looks like this:
123 111 1
146 204 2
178 398 1
...
...
First column is x, second is y and the third mean the number in each square.
My matrix is 400x400 dimension. I would like to change it to the simple file
M file doesn't posses every square (for example 0 0 doesn't exist which mean that in output file i would like to have 0 in first row in first place.
My output file should look like this
0 0 1 0 0 0 1 0 7 9 3 0 2 0 ...
8 0 0 1 0 0 0 0 0 0 0 0 0 0 ...
7 8 9 0 7 5 0 0 3 2 4 5 5 7 ...
...
...
How can I change my file?
From first file i would like to reah second file. Like text file with 400lines each 400 characters splited by " " (blankspace).
Just initialize your matrix as as a list of list of zeros, and then iterate the lines in the file and set the values in the matrix accordingly. Cells that are not in the file will remain unchanged.
matrix = [[0 for i in range(400)] for k in range(400)]
with open("filename") as data:
for row in data:
(x, y, n) = map(int, row.split())
matrix[x][y] = n
Finally, write that matrix to another file:
with open("outfile", "w") as outfile:
for row in matrix:
outfile.write(" ".join(map(str, row)) + "\n")
You could also use numpy:
matrix = numpy.zeros((4,4), dtype=numpy.int8)
Related
There are several files like this:
sample_a.txt containing:
a
b
c
sample_b.txt containing:
b
w
e
sample_c.txt containing:
a
m
n
I want to make a matrix of absence/presence like this:
a b c w e m n
sample_a 1 1 1 0 0 0 0
sample_b 0 1 0 1 1 0 0
sample_c 1 0 0 0 0 1 1
I know a dirty and dumb way how to solve it: make up a list of all possible letters in those files, and then iteratively comparing each line of each file with this 'library' fill in the final matrix by index. But I guess there's a smarter solution. Any ideas?
Upd:
the sample files can be of different length.
You can try:
import pandas as pd
from collections import defaultdict
dd = defaultdict(list) # dictionary where each value per key is a list
files = ["sample_a.txt","sample_b.txt","sample_c.txt"]
for file in files:
with open(file,"r") as f:
for row in f:
dd[file.split(".")[0]].append(row[0])
#appending to dictionary dd:
#KEY: file.split(".")[0] is file name without extension
#VALUE: row[0] is first character of line in text file
# (second character was new line '\n' so I removed it)
df = pd.DataFrame.from_dict(dd, orient='index').T.melt() #converting dictionary to long format of dataframe
pd.crosstab(df.variable, df.value) #make crosstab, similar to pd.pivot_table
result:
value a b c e f m n o p w
variable
sample_a 1 1 1 0 0 0 0 0 0 0
sample_b 0 1 0 1 1 0 0 0 0 1
sample_c 1 0 0 0 0 1 1 1 1 0
Please note letters (columns) are in alphabetical order.
Ben
5 0 0 0 0 0 0 1 0 1 -3 5 0 0 0 5 5 0 0 0 0 5 0 0 0 0 0 0 0 0 1 3 0 1 0 -5 0 0 5 5 0 5 5 5 0 5 5 0 0 0 5 5 5 5 -5
Moose
5 5 0 0 0 0 3 0 0 1 0 5 3 0 5 0 3 3 5 0 0 0 0 0 5 0 0 0 0 0 3 5 0 0 0 0 0 5 -3 0 0 0 5 0 0 0 0 0 0 5 5 0 3 0 0
Reuven
I was wondering how to read multiple lines of this sort of file into a list or dictionary as I want the ratings which are the numbers to stay with the names of the person that corresponds with the rating
You could read the file in pairs of two lines and populate a dictionary.
path = ... # path to your file
out = {}
with open(path) as f:
# iterate over lines in the file
for line in f:
# the first, 3rd, ... line contains the name
name = line
# the 2nd, 4th, ... line contains the ratings
ratings = f.next() # by calling next here, we jump two lines per iteration
# write values to dictionary while using strip to get rid of whitespace
out[name.strip()] = [int(rating.strip()) for rating in ratings.strip().split(' ')]
It could also be done with a while loop:
path = ... # path to your file
out = {}
with open(path) as f:
while(True):
# read name and ratings, which are in consecutive lines
name = f.readline()
ratings = f.readline()
# stop condition: end of file is reached
if name == '':
break
# write values to dictionary:
# use name as key and convert ratings to integers.
# use strip to get rid of whitespace
out[name.strip()] = [int(rating.strip()) for rating in ratings.strip().split(' ')]
You can use zip to combine lines by pairs to form the dictionary
with open("file.txt","r") as f:
lines = f.read().split("\n")
d = { n:[*map(int,r.split())] for n,r in zip(lines[::2],lines[1::2]) }
I want to read 2d list from the user as space separated values like this format :
0 0 0 0 0
0 0 0 0 1
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
and here is my code :
# Read Matrix of 5X5
mat = [[int(input()) for x in range(5)] for y in range(5)]
May be you can try this:
mat = [list(map(int, input().split())) for y in range(5)]
This will take 5 lists line by line to create the matrix.
I have a text file which looks like this:
~Date and Time of Data Converting: 15.02.2019 16:12:44
~Name of Test: XXX
~Address: ZZZ
~ID: OPP
~Testchannel: CH06
~a;b;DateTime;c;d;e;f;g;h;i;j;k;extract;l;m;n;o;p;q;r
0;1;04.03.2019 07:54:19;0;0;2;Pause;3,57263521596443;0;0;0;0;24,55957;1;3;0;0;0;0;0
5,5523894132E-7;2;04.03.2019 07:54:19;5,5523894132E-7;5,5523894132E-7;2;Pause;3,57263521596443;0;0;0;0;24,55957;1;0;0;0;0;0;0
0,00277777777779538;3;04.03.2019 07:54:29;0,00277777777779538;0,00277777777779538;2;Pause;3,5724446855812;0;0;0;0;24,55653;1;1;0;0;0;0;0
0,00555555532278617;4;04.03.2019 07:54:39;0,00555555532278617;0,00555555532278617;2;Pause;3,57263521596443;0;0;0;0;24,55957;1;1;0;0;0;0;0
0,00833333333338613;5;04.03.2019 07:54:49;0,00833333333338613;0,00833333333338613;2;Pause;3,57263521596443;0;0;0;0;24,55653;1;1;0;0;0;0;0
0,0111112040002119;6;04.03.2019 07:54:59;0,0111112040002119;0,0111112040002119;2;Pause;3,57263521596443;0;0;0;0;24,55653;1;1;0;0;0;0;0
0,013888887724954;7;04.03.2019 07:55:09;0,013888887724954;0,013888887724954;2;Pause;3,57263521596443;0;0;0;0;24,55653;1;1;0;0;0;0;0
I need to extract the values from the column named extract, and need to store the output as an excel file.
Can anyone give me any idea how I can proceed?
So far, I have only been able to create an empty excel file for the output, and I have read the text file. I however don't know how to append output to the empty excel file.
import os
file=open('extract.csv', "a")
if os.path.getsize('extract.csv')==0:
file.write(" "+";"+"Datum"+";"+"extract"+";")
with open('myfile.txt') as f:
dat=[f.readline() for x in range(10)]
datum=dat[7].split(' ')[3]
data = np.genfromtxt('myfile.txt', delimiter=';', skip_header=12,dtype=str)
You can use the pandas module.
You need to read skip the first lines of your text file. Here, I consider not to know how many there are. I loop until I find a data row.
Then read the data.
Finaly, export it as dataframe with to_excel (doc)
Here the code:
# Import module
import pandas as pd
# Read file
with open('temp.txt') as f:
content = f.read().split("\n")
# Skip the first lines (find number start data)
for i, line in enumerate(content):
if line and line[0] != '~': break
# Columns names and data
header = content[i - 1][1:].split(';')
data = [row.split(';') for row in content[i:]]
# Store in dataframe
df = pd.DataFrame(data, columns=header)
print(df)
# a b DateTime c d e f ... l m n o p q r
# 0 0 1 04.03.2019 07:54:19 0 0 2 Pause ... 1 3 0 0 0 0 0
# 1 5,5523894132E-7 2 04.03.2019 07:54:19 5,5523894132E-7 5,5523894132E-7 2 Pause ... 1 0 0 0 0 0 0
# 2 0,00277777777779538 3 04.03.2019 07:54:29 0,00277777777779538 0,00277777777779538 2 Pause ... 1 1 0 0 0 0 0
# 3 0,00555555532278617 4 04.03.2019 07:54:39 0,00555555532278617 0,00555555532278617 2 Pause ... 1 1 0 0 0 0 0
# 4 0,00833333333338613 5 04.03.2019 07:54:49 0,00833333333338613 0,00833333333338613 2 Pause ... 1 1 0 0 0 0 0
# 5 0,0111112040002119 6 04.03.2019 07:54:59 0,0111112040002119 0,0111112040002119 2 Pause ... 1 1 0 0 0 0 0
# 6 0,013888887724954 7 04.03.2019 07:55:09 0,013888887724954 0,013888887724954 2 Pause ... 1 1 0 0 0 0 0
# Select only the Extract column
# df = df.Extract
# Save the data in excel file
df.to_excel("OutPut.xlsx", "MySheetName", index=False)
Note: if you know the number of lines to skip, you can simply load the dataframe with read_csv using the skiprows parameter. (doc).
Hope that helps!
I am quite new to programming (Python) and I am trying to write a script in python that compares the values in two separate files such that if the value is the same it assigns 0, and it the value is different it assigns 1.
Say the both initial files are 4rows by 3 columns, so the final file will be a 4rows by 3 columns file of just 1’s and 0’s.
Also, I'd like to sum all the values in this new file (that is summing all the 1’s together).
I have checked around, and I have come across functions such as 'difflib', however I don't know if that'll be suitable.
I am wondering if anyone can help out with something simple...
Thanks a lot in advance :)
The both files shown below consist of 5rows and 6columns
File 1 (ain.txt)
0 1 0 1 0 0
0 0 0 0 0 0
0 1 0 1 0 0
0 0 0 0 0 0
0 1 0 1 0 0
File 2 (bin.txt)
1 1 1 1 1 0
1 1 1 1 1 0
1 1 1 1 1 0
1 1 1 1 1 0
1 1 1 1 1 0
The script below outputs True and False...
import numpy as np
infile = np.loadtxt('ain.txt')
data = np.array(infile)
infile1 = np.loadtxt('bin.txt')
data1 = np.array(infile1)
index = (data==data1)
np.savetxt('comparrr.txt', (index), delimiter = ' ', fmt='%s')
The output shown below:
comparrr.txt
FALSE TRUE FALSE TRUE FALSE TRUE
FALSE FALSE FALSE FALSE FALSE TRUE
FALSE TRUE FALSE TRUE FALSE TRUE
FALSE FALSE FALSE FALSE FALSE TRUE
FALSE TRUE FALSE TRUE FALSE TRUE
However I would want the "FALSE" to be represented by values of 1, and the "TRUE" by values by 0.
I hope this clarifies my question.
Thanks very much in advance.
Sorry for all the troubles, I found out the issue with the previous script above was the format I chose (fmt='%s')... changing that to (fmt='%d') gives the output as 1's and 0's... however I want to have them flipped (i.e. the 1's become 0's, and the 0's become 1's)
Thanks
The output after the change in format mentioned above, shown below:
0 1 0 1 0 1
0 0 0 0 0 1
0 1 0 1 0 1
0 0 0 0 0 1
0 1 0 1 0 1
EDIT: Ok, updating answer
You don't need to import numpy to solve this problem.
If you open the files in iter() they will be read line by line as strings. You can use split() to make them into a list and then use zip() and list comps to quickly figure out if they're equal or not. Then you can turn it back into a string(with map() and join()) and toss it into the file.
foo1 = iter(open('foo1', 'r'))
foo2 = iter(open('foo2', 'r'))
outArr = [ [0 if p==q else 1 for p,q in zip(i.split(), j.split()) ] for i,j in zip(foo1,foo2) ]
totalSum = sum([ sum(row) for row in outArr ])
with open('outFile', 'w') as out:
for row in outArr:
out.write(' '.join(map(str,row))+'\n')
In regards to your code--while the index = (data==data1) bit technically works because of how numpy arrays work, it isn't very readable in my opinion.
To invert your array, numpy provides invert which can be applied directly to the numpy array as np.invert(index). Also, np.loadtxt() returns an np.ndarray type, you don't need to reassign it. To make your code work as you have outlined I would do the following...
import numpy as np
infile = np.loadtxt('foo1')
infile1 = np.loadtxt('foo2')
index = np.invert(infile==infile1).astype(int)
totalSum = sum(sum(index))
np.savetxt('outFile', index, fmt='%d')
'''
assume file 'a.txt' is:
1 2 3
4 5 6
7 8 9
10 11 12
'''
# 1. read in two file.
with open('a.txt','r') as fa:
a = [map(int, line.split()) for line in fa]
with open('b.txt','r') as fb:
b = [map(int, line.split()) for line in fb]
# 2. compare values in two files.
sum_value = 0
c = []
for i in range(4):
c.append([])
for j in range(3):
if (a[i][j] == b[i][j]):
c[i].append(1)
sum_value += 1
else:
c[i].append(0)
# 3. print comparison array.
print c
# 4. print sum value.
print sum_value