Related
My Code :
import ast
with open('input.txt', 'r') as file :
filedata = file.read()
filedata = filedata.replace('|', ',')
out = []
buff = []
for c in filedata :
if c == '\n':
out.append(''.join(buff))
buff = []
else:
buff.append(c)
else:
if buff:
out.append(''.join(buff))
list = [[i] for i in out]
print(list)
Input :
10|1|SELL|toaster_1|10.00|20 12|8|BID|toaster_1|7.50
13|5|BID|toaster_1|12.50 15|8|SELL|tv_1|250.00|20 16
17|8|BID|toaster_1|20.00 18|1|BID|tv_1|150.00 19|3|BID|tv_1|200.00
20 21|3|BID|tv_1|300.00
Expected Output
[["10","1","SELL","toaster_1","10.00","20"],
["12","8","BID","toaster_1","7.50"],
["13","5","BID","toaster_1","12.50"],
["15","8","SELL","tv_1","250.00","20"], ["16"],
["17","8","BID","toaster_1","20.00"],
["18","1","BID","tv_1","150.00"], ["19","3","BID","tv_1","200.00"],
["20"], ["21","3","BID","tv_1","300.00"]] "
The Output I am getting:
[['10,1,SELL,toaster_1,10.00,20'],
['12,8,BID,toaster_1,7.50'], ['13,5,BID,toaster_1,12.50'],
['15,8,SELL,tv_1,250.00,20'], ['16'], ['17,8,BID,toaster_1,20.00'],
['18,1,BID,tv_1,150.00'], ['19,3,BID,tv_1,200.00'], ['20'],
['21,3,BID,tv_1,300.00']] [Finished in 0.1s]
I want to access individual elements within sublist, eg, SELL, or
toaster, but I am not able to access them. Can someone advice please?
Use:
# filedata = file.read()
filedata = """10|1|SELL|toaster_1|10.00|20 12|8|BID|toaster_1|7.50
13|5|BID|toaster_1|12.50 15|8|SELL|tv_1|250.00|20 16
17|8|BID|toaster_1|20.00 18|1|BID|tv_1|150.00 19|3|BID|tv_1|200.00
20 21|3|BID|tv_1|300.00 """
result = []
for i in filedata.split(): #split by space
result.append(i.split("|")) #split by `|` and append to result
print(result)
Or a list comprehension
Ex:
result = [i.split("|") for i in filedata.split()]
Output:
[['10', '1', 'SELL', 'toaster_1', '10.00', '20'],
['12', '8', 'BID', 'toaster_1', '7.50'],
['13', '5', 'BID', 'toaster_1', '12.50'],
['15', '8', 'SELL', 'tv_1', '250.00', '20'],
['16'],
['17', '8', 'BID', 'toaster_1', '20.00'],
['18', '1', 'BID', 'tv_1', '150.00'],
['19', '3', 'BID', 'tv_1', '200.00'],
['20'],
['21', '3', 'BID', 'tv_1', '300.00']]
Well your code never handles splitting the line into comma separated values. You just read the line character by character, join all those characters together into a string, and append it to the out list.
The following code should work (I minimally changed your own code. I would instead use a more clean solution like the one by Rakesh):
import ast
with open('input.txt', 'r') as file :
filedata = file.read()
filedata = filedata.replace('|', ',')
out = []
buff = []
for c in filedata :
if c == '\n':
line = ''.join(buff)
for word in line.split(","):
out.append(word)
buff = []
else:
buff.append(c)
else:
if buff:
out.append(''.join(buff))
# l = [[i] for i in out]
print(out)
By the way, it is recommended not to use list as a variable name.
I have a list of {n} dictionaries in a txt file. Each dictionary per line as illustrated below which i want exported in csv format with each key presented per column.
{'a':'1','b':'2','c':'3'}
{'a':'4','b':'5','c':'6'}
{'a':'7','b':'8','c':'9'}
{'a':'10','b':'11','c':'12'}
...
{'a':'x','b':'y','c':'z'}
i want csv output for {n} rows as below with index
a b c
0 1 2 3
1 4 5 6
2 7 8 9
... ... ... ...
n x y z
You can use ast.literal_eval (doc) to load your data from the text file.
With contents of input file file.txt:
{'a':'1','b':'2','c':'3'}
{'a':'4','b':'5','c':'6'}
{'a':'7','b':'8','c':'9'}
{'a':'10','b':'11','c':'12'}
{'a':'x','b':'y','c':'z'}
You could use this script to load the data and input file.csv:
import csv
from ast import literal_eval
with open('file.txt', 'r') as f_in:
lst = [literal_eval(line) for line in f_in if line.strip()]
with open('file.csv', 'w', newline='') as csvfile:
fieldnames = ['a', 'b', 'c']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(lst)
file.csv will become:
a,b,c
1,2,3
4,5,6
7,8,9
10,11,12
x,y,z
Importing the file to LibreOffice:
x =[{'a':'1','b':'2','c':'3'},
{'a':'4','b':'5','c':'6'},
{'a':'7','b':'8','c':'9'},
{'a':'10','b':'11','c':'12'}]
n = len(x)
keys = list(x[0].keys())
newdict=dict()
for m in keys:
newdict[m]=[]
for i in range(n):
newdict[m].append(x[i][m])
newdict
Output is
{'a': ['1', '4', '7', '10'],
'b': ['2', '5', '8', '11'],
'c': ['3', '6', '9', '12']}
Or you can use pandas.concat which is used to combine DataFrames with the same columns.
import pandas as pd
x =[{'a':'1','b':'2','c':'3'},
{'a':'4','b':'5','c':'6'},
{'a':'7','b':'8','c':'9'},
{'a':'10','b':'11','c':'12'}]
xpd=[]
for i in x:
df=pd.DataFrame(i, index=[0])
xpd.append(df)
pd.concat(xpd, ignore_index=True)
I have a recursive function that reads a list of scout records from a file, and adds then in order of their ID's to a list box. The function is called with addScouts(1) The function is below:
def addScouts(self,I):
i = I
with open(fileName,"r") as f:
lines = f.readlines()
for line in lines:
if str(line.split(",")[3])[:-1] == str(i):
self.scoutList.insert(END,line[:-1])
i += 1
return self.addScouts(i)
return
My issue is that my file ID's are ordered 1,2,4,5 as at some point I removed the scout with ID of 3. However, when I run the function to re-order the scouts in the list box (the function above), it only lists the scouts up to and including ID 3. This is because when i = 3, none of the items in the file are equal to 3, so the function reaches the end and returns before it gets a chance to check the remaining records.
File contents:
Kris,Rice,17,1
Olly,Fallows,17,2
Olivia,Bird,17,4
Louis,Martin,18,5
Any idea's how to fix this?
Just sort on the last column:
sorted(f,key=lambda x: int(x.split(",")[-1]))
You can use bisect to find where to put the new data to keep the data ordered after it is sorted once:
from bisect import bisect
import csv
with open("foo.txt") as f:
r = list(csv.reader(f))
keys = [int(row[-1]) for row in r]
new = ["foo","bar","12","3"]
ind = bisect(keys, int(new[-1]))
r.insert(ind,new)
print(r)
Output:
[['Kris', 'Rice', '17', '1'], ['Olly', 'Fallows', '17', '2'], ['foo', 'bar', '12', '3'], ['Olivia', 'Bird', '17', '4'], ['Louis', 'Martin', '18', '5']]
A simpler way is to check for the first row that has a higher id, if none are higher just append to the end:
import csv
with open("foo.txt") as f:
r = list(csv.reader(f))
new = ["foo","bar","12","3"]
key = int(new[-1])
ind = None
for i, row in enumerate(r):
if int(row[-1]) >= key:
ind = i
break
r.insert(ind, new) if ind is not None else r.append(new)
print(r)
Output:
[['Kris', 'Rice', '17', '1'], ['Olly', 'Fallows', '17', '2'], ['foo', 'bar', '12', '3'], ['Olivia', 'Bird', '17', '4'], ['Louis', 'Martin', '18', '5']
To always keep that file in order when adding a new value we just need to write to a temp file, writing the line in the correct place and then replace the original with the updated file:
import csv
from tempfile import NamedTemporaryFile
from shutil import move
with open("foo.csv") as f, NamedTemporaryFile(dir=".", delete=False) as temp:
r = csv.reader(f)
wr = csv.writer(temp)
new = ["foo", "bar", "12", "3"]
key, ind = int(new[-1]), None
for i, row in enumerate(r):
if int(row[-1]) >= key:
wr.writerow(new)
wr.writerow(row)
wr.writerows(r)
break
wr.writerow(row)
else:
wr.writerow(new)
move(temp.name, "foo.csv")
foo.csv after will have the data in order:
Kris,Rice,17,1
Olly,Fallows,17,2
foo,bar,12,3
Olivia,Bird,17,4
Louis,Martin,18,5
You can check if your list has the same length as your file and if not, you run addScouts again, and if true, you end. Like this:
def addScouts(self,I):
i = I
with open(fileName,"r") as f:
lines = f.readlines()
for line in lines:
if str(line.split(",")[3])[:-1] == str(i):
self.scoutList.insert(END,line[:-1])
i += 1
return self.addScouts(i)
if len(scoutList) < len(lines):
return self.addScouts(i+1)
else:
return
I'm attempting to turn .csv data into a dictionary in Python but I appear to be getting duplicate dictionary entries.
This is an example of what the .csv data looks like:
ticker,1,2,3,4,5,6
XOM,10,15,17,11,13,20
AAPL,12,11,12,13,11,22
My intention is to use the first column as the key and the remaining columns as the values. Ideally I should have 3 entries: ticker, XOM, and AAPL. But instead I get this:
{'ticker': ['1', '2', '3', '4', '5', '6']}
{'ticker': ['1', '2', '3', '4', '5', '6']}
{'XOM': ['10', '15', '17', '11', '13', '20']}
{'ticker': ['1', '2', '3', '4', '5', '6']}
{'XOM': ['10', '15', '17', '11', '13', '20']}
{'AAPL': ['12', '11', '12', '13', '11', '22']}
So it looks like I'm getting row 1, then row 1 & 2, then row 1, 2 & 3.
This is the code I'm using:
def data_pull():
#gets data out of a .csv file
datafile = open("C:\sample.csv")
data = [] #blank list
dict = {} #blank dictionary
for row in datafile:
data.append(row.strip().split(",")) #removes whitespace and commas
for x in data: #organizes data from list into dictionary
k = x[0]
v = x[1:]
dict = {k:v for x in data}
print dict
data_pull()
I'm trying to figure out why the duplicate entries are showing up.
You have too many loops; you extend data then loop over the whole data list with all entries gathered so far:
for row in datafile:
data.append(row.strip().split(",")) #removes whitespace and commas
for x in data:
# will loop over all entries parsed so far
so you'd append a row to data, then loop over the list, with one item:
data = [['ticker', '1', '2', '3', '4', '5', '6']]
then you'd read the next line and append to data, so then you loop over data again and process:
data = [
['ticker', '1', '2', '3', '4', '5', '6'],
['XOM', '10', '15', '17', '11', '13', '20'],
]
so iterate twice, then add the next line, loop three times, etc.
You could simplify this to:
for row in datafile:
x = row.strip().split(",")
dict[x[0]] = x[1:]
You can save yourself some work by using the csv module:
import csv
def data_pull():
results = {}
with open("C:\sample.csv", 'rb') as datafile:
reader = csv.reader(datafile)
for row in reader:
results[row[0]] = row[1:]
return results
Use the built in csv module:
import csv
output = {}
with open("C:\sample.csv") as f:
freader = csv.reader(f)
for row in freader:
output[row[0]] = row[1:]
The loop for x in data should be outside of the loop for row in datafile:
for row in datafile:
data.append(row.strip().split(",")) #removes whitespace and commas
for x in data: #organizes data from list into dictionary
k = x[0]
Or, csv module can be your friend:
with open("text.csv") as lines:
print {row[0]: row[1:] for row in csv.reader(lines)}
A side note. It's always a good idea to use the raw strings for Windows paths:
open(r"C:\sample.csv")
If your file was named, e.g, C:\text.csv then \t would be interpreted as a tab character.
I have a text file, of which i need each column, preferably into a dictionary or list, the format is :
N ID REMAIN VERS
2 2343333 bana twelve
3 3549287 moredp twelve
3 9383737 hinsila twelve
3 8272655 hinsila eight
I have tried:
crs = open("file.txt", "r")
for columns in ( raw.strip().split() for raw in crs ):
print columns[0]
Result = 'Out of index error'
Also tried:
crs = csv.reader(open(file.txt", "r"), delimiter=',', quotechar='|', skipinitialspace=True)
for row in crs:
for columns in row:
print columns[3]
Which seems to read each char as a column, instead of each 'word'
I would like to get the four columns, ie:
2
2343333
bana
twelve
into seperate dictionaries or lists
Any help is great, thanks!
This works fine for me:
>>> crs = open("file.txt", "r")
>>> for columns in ( raw.strip().split() for raw in crs ):
... print columns[0]
...
N
2
3
3
3
If you want to convert columns to rows, use zip.
>>> crs = open("file.txt", "r")
>>> rows = (row.strip().split() for row in crs)
>>> zip(*rows)
[('N', '2', '3', '3', '3'),
('ID', '2343333', '3549287', '9383737', '8272655'),
('REMAIN', 'bana', 'moredp', 'hinsila', 'hinsila'),
('VERS', 'twelve', 'twelve', 'twelve', 'eight')]
If you have blank lines, filter them before using zip.
>>> crs = open("file.txt", "r")
>>> rows = (row.strip().split() for row in crs)
>>> zip(*(row for row in rows if row))
[('N', '2', '3', '3', '3'), ('ID', '2343333', '3549287', '9383737', '8272655'), ('REMAIN', 'bana', 'moredp', 'hinsila', 'hinsila'), ('VERS', 'twelve', 'twelve', 'twelve', 'eight')]
>>> with open("file.txt") as f:
... c = csv.reader(f, delimiter=' ', skipinitialspace=True)
... for line in c:
... print(line)
...
['N', 'ID', 'REMAIN', 'VERS', ''] #that '' is for leading space after columns.
['2', '2343333', 'bana', 'twelve', '']
['3', '3549287', 'moredp', 'twelve', '']
['3', '9383737', 'hinsila', 'twelve', '']
['3', '8272655', 'hinsila', 'eight', '']
Or, old-fashioned way:
>>> with open("file.txt") as f:
... [line.split() for line in f]
...
[['N', 'ID', 'REMAIN', 'VERS'],
['2', '2343333', 'bana', 'twelve'],
['3', '3549287', 'moredp', 'twelve'],
['3', '9383737', 'hinsila', 'twelve'],
['3', '8272655', 'hinsila', 'eight']]
And for getting column values:
>>> l
[['N', 'ID', 'REMAIN', 'VERS'],
['2', '2343333', 'bana', 'twelve'],
['3', '3549287', 'moredp', 'twelve'],
['3', '9383737', 'hinsila', 'twelve'],
['3', '8272655', 'hinsila', 'eight']]
>>> {l[0][i]: [line[i] for line in l[1:]] for i in range(len(l[0]))}
{'ID': ['2343333', '3549287', '9383737', '8272655'],
'N': ['2', '3', '3', '3'],
'REMAIN': ['bana', 'moredp', 'hinsila', 'hinsila'],
'VERS': ['twelve', 'twelve', 'twelve', 'eight']}
You could use a list comprehension like this:
with open("split.txt","r") as splitfile:
for columns in [line.split() for line in splitfile]:
print(columns)
You will then have it in a 2d array allowing you to group it any way you like it.
How about this?
f = open("file.txt")
for i in f:
k = i.split()
for j in k:
print j
just use a list of lists
import csv
columns = [[] for _ in range(4)] # 4 columns expected
with open('path', rb) as f:
reader = csv.reader(f, delimiter=' ')
for row in reader:
for i, col in enumerate(row):
columns[i].append(col)
or if the number of columns needs to grow dynamically:
import csv
columns = []
with open('path', rb) as f:
reader = csv.reader(f, delimiter=' ')
for row in reader:
while len(row) > len(columns):
columns.append([])
for i, col in enumerate(row):
columns[i].append(col)
In the end, you can then print your columns with:
for i, col in enumerate(columns, 1):
print 'List{}: {{{}}}'.format(i, ','.join(col))