I would like to convert my text file below into a list:
4,9,2
3,5,7
8,1,6
Here's my python code so far, but I couldn't understand why it doesn't work:
def main():
file = str(input("Please enter the full name of the desired file (with extension) at the prompt below: \n"))
print (parseCSV(file))
def parseCSV(file):
file_open = open(file)
#print (file_open.read())
with open(file) as f:
d = f.read().split(',')
data = list(map(int, d))
print (data)
main()
The error message is:
line 12, in parseCSV
data = list(map(int, d))
ValueError: invalid literal for int() with base 10: '2\n3'
Thanks :)
With d = f.read().split(','), you're reading the entire file and splitting on commas. Since the file consists of multiple lines, it will contain newline characters. These characters are not removed by split(',').
To fix this, iterate over the lines first instead of splitting the whole thing on commas:
d = (item for line in f for item in line.split(','))
Read is reading the entire file (including the newlines). So your actual data looks like:
'4,9,2\n3,5,7\n8,1,6'
You can either read the content in a single line at a time using
d = f.readline().split(',')
while d != "":
data = list(map(int, d))
print(data)
d = f.readline().split(',')
Or, you can handle the new lines ("\n" and or "\n\r") as follows:
d = f.readline().replace("\n", ",").split(',')
f.read() will read everything including the newline character (\n) and so map(int, d) will spit out error.
with open(file) as f:
for line in f:
d = line.split(',')
data = list(map(int, d))
print (data)
for line in f is a standard way to read a file line by line in python
You need to split by newlines ('\n'), in this case you should use csv library.
>>> import csv
>>> with open('foo.csv') as f:
print [map(int, row) for row in csv.reader(f)]
[[4, 9, 2], [3, 5, 7], [8, 1, 6]]
Related
filename = r"D:\PythonFiles\Python_File.txt"
f = open(filename)
List1 = ['E','H']
for line in f:
if line == List1:
print(line)
When I execute this, all that comes up is the entire contents of the txt file:
H
E
L
L
O
All that should be printed is:
H
E
I think the answers above are what you're looking for. However, for completeness and because it is a better way to do what you're asking to do, I'll also propose using a set intersection:
In [20]: f = open('file.txt').read().split('\n') # Since readlines keeps newlines
...: interesting = ['E', 'H']
...: set(interesting) & set(f)
...:
Out[20]: {'E', 'H'}
Use in as it tells you if something is in something else.
filename = r"D:\PythonFiles\Python_File.txt"
f = open(filename)
List1 = ['E','H']
for line in f:
## Use the 'in' operator here instead of '=='
if line in List1:
print(line)
## If List1 is going to be dynamic
## else:
## List1.append(line)
# Iterate File contents Line by Line
for line in f.readlines():
# Remove leading NewLine and Test if line in List1
if line.strip() in List1:
# Print's line with NewLine
print(line)
Try it:
filename = r"D:\PythonFiles\Python_File.txt"
f = open(filename, 'r')
List1 = ['E','H']
for line in f.readlines():
if line in List1:
print(line)
I am trying to put the following text file into a dictionary but I would like any section starting with '#' or empty lines ignored.
My text file looks something like this:
# This is my header info followed by an empty line
Apples 1 # I want to ignore this comment
Oranges 3 # I want to ignore this comment
#~*~*~*~*~*~*~*Another comment~*~*~*~*~*~*~*~*~*~*
Bananas 5 # I want to ignore this comment too!
My desired output would be:
myVariables = {'Apples': 1, 'Oranges': 3, 'Bananas': 5}
My Python code reads as follows:
filename = "myFile.txt"
myVariables = {}
with open(filename) as f:
for line in f:
if line.startswith('#') or not line:
next(f)
key, val = line.split()
myVariables[key] = val
print "key: " + str(key) + " and value: " + str(val)
The error I get:
Traceback (most recent call last):
File "C:/Python27/test_1.py", line 11, in <module>
key, val = line.split()
ValueError: need more than 1 value to unpack
I understand the error but I do not understand what is wrong with the code.
Thank you in advance!
Given your text:
text = """
# This is my header info followed by an empty line
Apples 1 # I want to ignore this comment
Oranges 3 # I want to ignore this comment
#~*~*~*~*~*~*~*Another comment~*~*~*~*~*~*~*~*~*~*
Bananas 5 # I want to ignore this comment too!
"""
We can do this in 2 ways. Using regex, or using Python generators. I would choose the latter (described below) as regex is not particularly fast(er) in such cases.
To open the file:
with open('file_name.xyz', 'r') as file:
# everything else below. Just substitute `for line in lines` with
# `for line in file.readline()`
Now to create a similar, we split the lines, and create a list:
lines = text.split('\n') # as if read from a file using `open`.
Here is how we do all you want in a couple of lines:
# Discard all comments and empty values.
comment_less = filter(None, (line.split('#')[0].strip() for line in lines))
# Separate items and totals.
separated = {item.split()[0]: int(item.split()[1]) for item in comment_less}
Lets test:
>>> print(separated)
{'Apples': 1, 'Oranges': 3, 'Bananas': 5}
Hope this helps.
This doesn't exactly reproduce your error, but there's a problem with your code:
>>> x = "Apples\t1\t# This is a comment"
>>> x.split()
['Apples', '1', '#', 'This', 'is', 'a', 'comment']
>>> key, val = x.split()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack
Instead try:
key = line.split()[0]
val = line.split()[1]
Edit: and I think your "need more than 1 value to unpack" is coming from the blank lines. Also, I'm not familiar with using next() like this. I guess I would do something like:
if line.startswith('#') or line == "\n":
pass
else:
key = line.split()[0]
val = line.split()[1]
To strip comments, you could use str.partition() which works whether the comment sign is present or not in the line:
for line in file:
line, _, comment = line.partition('#')
if line.strip(): # non-blank line
key, value = line.split()
line.split() may raise an exception in this code too—it happens if there is a non-blank line that does not contain exactly two whitespace-separated words—it is application depended what you want to do in this case (ignore such lines, print warning, etc).
You need to ignore empty lines and lines starting with # splitting the remaining lines after either splitting on # or using rfind as below to slice the string, an empty line will have a new line so you need and line.strip() to check for one, you cannot just split on whitespace and unpack as you have more than two elements after splitting including what is in the comment:
with open("in.txt") as f:
d = dict(line[:line.rfind("#")].split() for line in f
if not line.startswith("#") and line.strip())
print(d)
Output:
{'Apples': '1', 'Oranges': '3', 'Bananas': '5'}
Another option is to split twice and slice:
with open("in.txt") as f:
d = dict(line.split(None,2)[:2] for line in f
if not line.startswith("#") and line.strip())
print(d)
Or splitting twice and unpacking using an explicit loop:
with open("in.txt") as f:
d = {}
for line in f:
if not line.startswith("#") and line.strip():
k, v, _ = line.split(None, 2)
d[k] = v
You can also use itertools.groupby to group the lines you want.
from itertools import groupby
with open("in.txt") as f:
grouped = groupby(f, lambda x: not x.startswith("#") and x.strip())
d = dict(next(v).split(None, 2)[:2] for k, v in grouped if k)
print(d)
To handle where we have multiple words in single quotes we can use shlex to split:
import shlex
with open("in.txt") as f:
d = {}
for line in f:
if not line.startswith("#") and line.strip():
data = shlex.split(line)
d[data[0]] = data[1]
print(d)
So changing the Banana line to:
Bananas 'north-side disabled' # I want to ignore this comment too!
We get:
{'Apples': '1', 'Oranges': '3', 'Bananas': 'north-side disabled'}
And the same will work for the slicing:
with open("in.txt") as f:
d = dict(shlex.split(line)[:2] for line in f
if not line.startswith("#") and line.strip())
print(d)
If the format of the file is correctly defined you can try a solution with regular expressions.
Here's just an idea:
import re
fruits = {}
with open('fruits_list.txt', mode='r') as f:
for line in f:
match = re.match("([a-zA-Z0-9]+)[\s]+([0-9]+).*", line)
if match:
fruit_name, fruit_amount = match.groups()
fruits[fruit_name] = fruit_amount
print fruits
UPDATED:
I changed the way of reading lines taking care of large files. Now I read line by line and not all in one. This improves the memory usage.
Say I have a file "stuff.txt" that contains the following on separate lines:
q:5
r:2
s:7
I want to read each of these lines from the file, and convert them to dictionary elements, the letters being the keys and the numbers the values.
So I would like to get
y ={"q":5, "r":2, "s":7}
I've tried the following, but it just prints an empty dictionary "{}"
y = {}
infile = open("stuff.txt", "r")
z = infile.read()
for line in z:
key, value = line.strip().split(':')
y[key].append(value)
print(y)
infile.close()
try this:
d = {}
with open('text.txt') as f:
for line in f:
key, value = line.strip().split(':')
d[key] = int(value)
You are appending to d[key] as if it was a list. What you want is to just straight-up assign it like the above.
Also, using with to open the file is good practice, as it auto closes the file after the code in the 'with block' is executed.
There are some possible improvements to be made. The first is using context manager for file handling - that is with open(...) - in case of exception, this will handle all the needed tasks for you.
Second, you have a small mistake in your dictionary assignment: the values are assigned using = operator, such as dict[key] = value.
y = {}
with open("stuff.txt", "r") as infile:
for line in infile:
key, value = line.strip().split(':')
y[key] = (value)
print(y)
Python3:
with open('input.txt', 'r', encoding = "utf-8") as f:
for line in f.readlines():
s=[] #converting strings to list
for i in line.split(" "):
s.append(i)
d=dict(x.strip().split(":") for x in s) #dictionary comprehension: converting list to dictionary
e={a: int(x) for a, x in d.items()} #dictionary comprehension: converting the dictionary values from string format to integer format
print(e)
I have this code wrote in Python:
with open ('textfile.txt') as f:
list=[]
for line in f:
line = line.split()
if line:
line = [int(i) for i in line]
list.append(line)
print(list)
This actually read integers from a text file and put them in a list.But it actually result as :
[[10,20,34]]
However,I would like it to display like:
10 20 34
How to do this? Thanks for your help!
You probably just want to add the items to the list, rather than appending them:
with open('textfile.txt') as f:
list = []
for line in f:
line = line.split()
if line:
list += [int(i) for i in line]
print " ".join([str(i) for i in list])
If you append a list to a list, you create a sub list:
a = [1]
a.append([2,3])
print a # [1, [2, 3]]
If you add it you get:
a = [1]
a += [2,3]
print a # [1, 2, 3]!
with open('textfile.txt') as f:
lines = [x.strip() for x in f.readlines()]
print(' '.join(lines))
With an input file 'textfiles.txt' that contains:
10
20
30
prints:
10 20 30
It sounds like you are trying to print a list of lists. The easiest way to do that is to iterate over it and print each list.
for line in list:
print " ".join(str(i) for i in line)
Also, I think list is a keyword in Python, so try to avoid naming your stuff that.
If you know that the file is not extremely long, if you want the list of integers, you can do it at once (two lines where one is the with open(.... And if you want to print it your way, you can convert the element to strings and join the result via ' '.join(... -- like this:
#!python3
# Load the content of the text file as one list of integers.
with open('textfile.txt') as f:
lst = [int(element) for element in f.read().split()]
# Print the formatted result.
print(' '.join(str(element) for element in lst))
Do not use the list identifier for your variables as it masks the name of the list type.
I've got a text file like this:
1;2;3;4
5;6;7;8
And I'd like to transform it to:
[[1,2,3,4],[5,6,7,8]]
Using Python, how can i achieve this?*
You can use the following:
data = [[int(i) for i in line.split(';')] for line in open(filename)]
Alternative using the csv module:
import csv
data = [[int(i) for i in ln] for ln in csv.reader(open(filename), delimiter=';')]
If lists of strings are acceptable:
data = [line.split(';') for line in open(filename)]
Or the csv equivalent:
data = list(csv.reader(open(filename), delimiter=';'))
As a multi-line string:
>>> s = """1;2;3;4
5;6;7;8"""
>>> [[int(x) for x in a.split(';')] for a in s.splitlines()]
[[1, 2, 3, 4], [5, 6, 7, 8]]
As your data seems to be some sort of CSV like data, why not use python's csv parsing module? This handles encoding and supports delimiters all for free.
If you just want some code, use a list comprehension and split using the split method of str:
result = [line.split(';') for line in text.split("\n")]
'1;2;3;4'.split(';') will produce the list [1, 2, 3, 4] from the string '1;2;3;4', so you just need to do that for each line in your file:
def split_lists(filepath, sep=';'):
with open(filepath) as f:
line_lists = []
for line in f:
line_lists.append(line.split(sep))
return line_lists
Or more compactly with a comprehension
def split_lists(filepath, sep=';'):
with open(filepath) as f:
return [line.split(sep) for line in f]
thanks for the interesting question, can be resolved by 2 map and one for loop
s='1;2;3;4\n5;6;7;8'
map(lambda seq: [int(i) for i in seq], map(lambda x:x.split(';'), s.split('\n')))