I'm trying take a .txt file populated by 88 rows, each of which has two characters separated by a space, copy the first character in each row into a list #1, copy the second character of each list into a list #2 and then populate a dictionary with those two lists. Something, however, is going wrong when I try to copy down the data from the file into my lists. Could you tell me what I'm not doing correctly?
I keep getting this error: "IndexError: string index out of range" at the line where I have typed "column1[count] = readit[0]"
def main():
modo = open('codes.txt', 'r') #opening file
filezise = 0 #init'ing counter
for line in modo:
filezise+=1 #counting lines in the file
column1 = []*filezise
column2 = []*filezise #making the lists as large as the file
count = 0 #init'ing next counter
while count < filezise+1:
readit = str(modo.readline())
column1[count] = readit[0] ##looping through the file and
column2[count] = readit[2] ##populating the first list with the
count+=1 #first character and the second list
print(column1, column2) #with the second character
index = 0
n = 0
codebook = {} #init'ing dictionary
for index, n in enumerate(column1): #looping through to bind the key
codebook[n] = column2[index] #to its concordant value
print(codebook)
main()
When you write
for line in modo:
filezise+=1
You have already consumed the file.
If you want to consume it again, you need to do modo.seek(0) first to rewind the file back.
If you do not rewind the file, the line below will return an empty string, because there is nothing left in the file.
readit = str(modo.readline())
Of course, there's no real need to go through the file twice. You can just do it once and append to your lists.
column1 = []
column2 = []
for line in modo:
filezise+=1
column1.append(line[0])
column2.append(line[2])
Try this
codebook = dict([''.join(line.strip().split(' ')) for line in open('codes.txt').readlines()])
You are getting the error because column1 = []*filezise doesn't actually make a list of length filezise. (If you look at the result, you will see that column1 is just an empty list.) When you try to access column1[count] when count > 0, you will get that error because there is nothing in column1 with an index greater than 0.
You shouldn't be trying to initialize the list. Instead, iterate over the lines in the file and append the appropriate characters:
column1=[]
column2=[]
for line in file('codes.txt'):
column1.append(line[0])
column2.append(line[2])
There's a much simpler way to get a dictionary from your file, by using the csv module and the dict() built-in function:
import csv
with open('codes.txt', 'rb') as csvfile:
codebook = dict(csv.reader(csvfile, delimiter=' '))
So long as the intermediate lists aren't being used, you could also use a dictionary comprehension to do everything in one go.
with open('codes.txt', 'r') as f:
codebook = {line[0]: line[-1] for line in f.read().splitlines()}
Related
im trying to store each new line of a text file as a different list within a list, where the characters of that nested list are also individual cells. Right now it only appends the ending character of each line, not sure why due to the nested while loop. Anyone see the mistakes? Thanks
def read_lines(filename):
ls_1 = []
x = open(filename, 'r')
i = 0
t = 0
while True: #nested while loop to read lines and seperate lines into individual characters (cells)
read = x.readline()
if read == '':
break
st = read.strip("''\n''")
while t < len(st):
ls_2 = []
ls_2.append(st[t])
t += 1
ls_1.append(ls_2) #append a new list to the original list every time the while loop resets and a new line is read
#ls_2.clear() # removes contents so the next loop doesn't repeat the first readline (doesnt work for unkown reason)
t = 0 # resets the index of read so the next new line can be read from start of line
i += 1
x.close()
return ls_1
Whole txt file:
Baby on board, how I've adored
That sign on my car's windowpane.
Bounce in my step,
Loaded with pep,
'Cause I'm driving in the carpool lane.
Call me a square,
Friend, I don't care.
That little yellow sign can't be ignored.
I'm telling you it's mighty nice.
Each trip's a trip to paradise
With my baby on board!
The reason you are only getting the last character is because you create *a new list inside your inner loop:
while t < len(st):
ls_2 = []
ls_2.append(st[t])
t += 1
ls_1.append(ls_2)
Instead, you would have to do:
ls_2 = []
while t < len(st):
ls_2.append(st[t])
t += 1
ls_1.append(ls_2)
However, don't use while loops to read from files, file objects are iterators, so just use a for-loop. Similarly, don't use a while loop to iterate over a string.
Here is how you would do it, Pythonically:
result = []
with open(filename) as f:
for line in f:
result.append(list(line.strip()))
Or with a list comprehension:
with open(filename) as f:
result = [list(line.strip()) for line in f]
You almost never use while-loops in Python. Everything is iterator based.
I suggested you to use the function readlines from python, that way you can iterate of each line of the opened file, then you can cast the string to list, by doing that you generate a list with all characters that compose that string (which seems to be what you want).
Try using the following code:
def read_lines(filename):
x = open(filename, 'r')
ls_1 = [list(line.strip()) for line in x.readlines()]
x.close()
return ls_1
I have a text file that has three lines and would like the first number of each line stored in an array, the second in another, so on and so fourth. And for it to print the array out.
The text file:
0,1,2,3,0
1,3,0,0,2
2,0,3,0,1
The code I'm using (I've only showed the first array for simplicity):
f=open("ballot.txt","r")
for line in f:
num1=line[0]
num1=[]
print(num1)
I expect the result for it to print out the first number of each line:
0
1
2
the actual result i get is
[]
[]
[]
It looks like you reset num1 right? Every time num1 is reseted to an empty list before printing it.
f=open("ballot.txt","r")
for line in f:
num1=line[0]
#num1=[] <-- remove this line
print(num1)
This will return the first char of the line. If you want the first number (i.e. everything before the first coma), you can try this:
f=open("ballot.txt","r")
for line in f:
num1=line.split(',')[0]
print(num1)
You read in the line fine and assign the first char of the line to the variable, but then you overwrite the variable with an empty list.
f=open("ballot.txt","r")
for line in f:
num1=line.strip().split(',')[0] # splits the line by commas and grabs 1st val
# ^^^^^^^^^^^^^^^^^^^^^^^^^^
print(num1)
This should do what you want. In your simple case, it's index 0, but you could index any value.
Since the file is comma-delimited, splitting the line by the comma will give you all the columns. Then you index the one you want. The strip() call gets rid of the newline character (which would otherwise be hanging off the last column value).
As for the big picture, trying to get lists from each column, read in the whole file into a data structure. Then process the data structure to make your lists.
def get_column_data(data, index):
return [values[index] for values in data]
with open("ballot.txt", "r") as f:
data = f.read()
data_struct = []
for line in data.splitlines():
values = line.split(',')
data_struct.append(values)
print(data, '\nData Struct is ', data_struct)
print(get_column_data(data_struct, 0))
print(get_column_data(data_struct, 1))
The get_column_data function parses the data structure and makes a list (via list comprehension) of the values of the proper column.
In the end, the data_struct is a list of lists, which can be accessed as a two-dimensional array if you wanted to do that.
I'm trying take a .txt file populated by 88 rows, each of which has two characters separated by a space, copy the first character in each row into a list #1, copy the second character of each list into a list #2 and then populate a dictionary with those two lists. Something, however, is going wrong when I try to copy down the data from the file into my lists. Could you tell me what I'm not doing correctly?
I keep getting this error: "IndexError: string index out of range" at the line where I have typed "column1[count] = readit[0]"
def main():
modo = open('codes.txt', 'r') #opening file
filezise = 0 #init'ing counter
for line in modo:
filezise+=1 #counting lines in the file
column1 = []*filezise
column2 = []*filezise #making the lists as large as the file
count = 0 #init'ing next counter
while count < filezise+1:
readit = str(modo.readline())
column1[count] = readit[0] ##looping through the file and
column2[count] = readit[2] ##populating the first list with the
count+=1 #first character and the second list
print(column1, column2) #with the second character
index = 0
n = 0
codebook = {} #init'ing dictionary
for index, n in enumerate(column1): #looping through to bind the key
codebook[n] = column2[index] #to its concordant value
print(codebook)
main()
When you write
for line in modo:
filezise+=1
You have already consumed the file.
If you want to consume it again, you need to do modo.seek(0) first to rewind the file back.
If you do not rewind the file, the line below will return an empty string, because there is nothing left in the file.
readit = str(modo.readline())
Of course, there's no real need to go through the file twice. You can just do it once and append to your lists.
column1 = []
column2 = []
for line in modo:
filezise+=1
column1.append(line[0])
column2.append(line[2])
Try this
codebook = dict([''.join(line.strip().split(' ')) for line in open('codes.txt').readlines()])
You are getting the error because column1 = []*filezise doesn't actually make a list of length filezise. (If you look at the result, you will see that column1 is just an empty list.) When you try to access column1[count] when count > 0, you will get that error because there is nothing in column1 with an index greater than 0.
You shouldn't be trying to initialize the list. Instead, iterate over the lines in the file and append the appropriate characters:
column1=[]
column2=[]
for line in file('codes.txt'):
column1.append(line[0])
column2.append(line[2])
There's a much simpler way to get a dictionary from your file, by using the csv module and the dict() built-in function:
import csv
with open('codes.txt', 'rb') as csvfile:
codebook = dict(csv.reader(csvfile, delimiter=' '))
So long as the intermediate lists aren't being used, you could also use a dictionary comprehension to do everything in one go.
with open('codes.txt', 'r') as f:
codebook = {line[0]: line[-1] for line in f.read().splitlines()}
I am trying to take data from one file and create two lists which are both written to a new file. One of the lists contains names 6 characters or less and the second list contains a list with names that do not contain "a" or "e." I have the code done that will form both lists, I have tried them both separately and they work but, I cannot make them both append to a new list at the same time. Whichever list I do first will be the only one that gets appended to the new file. Any help would be much appreciated!
Code
main_file = open("words.txt", "r")
#loops to find lists
lists = open('test.txt')
lists.read()
with open("test.txt", "a") as lists:
for names in main_file:
if len(names) <= 6:
lists.write(names)
line = True
for line in main_file:
if "a" in line or "e" in line or "A" in line:
line = False
else:
lists.write(line)
Both lists need to be appended to a new and the SAME file
If I understand your qn correctly
import re
l1 = []
l2 = []
lists = open('test.txt','w')
with open('words.txt') as f:
for line in f.read().split('\n'):
if len(line) <= 6:
l1.append(line)
if not re.search(r'a|e', line):
l2.append(line)
lists.write('\n'.join( l1+l2 ) )
lists.close()
I have the following text file in the same folder as my Python Code.
78459581
Black Ballpoint Pen
12345670
Football
49585922
Perfume
83799715
Shampoo
I have written this Python code.
file = open("ProductDatabaseEdit.txt", "r")
d = {}
for line in file:
x = line.split("\n")
a=x[0]
b=x[1]
d[a]=b
print(d)
This is the result I receive.
b=x[1] # IndexError: list index out of range
My dictionary should appear as follows:
{"78459581" : "Black Ballpoint Pen"
"12345670" : "Football"
"49585922" : "Perfume"
"83799715" : "Shampoo"}
What am I doing wrong?
A line is terminated by a linebreak, thus line.split("\n") will never give you more than one line.
You could cheat and do:
for first_line in file:
second_line = next(file)
You can simplify your solution by using a dictionary generator, this is probably the most pythonic solution I can think of:
>>> with open("in.txt") as f:
... my_dict = dict((line.strip(), next(f).strip()) for line in f)
...
>>> my_dict
{'12345670': 'Football', '49585922': 'Perfume', '78459581': 'Black Ballpoint Pen', '83799715': 'Shampoo'}
Where in.txt contains the data as described in the problem. It is necessary to strip() each line otherwise you would be left with a trailing \n character for your keys and values.
You need to strip the \n, not split
file = open("products.txt", "r")
d = {}
for line in file:
a = line.strip()
b = file.next().strip()
# next(file).strip() # if using python 3.x
d[a]=b
print(d)
{'12345670': 'Football', '49585922': 'Perfume', '78459581': 'Black Ballpoint Pen', '83799715': 'Shampoo'}
What's going on
When you open a file you get an iterator, which will give you one line at a time when you use it in a for loop.
Your code is iterating over the file, splitting every line in a list with \n as the delimiter, but that gives you a list with only one item: the same line you already had. Then you try to access the second item in the list, which doesn't exist. That's why you get the IndexError: list index out of range.
How to fix it
What you need is this:
file = open('products.txt','r')
d = {}
for line in file:
d[line.strip()] = next(file).strip()
In every loop you add a new key to the dictionary (by assigning a value to a key that didn't exist yet) and assign the next line as the value. The next() function is just telling to the file iterator "please move on to the next line". So, to drive the point home: in the first loop you set first line as a key and assign the second line as the value; in the second loop iteration, you set the third line as a key and assign the fourth line as the value; and so on.
The reason you need to use the .strip() method every time, is because your example file had a space at the end of every line, so that method will remove it.
Or...
You can also get the same result using a dictionary comprehension:
file = open('products.txt','r')
d = {line.strip():next(file).strip() for line in file}
Basically, is a shorter version of the same code above. It's shorter, but less readable: not necessarily something you want (a matter of taste).
In my solution i tried to not use any loops. Therefore, I first load the txt data with pandas:
import pandas as pd
file = pd.read_csv("test.txt", header = None)
Then I seperate keys and values for the dict such as:
keys, values = file[0::2].values, file[1::2].values
Then, we can directly zip these two as lists and create a dict:
result = dict(zip(list(keys.flatten()), list(values.flatten())))
To create this solution I used the information as provided in [question]: How to remove every other element of an array in python? (The inverse of np.repeat()?) and in [question]: Map two lists into a dictionary in Python
You can loop over a list two items at a time:
file = open("ProductDatabaseEdit.txt", "r")
data = file.readlines()
d = {}
for line in range(0,len(data),2):
d[data[i]] = data[i+1]
Try this code (where the data is in /tmp/tmp5.txt):
#!/usr/bin/env python3
d = dict()
iskey = True
with open("/tmp/tmp5.txt") as infile:
for line in infile:
if iskey:
_key = line.strip()
else:
_value = line.strip()
d[_key] = _value
iskey = not iskey
print(d)
Which gives you:
{'12345670': 'Football', '49585922': 'Perfume', '78459581': 'Black Ballpoint Pen', '83799715': 'Shampoo'}