Index error iterating over list in python - python

So I have this file that contains 2 words each line. It looks like this.
[/lang:F </lang:foreign>
[lipmack] [lipsmack]
[Fang:foreign] <lang:foreign>
the first word is incorrectly formatted and the second one is correctly formatted. I am trying to put them in a dictionary. Below is my code.
textFile = open("hinditxt1.txt", "r+")
textFile = textFile.readlines()
flist = []
for word in textFile:
flist.append(word.split())
fdict = dict()
for num in range(len(flist)):
fdict[flist[num][0]] = flist[num][1]
First I split it then I try to put them in a dictionary. But for some reason I get "IndexError: list index out of range" when trying to put them in a dictionary. What can i do to fix it? Thanks!

It is better in python to iterate over the items of a list rather than a new range of indicies. My guess is that the IndexError is coming from a line in the input file that is blank or does not contain any spaces.
with open("input.txt", 'r') as f:
flist = [line.split() for line in f]
fdict = {}
for k, v in flist:
fdict[k] = v
print(fdict)
The code above avoids needing to access elements of the list using an index by simply iterating over the items of the list itself. We can further simplify this by using a dict comprehension:
with open("input.txt", 'r') as f:
flist = [line.split() for line in f]
fdict = {k: v for k, v in flist}
print(fdict)

With dictionaries it is typical to use the .update() method to add new key-value pairs. It would look more like:
for num in range(len(flist)):
fdict.update({flist[num][0] : flist[num][1]})
A full example without file reading would look like:
in_words = ["[/lang:F </lang:foreign>",
"[lipmack] [lipsmack]",
"[Fang:foreign] <lang:foreign>"]
flist = []
for word in in_words:
flist.append(word.split())
fdict = dict()
for num in range(len(flist)):
fdict.update({flist[num][0]: flist[num][1]})
print(fdict)
Yielding:
{'[lipmack]': '[lipsmack]', '[Fang:foreign]': '<lang:foreign>', '[/lang:F': '</lang:foreign>'}
Although your output may vary, since dictionaries do not maintain order.
As #Alex points out, the IndexError is likely from your data having improperly formatted data (i.e. a line with only 1 or 0 items on it). I suspect the most likely cause of this would be a \n at the end of your file that is causing the last line(s) to be blank.

Related

Python, check strings for values and remove other characters

f= open("new_sample.txt","r")
for ranges_in_file in f:
if(ranges_in_file.find('ranges:')!= -1):
new_data = ranges_in_file.split(" ")
print('success')
hi guys, currently i am reading a .txt file line by line to find for a certain value. I am able to find the line. For example, ranges: [1.3,1.9,2.05,inf,1.64] How do i store the certain line into a list and after that, remove any excess characters in the line such as the word "ranges" and "inf"?
Given you have read the lines of a file and can get a list,
which is like
ranges_in_file = [1.3,1.9,2.05,math.inf,1.64]
you can make a comprehension over the things you need/do not need:
wanted = [x for x in ranges_in_file if x not in [math.inf, "range"] ]
You can split the variableranges and the list using a,b=ranges_in_file.split('=')
b.strip()
So b contains your required list as a string. Use c=list(b[1:len(b)-1].split(',')) to convert it into list. Then you can iterate over list and discard any values you don't want. (Remember now all entries of list are strings!)
For new_sample.txt:
ranges: [1.3,1.9,2.05,inf,1.64]
you can split data with delimiter space and ", ":
f = open("new_sample.txt", "r")
file = f.read().splitlines()
f.close()
result = []
for i in file:
b = i.split(" ")
if b[0] == "ranges:":
temp = b[1][1:-1].split(",")
for j in temp:
if j not in ["inf"]:
result.append(j)
print(result)
OUTPUT: ['1.3', '1.9', '2.05', '1.64']
If your lines always look like this, so the start with "ranges: ", followed by a list of values, and the only thing you want to remove is inf, you can turn it into a list of floats easily using a mapfunction like this:
line = "ranges: [1.3,1.9,2.05,inf,1.64]"
values = list(map(float, [x.strip('[]') for x in (line.split(' ')[1]).split(',') if 'inf' not in x]))
Output:
[1.3, 1.9, 2.05, 1.64] # list of float values
You can then apply this to every line of the file that starts with 'ranges:', which will give you a list of the individual lines value lists. Notice the use of with open(..., which is safer to use for files in general, because the file will always be closed properly no matter what happens.
values = []
with open('new_sample.txt', 'r') as f:
for line in f.readlines():
if line.startswith('ranges:'):
line_values = list(map(float, [x.strip #.... and so on, see above
values.append(line_values)
But if your lines can be different, a more general approach is needed.

Sort file by key

I am learning Python 3 and I'm having issues completing this task. It's given a file with a string on each new line. I have to sort its content by the string located between the first hyphen and the second hyphen and write the sorted content into a different file. This is what I tried so far, but nothing gets sorted:
def sort_keys(path, input, output):
list = []
with open(path+'\\'+input, 'r') as f:
for line in f:
if line.count('-') >= 1:
list.append(line)
sorted(list, key = lambda s: s.split("-")[1])
with open(path + "\\"+ output, 'w') as o:
for line in list:
o.write(line)
sort_keys("C:\\Users\\Daniel\\Desktop", "sample.txt", "results.txt")
This is the input file: https://pastebin.com/j8r8fZP6
Question 1: What am I doing wrong with the sorting? I've used it to sort the words of a sentence on the last letter and it worked fine, but here don't know what I am doing wrong
Question 2: I feel writing the content of the input file in a list, sorting the list and writing aftwerwards that content is not very efficient. What is the "pythonic" way of doing it?
Question 3: Do you know any good exercises to learn working with files + folders in Python 3?
Kind regards
Your sorting is fine. The problem is that sorted() returns a list, rather than altering the one provided. It's also much easier to use list comprehensions to read the file:
def sort_keys(path, infile, outfile):
with open(path+'\\'+infile, 'r') as f:
inputlines = [line.strip() for line in f.readlines() if "-" in line]
outputlines = sorted(inputlines, key=lambda s: s.split("-")[1])
with open(path + "\\" + outfile, 'w') as o:
for line in outputlines:
o.write(line + "\n")
sort_keys("C:\\Users\\Daniel\\Desktop", "sample.txt", "results.txt")
I also changed a few variable names, for legibility's sake.
EDIT: I understand that there are easier ways of doing the sorting (list.sort(x)), however this way seems more readable to me.
First, your data has a couple lines without hyphens. Is that a typo? Or do you need to deal with those lines? If it is NOT a typo and those lines are supposed to be part of the data, how should they be handled?
I'm going to assume those lines are typos and ignore them for now.
Second, do you need to return the whole line? But each line is sorted by the 2nd group of characters between the hyphens? If that's the case...
first, read in the file:
f = open('./text.txt', 'r')
There are a couple ways to go from here, but let's clean up the file contents a little and make a list object:
l = [i.replace("\n","") for i in f]
This will create a list l with all the newline characters removed. This particular way of creating the list is called a list comprehension. You can do the exact same thing with the following code:
l = []
for i in f:
l.append(i.replace("\n","")
Now lets create a dictionary with the key as the 2nd group and the value as the whole line. Again, there are some lines with no hyphens, so we are going to just skip those for now with a simple try/except block:
d = {}
for i in l:
try:
d[i.split("-")[1]] = i
except IndexError:
pass
Now, here things can get slightly tricky. It depends on how you want to approach the problem. Dictionaries are inherently unsorted in python, so there is not a really good way to simply sort the dictionary. ONE way (not necessarily the BEST way) is to create a sorted list of the dictionary keys:
s = sorted([k for k, v in d.items()])
Again, I used a list comprehension here, but you can rewrite that line to do the exact same thing here:
s = []
for k, v in d.items():
s.append(k)
s = sorted(s)
Now, we can write the dictionary back to a file by iterating through the dictionary using the sorted list. To see what I mean, lets print out the dictionary one value at a time using the sorted list as the keys:
for i in s:
print(d[i])
But instead of printing, we will now append the line to a file:
o = open('./out.txt', 'a')
for i in s:
o.write(d[i] + "\n")
Depending on your system and formatting, you may or may not need the + "\n" part. Also note that you want to use 'a' and not 'w' because you are appending one line at a time and if you use 'w' your file will only be the last item of the list.

Breaking txt file into list of lists by character and by row

I am just learning to code and am trying to take an input txt file and break into a list (by row) where each row's characters are elements of that list. For example if the file is:
abcde
fghij
klmno
I would like to create
[['a','b','c','d','e'], ['f','g','h','i','j'],['k','l','m','n','o']]
I have tried this, but the results aren't what I am looking for.
file = open('alpha.txt', 'r')
lst = []
for line in file:
lst.append(line.rstrip().split(','))
print(lst)
[['abcde', 'fghij', 'klmno']]
I also tried this, which is closer, but I don't know how to combine the two codes:
file = open('alpha.txt', 'r')
lst = []
for line in file:
for c in line:
lst.append(c)
print(lst)
['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
I tried to add the rstrip into the lst.append but it didn't work (or I didn't do it properly). Sorry - complete newbie here!
I should mention that I don't want newline characters included. Any help is much appreciated!
This is very simple. You have to use the list() constructor to make a string into its respective characters.
with open('alpha.txt', 'r') as file:
print([list(line)[:-1] for line in file.readlines()])
(The with open construct is just an idiom, so you don't have to do all the handling with the file like closing it, which you forgot to do)
If you want to split a string to it's charts you can just use list(s) (where s = 'asdf'):
file = open('alpha.txt', 'r')
lst = []
for line in file:
lst.append(list(line.strip()))
print(lst)
You are appending each entry to your original list. You want to create a new list for each line in your input, append to that list, and then append that list to your master list. For example,
file = open('alpha.txt', 'r')
lst = []
for line in file:
newLst = []
for c in line:
newLst.append(c)
lst.append(newLst)
print(lst)
use a nested list comprehension. The outer loop iterates over the lines in the file and the inner loop over the characters in the strings of each line.
with open('alpha.txt') as f:
out = [[char for char in line.strip()] for line in f]
req = [['a','b','c','d','e'], ['f','g','h','i','j'],['k','l','m','n','o']]
print(out == req)
prints
True

Dictionary out of strings with blanks as the key/value separator

I have a text file that looks like this:
A (four spaces) 16
B (four spaces) 25
etc.
I need to make a dictionary that looks like this: dic = { A:16, B:25, etc.}
I have already tried "split" (which turned out doesn't work with spaces).
Here is my code so far:
lines = []
with open("ABC.txt","r") as filein:
for line in filein:
lines.append(line.strip('\n'))
print lines
which for the moment gives me a list of strings : "A(four spaces)16", "B(fourspaces)25") etc. I just need to separate the key from value using the four spaces as a cutoff.
Any advice?
The constructor dict() takes an iterable of key-value pairs. So the dictionary can be built directly like so:
with open("ABC.txt","r") as filein:
mydict = dict(line.split() for line in filein)
I don't see why split shouldn't work... Try the following.
dic = {}
with open("ABC.txt",'r') as f:
for l in f:
key,value = l.split() #Set two temporary variables for easy access to data
dic[key] = int(value) #Set the appropriate key of the list to its respective value
print dic
lines is not a dictionary it is a list to make it a dictionary you should initilize:
lines = {}
You could also do:
for line in filein:
lines[line[0]] = line[-2:]
assuming the length of your keys and values is always the same and also assuming one line is somethin like
A 25

Trying to input data from a txt file in a list, then make a list, then assign values to the lines

allt = []
with open('towers1.txt','r') as f:
towers = [line.strip('\n') for line in f]
for i in towers:
allt.append(i.split('\t'))
print allt [0]
now i need help, im inputting this text
mw91 42.927 -72.84 2.8
yu9x 42.615 -72.58 2.3
HB90 42.382 -72.679 2.4
and when i output im getting
['mw91 42.927 -72.84 2.8']
where in my code and what functions can i use to define the 1st 2nd 3rd and 4th values in this list and all the ones below that will output, im trying
allt[0][2] or
allt[i][2]
but that dosent give me -72.84, its an error, then other times it goes list has no attribute split
update, maybe i need to use enumerate?? i need to make sure though the middle 2 values im imputing though can be used and numbers and not strings because im subtracting them with math
Are you sure those are tabs? You can specify no argument for split and it automatically splits on whitespace (which means you won't have to strip newlines beforehand either). I copied your sample into a file and got it to work like this:
allt = []
with open('towers1.txt','r') as f:
for line in f:
allt.append(line.split())
>>>print allt[0]
['mw91', '42.927', '-72.84', '2.8']
>>>print allt[0][1]
'42.927'
Footnote: if you get rid of your first list comprehension, you're only iterating the file once, which is less wasteful.
Just saw that you want help converting the float values as well. Assuming that line.split() splits up the data correctly, something like the following should probably work:
allt = []
with open('towers1.txt','r') as f:
for line in f:
first, *_else = line.split() #Python3
data = [first]
float_nums = [float(x) for x in _else]
data.extend(float_nums)
allt.append(data)
>>>print allt[0]
['mw91', 42.927, -72.84, 2.8]
For Python2, substitute the first, *_else = line.split() with the following:
first, _else = line.split()[0], line.split()[1:]
Finally (in response to comments below), if you want a list of a certain set of values, you're going to have to iterate again and this is where list comprehensions can be useful. If you want the [2] index value for each element in allt, you'll have to do something like this:
>>> some_items = [item[2] for item in allt]
>>> some_items
[-72.84, -72.58, -72.679]
[] implies a list.
'' implies a string.
allt = ['mw91 42.927 -72.84 2.8']
allt is a list that contains a string:
allt[0] --> 'mw91 42.927 -72.84 2.8'
allt[0][2] --> '9'
allt.split() --> ['mw91', '42.927', '-72.84', '2.8']
allt.split()[2] --> '-72.84' #This is still a string.
float(allt.split()[2]) --> -72.84 #This is now a float.
I think this should also work
with open('towers.txt', 'r') as f:
allt = map(str.split, f)
And if you need the values after the first one to be floats...
with open('towers.txt', 'r') as f:
allt = [line[:1] + map(float, line[1:]) for line in map(str.split, f)]

Categories

Resources