Split method in python is outputing an index error - python

This program takes in a txt file and prints out the first word of each line. It works perfectly but at the end it prints out this error
Traceback (most recent call last):
File "C:/Users/vipku/PycharmProjects/untitled/test.py", line 7, in <module>
print(f.readline().split()[0])
IndexError: list index out of range
This is the code that I wrote
f = open("example.txt", "r")
for line in f:
for first in line:
print(f.readline().split()[0])

Check if the line is empty before getting the first element:
f = open("example.txt", "r")
lines = f.readlines()
for line in lines:
if line.strip():
print(line.split()[0])

Note that this:
f = open("example.txt", "r")
for line in f:
for first in line:
actually means:
open file "example.txt" for reading
for every line in that file:
for every character in that line:
So this means you do readline more times that actual number of lines - due to this you get empty str from readline, as docs says if f.readline() returns an empty string, the end of the file has been reached, while a blank line is represented by '\n', a string containing only a single newline.
It is enough to use single for if you want to deal with line after line. You should check for blank lines (lines consisting of single newline - using .split at them result in empty list), so solution might looks like:
f = open("example.txt", "r")
for line in f:
words = line.split()
if words:
print(words[0])
f.close()
I harness fact that empty lists are False-y and non-empty list True-y, so print will be executed only if words has at least 1 element. Note that files should be closed after usage. You might do it implicitly or use with open... approach instead. You might learn about latter from this realpython tutorial.

Related

Parsing Logs with Regular Expressions Python

Coding and Python lightweight :)
I've gotta iterate through some logfiles and pick out the ones that say ERROR. Boom done got that. What I've gotta do is figure out how to grab the following 10 lines containing the details of the error. Its gotta be some combo of an if statement and a for/while loop I presume. Any help would be appreciated.
import os
import re
# Regex used to match
line_regex = re.compile(r"ERROR")
# Output file, where the matched loglines will be copied to
output_filename = os.path.normpath("NodeOut.log")
# Overwrites the file, ensure we're starting out with a blank file
#TODO Append this later
with open(output_filename, "w") as out_file:
out_file.write("")
# Open output file in 'append' mode
with open(output_filename, "a") as out_file:
# Open input file in 'read' mode
with open("MXNode1.stdout", "r") as in_file:
# Loop over each log line
for line in in_file:
# If log line matches our regex, print remove later, and write > file
if (line_regex.search(line)):
# for i in range():
print(line)
out_file.write(line)
There is no need for regex to do this, you can just use the in operator ("ERROR" in line).
Also, to clear the content of the file without opening it in w mode, you can simply place the cursor at the beginning of the file and truncate.
import os
output_filename = os.path.normpath("NodeOut.log")
with open(output_filename, 'a') as out_file:
out_file.seek(0, 0)
out_file.truncate(0)
with open("MXNode1.stdout", 'r') as in_file:
line = in_file.readline()
while line:
if "ERROR" in line:
out_file.write(line)
for i in range(10):
out_file.write(in_file.readline())
line = in_file.readline()
We use a while loop to read lines one by one using in_file.readline(). The advantage is that you can easily read the next line using a for loop.
See the doc:
f.readline() reads a single line from the file; a newline character (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline. This makes the return value unambiguous; if f.readline() returns an empty string, the end of the file has been reached, while a blank line is represented by '\n', a string containing only a single newline.
Assuming you would only want to always grab the next 10 lines, then you could do something similar to:
with open("MXNode1.stdout", "r") as in_file:
# Loop over each log line
lineCount = 11
for line in in_file:
# If log line matches our regex, print remove later, and write > file
if (line_regex.search(line)):
# for i in range():
print(line)
lineCount = 0
if (lineCount < 11):
lineCount += 1
out_file.write(line)
The second if statement will help you always grab the line. The magic number of 11 is so that you grab the next 10 lines after the initial line that the ERROR was found on.

How do I split each line into two strings and print without the comma?

I'm trying to have output to be without commas, and separate each line into two strings and print them.
My code so far yields:
173,70
134,63
122,61
140,68
201,75
222,78
183,71
144,69
But i'd like it to print it out without the comma and the values on each line separated as strings.
if __name__ == '__main__':
# Complete main section of code
file_name = "data.txt"
# Open the file for reading here
my_file = open('data.txt')
lines = my_file.read()
with open('data.txt') as f:
for line in f:
lines.split()
lines.replace(',', ' ')
print(lines)
In your sample code, line contains the full content of the file as a str.
my_file = open('data.txt')
lines = my_file.read()
You then later re-open the file to iterate the lines:
with open('data.txt') as f:
for line in f:
lines.split()
lines.replace(',', ' ')
Note, however, str.split and str.replace do not modify the existing value, as strs in python are immutable. Also note you are operating on lines there, rather than the for-loop variable line.
Instead, you'll need to assign the result of those functions into new values, or give them as arguments (E.g., to print). So you'll want to open the file, iterate over the lines and print the value with the "," replaced with a " ":
with open("data.txt") as f:
for line in f:
print(line.replace(",", " "))
Or, since you are operating on the whole file anyway:
with open("data.txt") as f:
print(f.read().replace(",", " "))
Or, as your file appears to be CSV content, you may wish to use the csv module from the standard library instead:
import csv
with open("data.txt", newline="") as csvfile:
for row in csv.reader(csvfile):
print(*row)
with open('data.txt', 'r') as f:
for line in f:
for value in line.split(','):
print(value)
while python can offer us several ways to open files this is the prefered one for working with files. becuase we are opening the file in lazy mode (this is the prefered one espicialy for large files), and after exiting the with scope (identation block) the file io will be closed automaticly by the system.
here we are openening the file in read mode. files folow the iterator polices, so we can iterrate over them like lists. each line is a true line in the file and is a string type.
After getting the line, in line variable, we split (see str.split()) the line into 2 tokens, one before the comma and the other after the comma. split return new constructed list of strings. if you need to omit some unwanted characters you can use the str.strip() method. usualy strip and split combined together.
elegant and efficient file reading - method 1
with open("data.txt", 'r') as io:
for line in io:
sl=io.split(',') # now sl is a list of strings.
print("{} {}".format(sl[0],sl[1])) #now we use the format, for printing the results on the screen.
non elegant, but efficient file reading - method 2
fp = open("data.txt", 'r')
line = None
while (line=fp.readline()) != '': #when line become empty string, EOF have been reached. the end of file!
sl=line.split(',')
print("{} {}".format(sl[0],sl[1]))

break when empty line from a File

I have file contains text like Hello:World
#!/usr/bin/python
f = open('m.txt')
while True:
line = f.readline()
if not line :
break
first = line.split(':')[0]
second = line.split(':')[1]
f.close()
I want to put the string after splitting it into 2 variables
On the second iteration i get error
List index out of range
it doesn't break when the line is empty , i searched the answer on related topics and the solution was
if not line:
print break
But it does not work
If there's lines after an empty line (or your text editor inserted an empty line at the end of the file), it's not actually empty. It has a new line character and/or carriage return
You need to strip it off
with open('m.txt') as f:
for line in f:
if not line.strip():
break
first, second = line.split(':')
You can do this relatively easily by utilizing an optional feature of the built-in iter() function by passing it a second argument (called sentinel in the docs) that will cause it to stop if the value is encountered while iterating.
Here's what how use it to make the line processing loop terminate if an empty line is encountered:
with open('m.txt') as fp:
for line in iter(fp.readline, ''):
first, second = line.rstrip().split(':')
print(first, second)
Note the rstrip() which removes the newline at the end of each line read.
Your code is fine, I can't put a picture in a comment. It all works, here:

Read in every line that starts with a certain character from a file

I am trying to read in every line in a file that starts with an 'X:'. I don't want to read the 'X:' itself just the rest of the line that follows.
with open("hnr1.abc","r") as file: f = file.read()
id = []
for line in f:
if line.startswith("X:"):
id.append(f.line[2:])
print(id)
It doesn't have any errors but it doesn't print anything out.
try this:
with open("hnr1.abc","r") as fi:
id = []
for ln in fi:
if ln.startswith("X:"):
id.append(ln[2:])
print(id)
dont use names like file or line
note the append just uses the item name not as part of the file
by pre-reading the file into memory the for loop was accessing the data by character not by line
for line in f:
search = line.split
if search[0] = "X":
storagearray.extend(search)
That should give you an array of all the lines you want, but they'll be split into separate words. Also, you'll need to have defined storagearray before we call it in the above block of code. It's an inelegant solution, as I'm a learner myself, but it should do the job!
edit: If you want to output the lines, simply use python's inbuilt print function:
str(storagearray)
print storagearray
Read every line in the file (for loop)
Select lines that contains X:
Slice the line with index 0: with starting char's/string as X: = ln[0:]
Print lines that begins with X:
for ln in input_file:
if ln.startswith('X:'):
X_ln = ln[0:]
print (X_ln)

How do you read a specific line of a text file in Python?

I'm having trouble reading an entire specific line of a text file using Python. I currently have this:
load_profile = open('users/file.txt', "r")
read_it = load_profile.readline(1)
print read_it
Of course this will just read one byte of the first line, which is not what I want. I also tried Google but didn't find anything.
What are the conditions of this line? Is it at a certain index? Does it contain a certain string? Does it match a regex?
This code will match a single line from the file based on a string:
load_profile = open('users/file.txt', "r")
read_it = load_profile.read()
myLine = ""
for line in read_it.splitlines():
if line == "This is the line I am looking for":
myLine = line
break
print myLine
And this will give you the first line of the file (there are several other ways to do this as well):
load_profile = open('users/file.txt', "r")
read_it = load_profile.read().splitlines()[0]
print read_it
Or:
load_profile = open('users/file.txt', "r")
read_it = load_profile.readline()
print read_it
Check out Python File Objects Docs
file.readline([size])
Read one entire line from the file. A trailing
newline character is kept in the string (but may be absent when a file
ends with an incomplete line). [6] If the size argument is present and
non-negative, it is a maximum byte count (including the trailing
newline) and an incomplete line may be returned. When size is not 0,
an empty string is returned only when EOF is encountered immediately.
Note Unlike stdio‘s fgets(), the returned string contains null
characters ('\0') if they occurred in the input.
file.readlines([sizehint])
Read until EOF using readline() and return
a list containing the lines thus read. If the optional sizehint
argument is present, instead of reading up to EOF, whole lines
totalling approximately sizehint bytes (possibly after rounding up to
an internal buffer size) are read. Objects implementing a file-like
interface may choose to ignore sizehint if it cannot be implemented,
or cannot be implemented efficiently.
Edit:
Answer to your comment Noah:
load_profile = open('users/file.txt', "r")
read_it = load_profile.read()
myLines = []
for line in read_it.splitlines():
# if line.startswith("Start of line..."):
# if line.endswith("...line End."):
# if line.find("SUBSTRING") > -1:
if line == "This is the line I am looking for":
myLines.append(line)
print myLines
You can use Python's inbuilt module linecache
import linecache
line = linecache.getline(filepath,linenumber)
load_profile.readline(1)
specifically says to cap at 1 byte. it doesn't mean 1 line. Try
read_it = load_profile.readline()
def readline_number_x(file,x):
for index,line in enumerate(iter(file)):
if index+1 == x: return line
return None
f = open('filename')
x = 3
line_number_x = readline_number_x(f,x) #This will return the third line

Categories

Resources