reading .txt file in python

reading .txt file in python - python

I have a problem with a code in python. I want to read a .txt file. I use the code:
f = open('test.txt', 'r') # We need to re-open the file
data = f.read()
print(data)
I would like to read ONLY the first line from this .txt file. I use
f = open('test.txt', 'r') # We need to re-open the file
data = f.readline(1)
print(data)
But I am seeing that in screen only the first letter of the line is showing.
Could you help me in order to read all the letters of the line ? (I mean to read whole the line of the .txt file)

with open("file.txt") as f:
print(f.readline())
This will open the file using with context block (which will close the file automatically when we are done with it), and read the first line, this will be the same as:
f = open(“file.txt”)
print(f.readline())
f.close()
Your attempt with f.readline(1) won’t work because it the argument is meant for how many characters to print in the file, therefore it will only print the first character.
Second method:
with open("file.txt") as f:
print(f.readlines()[0])
Or you could also do the above which will get a list of lines and print only the first line.
To read the fifth line, use
with open("file.txt") as f:
print(f.readlines()[4])
Or:
with open("file.txt") as f:
lines = []
lines += f.readline()
lines += f.readline()
lines += f.readline()
lines += f.readline()
lines += f.readline()
print(lines[-1])
The -1 represents the last item of the list
Learn more:
with statement
files in python
readline method

Your first try is almost there, you should have done the following:
f = open('my_file.txt', 'r')
line = f.readline()
print(line)
f.close()
A safer approach to read file is:
with open('my_file.txt', 'r') as f:
print(f.readline())
Both ways will print only the first line.
Your error was that you passed 1 to readline which means you want to read size of 1, which is only a single character. please refer to https://www.w3schools.com/python/ref_file_readline.asp

I tried this and it works, after your suggestions:
f = open('test.txt', 'r')
data = f.readlines()[1]
print(data)

Use with open(...) instead:
with open("test.txt") as file:
line = file.readline()
print(line)

Keep f.readline() without parameters.
It will return you first line as a string and move cursor to second line.
Next time you use f.readline() it will return second line and move cursor to the next, etc...

Related

Adding a comma to end of first row of csv files within a directory using python

Ive got some code that lets me open all csv files in a directory and run through them removing the top 2 lines of each file, Ideally during this process I would like it to also add a single comma at the end of the new first line (what would have been originally line 3)
Another approach that's possible could be to remove the trailing comma's on all other rows that appear in each of the csvs.
Any thoughts or approaches would be gratefully received.
import glob
path='P:\pytest'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r') as f:
lines = f.read().split("\n")
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w')
for line in lines:
o.write(line+'\n')
o.close()

adding a counter in there can solve this:
import glob
path=r'C:/Users/dsqallihoussaini/Desktop/dev_projects/stack_over_flow'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r') as f:
lines = f.read().split("\n")
print(lines)
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w')
counter=0
for line in lines:
counter=counter+1
if counter==1:
o.write(line+',\n')
else:
o.write(line+'\n')
o.close()

One possible problem with your code is that you are reading the whole file into memory, which might be fine. If you are reading larger files, then you want to process the file line by line.
The easiest way to do that is to use the fileinput module: https://docs.python.org/3/library/fileinput.html
Something like the following should work:
#!/usr/bin/env python3
import glob
import fileinput
# inplace makes a backup of the file, then any output to stdout is written
# to the current file.
# change the glob..below is just an example.
#
# Iterate through each file in the glob.iglob() results
with fileinput.input(files=glob.iglob('*.csv'), inplace=True) as f:
for line in f: # Iterate over each line of the current file.
if f.filelineno() > 2: # Skip the first two lines
# Note: 'line' has the newline in it.
# Insert the comma if line 3 of the file, otherwise output original line
print(line[:-1]+',') if f.filelineno() == 3 else print(line, end="")

Ive added some encoding as well as mine was throwing a error but encoding fixed that up nicely
import glob
path=r'C:/whateveryourfolderis'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r',encoding='utf-8') as f:
lines = f.read().split("\n")
#print(lines)
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w',encoding='utf-8')
counter=0
for line in lines:
counter=counter+1
if counter==1:
o.write(line+',\n')
else:
o.write(line+'\n')
o.close()

Parsing Logs with Regular Expressions Python

Coding and Python lightweight :)
I've gotta iterate through some logfiles and pick out the ones that say ERROR. Boom done got that. What I've gotta do is figure out how to grab the following 10 lines containing the details of the error. Its gotta be some combo of an if statement and a for/while loop I presume. Any help would be appreciated.
import os
import re
# Regex used to match
line_regex = re.compile(r"ERROR")
# Output file, where the matched loglines will be copied to
output_filename = os.path.normpath("NodeOut.log")
# Overwrites the file, ensure we're starting out with a blank file
#TODO Append this later
with open(output_filename, "w") as out_file:
out_file.write("")
# Open output file in 'append' mode
with open(output_filename, "a") as out_file:
# Open input file in 'read' mode
with open("MXNode1.stdout", "r") as in_file:
# Loop over each log line
for line in in_file:
# If log line matches our regex, print remove later, and write > file
if (line_regex.search(line)):
# for i in range():
print(line)
out_file.write(line)

There is no need for regex to do this, you can just use the in operator ("ERROR" in line).
Also, to clear the content of the file without opening it in w mode, you can simply place the cursor at the beginning of the file and truncate.
import os
output_filename = os.path.normpath("NodeOut.log")
with open(output_filename, 'a') as out_file:
out_file.seek(0, 0)
out_file.truncate(0)
with open("MXNode1.stdout", 'r') as in_file:
line = in_file.readline()
while line:
if "ERROR" in line:
out_file.write(line)
for i in range(10):
out_file.write(in_file.readline())
line = in_file.readline()
We use a while loop to read lines one by one using in_file.readline(). The advantage is that you can easily read the next line using a for loop.
See the doc:
f.readline() reads a single line from the file; a newline character (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline. This makes the return value unambiguous; if f.readline() returns an empty string, the end of the file has been reached, while a blank line is represented by '\n', a string containing only a single newline.

Assuming you would only want to always grab the next 10 lines, then you could do something similar to:
with open("MXNode1.stdout", "r") as in_file:
# Loop over each log line
lineCount = 11
for line in in_file:
# If log line matches our regex, print remove later, and write > file
if (line_regex.search(line)):
# for i in range():
print(line)
lineCount = 0
if (lineCount < 11):
lineCount += 1
out_file.write(line)
The second if statement will help you always grab the line. The magic number of 11 is so that you grab the next 10 lines after the initial line that the ERROR was found on.

Why is my if statement not working and just outputting the else, everything works till there? [duplicate]

In Python, calling e.g. temp = open(filename,'r').readlines() results in a list in which each element is a line from the file. However, these strings have a newline character at the end, which I don't want.
How can I get the data without the newlines?

You can read the whole file and split lines using str.splitlines:
temp = file.read().splitlines()
Or you can strip the newline by hand:
temp = [line[:-1] for line in file]
Note: this last solution only works if the file ends with a newline, otherwise the last line will lose a character.
This assumption is true in most cases (especially for files created by text editors, which often do add an ending newline anyway).
If you want to avoid this you can add a newline at the end of file:
with open(the_file, 'r+') as f:
f.seek(-1, 2) # go at the end of the file
if f.read(1) != '\n':
# add missing newline if not already present
f.write('\n')
f.flush()
f.seek(0)
lines = [line[:-1] for line in f]
Or a simpler alternative is to strip the newline instead:
[line.rstrip('\n') for line in file]
Or even, although pretty unreadable:
[line[:-(line[-1] == '\n') or len(line)+1] for line in file]
Which exploits the fact that the return value of or isn't a boolean, but the object that was evaluated true or false.
The readlines method is actually equivalent to:
def readlines(self):
lines = []
for line in iter(self.readline, ''):
lines.append(line)
return lines
# or equivalently
def readlines(self):
lines = []
while True:
line = self.readline()
if not line:
break
lines.append(line)
return lines
Since readline() keeps the newline also readlines() keeps it.
Note: for symmetry to readlines() the writelines() method does not add ending newlines, so f2.writelines(f.readlines()) produces an exact copy of f in f2.

temp = open(filename,'r').read().split('\n')

Reading file one row at the time. Removing unwanted chars from end of the string with str.rstrip(chars).
with open(filename, 'r') as fileobj:
for row in fileobj:
print(row.rstrip('\n'))
See also str.strip([chars]) and str.lstrip([chars]).

I think this is the best option.
temp = [line.strip() for line in file.readlines()]

temp = open(filename,'r').read().splitlines()

My preferred one-liner -- if you don't count from pathlib import Path :)
lines = Path(filename).read_text().splitlines()
This it auto-closes the file, no need for with open()...
Added in Python 3.5.
https://docs.python.org/3/library/pathlib.html#pathlib.Path.read_text

Try this:
u=open("url.txt","r")
url=u.read().replace('\n','')
print(url)

To get rid of trailing end-of-line (/n) characters and of empty list values (''), try:
f = open(path_sample, "r")
lines = [line.rstrip('\n') for line in f.readlines() if line.strip() != '']

You can read the file as a list easily using a list comprehension
with open("foo.txt", 'r') as f:
lst = [row.rstrip('\n') for row in f]

my_file = open("first_file.txt", "r")
for line in my_file.readlines():
if line[-1:] == "\n":
print(line[:-1])
else:
print(line)
my_file.close()

This script here will take lines from file and save every line without newline with ,0 at the end in file2.
file = open("temp.txt", "+r")
file2 = open("res.txt", "+w")
for line in file:
file2.writelines(f"{line.splitlines()[0]},0\n")
file2.close()
if you looked at line, this value is data\n, so we put splitlines()
to make it as an array and [0] to choose the only word data

import csv
with open(filename) as f:
csvreader = csv.reader(f)
for line in csvreader:
print(line[0])

Trouble with sys.stdin.readline [duplicate]

In Python, calling e.g. temp = open(filename,'r').readlines() results in a list in which each element is a line from the file. However, these strings have a newline character at the end, which I don't want.
How can I get the data without the newlines?

You can read the whole file and split lines using str.splitlines:
temp = file.read().splitlines()
Or you can strip the newline by hand:
temp = [line[:-1] for line in file]
Note: this last solution only works if the file ends with a newline, otherwise the last line will lose a character.
This assumption is true in most cases (especially for files created by text editors, which often do add an ending newline anyway).
If you want to avoid this you can add a newline at the end of file:
with open(the_file, 'r+') as f:
f.seek(-1, 2) # go at the end of the file
if f.read(1) != '\n':
# add missing newline if not already present
f.write('\n')
f.flush()
f.seek(0)
lines = [line[:-1] for line in f]
Or a simpler alternative is to strip the newline instead:
[line.rstrip('\n') for line in file]
Or even, although pretty unreadable:
[line[:-(line[-1] == '\n') or len(line)+1] for line in file]
Which exploits the fact that the return value of or isn't a boolean, but the object that was evaluated true or false.
The readlines method is actually equivalent to:
def readlines(self):
lines = []
for line in iter(self.readline, ''):
lines.append(line)
return lines
# or equivalently
def readlines(self):
lines = []
while True:
line = self.readline()
if not line:
break
lines.append(line)
return lines
Since readline() keeps the newline also readlines() keeps it.
Note: for symmetry to readlines() the writelines() method does not add ending newlines, so f2.writelines(f.readlines()) produces an exact copy of f in f2.

temp = open(filename,'r').read().split('\n')

Reading file one row at the time. Removing unwanted chars from end of the string with str.rstrip(chars).
with open(filename, 'r') as fileobj:
for row in fileobj:
print(row.rstrip('\n'))
See also str.strip([chars]) and str.lstrip([chars]).

I think this is the best option.
temp = [line.strip() for line in file.readlines()]

temp = open(filename,'r').read().splitlines()

My preferred one-liner -- if you don't count from pathlib import Path :)
lines = Path(filename).read_text().splitlines()
This it auto-closes the file, no need for with open()...
Added in Python 3.5.
https://docs.python.org/3/library/pathlib.html#pathlib.Path.read_text

Try this:
u=open("url.txt","r")
url=u.read().replace('\n','')
print(url)

To get rid of trailing end-of-line (/n) characters and of empty list values (''), try:
f = open(path_sample, "r")
lines = [line.rstrip('\n') for line in f.readlines() if line.strip() != '']

You can read the file as a list easily using a list comprehension
with open("foo.txt", 'r') as f:
lst = [row.rstrip('\n') for row in f]

my_file = open("first_file.txt", "r")
for line in my_file.readlines():
if line[-1:] == "\n":
print(line[:-1])
else:
print(line)
my_file.close()

This script here will take lines from file and save every line without newline with ,0 at the end in file2.
file = open("temp.txt", "+r")
file2 = open("res.txt", "+w")
for line in file:
file2.writelines(f"{line.splitlines()[0]},0\n")
file2.close()
if you looked at line, this value is data\n, so we put splitlines()
to make it as an array and [0] to choose the only word data

import csv
with open(filename) as f:
csvreader = csv.reader(f)
for line in csvreader:
print(line[0])

How to read one particular line from .txt file in python?

I know I can read the line by line with
dataFile = open('myfile.txt', 'r')
firstLine = dataFile.readline()
secondLine = dataFile.readline()
...
I also know how to read all the lines in one go
dataFile = open('myfile.txt', 'r')
allLines = dataFile.read()
But my question is how to read one particular line from .txt file?
I wish to read that line by its index.
e.g. I want the 4th line, I expect something like
dataFile = open('myfile.txt', 'r')
allLines = dataFile.readLineByIndex(3)

Skip 3 lines:
with open('myfile.txt', 'r') as dataFile:
for i in range(3):
next(dataFile)
the_4th_line = next(dataFile)
Or use linecache.getline:
the_4th_line = linecache.getline('myfile.txt', 4)

From another Ans
Use Python Standard Library's linecache module:
line = linecache.getline(thefilename, 33)
should do exactly what you want. You don't even need to open the file -- linecache does it all for you!

You can do exactly as you wanted with this:
DataFile = open('mytext.txt', 'r')
content = DataFile.readlines()
oneline = content[5]
DataFile.close()
you could take this down to three lines by removing oneline = content[5] and using content[5] without creating another variable (print(content[5]) for example) I did this just to make it clear that content[5] must be a used as a list to read the one line.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

reading .txt file in python - python

I tried this and it works, after your suggestions: f = open('test.txt', 'r') data = f.readlines()[1] print(data)

Use with open(...) instead: with open("test.txt") as file: line = file.readline() print(line)

Keep f.readline() without parameters. It will return you first line as a string and move cursor to second line. Next time you use f.readline() it will return second line and move cursor to the next, etc...

Related

Adding a comma to end of first row of csv files within a directory using python

Parsing Logs with Regular Expressions Python

Why is my if statement not working and just outputting the else, everything works till there? [duplicate]

Trouble with sys.stdin.readline [duplicate]

How to read one particular line from .txt file in python?

Categories

Resources