Hopefully this is an easy fix. I'm trying to edit one field of a file we use for import, however when I run the following code it leaves the file blank and 0kb. Could anyone advise what I'm doing wrong?
import re #import regex so we can use the commands
name = raw_input("Enter filename:") #prompt for file name, press enter to just open test.nhi
if len(name) < 1 : name = "test.nhi"
count = 0
fhand = open(name, 'w+')
for line in fhand:
words = line.split(',') #obtain individual words by using split
words[34] = re.sub(r'\D', "", words[34]) #remove non-numeric chars from string using regex
if len(words[34]) < 1 : continue # If the 34th field is blank go to the next line
elif len(words[34]) == 2 : "{0:0>3}".format([words[34]]) #Add leading zeroes depending on the length of the field
elif len(words[34]) == 3 : "{0:0>2}".format([words[34]])
elif len(words[34]) == 4 : "{0:0>1}".format([words[34]])
fhand.write(words) #write the line
fhand.close() # Close the file after the loop ends
I have taken below text in 'a.txt' as input and modified your code. Please check if it's work for you.
#Intial Content of a.txt
This,program,is,Java,program
This,program,is,12Python,programs
Modified code as follow:
import re
#Reading from file and updating values
fhand = open('a.txt', 'r')
tmp_list=[]
for line in fhand:
#Split line using ','
words = line.split(',')
#Remove non-numeric chars from 34th string using regex
words[3] = re.sub(r'\D', "", words[3])
#Update the 3rd string
# If the 3rd field is blank go to the next line
if len(words[3]) < 1 :
#Removed continue it from here we need to reconstruct the original line and write it to file
print "Field empty.Continue..."
elif len(words[3]) >= 1 and len(words[3]) < 5 :
#format won't add leading zeros. zfill(5) will add required number of leading zeros depending on the length of word[3].
words[3]=words[3].zfill(5)
#After updating 3rd value in words list, again creating a line out of it.
tmp_str = ",".join(words)
tmp_list.append(tmp_str)
fhand.close()
#Writing to same file
whand = open("a.txt",'w')
for val in tmp_list:
whand.write(val)
whand.close()
File content after running code
This,program,is,,program
This,program,is,00012,programs
The file mode 'w+' Truncates your file to 0 bytes, so you'll only be able to read lines that you've written.
Look at Confused by python file mode "w+" for more information.
An idea would be to read the whole file first, close it, and re-open it to write files in it.
Not sure which OS you're on but I think reading and writing to the same file has undefined behaviour.
I guess internally the file object holds the position (try fhand.tell() to see where it is). You could probably adjust it back and forth as you went using fhand.seek(last_read_position) but really that's asking for trouble.
Also, I'm not sure how the script would ever end as it would end up reading the stuff it had just written (in a sort of infinite loop).
Best bet is to read the entire file first:
with open(name, 'r') as f:
lines = f.read().splitlines()
with open(name, 'w') as f:
for l in lines:
# ....
f.write(something)
For 'Printing to a file via Python' you can use:
ifile = open("test.txt","r")
print("Some text...", file = ifile)
Related
I'm creating a program to allow users to remove users which works, however, when it removes a user at the end of the file a new line character is not removed which breaks the program. The following is the a part of the function to remove the user.
with open("users.txt", "r") as input:
with open("temp.txt", "w") as output: # Iterate all lines from file
for line in input:
if not line.strip("\n").startswith(enteredUsername):
# If line doesn't start with the username entered, then write it in temp file.
output.write(line)
os.replace('temp.txt', 'users.txt') # Replace file with original name
This creates a temporary file where anything which doesn't start with a given string is written to the file. the name is then swapped back to "users.txt" I've looked on other threads on stackoverflow as well as other websites and nothing has worked, is there anything I should change about this solution?
EDIT --------------------
I managed to fix this with the following code (and thanks to everyone for your suggestions!):
count = 1 # Keeps count of the number of lines
removed = False # Initially nothing has been removed
with open(r"users.txt", 'r') as fp:
x = len(fp.readlines()) # Finds the number of lines in the file
if login(enteredUsername, enteredPassword) == True: # Checks if the username and password combinination is correct
with open("users.txt", "r") as my_input:
with open("temp.txt", "w") as output: # Iterate all lines from file
for line in my_input:
if not line.strip("\n").startswith(enteredUsername): # If line doesn't start with the username entered, then write it in temp file.
if count == x - 1 and removed == False: # If something has not been removed, get rid of newline character
output.write(line[:-1])
else:
output.write(line)
else:
removed = True # This only becomes true if the previous statement is false, if so, something has been 'removed'
count +=1 # Increments the count for every line
os.replace('temp.txt', 'users.txt') # Replace file with original name
with open("users.txt", "r") as input:
with open("temp.txt", "w") as output: # Iterate all lines from file
for line in input:
if not line.strip("\n").startswith(enteredUsername):
# If line doesn't start with the username entered, then write it in temp file.
# output.write(line) # <-- you are writing the line that still have the new line character
output.write(line.strip("\n")) # try this?
os.replace('temp.txt', 'users.txt') # Replace file with original name
Also as general tip I would recommend not using the term "input" as a variable name since it is reserved in python. Just letting you know as it can potentially cause some whacky errors that can be a pain to debug (speaking from personal experience here!)
================================================================
EDIT:
I realize that doing this will likely not have any new line characters after you write the line, which will have all usernames on the same line. You will need to write a new line character after every name you write down except for the last one, which give you the trailing new line character that is causing you the problem.
with open("users.txt", "r") as my_input:
with open("temp.txt", "w") as output: # Iterate all lines from file
for line in my_input:
if not line.strip("\n").startswith(enteredUsername):
# If line doesn't start with the username entered, then write it in temp file.
output.write(line)
os.replace('temp.txt', 'users.txt') # Replace file with original name
# https://stackoverflow.com/questions/18857352/remove-very-last-character-in-file
# remove last new line character from the file
with open("users.txt", 'rb+') as filehandle:
filehandle.seek(-1, os.SEEK_END)
filehandle.truncate()
This is admittedly a hackey way to go about it-- but it should work! This last section removes the last character of the file, which is a new line character.
You don't need to use a temporary file for this.
def remove_user(filename, enteredUsername):
last = None
with open(filename, 'r+') as users:
lines = users.readlines()
users.seek(0)
for line in lines:
if not line.startswith(enteredUsername):
users.write(line)
last = line
# ensure that the last line is newline terminated
if last and last[-1] != '\n':
users.write('\n')
users.truncate()
I have a text file. I want to Count the last names ending with "E". This is the code I have so far. I know it is not correct but I am stuck and do not know what else to do to make it work.
def ans9(file):
infile = open(file)
contents = infile.read().split()
infile.close()
return len(contents)
ans9.reverse()
for word in ans9:
print(word[e])
From what I see in the file, the name and the float number are delimited by tab. What you want to do is open a file, read it line by line. Then go through those lines (one line at a time), split it on tab character (\t) and take the first element of that list (name) and then a last character of that name. In code, it would look like this:
with open(file, ‘r’) as f:
lines = f.readlines()
cnt = 0
for i in lines:
if i.split(‘\t’)[0][-1] == ‘e’ or i.split(‘\t’)[0][-1] == ‘E’:
cnt += 1
New to coding and trying to figure out how to fix a broken csv file to make be able to work with it properly.
So the file has been exported from a case management system and contains fields for username, casenr, time spent, notes and date.
The problem is that occasional notes have newlines in them and when exporting the csv the tooling does not contain quotation marks to define it as a string within the field.
see below example:
user;case;hours;note;date;
tnn;123;4;solved problem;2017-11-27;
tnn;124;2;random comment;2017-11-27;
tnn;125;3;I am writing a comment
that contains new lines
without quotation marks;2017-11-28;
HJL;129;8;trying to concatenate lines to re form the broken csv;2017-11-29;
I would like to concatenate lines 3,4 and 5 to show the following:
tnn;125;3;I am writing a comment that contains new lines without quotation marks;2017-11-28;
Since every line starts with a username (always 3 letters) I thought I would be able to iterate the lines to find which lines do not start with a username and concatenate that with the previous line.
It is not really working as expected though.
This is what I have got so far:
import re
with open('Rapp.txt', 'r') as f:
for line in f:
previous = line #keep current line in variable to join next line
if not re.match(r'^[A-Za-z]{3}', line): #regex to match 3 letters
print(previous.join(line))
Script shows no output just finishes silently, any thoughts?
I think I would go a slightly different way:
import re
all_the_data = ""
with open('Rapp.txt', 'r') as f:
for line in f:
if not re.search("\d{4}-\d{1,2}-\d{1,2};\n", line):
line = re.sub("\n", "", line)
all_the_data = "".join([all_the_data, line])
print (all_the_data)
There a several ways to do this each with pros and cons, but I think this keeps it simple.
Loop the file as you have done and if the line doesn't end in a date and ; take off the carriage return and stuff it into all_the_data. That way you don't have to play with looking back 'up' the file. Again, lots of way to do this. If you would rather use the logic of starts with 3 letters and a ; and looking back, this works:
import re
all_the_data = ""
with open('Rapp.txt', 'r') as f:
all_the_data = ""
for line in f:
if not re.search("^[A-Za-z]{3};", line):
all_the_data = re.sub("\n$", "", all_the_data)
all_the_data = "".join([all_the_data, line])
print ("results:")
print (all_the_data)
Pretty much what was asked for. The logic being if the current line doesn't start right, take out the previous line's carriage return from all_the_data.
If you need help playing with the regex itself, this site is great: http://regex101.com
The regex in your code matches to all the lines (string) in the txt (finds a valid match to the pattern). The if condition is never true and hence nothing prints.
with open('./Rapp.txt', 'r') as f:
join_words = []
for line in f:
line = line.strip()
if len(line) > 3 and ";" in line[0:4] and len(join_words) > 0:
print(';'.join(join_words))
join_words = []
join_words.append(line)
else:
join_words.append(line)
print(";".join(join_words))
I've tried to not use regex here to keep it a little clear if possible. But, regex is a better option.
A simple way would be to use a generator that acts as a filter on the original file. That filter would concatenate a line to the previous one if it has not a semicolon (;) in its 4th column. Code could be:
def preprocess(fd):
previous = next(fd)
for line in fd:
if line[3] == ';':
yield previous
previous = line
else:
previous = previous.strip() + " " + line
yield previous # don't forget last line!
You could then use:
with open(test.txt) as fd:
rd = csv.DictReader(preprocess(fd))
for row in rd:
...
The trick here is that the csv module only requires on object that returns a line each time next function is applied to it, so a generator is appropriate.
But this is only a workaround and the correct way would be that the previous step directly produces a correct CSV file.
I am new to Python.
Scenario:
apple=gravity search this pattern in file
search for apple if exist fetch corresponding value for apple,
if it is apple=gravity then case pass .
file structure (test.txt )
car=stop
green=go
apple=gravity
Please provide some suggestions as to how I can search value for key in file using Python
Sample:
f = open('test.txt', 'r')
wordCheck="apple=gravity";
for line in f:
if 'wordCheck' == line:
print ('found')
else:
print ('notfound')
break
Split your line with =
Check if apple is present in your first index! If true then, print the second index!
Note:
While reading lines from file, the '\n' character will be present. To get your line without \n read you content from file and use splitlines()!
To make it clean, strip the spaces from the beginning and end of your line to avoid glitches caused by spaces at the beginning and end of your line!
That is,
f = open('test.txt', 'r')
for line in map(str.strip,f.read().splitlines()):
line = line.split('=')
if 'apple' == line[0]:
print line[1]
else:
print ('notfound')
Output:
notfound
notfound
gravity
Hope it helps!
Iterating through the file directly as you are doing, is just fine, and considered more 'Pythonic' than readlines() (or indeed read().splitlines()).
Here, I strip the newline from each line and then split by the = to get the two halves.
Then, I test for the check word, and if present print out the other half of the line.
Note also that I have used the with context manager to open the file. This makes sure that the file is closed, even if an exception occurs.
with open('test.txt', 'r') as f:
wordcheck="apple"
for line in f:
key, val = line.strip().split('=')
if wordcheck == key:
print (val)
else:
print ('notfound')
I am trying to parse some text files and need to extract blocks of text. Specifically, the lines that start with "1:" and 19 lines after the text. The "1:" does not start on the same row in each file and there is only one instance of "1:". I would prefer to save the block of text and export it to a separate file. In addition, I need to preserve the formatting of the text in the original file.
Needless to say I am new to Python. I generally work with R but these files are not really compatible with R and I have about 100 to process. Any information would be appreciated.
The code that I have so far is:
tmp = open(files[0],"r")
lines = tmp.readlines()
tmp.close()
num = 0
a=0
for line in lines:
num += 1
if "1:" in line:
a = num
break
a = num is the line number for the block of text I want. I then want to save to another file the next 19 lines of code, but can't figure how how to do this. Any help would be appreciated.
Here is one option. Read all lines from your file. Iterate till you find your line and return next 19 lines. You would need to handle situations where your file doesn't contain additional 19 lines.
fh = open('yourfile.txt', 'r')
all_lines = fh.readlines()
fh.close()
for count, line in enumerate(all_lines):
if "1:" in line:
return all_lines[count+1:count+20]
Could be done in a one-liner...
open(files[0]).read().split('1:', 1)[1].split('\n')[:19]
or more readable
txt = open(files[0]).read() # read the file into a big string
before, after = txt.split('1:', 1) # split the file on the first "1:"
after_lines = after.split('\n') # create lines from the after text
lines_to_save = after_lines[:19] # grab the first 19 lines after "1:"
then join the lines with a newline (and add a newline to the end) before writing it to a new file:
out_text = "1:" # add back "1:"
out_text += "\n".join(lines_to_save) # add all 19 lines with newlines between them
out_text += "\n" # add a newline at the end
open("outputfile.txt", "w").write(out_text)
to comply with best practice for reading and writing files you should also be using the with statement to ensure that the file handles are closed as soon as possible. You can create convenience functions for it:
def read_file(fname):
"Returns contents of file with name `fname`."
with open(fname) as fp:
return fp.read()
def write_file(fname, txt):
"Writes `txt` to a file named `fname`."
with open(fname, 'w') as fp:
fp.write(txt)
then you can replace the first line above with:
txt = read_file(files[0])
and the last line with:
write_file("outputfile.txt", out_text)
I always prefer to read the file into memory first, but sometimes that's not possible. If you want to use iteration then this will work:
def process_file(fname):
with open(fname) as fp:
for line in fp:
if line.startswith('1:'):
break
else:
return # no '1:' in file
yield line # yield line containing '1:'
for i, line in enumerate(fp):
if i >= 19:
break
yield line
if __name__ == "__main__":
with open('ouput.txt', 'w') as fp:
for line in process_file('intxt.txt'):
fp.write(line)
It's using the else: clause on a for-loop which you don't see very often anymore, but was created for just this purpose (the else clause if executed if the for-loop doesn't break).