In case you didn't catch it in the title, this is Python 3.6
I'm running into an issue where I was able to write to a file, and now I cannot. The crazy thing is that this was working fine earlier.
I'm trying to either append my file if it exists, or write to a new file if it doesn't exist.
main_area_text represents the div tag text below
<div id="1131607" align="center"
style="width:970px;padding:0px;margin:0px;overflow:visible;text-
align:center"></div>
and below is my code:
main_area_text = #this is equal to the html text above
#I've verified this with a watch during debugging
#But this doesn't actually matter, because you can put
#anything in here and it still doesn't work
html_file_path = os.getcwd() + "\\data\\myfile.html"
if os.path.isfile(html_file_path):
print("File exists!")
actual_file = open(html_file_path, "a")
actual_file.write(main_area_text)
else:
print("File does not exist!")
actual_file = open(html_file_path, "w")
actual_file.write(main_area_text)
Earlier, in it's working state, I could create/write/append to .html and .txt files.
NOTE: If the file doesn't exist, the program still creates a new file... It's just empty.
I'm somewhat new to the python language, so I realize it's very possible that I could be overlooking something simple. (It's actually why I'm writing this code, to just familiarize myself with python.)
Thanks in advance!
Since you're not closing your file, the data isn't being flushed to disk. Instead try this:
main_area_text = "stuff"
html_file_path = os.getcwd() + "\\data\\myfile.html"
if os.path.isfile(html_file_path):
print("File exists!")
with open(html_file_path, "a") as f:
f.write(main_area_text)
else:
print("File does not exist!")
with open(html_file_path, "w") as f:
f.write(main_area_text)
The python with statement will handle flushing the data to disk and closing the data automatically. It's generally good practice to use with when handling files.
Related
I'm trying to create a webscraping script in Python where I follow a bunch of links and insert them into a .txt file. However, I want to do this only if the website already doesn't exist in the file.
I have written this code to insert the given website link into the file, so far (not working):
def writeSite(site):
file = open("websites.txt", 'a+')
# print(site)
if site in file.read():
return
file.write(site + "\n")
file.close()
Thanks in advance.
You were pretty close, but because you open the file to append to it, it starts with the file pointer at the end. You need to seek to the start to read its contents again:
def writeSite(site):
file = open("websites.txt", 'a+')
file.seek(0)
# print(site)
if site in file.read():
return
file.write(site + "\n")
file.close()
However, keep in mind that site in file.read() is very crude.
For example, imagine you already have 'http://somesite.com/page/' in the file but now you want to add 'http://somesite.com/' - the URL as a whole is not in the file, but your test will find it.
If you want to check whole lines (and be sure you deal with the file nicely), this would be better:
def writeSite(site):
site += '\n'
with open("websites.txt", 'a+') as f:
f.seek(0)
if site in f.readlines():
return
f.write(site)
It adds a newline to the name of the site to separate the URLs in the file and uses readlines to make use of that fact to check for the whole URL. Using with ensures the file always gets closed.
And since you want to read before writing anyway, you could use 'r+' as a mode, and skip the seek - but only if you can be sure the file already exists. I assume you chose 'a+' because that isn't the case.
(in case you worry that this changes the value of site - that's only true for the parameter inside the function. Whatever value you passed in outside the function will remain unaffected)
I have a python project for a GUI to be used with the slurm queuing manager at our computing cluster. One thing I can do is to print the contents of certain files for a specific job in a text window.
However, the extensions people use for the same type of file will sometimes change. I could program it such that it works for me, but I also want to be able to look up other people's files.
The way I have solved this is the following
extensions = [".ex1", ".ext2", "ext3"]
for ext in extensions:
try:
f = open(jobname+ext), "r")
content = f.read()
f.close()
<doing some stuff with content>
except IOError:
if ext == extensions[-1]:
print("File not found")
return
If the actual extension used is covered by extensions, then my code will find it. I would like to know if more experienced programmers have a better/more elegant/more efficient way of doing it. Luckily the files to be read are very small, so looping over all the possibilities will not take much time. But this particular solution might not be suitable for other cases.
As I understand the Question, you already know the filename and path, and only the extension is unknown.
Use the glob package to find all files with that name like this:
from glob import glob
matches = glob("/path/to/files/knownfilename.*")
if not matches:
print("File not found!")
return
try:
with open(matches[0], "r") as f:
content = f.read()
# do stuff
except IOError:
print("Error reading file {}".format(matches[0]))
In this case you might have to deal with the possibility that
there are multiple files with that name and different extensions
the first file in the matches list is not the kind of file you want (maybe some backup file with .bak extension or whatever), so you might also want to blacklist some extensions
You could use the with statement to open a file and then have it automatically closed. Also, you could omit the mode parameter to open() (which defaults to 'r') and probably add a break after you found a valid extension:
extensions = [".ex1", ".ext2", "ext3"]
for ext in extensions:
try:
with open(jobname+ext)) as f:
content = f.read()
# do some stuff with content
break
except IOError:
if ext == extensions[-1]:
print("File not found")
return
You can use os.listdir('.') to get a list of file names in the current working directory, iterate through the list with a for loop, and slice the file name from the length of jobname and use the in operator to test if it is one of extension names in the extensions list/tuple. break after processing the file when a file is found with the desired name. Use the else block for the for loop to print a File not found message if the loop finishes without breaking:
import os
extensions = '.ext1', '.ext2', '.ext3'
for filename in os.listdir('.'):
if filename.startswith(jobname) and filename[len(jobname):] in extensions:
with open(filename) as f:
content = f.read()
# doing some stuff with content
break
else:
print("File not found")
Even if that works, the logic of comparing the current extension with the end of the list feels weird. In the worst case, if the last extension is accidentally duplicated earlier in the list, this would lead to hard-to-diagnose errors.
Since (presumably) you already return out of the loop as soon as you have found the file, you could just put the "missing-file" behavior after the loop (where it will only be reached if no file was found), and leave the catch-block empty:
extensions = [".ex1", ".ext2", ".ext3"]
for ext in extensions:
try:
with open(jobname+ext), "r") as f:
content = f.read()
<doing some stuff with content>
return
except IOError:
pass
print("File not found")
I'm learning how to read files and I want to know why this is happening and how to fix it. I made a .txt file just for practicing this and I have it in my documents. When I run the code though it tells me.
Errno2 no such file or directory: jub.txt
I have tried listing it as C:\Users and so on as well. I have watched tons of tutorials. Can some one please explain this to me so I can get it to work.
print ("Opening and closing a file")
text_file = open("jub.txt", "r")
print (text_file('jub.txt'))
text_file.close()
First check that your file exists in current directory, you can add this simple validation.
Secondly use with wrapper, it will close file for you after you exit this block. Thirdly: You read from file using read and readlines methods.
print ("Opening and closing a file")
f_name = "jub.txt"
if not os.path.exists(f_name):
print 'File %s does not exist'%f_name
return
with open(f_name , "r") as text_file:
print (text_file.read())
For your path to be more precise, mayby use full system path, and not relative. Example: '/home/my_user/doc/myfile.txt'
Just to complement the code provided by Beri, I would rather use a try/except statement, and the new-style string formatting:
print("Opening and closing a file")
f_name = 'jub.txt'
try:
with open(f_name, 'r') as text_file:
print(text_file.read())
except FileNotFoundError:
print("File {} does not exist".format(f_name))
By the way I would recommend reading directly from the official Python doc, it's pretty clear and concise:
https://docs.python.org/3.4/tutorial/inputoutput.html#reading-and-writing-files
Here is my code:
# header.py
def add_header(filename):
header = '"""\nName of Project"""'
try:
f = open(filename, 'w')
except IOError:
print "Sorry could not open file, please check path"
else:
with f:
f.seek(0,0)
f.write(header)
print "Header added to", filename
if __name__ == "__main__":
filename = raw_input("Please provide path to file: ")
add_header(filename)
When I run this script (by doing python header.py), even when I provide a filename which does not exist it does not return the messages in the function. It returns nothing even when I replace the print statements with return statements. How would I show the messages in the function?
I believe you are always creating the file. Therefore, you won't see a file not there exception. It does not hurt to put a write or file open write under try except, because you might not have privileges to create the file.
I have found with statements like try except and else to test those at the Python command line, which is a very excellent place to work out cockpit error, and I'm very experienced at generating a lot of cockpit error while proving out a concept.
The fact you're using try except is very good. I just have to go review what happens when a logic flow goes through one of them. The command line is a good place to do that.
The correct course of action here is to try and read the file, if it works, read the data, then write to the file with the new data.
Writing to a file will create the file if it doesn't exist, and overwrite existing contents.
I'd also note you are using the with statement in an odd manner, consider:
try:
with open(filename, 'w') as f:
f.seek(0,0)
f.write(header)
print("Header added to", filename)
except IOError:
print("Sorry could not open file, please check path")
This way is more readable.
To see how to do this the best way possible, see user1313312's answer. My method works but isn't the best way, I'll leave it up for my explanation.
Old answer:
Now, to solve your problem, you really want to do something like this:
def add_header(filename):
header = '"""\nName of Project"""'
try:
with open(filename, 'r') as f:
data = f.read()
with open(filename, 'w') as f:
f.write(header+"\n"+data)
print("Header added to"+filename)
except IOError:
print("Sorry could not open file, please check path")
if __name__ == "__main__":
filename = raw_input("Please provide path to file: ")
add_header(filename)
As we only have the choices of writing to a file (overwriting the existing contents) and appending (at the end) we need to construct a way to prepend data. We can do this by reading the contents (which handily checks the file exists at the same time) and then writing the header followed by the contents (here I added a newline for readability).
This is a slightly modified version of Lattywares solution. Since it is not possible to append data to the beginning of a file, the whole content is read and the file is written anew including your header. By opening the file in read/write mode we can do both operations with the same file handler without releasing it. This should provide some protection against race conditions.
try:
with open(filename, 'r+') as f:
data = f.read()
f.seek(0,0)
f.write(header)
f.write(data)
#f.truncate() is not needed here as the file will always grow
print("Header added to", filename)
except IOError:
print("Sorry, could not open file for reading/writing")
this script opens a file in "w" mode (write mode),which means once the file dose not exist,it will be created. So No IOError.
I am FTPing a zip file from a remote FTP site using Python's ftplib. I then attempt to write it to disk. The file write works, however most attempts to open the zip using WinZip or WinRar fail; both apps claim the file is corrupted. Oddly however, when right clicking and attempting to extract the file using WinRar, the file will extract.
So to be clear, the file write will work, but will not open inside the popular zip apps, but will decompress using those same apps. Note that the Python zipfile module never fails to extract the zips.
Here is the code that I'm using to get the zip file from the FTP site (please ignore the bad tabbing, that's not the issue).
filedata = None
def appender(chunk):
global filedata
filedata += chunk
def getfile(filename):
try:
ftp = None
try:
ftp = FTP(address)
ftp.login('user', 'password')
except Exception, e:
print e
command = 'RETR ' + filename
idx = filename.rfind('/')
path = filename[0:idx]
ftp.cwd(path)
fileonly = filename[idx+1:len(filename)]
ftp.retrbinary('RETR ' + filename, appender)
global filedata
data = filedata
ftp.close()
filedata = ''
return data
except Exception, e:
print e
data = getfile('/archives/myfile.zip')
file = open(pathtoNTFileShare, 'wb')
file.write(data)
file.close()
Pass file.write directly inside the retrbinary function instead of passing appender. This will work and it will also not use that much RAM when you are downloading a big file.
If you'd like the data stored inside a variable though, you can also have a variable named:
blocks = []
Then pass to retrbinary instead of appender:
blocks.append
Your current appender function is wrong. += will not work correctly when there is binary data because it will try to do a string append and stop at the first NULL it sees.
As mentioned by #Lee B you can also use urllib2 or Curl. But your current code is almost correct if you make the small modifications I mentioned above.
I've never used that library, but urllib2 works fine, and is more straightforward. Curl is even better.
Looking at your code, I can see a couple of things wrong. Your exception catching only prints the exception, then continues. For fatal errors like not getting an FTP connection, they need to print the message and then exit. Also, your filedata starts off as None, then your appender uses += to add to that, so you're trying to append a string + None, which gives a TypeError when I try it here. I'm surprised it's working at all; I would have guessed that the appender would throw an exception, and so the FTP copy would abort.
While re-reading, I just noticed another answer about use of += on binary data. That could well be it; python tries to be smart sometimes, and could be "helping" when you join strings with whitespace or NULs in them, or something like that. Your best bet there is to have the file open (let's call it outfile), and use your appender to just outfile.write(chunk).