I'm working on a script in Python 3.8.6 to load .sql files into big query. We're adding some non .sql files into our repo and I want my python script to only look at sql files, so I added an if statement in my loop and now I get an error: Invalid Character in identifier.
for filename in os.listdir(self.script_dir):
if os.path.splitext(filename)[1] == '.sql':
self.logger.info(os.path.join(self.script_dir, filename))
sql = self.read_sql(os.path.join(self.script_dir, filename))
Any idea as to why this is happening? There is actually only one file in the directory that its running for, which does not have a .sql extension. The original file was a text file saved with no extension (we use it to check in empty folders), I added a .txt extension to it as well and still get the same error.
Maybe there is a zero width space somewhere, copied from some website or pdf. Try to delete the line and retype it.
Related
I am trying to collect every txt file from my computer and write it into the terminal when I run the script. I do not know how to do it. Is there a way to read every txt file in the computer then print the contents? (not a certain folder or directory).
In Python, the glob module would give you a list of filenames matching a given string. In your case, glob.glob('dir/*.txt') would give you a list of filenames in directory dir that end in .txt. You can then open each file and print() it to the terminal. Depending on your OS, you might be able to do it in your terminal without writing a separate script.
I have a list of dicts in Python and want to upload it as a json-file to Azure File Storage. When I print the list locally the linebreaks exist. After uploading and manually checking the file on Azure File Storage I noticed that the linebreaks were non existent.
list_of_dicts = my_json_dicts
transformed_dict_str = '\n'.join([json.dumps(x) for x in list_of_dicts])
# print(transformed_dict_str) gives me the "dicts"/lines separated by linebreaks.
service.create_file_from_text(share_name, file_path, file_name.json, transformed_dict_str, encoding='utf-8')
Can anyone tell me why the uploaded file (when i open it in notepad after downloading manually via the browser interface of Azure) does not contain any linebreaks?
Edit:
When I write the string to a local path with the following code, the linebreaks still exist. So it must happen during the create_file_from_text function?
file = open("myjson.json", "w")
file.write(transformed_dict_str)
file.close()
Please use '\r\n' instead of '\n' in your code.
I can reproduce your issue when use '\n', but works fine using '\r\n' (in notepad, there is linebreaks).
I mounted Google drive in Colaboratory and set path to the drive/.
It seems that the path is set correctly, however when I run my code, the files can't be read. I am quite sure that proc_data_lib.py is a pythonfile instead of zip file and the first few lines of that python file is in the second picture
Error Images
File:proc_data_lib.py
ERROR INFO: File "drive/ECMLDeepAudio-Master/lib/proc_data_lib.py", line 1
PK�����rpL^�2'���'������mimetypeapplication/vnd.oasis.opendocument.textPK�����rpL/�4z���������Thumbnails/thumbnail.png�PNG
^
SyntaxError: invalid syntax
The problem here is that the file is not just a text file -- from the error message, it seems to be a custom format of type oasis.opendocument.text, which looks to be an OpenOffice file. Were you writing the file with something like OpenOffice?
Switching to a vanilla text file should fix things.
I have the following code:
with open('EcoDocs TK pdfs.csv', 'rb') as pdf_in:
pdflist = csv.reader(pdf_in, quotechar='"')
for row in pdflist:
if row[1].endswith(row[2]):#check if file type is appended to file name
pathname = ''.join(row[0:2])
else:
pathname = ''.join(row)
if os.path.isfile(pathname):
filehash = md5.md5(file(pathname).read()).hexdigest()
It reads in file paths, file names and file types from a csv file. It then checks to see if the file type is appended to the file name, before joining the file path and file name. It then checks to see if the file exists, before doing something with the file. There are about 5000 file names in the csv file, but isfile only returns True for about half of these. I've manually checked that some of those isfile returns False for exist. As all the data is read in, there shouldn't be any problems with escape characters or single backslashes, so I'm a bit stumped. Any ideas? An example of the csv file format is below, as well as an example of some of the pathnamethat isfile can't find.
csv file-
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\,5_l B.xls,.xls
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\,5_l A.pdf,.pdf
pathname created-
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\5_l B.xls
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\5_l A.pdf
Thanks.
You can safely assume that os.path.isfile() works correctly. Here is my process to debug issues like this:
Add a print(pathname) before I use it.
Eyeball the output. Does anything look suspicious?
Copy the output into the clipboard -> Win+RcmdReturndirSpace" + paste into new command prompt + "Return
That checks whether the path is really correct (finds slight mistakes that eyeballing will miss). It also helps to validate the insane DOS naming conventions which are still enforced even on Windows.
if this also works, the next step is to check file and folder permissions: Make sure the user that runs the script actually has permissions to see and read the file.
EDIT Paths on Windows are ... complicated. An important detail, for example, is that "." is a very, very special character. The name "a.something very long" isn't valid in the command prompt because it demands that you have at most three characters after the last "." in a file name! You're just lucky that it doesn't demand that the name before the last dot is at most 8 characters.
Conclusion: You must be very, very, very careful with "strange characters" in file names and paths on Windows. The only characters which are safe are listed in this document.
try:
directoryListing = os.listdir(inputDirectory)
#other code goes here, it iterates through the list of files in the directory
except WindowsError as winErr:
print("Directory error: " + str((winErr)))
This works fine, and I have tested that it doesnt choke and die when the directory doesn't exist, but I was reading in a Python book that I should be using "with" when opening files. Is there a preferred way to do what I am doing?
You are perfectly fine. The os.listdir function does not open files, so ultimately you are alright. You would use the with statement when reading a text file or similar.
an example of a with statement:
with open('yourtextfile.txt') as file: #this is like file=open('yourtextfile.txt')
lines=file.readlines() #read all the lines in the file
#when the code executed in the with statement is done, the file is automatically closed, which is why most people use this (no need for .close()).
What you are doing is fine. With is indeed the preferred way for opening files, but listdir is perfectly acceptable for just reading the directory.