User input path not recognised by os.path.abspath - python

I am writing a script in which I want the user to input a file path that will then be parsed by the rest of the script as a way to know which file to work on.
For brevity here is the beginning of the code which is where I have a problem:
### import modules
import pyabf
#### open file and extract basic data
file_path = input('file path?')
abf = pyabf.ABF(file_path)
data = abf.data
If I run the script in this form I get the following error:
File "/Users/XXX/anaconda3/envs/blah_blah/lib/python3.6/site-packages/pyabf/abf.py", line 65, in init
raise ValueError("ABF file does not exist: %s" % self.abfFilePath)
ValueError: ABF file does not exist
And here the portion of the script that gives me this error:
# clean-up file paths and filenames, then open the file
self.abfFilePath = os.path.abspath(abfFilePath)
if not os.path.exists(self.abfFilePath):
raise ValueError("ABF file does not exist: %s" % self.abfFilePath)
self.abfID = os.path.splitext(os.path.basename(self.abfFilePath))[0]
log.debug(self.__repr__())
If I run the lines of the code individually in the console everything works:
abf = pyabf.ABF(file_path) # same path as before works to open the file
data = abf.data # then manages to extract the correct numpy array.
What happens to the path when it is fed into as user input as opposed to being given directly into the argument where it will be used? I have tried to type the path in different ways with or without '' or () but I can't get it to be recognised by the pyabf.ABF script.
I had a look at the os.path info and from what I can understand the os.path.abspath(abfFilePath) line, which is bugging now, should just return an absolute path name. I'm sure it's probably something simple and obvious that I just don't get.
Hopefully someone here can help.
Thanks!

Related

How do I write the time from datetime to a file in Python?

I'm trying to have my Python code write everything it does to a log, with a timestamp. But it doesn't seem to work.
this is my current code:
filePath= Path('.')
time=datetime.datetime.now()
bot_log = ["","Set up the file path thingy"]
with open ('bot.log', 'a') as f:
f.write('\n'.join(bot_log)%
datetime.datetime.now().strftime("%d-%b-%Y (%H:%M:%S.%f)"))
print(bot_log[0])
but when I run it it says:
Traceback (most recent call last):
File "c:\Users\Name\Yuna-Discord-Bot\Yuna Discord Bot.py", line 15, in <module>
f.write('\n'.join(bot_log)%
TypeError: not all arguments converted during string formatting
I have tried multiple things to fix it, and this is the latest one. is there something I'm doing wrong or missing? I also want the time to be in front of the log message, but I don't think it would do that (if it worked).
You need to put "%s" somewhere in the input string before string formatting. Here's more detailed explanation.
Try this:
filePath= Path('.')
time=datetime.datetime.now()
bot_log = "%s Set up the file path thingy\n"
with open ('bot.log', 'a') as f:
f.write(bot_log % datetime.datetime.now().strftime("%d-%b-%Y (%H:%M:%S.%f)"))
print(bot_log)
It looks like you want to write three strings to your file as separate lines. I've rearranged your code to create a single list to pass to writelines, which expects an iterable:
filePath= Path('.')
time=datetime.datetime.now()
bot_log = ["","Set up the file path thingy"]
with open ('bot.log', 'a') as f:
bot_log.append(datetime.datetime.now().strftime("%d-%b-%Y (%H:%M:%S.%f)"))
f.writelines('\n'.join(bot_log))
print(bot_log[0])
EDIT: From the comments the desire is to prepend the timestamp to the message and keep it on the same line. I've used f-strings as I prefer the clarity they provide:
import datetime
from pathlib import Path
filePath = Path('.')
with open('bot.log', 'a') as f:
time = datetime.datetime.now()
msg = "Set up the file path thingy"
f.write(f"""{time.strftime("%d-%b-%Y (%H:%M:%S.%f)")} {msg}\n""")
You could also look at the logging module which does a lot of this for you.

Python telethon get message media file name

I have the following code to get messages from group:
getmessage = client.get_messages(dialog, limit=1000)
for message in getmessage:
try:
if message.media == None:
print("message")
continue
else:
print("Media**********")
client.download_media(message)
The code above write the media.
I need to know the file name/file type before I write the file, How can I get it?
You can use a filename of your choosing to ensure that said filename will be used:
filename = 'some-file'
filename = client.download_media(message, filename)
Then filename will be some-file with the correct extension.
Otherwise, Telethon will generate a file name for the file (it will try the original name, and if it exists, append (n) to avoid overwriting existing files), so you can't really know where it will be saved beforehand (but the method does return the final filename).

Overwrite excel file via VBS + python

I have a function that should create a VBScript and save (overwrite) the Excel file, specified in this script with a password on modifying.
The function:
def setPassword(self, excel_file_path, password):
"""Locks excel file with password. Modification allowed only when password is entered"""
excel_file_path = Path(excel_file_path)
vbs_script = \
f"""' Save with password required upon opening
Set excel_object = CreateObject("Excel.Application")
Set workbook = excel_object.Workbooks.Open(FileName:="{excel_file_path}", ReadOnly:=False, Notify:=False)
excel_object.DisplayAlerts = False
excel_object.Visible = False
workbook.SaveAs "{excel_file_path}",,, "{password}"
excel_object.Application.Quit
"""
# write
vbs_script_path = excel_file_path.parent.joinpath("set_password.vbs")
with open(vbs_script_path, "w") as file:
file.write(vbs_script)
# execute
result = subprocess.Popen(['cscript.exe', str(vbs_script_path)], creationflags=subprocess.CREATE_NO_WINDOW, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, err = result.communicate()
res = {
"result": output.decode(encoding='cp866'),
"err": self._errMessage(err.decode(encoding='cp866'))
}
# remove
vbs_script_path.unlink()
return res
The issue is that on my PC it does it correctly. However, if I run it on my work PC, the VBS throws an exception that it can't access the file when trying to save it.
If I use a different name to save the file with password, it works just fine.
Sure, I could save the file with a different name, delete the initial file and then rename the new file. But it is a duck tape and I want a good solution.
PS
I saw information that it is impossible to overwrite the file this way. But since it works on one computer, I suppose there should be a way to make it work on another.
PPS
Inserting this XlSaveConflictResolution argument into the save func didn't help as well
What is wrong with this VBS and how to make it work properly?
Solved it by saving the file under a new name, then deleting the initial file and renaming the new file with the initial name.
For example, I have a file named foo.xlsx. I run the script and save the processed file as foe.xlsx. Then I delete foo.xlsx with os.os.unlink(). Finally, I rename the foe.xlsx back to foo.xlsx with os.rename()
Hope one day this helps someone

weird no such file or directory in python

I'm new to python and the following piece of code is driving me crazy. It lists the files in a directory and for each file does some stuff. I get a IOError: [Errno2] No such file or directory: my_file_that_is_actually_there!
def loadFile(aFile):
f_gz = gzip.open(aFile, 'rb')
data = f_gz.read()
#do some stuff...
f_gz.close()
return data
def main():
inputFolder = '../myFolder/'
for aFile in os.listdir(inputFolder):
data = loadFile(aFile)
#do some more stuff
The file exists and it's not corrupted. I do not understand how it's possible that python first finds the file when it checks the content of myFolder, and then it cannot find itanymore... This happens on the second iteration of my for loop only with any files.
NOTE: Why does this exception happen ONLY at the second iteration of the loop?? The first file in the folder is found and opened without any issues...
This is because open receives the local name (returned from os.listdir). It doesn't know that you mean that it should look in ../myFolder. So it receives a relative path and applies it to the current dir. To fix it, try:
data = loadFile(os.path.join(inputFolder, aFile))

unzipping file results in "BadZipFile: File is not a zip file"

I have two zip files, both of them open well with Windows Explorer and 7-zip.
However when i open them with Python's zipfile module [ zipfile.ZipFile("filex.zip") ], one of them gets opened but the other one gives error "BadZipfile: File is not a zip file".
I've made sure that the latter one is a valid Zip File by opening it with 7-Zip and looking at its properties (says 7Zip.ZIP). When I open the file with a text editor, the first two characters are "PK", showing that it is indeed a zip file.
I'm using Python 2.5 and really don't have any clue how to go about for this. I've tried it both with Windows as well as Ubuntu and problem exists on both platforms.
Update: Traceback from Python 2.5.4 on Windows:
Traceback (most recent call last):
File "<module1>", line 5, in <module>
zipfile.ZipFile("c:/temp/test.zip")
File "C:\Python25\lib\zipfile.py", line 346, in init
self._GetContents()
File "C:\Python25\lib\zipfile.py", line 366, in _GetContents
self._RealGetContents()
File "C:\Python25\lib\zipfile.py", line 378, in _RealGetContents
raise BadZipfile, "File is not a zip file"
BadZipfile: File is not a zip file
Basically when the _EndRecData function is called for getting data from End of Central Directory" record, the comment length checkout fails [ endrec[7] == len(comment) ].
The values of locals in the _EndRecData function are as following:
END_BLOCK: 4096,
comment: '\x00',
data: '\xd6\xf6\x03\x00\x88,N8?<e\xf0q\xa8\x1cwK\x87\x0c(\x82a\xee\xc61N\'1qN\x0b\x16K-\x9d\xd57w\x0f\xa31n\xf3dN\x9e\xb1s\xffu\xd1\.....', (truncated)
endrec: ['PK\x05\x06', 0, 0, 4, 4, 268, 199515, 0],
filesize: 199806L,
fpin: <open file 'c:/temp/test.zip', mode 'rb' at 0x045D4F98>,
start: 4073
files named file can confuse python - try naming it something else. if it STILL wont work, try this code:
def fixBadZipfile(zipFile):
f = open(zipFile, 'r+b')
data = f.read()
pos = data.find('\x50\x4b\x05\x06') # End of central directory signature
if (pos > 0):
self._log("Trancating file at location " + str(pos + 22)+ ".")
f.seek(pos + 22) # size of 'ZIP end of central directory record'
f.truncate()
f.close()
else:
# raise error, file is truncated
I run into the same issue. My problem was that it was a gzip instead of a zip file. I switched to the class gzip.GzipFile and it worked like a charm.
astronautlevel's solution works for most cases, but the compressed data and CRCs in the Zip can also contain the same 4 bytes. You should do an rfind (not find), seek to pos+20 and then add write \x00\x00 to the end of the file (tell zip applications that the length of the 'comments' section is 0 bytes long).
# HACK: See http://bugs.python.org/issue10694
# The zip file generated is correct, but because of extra data after the 'central directory' section,
# Some version of python (and some zip applications) can't read the file. By removing the extra data,
# we ensure that all applications can read the zip without issue.
# The ZIP format: http://www.pkware.com/documents/APPNOTE/APPNOTE-6.3.0.TXT
# Finding the end of the central directory:
# http://stackoverflow.com/questions/8593904/how-to-find-the-position-of-central-directory-in-a-zip-file
# http://stackoverflow.com/questions/20276105/why-cant-python-execute-a-zip-archive-passed-via-stdin
# This second link is only losely related, but echos the first, "processing a ZIP archive often requires backwards seeking"
content = zipFileContainer.read()
pos = content.rfind('\x50\x4b\x05\x06') # reverse find: this string of bytes is the end of the zip's central directory.
if pos>0:
zipFileContainer.seek(pos+20) # +20: see secion V.I in 'ZIP format' link above.
zipFileContainer.truncate()
zipFileContainer.write('\x00\x00') # Zip file comment length: 0 byte length; tell zip applications to stop reading.
zipFileContainer.seek(0)
return zipFileContainer
I had the same problem and was able to solve this issue for my files, see my answer at
zipfile cant handle some type of zip data?
I'm very new at python and i was facing the exact same issue, none of the previous methods were working.
Trying to print the 'corrupted' file just before unzipping it returned an empty byte object.
Turned out, I was trying to unzip the file right after writing it to disk, without closing the file handler.
with open(path, 'wb') as outFile:
outFile.write(data)
outFile.close() # was missing this
with zipfile.ZipFile(path, 'r') as zip:
zip.extractall(destination)
Closing the file stream then unzipping the file resolved my issue.
Sometime there are zip file which contain corrupted files and upon unzipping the zip gives badzipfile error. but there are tools like 7zip winrar which ignores these errors and successfully unzip the zip file. you can create a sub process and use this code to unzip your zip file without getting BadZipFile Error.
import subprocess
ziploc = "C:/Program Files/7-Zip/7z.exe" #location where 7zip is installed
cmd = [ziploc, 'e',your_Zip_file.zip ,'-o'+ OutputDirectory ,'-r' ]
sp = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)
Show the full traceback that you got from Python -- this may give a hint as to what the specific problem is. Unanswered: What software produced the bad file, and on what platform?
Update: Traceback indicates having problem detecting the "End of Central Directory" record in the file -- see function _EndRecData starting at line 128 of C:\Python25\Lib\zipfile.py
Suggestions:
(1) Trace through the above function
(2) Try it on the latest Python
(3) Answer the question above.
(4) Read this and anything else found by google("BadZipfile: File is not a zip file") that appears to be relevant
I faced this problem and was looking for a good and clean solution; But there was no solution until I found this answer. I had the same problem that #marsl (among the answers) had. It was a gzipfile instead of a zipfile in my case.
I could unarchive and decompress my gzipfile with this approach:
with tarfile.open(archive_path, "r:gz") as gzip_file:
gzip_file.extractall()
Have you tried a newer python, or if that is too much trouble, simply a newer zipfile.py? I have successfully used a copy of zipfile.py from Python 2.6.2 (latest at the time) with Python 2.5 in order to open some zip files that weren't supported by Py2.5s zipfile module.
In some cases, you have to confirm if the zip file is actually in gzip format. this was the case for me and i solved it by :
import requests
import tarfile
url = ".tar.gz link"
response = requests.get(url, stream=True)
file = tarfile.open(fileobj=response.raw, mode="r|gz")
file.extractall(path=".")
for this this happened when the file wasn't downloaded fully I think. So I just delete it in my download code.
def download_and_extract(url: str,
path_used_for_zip: Path = Path('~/data/'),
path_used_for_dataset: Path = Path('~/data/tmp/'),
rm_zip_file_after_extraction: bool = True,
force_rewrite_data_from_url_to_file: bool = False,
clean_old_zip_file: bool = False,
gdrive_file_id: Optional[str] = None,
gdrive_filename: Optional[str] = None,
):
"""
Downloads data and tries to extract it according to different protocols/file types.
note:
- to force a download do:
force_rewrite_data_from_url_to_file = True
clean_old_zip_file = True
- to NOT remove file after extraction:
rm_zip_file_after_extraction = False
Tested with:
- zip files, yes!
Later:
- todo: tar, gz, gdrive
force_rewrite_data_from_url_to_file = remvoes the data from url (likely a zip file) and redownloads the zip file.
"""
path_used_for_zip: Path = expanduser(path_used_for_zip)
path_used_for_zip.mkdir(parents=True, exist_ok=True)
path_used_for_dataset: Path = expanduser(path_used_for_dataset)
path_used_for_dataset.mkdir(parents=True, exist_ok=True)
# - download data from url
if gdrive_filename is None: # get data from url, not using gdrive
import ssl
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
print("downloading data from url: ", url)
import urllib
import http
response: http.client.HTTPResponse = urllib.request.urlopen(url, context=ctx)
print(f'{type(response)=}')
data = response
# save zipfile like data to path given
filename = url.rpartition('/')[2]
path2file: Path = path_used_for_zip / filename
else: # gdrive case
from torchvision.datasets.utils import download_file_from_google_drive
# if zip not there re-download it or force get the data
path2file: Path = path_used_for_zip / gdrive_filename
if not path2file.exists():
download_file_from_google_drive(gdrive_file_id, path_used_for_zip, gdrive_filename)
filename = gdrive_filename
# -- write downloaded data from the url to a file
print(f'{path2file=}')
print(f'{filename=}')
if clean_old_zip_file:
path2file.unlink(missing_ok=True)
if filename.endswith('.zip') or filename.endswith('.pkl'):
# if path to file does not exist or force to write down the data
if not path2file.exists() or force_rewrite_data_from_url_to_file:
# delete file if there is one if your going to force a rewrite
path2file.unlink(missing_ok=True) if force_rewrite_data_from_url_to_file else None
print(f'about to write downloaded data from url to: {path2file=}')
# wb+ is used sinze the zip file was in bytes, otherwise w+ is fine if the data is a string
with open(path2file, 'wb+') as f:
# with open(path2file, 'w+') as f:
print(f'{f=}')
print(f'{f.name=}')
f.write(data.read())
print(f'done writing downloaded from url to: {path2file=}')
elif filename.endswith('.gz'):
pass # the download of the data doesn't seem to be explicitly handled by me, that is done in the extract step by a magic function tarfile.open
# elif is_tar_file(filename):
# os.system(f'tar -xvzf {path_2_zip_with_filename} -C {path_2_dataset}/')
else:
raise ValueError(f'File type {filename=} not supported.')
# - unzip data written in the file
extract_to = path_used_for_dataset
print(f'about to extract: {path2file=}')
print(f'extract to target: {extract_to=}')
if filename.endswith('.zip'):
import zipfile # this one is for zip files, inspired from l2l
zip_ref = zipfile.ZipFile(path2file, 'r')
zip_ref.extractall(extract_to)
zip_ref.close()
if rm_zip_file_after_extraction:
path2file.unlink(missing_ok=True)
elif filename.endswith('.gz'):
import tarfile
file = tarfile.open(fileobj=response, mode="r|gz")
file.extractall(path=extract_to)
file.close()
elif filename.endswith('.pkl'):
# no need to extract it, but when you use the data make sure you torch.load it or pickle.load it.
print(f'about to test torch.load of: {path2file=}')
data = torch.load(path2file) # just to test
assert data is not None
print(f'{data=}')
pass
else:
raise ValueError(f'File type {filename=} not supported, edit code to support it.')
# path_2_zip_with_filename = path_2_ziplike / filename
# os.system(f'tar -xvzf {path_2_zip_with_filename} -C {path_2_dataset}/')
# if rm_zip_file:
# path_2_zip_with_filename.unlink(missing_ok=True)
# # raise ValueError(f'File type {filename=} not supported.')
print(f'done extracting: {path2file=}')
print(f'extracted at location: {path_used_for_dataset=}')
print(f'-->Succes downloading & extracting dataset at location: {path_used_for_dataset=}')
you can use my code with pip install ultimate-utils for the most up to date version.
In the other case, this warning showing up when the ml/dl model has different format.
For the example:
you want to open pickle, but the model format is .sav
Solution:
you need to change the format to original format
pickle --> .pkl
tensorflow --> .h5
etc.
In my case, the zip file itself was missing from that directory - thus when I tried to unzip it, I got the error "BadZipFile: File is not a zip file". It got resolved after I moved the .zip file to the directory. Please confirm that the file is indeed present in your directory before running the python script.
In my case, the zip file was corrupted. I was trying to download the zip file with urllib.request.urlretrieve but the file wouldn't completely download for some reason.
I connected to a VPN, the file downloaded just fine, and I was able to open the file.

Categories

Resources