How do I delete temp .bed file created by shellfish.py - python

I am running shellfish.py to perform principal component analsysis. However, I am getting this error below. How do I fix this?
17:18:39 Found .bed format data data_WTCCC_f_650.bed
17:18:39 Found binary mapfile data_WTCCC_f_650.bim
17:18:39 shellfish error: Trying to create link to original data file data_WTCCC_f_650.bed, but link file shellfish-temp-16297/848716990677.bed already exists, presumably from a previous shellfish run. Delete any such files before running again.

You can remove this file from Python with:
import os
os.remove('path/to/bed/file')
Of course you can always delete this file by hand. Just find it in the file explorer (or the tool you use) and delete it.

Related

How to get the path of a ".lnk" file using tkinter.filedialog.askopenfilenames() in python 3.10? Or any other ways?

My work needs me to collect some file names and their generating time.
I am using the fileName = tkinter.filedialog.askopenfilenames() to realize the function, that the program pops up a window to ask for selecting files, then I can get the files' pathes and then use fileGeneratedTime = datetime.datetime.fromtimestamp(os.path.getmtime(fileName)) to get the files' generating time.
But now here comes the problem. When I want to get the path of a .lnk file, however, it returns the path of the file which the .lnk file is pointing to. It is OK to run the program on the origin computer that has the .lnk files, but when I copy the .lnk files to other computers, the program says FileNotFoundError.
So, is there any parameters that can make the fileName = tkinter.filedialog.askopenfilenames() returns the .lnk file itself's path (not the path of the file which the .lnk file points to)? Or is there any other ways to realize the same function?
Thanks for your answering!

os.listdir() adds characters to the beginning of file name?

I had a quick google of this but couldn't find anything. I'm using os to get a list of all the file names in the current working directory using the following code:
path = os.getcwd()
files = os.listdir(path)
The list of files returns fine, but the last element has an extra '~$' that isn't in the actual file name. For example:
files
['File1.xlsx', 'File2.xlsx', '~$File3.xlsx']
This is then causing an issue when I iterate through these files to try and import them, as I get the error of:
[Errno 2] No such file or directory: 'C:\\Users\\$File3.xlsx'
If anyone knows why this happens and how I can fix/prevent it, that would be great!
Just thought I'd answer in case anyone else has this issue.
It's nothing to do with os. It happened because I had File3 open in Excel while pulling the list of file names. I've found out that opening a microsoft document creates a temporary 'lock' file, which are denoted by '~$' (this is how it can re-open unsaved data if it crashes etc).
I found the below from here:
The files you are describing are so-called owner files (sometimes
referred to as "lock" files). An owner file is created when you work
with a document ... and it should be deleted when you save your
document and exit.
There's also a SO question about this within Microsoft files, which can be found here

Using paramiko, renaming a file and changing directory fails. Why?

I have to process a file on an SFTP server and, when done, move that file to an archive directory using Paramiko. However, if the file already exists in the archive directory, I want to rename the file at the same time. I have the basics for detecting the existing file in archive and adjusting the name. Basically, the final call looks like:
client.rename('/main-path/file.txt', '/main-path/archive/file_1.txt')
or
client.posix_rename('/main-path/file.txt', '/main-path/archive/file_1.txt')
These commands work on SOME servers with no problem. On other servers, I get an "Errno 2" error from paramiko.
Am I going about this wrong? Maybe I need to rename the file, in place, first?
client.rename('/main-path/file.txt', '/main-path/file_1.txt')
and then
client.rename('/main-path/file_1.txt', '/main-path/archive/file_1.txt')
???
Any help would be appreciated.
You can check if the file exist by using OS
import os
list_element= os.listdir("/main-path/archive")
if "file.txt" in list_element:
client.posix_rename('/main-path/file.txt', '/main-path/archive/file_1.txt')
else:
client.posix_rename('/main-path/file.txt', '/main-path/archive/file.txt')

Why Does a Strange File Shows Up in Directory When Using os.walk()?

The project is written in Pycharm on Windows 10.
I wrote a program that grabs .docx files from a directory and searches for information. At the end of the list of file names I get this file: "~$640188.docx"
I get this error when it hits this file:
raise BadZipfile, "File is not a zip file"
zipfile.BadZipfile: File is not a zip file
This error happens when I try to put file '~$640188.docx' into the docx2text method process
text = docx2txt.process(r'C:\path\to\folder\~$640188.docx')
From what I can see, this file does not exist in the directory I'm searching nor anywhere on my computer. The other strange part is that yesterday I wasn't getting this error.
I know there are sometimes "hidden" files in directories and I ran into those before on my mac (specifically '.DS_Store') but this is a .docx file.
I currently have an ugly solution, which says "don't run the code if you run into '~$640188.docx'". My concern is that this will become more of a problem when I dump 11000 files into the directory.
Where does this file come from?
Below is the code for reference
import docx2txt
import os
check_files = []
for dir, subdir, files in os.walk(r'C:\path\to\folder'):
for file in files:
check_files.append(file)
for file in check_files:
print "file: {0}".format(file)
text = docx2txt.process(r'C:\path\to\folder\{0}'.format(file))
Hidden .docx files starting with ~$ are simply temporary files created by Word while a file is actively open and being edited – the first two characters of the respective parent file's name are replaced with the ~$. They are usually deleted once you save and close a document, but sometimes they manage to stick around after you quit anyway. Since they are designed to be temporary compliments to a proper .docx file, they do not necessary have the correct zip package structure at all times.
You will do well to skip those. Checking if the file name starts with '~' should be good enough. Just add the following filtering:
check_files2 = [fl for fl in check_files if fl[0] != '~']
for file in check_files2:

Permission denied when pandas dataframe to tempfile csv

I'm trying to store a pandas dataframe to a tempfile in csv format (in windows), but am being hit by:
[Errno 13] Permission denied: 'C:\Users\Username\AppData\Local\Temp\tmpweymbkye'
import tempfile
import pandas
with tempfile.NamedTemporaryFile() as temp:
df.to_csv(temp.name)
Where df is the dataframe. I've also tried changing the temp directory to one I am sure I have write permissions:
tempfile.tempdir='D:/Username/Temp/'
This gives me the same error message
Edit:
The tempfile appears to be locked for editing as when I change the loop to:
with tempfile.NamedTemporaryFile() as temp:
df.to_csv(temp.name + '.csv')
I can write the file in the temp directory, but then it is not automatically deleted at the end of the loop, as it is no longer a temp file.
However, if I change the code to:
with tempfile.NamedTemporaryFile(suffix='.csv') as temp:
training_data.to_csv(temp.name)
I get the same error message as before. The file is not open anywhere else.
I encountered the same error message and the issue was resolved after adding "/df.csv" to file_path.
df.to_csv('C:/Users/../df.csv', index = False)
Check your permissions and, according to this post, you can run your program as an administrator by right click and run as administrator.
We can use the to_csv command to do export a DataFrame in CSV format. Note that the code below will by default save the data into the current working directory. We can save it to a different folder by adding the foldername and a slash to the file
verticalStack.to_csv('foldername/out.csv').
Check out your working directory to make sure the CSV wrote out properly, and that you can open it! If you want, try to bring it back into python to make sure it imports properly.
newOutput = pd.read_csv('out.csv', keep_default_na=False, na_values=[""])
ref
Unlike TemporaryFile(), the user of mkstemp() is responsible for deleting the temporary file when done with it.
With the use of this function may introduce a security hole in your program. By the time you get around to doing anything with the file name it returns, someone else may have beaten you to the punch. mktemp() usage can be replaced easily with NamedTemporaryFile(), passing it the delete=False paramete.
Read more.
After export to CSV you can close your file with temp.close().
with tempfile.NamedTemporaryFile(delete=False) as temp:
df.to_csv(temp.name + '.csv')
temp.close()
Sometimes,you need check the file path that if you have right permission to read and write file. Especially when you use relative path.
xxx.to_csv('%s/file.csv'%(file_path), index = False)
Sometimes, it gives that error simply because there is another file with the same name and it has no permission to delete the earlier file and replace it with the new file.
So either name the file differently while saving it,
or
If you are working on Jupyter Notebook or a other similar environment, delete the file after executing the cell that reads it into memory. So that when you execute the cell which writes it to the machine, there is no other file that exists with that name.
I encountered the same error. I simply had not yet saved my entire python file. Once I saved my python file in VS code as "insertyourfilenamehere".py to documents(which is in my path), I ran my code again and I was able to save my data frame as a csv file.
As per my knowledge, this error pops up when one attempt to save the file that have been saved already and currently open in the background.
You may try closing those files first and then rerun the code.
Just give a valid path and a file name
e.g:
final_df.to_csv('D:\Study\Data Science\data sets\MNIST\sample.csv')

Categories

Resources