Deeptools cannot find file that exists

Deeptools cannot find file that exists - python

An intern of mine is trying to use deeptools' bamCoverage function, and it throws a '/my/dir/data.bam' file does not exist error, despite the file being there. Yes, the file does exist, we can manipulate it using bash commands just fine so there's no real reason for it to throw that error.
According to this thread, it could be an issue with pysam or python. Both are fully up to date. Do you know how I could investigate issues with pysam or python IO further?
All of this is happening on a server that our team uses. Could it be an issue with python's paths for his user session?
For reference, here is the code I'm running in my bash terminal. It's pretty basic:
my.name#server:/mnt/data1/my.name/PROJET_X/DATA/BIGWIG$ bamCoverage -p 8 -b /mnt/data1/my.name/PROJET_X/DATA/BAM_sort/G0-G00.Inputs.sorted.bam -o /mnt/data1/my.name/PROJET_X/DATA/BIGWIG/G0-G00.Inputs.RPGC.bw --normalizeUsing RPGC --effectiveGenomeSize 2913022398 -bs 10
The file '/mnt/data1/my.name/PROJET_X/DATA/BAM_sort/G0-G00.Inputs.sorted.bam' does not exist
my.name#server:/mnt/data1/my.name/PROJET_X/DATA/BIGWIG$ ll /mnt/data1/my.name/PROJET_X/DATA/BAM_sort/G0-G00.Inputs.sorted.bam
-rwxr-xr-x 1 my.name bioinfo 4171366400 juin 22 10:14 /mnt/data1/my.name/PROJET_X/DATA/BAM_sort/G0-G00.Inputs.sorted.bam
I have used this line probably 100 times this past year, but now it won't find the file.
I've also tried installing deeptools in my home directory using conda, but that gives the same errors.
EDIT: Apparently, it was just a problem with that one file. bamCoverage will work on other data files. It would be nice if deeptools would tell you that instead of just "file not found"...

It would be helpful to see what code is being run here - please,
provide a minimal reproducible example, including the full output / error, if at all possible.
Right now, I can think of 4 possible solutions to common problems:
As mentioned in the comments, providing the full path might be one solution. You can get the full path by using the pwd bash command in the folder where the file is stored and then add the file name at the end of the string.
Another thing that sometimes helps with relative (local) paths is to start the path with a ".", so the path would be something like this: './my/dir/data.bam'.
Another problem I've encountered (mostly when running things on Windows environments) was that the backslashes had to be used instead: '\my\dir\data.bam'
For the sake of completeness here, I'll mention that especially with beginning programmers, the path is often erroneously provided without commas (to make it a string), but that does not seem to be the case here.
Hope this helps!

Related

Please help me run a Python script from Excel 365 VBA on a Mac Mini running Big Sur

My first question on this subject was closed for not being specific enough so I'll try to tell you what I have tried so far that hasn't worked. I'm trying to move Excel 2010 workbooks to my new Mac mini Big Sur machine using Office 365. It is my first Mac so I am very new to the operating system and the differences from the Microsoft world. It is also the first time I'm using Office 365. I am hopeful you will be kind enough to help me. In my VBA macros I run Python scripts to scrape web pages for data. Usually I have the Python script create a .csv file which I then open in the VBA macro to read and populate the necessary cells in the workbook. This is the code I use in the PC version to call the Python script:
Kill strFileOfData
Set objWsh = VBA.CreateObject("WScript.shell")
strPython = """C:\Users\Phil\AppData\Local\Programs\Python\Python37-32\python.exe"""
strPyScript = """C:\Users\Phil\Documents\PyScripts\GetMarketsStks1-0.py"""
objWsh.Run strPython & strPyScript
Err = 0
Do Until Dir(StrFileOfData) <> 0
Application.Wait (Now() + TimeValue("0:00:01"))
Loop
It may not be the best but it works reliably on the PC. I delete the data file first, run the Python script, wait for the data file to be created, then continue.
I installed the latest version of Python on the Mac and rewrote the Python script to get more data and ran the Python script on Terminal to make sure it executed properly. It ran fine and created the .csv file correctly.
I then changed the code in the VBA macro to account for the different file structures. This is the new code:
Kill strFileOfData
Set objWsh = VBA.CreateObject("WScript.shell")
strPython = """/Library/Frameworks/Python.framework/Python"""
strPyScript = """/Users/minime/MyDocuments/Finance/GetHistory1-0.py"""
objWsh.Run strPython & strPyScript
Err = 0
Do Until Dir(StrFileOfData) <> 0
Application.Wait (Now() + TimeValue("0:00:01"))
Loop
When I run this on the Mac I get:
Run-Time error '429': ActiveX component can't create object
Suspecting this was a difference in the way shells are created and used I began researching how to call Python from Excel on the Mac. After considerable dead ends I found this thread from 4 years ago:
How can I launch an external python process from Excel 365 VBA on OSX?
I tried to simply plow ahead and follow the instructions. I learned a bit about AppleScript, added the folder: "~/Library/Application Scripts/com.microsoft.Excel/", created the AppleScript named PythonCommand.scpt and placed it in that folder. Since I couldn't find the path in the example I substituted what I thought to be the correct path, assuming it was due to the difference in MacOS from 4 years ago. My AppleScript looks like this:
on PythonCommand(pythonScript)
do shell script "/Library/Frameworks/Python.framework/Python" & pythonScript
end PythonCommand
I then added this code to the VBA macro:
strPyScript = """/Users/minime/MyDocuments/Finance/GetHistory1-0.py"""
Dim result As String
result = AppleScriptTask("PythonCommand.scpt", "PythonCommand", strPyScript)
When I ran the VBA macro. I got this message:
Run-time error '13': Type mismatch
I tried it again with single quotes instead of the triple quotes and got the same result.
I tried to work backward to make sure the pieces worked. I again ran the Python script from a Terminal window with no problem so the next step I tried was running the AppleScript from IDLE. I typed in this:
AppleScriptTask("PythonCommand.scpt", "PythonCommand","/Users/minime/MyDocuments/Finance/GetHistory1-0.py")
and got this result:
Traceback (most recent call last): File "<pyshell#0>", line 1, in
AppleScriptTask("PythonCommand.scpt", "PythonCommand","/Users/minime/My Documents/Finance/GetHistory1-0.py") NameError: name 'AppleScriptTask'
is not defined
After more research I tried this AppleScript in the editor:
do shell script "/Library/Frameworks/Python.framework/Versions/3.9/Python /Users/minime/MyDocuments/Finance/GetHistory1-0.py"
I got this result:
error "sh: /Library/Frameworks/Python.framework/Versions/3.9/Python:
cannot execute binary file" number 126
Next I tried this:
do shell script "Python /Users/minime/MyDocuments/Finance/GetHistory1-0.py"
and got this:
error "Traceback (most recent call last):
File \"/Users/philipackermann/MyDocuments/Finance/GetHistory1-0.py\", line 2, in <module>
from urllib.request import urlopen
ImportError: No module named request" number 1
This might actually be some progress! That import statement is the first line of code in my Python script but I have no clue why this got further than the last attempt and why this is running differently than the execution of the Python script in Terminal.
But I just noticed that in Terminal I enter Python3 so I tried this:
do shell script "Python3 /Users/minime/MyDocuments/Finance/GetHistory1-0.py"
and got a pop up with a script error:
and this:
After more reading about Terminal I realized there are hidden files on the Mac so I tried this AppleScript:
do shell script "/usr/local/bin/Python3 /Users/minime/MyDocuments/Finance/GetHistory1-0.py"
And it worked!! The .csv file is created successfully. So now I need to figure out how to call this from Excel. I found an article here:
https://stackoverflow.com/questions/38723420/how-to-simply-run-an-applescript-task-from-mac-excel-2016
That actually works. I converted the AppleScript into an app (just changed the extension from scpt to app) and moved it to the Applications folder, then put this code into the VBA macro:
ThisWorkbook.FollowHyperlink Address:="/Applications/RunGetHistory.app", NewWindow:=True
This worked! Of course it isn't passing any arguments or receiving any results. One issue though, like my previous PC version, the code doesn't wait for the app to finish so I need a way to check for it to be done and since I changed the Python Script to open the file in the beginning and write lines to it while scraping the web pages the file is created right away so this code which I was using in the PC version isn't sufficient:
Do Until Dir(strFileNmHistory) <> ""
Application.Wait (Now() + TimeValue("0:00:01"))
Loop
I suppose I could add a separate file written at the end of the Python script just to say it's done but that's a pretty lame hack. So technically I guess I could say I solved this and CAN run a Python script from Excel but I have to believe there is a better way and I'm sure this community has folks much smarter than me who can make this better. Has anyone been able to get the solution from 4 years ago to work? That might solve the problems of arguments, results, and waiting for the Python script to end. Any suggestions you can make would be most helpful. I have researched this issue extensively and some answers said that sandboxing is preventing Excel from running Python but if we can run an app that runs the Python script I guess that gets around it. If you can comment on the specific errors I encountered above I'll try different approaches and report back.
Please help.
Phil

SOLVED! ...and I learned a lot along the way! I'll try to lay out only the steps needed to make this work but I do tend to ramble, sorry. One issue that made this harder than it needed to be was a simple one: I was missing a space in the "do shell script" statement of the AppleScript. Here is the VBA code that works for me:
Result = AppleScriptTask("PythonCommand.scpt", "PythonCommandHandler", "/Users/minime/MyDocuments/Finance/PythonScripts/GetHistory1-0.py")
Here is the AppleScript code:
on PythonCommandHandler(pythonScript)
do shell script "/usr/local/bin/Python3 " & pythonScript
return "Handler succeeded! " & pythonScript
end PythonCommandHandler
Note the space after "Python3". Create the AppleScript in a convenient folder and copy it to
"/Users/minime/Library/Application Scripts/com.microsoft.Excel"
Trying to edit it in that folder is problematic, trust me. Move it there.
The Python script can be anywhere as long as you give the path in the AppleScriptTask command in the VBA code.
Be sure all references to .csv files point to an Excel sandbox folder. I used this one but I suspect others would work too:
"/Users/minime/Library/Containers/com.microsoft.Excel/Data/Documents"
See notes below if you can't find this folder. This worked on 7/1/2021! We'll see what happens when the new OS X comes out. The call to AppleScriptTask waits for the full execution of the Python script before continuing. I haven't figured out a good way to handle errors in the Python script yet.
Here are some notes on important things I learned along the way that might be helpful for first time Mac users like myself:
CHANGE YOUR VBA EDITOR FONT. Ok, not really necessary but the default font for the VBA editor on the Mac Excel 365 version I'm using was not a proportional font so things like the "as" part of the Dim statements wouldn't line up for me. Minor maybe but a big annoyance easily solved: In the editor, click the word "Excel" in the Menu Bar (the top row of words and icons.) Click "Preferences" and a window will pop up with the title "Options". Click the "Editor Format" tab and select "Courier New" from the "Font:" drop down. I changed the "Size:" to 16. Oh and by the way, "Command, shift, i" will step you through the VBA code like F8 does on the PC.
PUT SOME OF THE FILES IN THE SANDBOX. A word about the "sandbox" which I'm sure others could explain better than I. To improve security the sandbox concept is supposed to restrict the code from executing something outside it's expected area: e.g. you wouldn't want an error in your code (or malicious code) to change some system settings that had nothing to do with what you intended. I get that. Good concept. In ancient times (mainframes) we got a SOC 1 for that type of issue. The sandbox basically is a portion of the computer in which you are allowed to play. Don't try to go outside of it (there are exceptions I don't fully understand yet). On the Mac this means that if you even try to do anything with a .csv file in your VBA code that isn't in the sandbox you will get an alert saying you need to grant permission for that to happen. The Excel workbook with the VBA code doesn't need to be in the sandbox because what matters is that the Excel executable is in the sandbox (I assume) and is allowed to reach out to your workbook (an exception to the rule?) and do all the things you asked it to including run your VBA code, with limitations. The AppleScript also needs to be in the sandbox but the Python script does not (I don't know why). We are only interested here with the Excel sandbox, there are, apparently, different ones for different apps. Now a word about aliases. For simplicity consider your Excel sandbox to be here:
"/Users/philipackermann/Library/Containers/com.microsoft.Excel"
There is an alias located here:
"/Users/philipackermann/Library/Containers/com.microsoft.Excel/Data/Library/Application Scripts/com.microsoft.Excel"
Moving a file to one will, by magic, move the file to the other one as well. By the way, the first "com.microsoft.Excel" in that alias path will show up as "Microsoft Excel" in Finder if you are trying to look for it by walking the path from the beginning. Oh by the way, "Command, shift, ." will let you see the hidden folders and files in Finder. You'll need that to get started. If I understand this correctly, the folder icons in Finder with the little arrow in the lower left corner indicate that the folder is an alias (or has an alias pointing to it? I'm not sure). The folder I used for the .csv files is not an alias or doesn't have an alias) so I had to use the really long path. So the Excel workbook and the Python script don't need to be in the sandbox but the AppleScript and the .csv files do need to be there.
DON'T EDIT IN THE SANDBOX ALIAS FOLDER! Check out the link I mentioned in the comments above with VaughanR. He helped point the way for me to understand the alias issue better. Thank you VaughanR. Trying to edit in those folders was a madding exercise in futility. Save yourself the trauma and edit your AppleScript in a local folder and copy it to the sandbox folder. I did this by opening two Finder windows (one with the local folder and one with the sandbox folder) and holding down the Command and Option keys while dragging the AppleScript file from the local folder to the sandbox folder.
TEST THE APPLESCRIPT IN IN THE SCRIPT EDITOR. As I mentioned somewhere else, that pesky Error 5: can pop up for a number of reasons. In the script editor add this line above the PythonCommandHandler on statement:
"PythonCommandHandler("GetHistory1-0.py")"
and run it in the editor to be sure it is doing what you want it to do. That's how I found the problem with the missing space and tested the different paths to get it running.
MISC. EXCEL NOTES. If you are trying to bring Excel workbooks from a Windows environment to the Mac watch out for these issues I have run into so far: SORTS. I had hardcoded sorts in my VBA code that caused Excel to crash when I tried to run them on the Mac. I recorded a sort and that works fine in the code. There are NO USERFORMS on the Mac! I have not yet found an alternative to use.

Is there any way to change the name of creator of a module?

I have created a code that automates some kind of task using python.
So for that I imported a module named Pywhatkit.
But after importing when I run the program first I get this(shown in the attached image below) note from creator of Pywhatkit(highlighted part in the attachment below)and then I am able to run my program.
Sometimes this seems unprofessional.
So is there any way too either change the name that appears on running the program or is there any way with which I can remove this part?

Python modules are generally written in python. If you need to make a modification like this, you can usually just look up where the module is installed and change the relevant bit of the code (usually in your python version's site_modules directory, in a folder named the same as the module, where the file __init__.py is the first thing executed, so that's a good place to start if you're blindly searching for something to change).
In this particular case, do the following:
Run the console command pip show pywhatkit to find the location of the installed pywhatkit module. Should be the third-to-last line of that command's output. I'll call this $pwkdir
Open the file $pwkdir/pywhatkit/mainfunctions.py in your text editor of choice
Comment out lines 300 through 304, and save the file.
the cause of the output you're seeing is a straightforward print() call, so removing it is easy and harmless.
I found the location that needs to be commented out by doing the command grep -nr 'Hello from the' $pwkdir/pywhatkit (i.e. searching for any usages of the observed phrase, because the first three words are enough to identify it), and reading the code.
You will probably need to do this again every time you reinstall or update this module to a new version.
Note that there are other places within the module where it prints to console. You may wish to search for and comment out those lines as well, or disable printing to stdout before importing the module for the first time.

Noted, this "Note from the creator" will be removed from the next update. Meanwhile, you can do as suggested in the other answer, you may consider editing the "mainfunctions.py", commenting the last print statement of that file will stop printing the message.

Flaky file deletion under Windows 7?

I have a Python test suite that creates and deletes many temporary files. Under Windows 7, the shutil.rmtree operations sometimes fail (<1% of the time). The failure is apparently random, not always on the same files, not always in the same way, but it's always on rmtree operations. It seems to be some kind of timing issue. It is also reminiscent of Windows 7's increased vigilance about permissions and administrator rights, but there are no permission issues here (since the code had just created the files), and there are no administrator rights in the mix.
It also looks like a timing issue between two threads or processes, but there is no concurrency here either.
Two examples of (partial) stack traces:
File "C:\ned\coverage\trunk\test\test_farm.py", line 298, in clean
shutil.rmtree(cleandir)
File "c:\python23\lib\shutil.py", line 142, in rmtree
raise exc[0], (exc[1][0], exc[1][1] + ' removing '+arg)
WindowsError: [Errno 5] Access is denied removing xml_1
File "C:\ned\coverage\trunk\test\test_farm.py", line 298, in clean
shutil.rmtree(cleandir)
File "c:\python23\lib\shutil.py", line 142, in rmtree
raise exc[0], (exc[1][0], exc[1][1] + ' removing '+arg)
WindowsError: [Errno 3] The system cannot find the path specified removing out
On Windows XP, it never failed. On Windows 7, it fails like this, across a few different Python versions (2.3-2.6, not sure about 3.1).
Anyone seen anything like this and have a solution? The code itself is on bitbucket for the truly industrious.

It's a long shot, but are you running anything that scans directories in the background? I'm thinking antivirus/backup (maybe Windows 7 has something like that built in? I don't know). I have experienced occasional glitches when deleting/moving files from the TSVNCache.exe process that TortoiseSVN starts -- seems it watches directories for changes, then presumably opens them for scanning the files.

We had similar problems with shutil.rmtree on Windows, particularly looking like your first stack trace. We solved it by using an exception handler with rmtree. See this answer for details.

Just a thought, but if the test behavior (creating and deleting lots of temp files) isn't typical of what the app actually does, maybe you could move those test file operations to (c)StringIO, and keep a suite of function tests that exercises your app's actual file creation/deletion behavior.
That way, you can make sure that your app behaves correctly, without introducing extra complexity not related to the app.

My guess is that you should check up on the code that creates the file, and make SURE they are closed explicitly before moving on to delete it. If nothing is obvious there in the code, download a copy of Process Monitor and watch what's happening on the file system there. This tool will give you the exact error code coming from Windows, and should shed some light on the situation.

That "The system cannot find the path specified:" error will appear intermittently if the path is too long for Windows (260 chars). Automated tasks often create folder hierarchies using relative references that produce fully qualified paths longer than 260 characters. Any script that attempts to delete those folders using fully qualified paths will fail.
I built a quick workaround that used relative path references, but don't have generic code solution to offer, just a warning that the excellent answers provided might not help you.

I met same problem with shutil.rmtree command and this issue maybe cause by special file name.(Ex:Other country language:леме / Ö)
Please use the following format to delete the directory which you want:
import shutil
shutil.rmtree(os.path.join("<folder_name>").decode('ascii'))
Enjoy it !

Bash alias to Python script -- is it possible?

The particular alias I'm looking to "class up" into a Python script happens to be one that makes use of the cUrl -o (output to file) option. I suppose I could as easily turn it into a BASH function, but someone advised me that I could avoid the quirks and pitfalls of the different versions and "flavors" of BASH by taking my ideas and making them Python scripts.
Coincident with this idea is another notion I had to make a feature of legacy Mac OS (officially known as "OS 9" or "Classic") pertaining to downloads platform-independent: writing the URL to some part of the file visible from one's file navigator {Konqueror, Dolphin, Nautilus, Finder or Explorer}. I know that only a scant few file types support this kind of thing using some other command-line tools (exiv2, wrjpgcom, etc). Which is perfectly fine with me as I only use this alias to download single-page image files such as JPEGs anyways.
I reckon I might as well take full advantage of the power of Python by having the script pass the string which is the source URL of the download (entered by the user and used first by cUrl) to something like exiv2 which could write it to the Comment block, EXIF User Comment block, and (taking as a first and worst example) Windows XP's File Description field. Starting small is sometimes a good way to start.
Hope someone has advice or suggestions.
BZT

The relevant section from the Bash manual states:
Aliases allow a string to be
substituted for a word when it is used
as the first word of a simple command.
So, there should be nothing preventing you from doing e.g.
$ alias geturl="python /some/cool/script.py"
Then you could use it like any other shell command:
$ geturl http://example.com/excitingstuff.jpg
And this would simply call your Python program.

I thought Pycurl might be the answer. Ahh Daniel Sternberg and his innocent presumptions that everybody knows what he does. I asked on the list whether or not pycurl had a "curl -o" analogue, and then asked 'If so: How would one go about coding it/them in a Python script?' His reply was the following:
"curl.setopt(pycurl.WRITEDATA, fp)
possibly combined with:
curl.setopt(pycurl.WRITEFUNCITON, callback) "
...along with Sourceforge links to two revisions of retriever.py. I can barely recall where easy_install put the one I've got; how am I supposed to compare them?
It's pretty apparent this gentleman never had a helpdesk or phone tech support job in the Western Hemisphere, where you have to assume the 'customer' just learned how to use their comb yesterday and be prepared to walk them through everything and anything. One-liners (or three-liners with abstruse links as chasers) don't do it for me.
BZT

Python file read problem

file_read = open("/var/www/rajaneesh/file/_config.php", "r")
contents = file_read.read()
print contents
file_read.close()
The output is empty, but in that file all contents are there. Please help me how to do read and replace a string in __conifg.php.

Usually, when there is such kind of issues, it is very useful to start the interactive shell and analyze all commands.
For instance, it could be that the file does not exists (see comment from freiksenet) or you do not have privileges to it, or it is locked by another process.
If you execute the script in some system (like a web server, as the path could suggest), the exception could go to a log - or simply be swallowed by other components in the system.
On the contrary, if you execute it in the interactive shell, you can immediately see what the problem was, and eventually inspect the object (by using help(), dir() or the module inspect). By the way, this is also a good method for developing a script - just by tinkering around with the concept in the shell, then putting altogether.
While we are here, I strongly suggest you usage of IPython. It is an evolution of the standard shell, with powerful aids for introspection (just press tab, or a put a question mark after an object). Unfortunately in the latest weeks the site is not often not available, but there are good chances you already have it installed on your system.

I copied your code onto my own system, and changed the filename so that it works on my system. Also, I changed the indenting (putting everything at the same level) from what shows in your question. With those changes, the code worked fine.
Thus, I think it's something else specific to your system that we probably cannot solve here (easily).

Would it be possible that you don't have read access to the file you are trying to open?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.