I am writing a Python script with the following objectives:
Starting from current working directory, change directory to child directory 'A'
Make slight adjustments to a fort.4 file
Run a Fortran binary file (the syntax of which is ../../../../ continuing until I hit the folder containing the binary); return to 2. until my particular objective is complete, then
Back out of child directory to parent, then enter another child directory and return to 2. until I have iterated through all the folders in question.
The code is coming along well. I am having to rely heavily upon Python's OS module for the directory work. However, I have never had any experience a) making minor adjustments of a file using python and b) running an executable. Could you guys give me some ideas on Python modules, direct me to a similar stack source etc, or perhaps give ways that this can be accomplished? I understand this is a vague question, so please ask if you do not understand what I am asking and I will elaborate. Also, the changes I have to make to this fort.4 file are repetitive in nature; they all happen at the same position in the file.
Cheers
EDIT::
entire fort.4 file:
file_name
movie1.dat !name of a general file the binary reads
nbr_box ! line3-8 is general info
2
this_box
1
lrdf_bead
.true.
beadid1
C1 !this is the line I must change
beadid2
F4 !This is a second line I must change
lrdf_com
.false.
bin_width
0.04
rcut
7
So really, I need to change "C1" to "C2" for example. The changes are very insignificant to make, but I must emphasize the fact that the main fortran executable reads this fort.4, as well as this movie1.dat file that I have already created. Hope this helps
Ok so there is a few important things here, first we need to be able to manage our cwd, for that we will use the os module
import os
whenever a method operates on a folder it is important to change directories into the folder and back to the parent folder. This can also be achieved with the os module.
def operateOnFolder(folder):
os.chdir(folder)
...
os.chdir("..")
Now we need to do some method for each directory, that comes with this,
for k in os.listdir(".") if os.path.isdir(k):
operateOnFolder(k)
Finally in order to operate on some preexisting FORTRAN file we can use the builtin file operators.
fileSource = open("someFile.f","r")
fileText = fileSource.read()
fileSource.close()
fileLines = fileText.split("\n")
# change a line in the file with -> fileLines[42] = "the 42nd line"
fileText = "\n".join(fileLines)
fileOutput = open("someFile.f","w")
fileOutput.write(fileText)
You can create and run your executable output.fx from source.f90::
subprocess.call(["gfortran","-o","output.fx","source.f90"])#create
subprocess.call(["output.fx"]) #execute
Related
When I run the code below, I get different results when I run in .py compared to when I run in .exe using pyinstaller
import win32com.client
import os
ConfigMacroName = "test.xls"
xl=win32com.client.Dispatch("Excel.Application")
Configmacrowb = xl.Workbooks.Open(os.getcwd()+ "\\Completed\\" + ConfigMacroName)
SlotPlansheet = Configmacrowb.Sheets("SlotPlan")
Header = SlotPlansheet.Rows(1)
SOcol = Header.Find('SO', LookAt=1).Column #I used LookAt=1 which is equivalent to LookAt:=xlWhole in excel VBA
SOlinecol = Header.Find('SO Line').Column
print("SO is " + str(SOcol) + "\nSo line is " + str(SOlinecol))
SlotPlansheet = None
Configmacrowb.Close(False)
Configmacrowb = None
xl.Quit()
xl = None
The excel input
The output in .py
The output in .exe
The output in .py file is the correct output I need. If I run it in .exe there will be duplicate variable since they both will refer to column B. For temporary solution I can just loop through the header to check each cell.
But I'm using find() function a lot so I don't know if my other programs are also affected by this inconsistency
Try changing the object creation line to:
xl=win32com.client.gencache.EnsureDispatch(‘Excel.Application’)
In my experience, the win32com.client.Dispatch() function can sometimes cause issues in that it does not guarantee the same result every time it runs. The caller doesn't know if they have an early- or late-bound object. If you have no cached makepy files then you will get a late-bound IDispatch automation interface, but if win32com finds an early-bound interface then it will use it (even if it wasn't your programme that created it). Hence code that ran fine previously may stop working.
Unless you have a good reason to be indifferent, I think it is better to be explicit and choose win32com.client.gencache.EnsureDispatch() or win32com.client.dynamic.Dispatch() for early- or late-binding respectively. I generally choose the EnsureDispatch() route, as it is quicker, enforces case-sensitivity, and gives access to any constants in the type library (eg win32com.client.constants.xlWhole) rather than rely on 'magic' integers.
Also, in the past, I have experienced odd behaviour around indexing (eg this SO question), and this was cured by deleted any gencache wrappers (see below).
Add this line to your debug code:
print('Cache directory:',win32com.client.gencache.GetGeneratePath())
This will tell you where the gencache early-binding python files are being generated, and where win32com.client.Dispatch() will look for any cached wrapper files to attempt early-binding. If you want to clear the cached of generated files just delete the contents of this directory. It will be interesting to see if the OP's two routes have the same directory.
Working with scientific data, specifically climate data, I am constantly hard-coding paths to data directories in my Python code. Even if I were to write the most extensible code in the world, the hard-coded file paths prevent it from ever being truly portable. I also feel like having information about the file system of your machine coded in your programs could be security issue.
What solutions are out there for handling the configuration of paths in Python to avoid having to code them out explicitly?
One of the solution rely on using configuration files.
You can store all your path in a json file like so :
{
"base_path" : "/home/bob/base_folder",
"low_temp_area_path" : "/home/bob/base/folder/low_temp"
}
and then in your python code, you could just do :
import json
with open("conf.json") as json_conf :
CONF = json.load(json_conf)
and then you can use your path (or any configuration variable you like) like so :
print "The base path is {}".format(CONF["base_path"])
First off its always good practise to add a main function to go with each class to test that class or functions in the file. Along with this you determine the current working directory. This becomes incredibly important when running python from a cron job or from a directory that is not the current working directory. No JSON files or environment variables are then needed and you will obtain interoperation across Mac, RHEL and Debian distributions.
This is how you do it, and it will work on windows also if you use '\' instead of '/' (if that is even necessary, in your case).
if "__main__" == __name__:
workingDirectory = os.path.realpath(sys.argv[0])
As you can see when you run your command, the working directory is calculated if you provide a full path or relative path, meaning it will work in a cron job automatically.
After that if you want to work with data that is stored in the current directory use:
fileName = os.path.join( workingDirectory, './sub-folder-of-current-directory/filename.csv' )
fp = open( fileName,'r')
or in the case of the above working directory (parallel to your project directory):
fileName = os.path.join( workingDirectory, '../folder-at-same-level-as-my-project/filename.csv' )
fp = open( fileName,'r')
I believe there are many ways around this, but here is what I would do:
Create a JSON config file with all the paths I need defined.
For even more portability, I'd have a default path where I look for this config file but also have a command line input to change it.
In my opinion passing arguments from command line would be best solution. You should take a look at argparse . This allows you to create nice way to handle arguments from the command line. for example:
myDataScript.py /home/userName/datasource1/
I have a string of Java source code in Python that I want to compile, execute, and collect the output (stdout and stderr). Unfortunately, as far as I can tell, javac and java require real files, so I have to create a temporary directory.
What is the best way to do this? The tempfile module seems to be oriented towards creating files and directories that are only visible to the Python process. But in this case, I need Java to be able to see them too. However, I also want the other stuff to be handled intelligently if possible (such as deleting the folder when done or using the appropriate system temp folder)
tempfile.NamedTemporaryFile and tempfile.TemporaryDirectory work perfectly fine for your purposes. The resulting objects have a .name attribute that provides a file system visible name that java/javac can handle just fine, just make sure to:
Set the suffix appropriately if the compiler insists on files being named with a .java extension
Always call .flush() on the file handle before handing the .name of a NamedTemporaryFile to an external process or it may (usually will) see an incomplete file
If you don't want Python cleaning up the files when you close the objects, either pass delete=False to NamedTemporaryFile's constructor, or use the mkstemp and mkdtemp functions (which create the objects, but don't clean them up for you).
So for example, you might do:
# Create temporary directory for source and class files
with tempfile.TemporaryDirectory() as d:
# Write source code
srcpath = os.path.join(d.name, "myclass.java")
with open(srcpath, "w") as srcfile:
srcfile.write('source code goes here')
# Compile source code
subprocess.check_call(['javac', srcpath])
# Run source code
# Been a while since I've java-ed; you don't include .java or .class
# when running, right?
invokename = os.path.splitext(srcpath)[0]
subprocess.check_call(['java', invokename])
... with block for TemporaryDirectory done, temp directory cleaned up ...
tempfile.mkstemp creates a file that is normally visible in the filesystem and returns you the path as well. You should be able to use this to create your input and output files - assuming javac will atomically overwrite the output file if it exists there should be no race condition if other processes on your system don't misbehave.
Quite simply, I am cycling through all sub folders in a specific location, and collecting a few numbers from three different files.
def GrepData():
import glob as glob
import os as os
os.chdir('RUNS')
RUNSDir = os.getcwd()
Directories = glob.glob('*.*')
ObjVal = []
ParVal = []
AADVal = []
for dir in Directories:
os.chdir(dir)
(X,Y) = dir.split(sep='+')
AADPath = glob.glob('Aad.out')
ObjPath = glob.glob('fobj.out')
ParPath = glob.glob('Par.out')
try:
with open(os.path.join(os.getcwd(),ObjPath[0])) as ObjFile:
for line in ObjFile:
ObjVal.append(list([X,Y,line.split()[0]]))
ObjFile.close()
except(IndexError):
ObjFile.close()
try:
with open(os.path.join(os.getcwd(),ParPath[0])) as ParFile:
for line in ParFile:
ParVal.append(list([X,Y,line.split()[0]]))
ParFile.close()
except(IndexError):
ParFile.close()
try:
with open(os.path.join(os.getcwd(),AADPath[0])) as AADFile:
for line in AADFile:
AADVal.append(list([X,Y,line.split()[0]]))
AADFile.close()
except(IndexError):
AADFile.close()
os.chdir(RUNSDir)
Each file open command is placed in a try - except block, as in a few cases the file that is opened will be empty, and thus appending the line.split() will lead to an index error since the list is empty.
However when running this script i get the following error: "OSError: [Errno 24] Too Many open files"
I was under the impression that the idea of the "with open..." statement was that it took care of closing the file after use? Clearly that is not happening.
So what I am asking for is two things:
The answer to: "Is my understanding of with open correct?"
How can I correct whatever error is inducing this problem?
(And yes i know the code is not exactly elegant. The whole try - except ought to be a single object that is reused - but I will fix that after figuring out this error)
Try moving your try-except inside the with like so:
with open(os.path.join(os.getcwd(),ObjPath[0])) as ObjFile:
for line in ObjFile:
try:
ObjVal.append(list([X,Y,line.split()[0]]))
except(IndexError):
pass
Notes: there is no need to close your file manually, this is what with is for. Also, there is no need to use as os in your imports if you are using the same name.
"Too many open files" has nothing to do with writing semantically incorrect python code, and you are using with correctly. The key is the part of your error that says "OSError," which refers to the underlying operating system.
When you call open(), the python interpreter will execute a system call. The details of the system call vary a bit by which OS you are using, but on linux this call is open(2). The operating system kernel will handle the system call. While the file is open, it has an entry in the system file table and takes up OS resources -- this means effectively it is "taking up space" whilst it is open. As such the OS has a limit to the number of files that can be opened at any one time.
Your problem is that while you call open(), you don't call close() quickly enough. In the event that your directory structure requires you to have many thousands files open at once that might approach this cap, it can be temporarily changed (at least on linux, I'm less familiar with other OSes so I don't want to go into too many details about how to do this across platforms).
I have some code that loads a default configuration file and then allows users to supply their own Python files as additional supplemental configuration or overrides of the defaults:
# foo.py
def load(cfg_path=None):
# load default configuration
exec(default_config)
# load user-specific configuration
if cfg_path:
execfile(cfg_path)
There is a problem, though: execfile() executes directives in the file specified by cfg_path as if it were in the working directory of foo.py, not its own working directory. Thus, import directives might fail if the cfg_path file does, say, from m import x where m is a module in the same directory as cfg_path.
How do I execfile() from the working directory of its argument, or otherwise achieve an equivalent result? Also, I've been told that execfile is deprecated in Python 3 and that I should be using exec, so if there's a better way that I should be doing this, I'm all ears.
Note: I don't think solutions which merely change the working directory are correct. That won't put those modules on the interpreter's module-lookup path, as far as I can tell.
os.chdir lets you change the working directory as you wish (you can extract the working directory of cfg_path with os.path.dirname); be sure to first get the current directory with os.getcwd if you want to restore it when you're done exec'ing cfg_path.
Python 3 does indeed remove execfile (in favor of a sequence where you read the file, compile the contents, then exec them), but you need not worry about that, if you're currently coding in Python 2.6, since the 2to3 source to source translation deals with all this smoothly and seamlessly.
Edit: the OP says, in a comment, that execfile launches a separate process and does not respect the current working directory. This is false, and here's an example showing that it is:
import os
def makeascript(where):
f = open(where, 'w')
f.write('import os\nprint "Dir in file:", os.getcwd()\n')
f.close()
def main():
where = '/tmp/bah.py'
makeascript(where)
execfile(where)
os.chdir('/tmp')
execfile(where)
if __name__ == '__main__':
main()
Running this on my machine produces output such as:
Dir in file: /Users/aleax/stko
Dir in file: /private/tmp
clearly showing that execfile does keep using the working directory that's set at the time execfile executes. (If the file executed changes the working directory, that will be reflected after execfile returns -- exactly because everything is taking place in the same process!).
So, whatever problems the OP is still observing are not tied to the current working directory (it's hard to diagnose what they may actually be, without seeing the code and the exact details of the observed problems;-).