I ran this line to retrieve the paths of all the files in a directory and its subdirectories and forgot to save it in a variable and it takes hours to run it again because of the size of the dataset.
list(glob.glob(str(train_root)+'/**/info.txt',recursive=True))
Is it saved in a variable name reserved by python like temp?
then I can run my_list=temp.
If you already ran the code, python released the memory after the code finished executing. You will likely have to rerun it to get the output.
If you ran this in IDLE, then you can certainly save to a file with something like:
f = open('<output_file_name>','w') and then print(<string_of_data>, file=f) or f.write(<string_of_data>). Finish off with f.close()
Running that will save an output text file containing that variable. You may need to cast the array as a string and specify a path for the output file.
Related
I've been attempting to execute a certain CLI from within python and store the output for later use within the same script. I suspect this question has a simple answer, but if one wishes to go through the entire pipeline, here is the tool in question.
wget http://rna.urmc.rochester.edu/Releases/current/RNAstructureForLinux.tgz
tar xvf as usual, go inside the resulting directory and execute 'make all', the executables I use in the bash script are within the 'exe' directory.
I attempted to execute the commands with os.system(), but with little luck. The CLI I am using; however, seems to be running. The function which I have set to execute the os.system() commands contains the following block.
txt = open('home/spectre/tools/RNAstructure/exe/RNAStructure_nucleic_acid.txt',"w")
txt.write('AAGGCTGTCCAGGCGCAATGTGGTGGCTGCTTCTCTGGGGAGTCCTCCAGGCTTGCCCAACCCGGGGCTCCGTCCTCTTGGCCCAAGAGCTACCCCAGCAGCTGACATCCCCCGGGTACCCAGAGCCGTATGGCAAAGGCCAAGAGAGCAGCACGGACATCAAGGCTCCAGAGGGCTTTGCTGTGAGGCTCGTCTTCCAGGACTTCGACCTGGAGCCGTCCCAGGACTGTGCAGGGGACTCTGTCACAGTGAGCTGGGGATGGGGGGGGTCCCGCCAGGACTGTGGCCAGGGAGATTCCCGGGGTTGTGGGAAGTGGCGGTGCCCTGAATCCCCCATCTGGAGGAGGGATGAAT')
os.system(' cd ~/tools/RNAstructure/exe ; ./python_RNA_structure.sh')
nucleotides, structure, MFE =
RNAStructure_from_file('home/spectre/tools/RNAstructure/exe/RNAStructure_bracket_output.txt')
The executable *.sh file contains this.
#!/bin/bash
cd ~/tools/RNAstructure/exe
./Fold RNAStructure_nucleic_acid.txt RNAStructure_nucleic_acid_output.txt
./ct2dot RNAStructure_nucleic_acid_output.txt -1 RNAStructure_bracket_output.txt
If I execute the bash script from the command line the output should look a little like this
Initializing nucleic acids...
Using auto-detected DATAPATH: "../data_tables" (set DATAPATH to avoid this warning).
done.
98% \[==================================================\] \\ done.
Writing output ct file...done.
Single strand folding complete.
Converting CT file...
Using auto-detected DATAPATH: "../data_tables" (set DATAPATH to avoid this warning).
CT file conversion complete.
If I execute the bash script form the python file.
Initializing nucleic acids...
Using auto-detected DATAPATH: "../data_tables" (set DATAPATH to avoid this warning).
Error reading sequence. The file did not contain any nucleotides.
Single strand folding complete with errors.
Converting CT file...
Using auto-detected DATAPATH: "../data_tables" (set DATAPATH to avoid this warning).
CT file conversion complete.
It looks an awful lot like my CLI can find the files it needs inside the terminal, but not outside of it. I haven't experimented with any parameters like trying absolute paths, but I understood by using os.system() I could execute a bash script, but it is not clear to me why this is changing how that script behaves.
What I've done to resolve the problem:
reopening the file seems to resolve the problem, but I am still working out why.
The problem seems to resolve when I reopen the file within the python script like so:
txt = open('home/spectre/tools/RNAstructure/exe/RNAStructure_nucleic_acid.txt',"w")
txt.write('AAGGCTGTCCAGGCGCAATGTGGTGGCTGCTTCTCTGGGGAGTCCTCCAGGCTTGCCCAACCCGGGGCTCCGTCCTCTTGGCCCAAGAGCTACCCCAGCAGCTGACATCCCCCGGGTACCCAGAGCCGTATGGCAAAGGCCAAGAGAGCAGCACGGACATCAAGGCTCCAGAGGGCTTTGCTGTGAGGCTCGTCTTCCAGGACTTCGACCTGGAGCCGTCCCAGGACTGTGCAGGGGACTCTGTCACAGTGAGCTGGGGATGGGGGGGGTCCCGCCAGGACTGTGGCCAGGGAGATTCCCGGGGTTGTGGGAAGTGGCGGTGCCCTGAATCCCCCATCTGGAGGAGGGATGAAT')
txt = open('home/spectre/tools/RNAstructure/exe/RNAStructure_nucleic_acid.txt')
os.system(' cd ~/tools/RNAstructure/exe ; ./python_RNA_structure.sh')
nucleotides, structure, MFE =
RNAStructure_from_file('home/spectre/tools/RNAstructure/exe/RNAStructure_bracket_output.txt')
I am not sure why this resolves the problem, I found this solution serendipitously. I'll update the answer when I figure out why, unless someone wants to beat me to it. It's magic to me for now.
It seems that after opening the file, RNAStructure_nucleic_acid.txt, and assigning it to the txt variable for writing, I need to reopen it after writing is complete. Otherwise the file is blank when I try printing it's output within the program, but after the program finishes executing, the file contains the correct text.
I am trying to collect every txt file from my computer and write it into the terminal when I run the script. I do not know how to do it. Is there a way to read every txt file in the computer then print the contents? (not a certain folder or directory).
In Python, the glob module would give you a list of filenames matching a given string. In your case, glob.glob('dir/*.txt') would give you a list of filenames in directory dir that end in .txt. You can then open each file and print() it to the terminal. Depending on your OS, you might be able to do it in your terminal without writing a separate script.
I'm using a script from a third party I can't modify or show (let's call it original.py) which takes a file and produces some calculations. At the end it ouputs a result (using the print statment).
Since I have many files I decided to make a second script that gets all wanted files and runs them through the original.py
1st get list of all files to run
2nd run each file through the original.py
3rd obtain results from each file
I have the 1st and 2nd step. However, the end result only saves the calculations from the last file it read.
import sys
import original
import glob
import os
fn=str(sys.argv[1])
for filename in sys.argv[1:]:
print(filename)
ficheiros = [f for f in glob.glob(fn)]
for ficheiro in ficheiros:
original.file = bytes(ficheiro,'utf-8')
original.function()
To summarize:
Knowing I can't change the original script (which is made with a print statement) how can I obtain the results for each loop? Is there a better way than using a for loop?.
The first script can be invoked with python original.py
It requires the file to be changed manually inside the script in the original.file line.
This script outputs the result in the console and I redirect it with: python original.py > result.txt
At the moment when I try to run my script, it reads all the correct files in the folder but only returns the results for the last file.
#
(I tried to reformulate the question hopefully it's easier to understand)
#
The problem is due to a mistake in the ````ficheiros = [f for f in glob.glob(fn)]`````it's only reading one file, hence only outputting one result.
Thanks for the time.sleep() trick in the comments.
Solved:
I changed the initial part to:
fn=str(sys.argv[1])
ficheiros= []
for filename in sys.argv[1:]:
ficheiros.append(filename)
#print(filename)
and now it correctly reads all the files and it outputs all the results
Depending on your operating system there are different ways to take what is printed to the console and append it to a file.
For example on Linux, you could run this file that calls original.py for every file python yourfile.py >> outputfile.txt, which will then effectively save everything that is printed into outputfile.txt.
The syntax is similar for Windows.
I'm not quite sure what you're asking, but you could try one of these:
Either redirecting all output to a file for later use, by running the script like so: python secondscript.py > outfilename.txt
Or, and this might or might not work for you, redefining the print command to a function that outputs the result how you want, eg:
def print(x):
with open('outfile.txt','w') as f:
f.write('example: ' + x)
If you choose the second option, I recommend saving the old print function (oldprint = print) so you can restore and use the regular print later.
I don't know if I got exactly what you want. You have a first script named original.py which takes some arguments and returns things in the form of print statements and you would like to grab these prints statements in your scripts to do things?
If so, a solution could be the subprocess module:
Let's say that this is original.py:
print("Hi, I'm original.py")
print("print me!")
And this is main.py:
import subprocess
script_path = "original.py"
print("Executing ", script_path)
process = subprocess.Popen(["python3", script_path], stdout=subprocess.PIPE)
for line in process.stdout:
print(line.decode("utf8"))
You can easily add more arguments in the Popen call like ["arg1", "arg2",] etc.
Output:
Executing original.py
Hi, I'm original.py
print me!
and you can grab the lines in the main.py to do what you want with them.
I am trying to run python script inside my load script in Qlik Sense app.
I know that I need to put OverrideScriptSecurity=1 in Settings.ini
I put
Execute py lib://python/getSolution.py 100 'bla'; // 100 and 'bla' are parameters
and I get no error in qlik sense, but script is not executed (I think) because inside the script I have
f = open("file.xml", "wb")
f.write(xml)
f.close
and file is not saved.
If I run script from terminal, then script is properly executed.
What could go wrong?
By the way, my full path to python interpreter is
C:\Users\Marko Z\AppData\Local\Programs\Python\Python37-32\python.exe
EDIT :
Even if I add this
Set vPythonPath = "C:\Users\Marko Z\AppData\Local\Programs\Python\Python37-32\python.exe";
Set vPythonFile = "C:\Users\Marko Z\Documents\Qlik\Sense\....\getSolution.py";
Execute $(vPythonPath) $(vPythonFile);
I get the same behaviour. No error, but not working,...
I even see that if I change path (incorrect path) it give me an error, but incorrect file it doesn't give me an error.... (but I am sure it is the right file path...)
My python code is
xml = "Marko"
xml = xml.encode('utf-8')
f = open("C:\\Users\\Marko Z\\Test.xml", "wb")
f.write(xml)
f.close
I figure out what was wrong.
For all others that would have similar problems:
Problem is in space in path.
If I move my script in c:\Windows\getSolution.py it work. I also need to change the python path to c:\Windows\py.exe
so end script looks like:
Execute c:\Windows\py.exe c:\Windows\getSolution.py 100 'bla';
But I still need to figure how to work with space in path...
Strange. With exactly the same python file and QS script the result file is generated correctly.
The content of my settings.ini.
[Settings 7]
StandardReload=0
OverrideScriptSecurity=1
According to Qlik's documentation there should be an empty line at the end (point 4 from the lists)
I was able to get the below to work in qlik sense:
set vPyExe = C:\Program Files\Python37\python.exe;
set vPyScript = D:\...\PythonScript.py;
Execute
"$(vPyExe)" "$(vSource)"
;
I'm currently creating a script that will simply open a program in the SAME directory as the script. I want to have a text file named "target.txt", and basically the script will read what's in "target.txt" and open a file based on its contents.
For example.. The text file will read "program.exe" inside, and the script will read that and open program.exe. The reason I'm doing this is to easily change the program the script opens without having to actually change whats inside.
The current script Im using for this is:
import subprocess
def openclient():
with open("target.txt", "rb") as f:
subprocess.call(f.read())
print '''Your file is opening'''
Its giving me an error saying it cannot find target.txt, even though I have it in the same directory. I have tried taking away the .txt, still nothing. This code actually worked before, however; it stopped working for some strange reason. I'm using PythonWin compiler instead of IDLE, I don't know if this is the reason.
There are two possible issues:
target.txt probably ends with a newline, which messes up subprocess.call()
If target.txt is not in the current directory, you can access the directory containing the currently executing Python file by parsing the magic variable __file__.
However, __file__ is set at script load time, and if the current directory is changed between loading the script and calling openclient(), the value of __file__ may be relative to the old current directory. So you have to save __file__ as an absolute path when the script is first read in, then use it later to access files in the same directory as the script.
This code works for me, with target.txt containing the string date to run the Unix date command:
#!/usr/bin/env python2.7
import os
import subprocess
def openclient(orig__file__=os.path.abspath(__file__)):
target = os.path.join(os.path.dirname(orig__file__), 'target.txt')
with open(target, "rb") as f:
subprocess.call(f.read().strip())
print '''Your file is opening'''
if __name__ == '__main__':
os.chdir('foo')
openclient()