Cannot get the output of pstats - python

I'm trying to use cProfile from: https://docs.python.org/2/library/profile.html#module-cProfile
I can get the data to print but I want to be able to manipulate the data and sort so that I get just the info I want. To get the data to print I use:
b = cProfile.run("function_name")
But after that runs and prints, b = None and I cannot figure out where the data is that it printed so that I can manipulate the data. Of course, I can see the data, but in order to analyze the data I need to able to get some sort of output into my IED editor. I've tried pstats but I get error messages. It seems that to use pstats I have to save some sort of file but I cannot figure out how to run the program and save it to a file.
UPDATE:
I almost have a solution
cProfile.run('re.compile("foo|bar")', 'restats')
There is a second argument where you can save a file as 'restats'. Now I should be able to open it and read it.
SOLVED:
cProfile.run("get_result()", 'data_stats')
p = pstats.Stats('data_stats')
p.strip_dirs().sort_stats(-1).print_stats()
p.sort_stats('name')

cProfile.run("get_result()", 'data_stats')
p = pstats.Stats('data_stats')
p.strip_dirs().sort_stats(-1).print_stats()
p.sort_stats('name')
In addition to the first argument which runs the code, the second argument actually saves the output to a file. The next line will then open the file. Once that file is open you should be able to see the values of p in your IED editor and be able to use normal python operations to manipulate it.

Related

How to save python notebook cell code to file in Colab

TLDR: How can I make a notebook cell save its own python code to a file so that I can reference it later?
I'm doing tons of small experiments where I make adjustments to Python code to change its behaviour, and then run various algorithms to produce results for my research. I want to save the cell code (the actual python code, not the output) into a new uniquely named file every time I run it so that I can easily keep track of which experiments I have already conducted. I found lots of answers on saving the output of a cell, but this is not what I need. Any ideas how to make a notebook cell save its own code to a file in Google Colab?
For example, I'm looking to save a file that contains the entire below snippet in text:
df['signal adjusted'] = df['signal'].pct_change() + df['baseline']
results = run_experiment(df)
All cell codes are stored in a List variable In.
For example you can print the lastest cell by
print(In[-1]) # show itself
# print(In[-1]) # show itself
So you can easily save the content of In[-1] or In[-2] to wherever you want.
Posting one potential solution but still looking for a better and cleaner option.
By defining the entire cell as a string, I can execute it and save to file with a separate command:
cell_str = '''
df['signal adjusted'] = df['signal'].pct_change() + df['baseline']
results = run_experiment(df)
'''
exec(cell_str)
with open('cell.txt', 'w') as f:
f.write(cell_str)

How to get LibreOffice headless Calc calculate to save new values from uno?

I am trying to open an excel file from python, get it to recalculate and then save it with the newly calculated values.
The spreadsheet is large and opens fine in LibreOffice with GUI, and initially shows old values. If I then do a Data->Calculate->Recalculate Hard I see the correct values, and I can of course saveas and all seems fine.
But, there are multiple large spreadsheets I want to do it from so I don't want to use a GUI instead I want to use Python. The following all seems to work to create a new spreadsheet but it doesn't have the new values (unless I again manually do a recalculate hard)
I'm running on Linux. First I do this:
soffice --headless --nologo --nofirststartwizard --accept="socket,host=0.0.0.0,port=8100,tcpNoDelay=1;urp"
Then, here is sample python code:
import uno
local = uno.getComponentContext()
resolver = local.ServiceManager.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", local)
context = resolver.resolve("uno:socket,host=localhost,port=8100;urp;StarOffice.ServiceManager")
remoteContext = context.getPropertyValue("DefaultContext")
desktop = context.createInstanceWithContext("com.sun.star.frame.Desktop", remoteContext)
document = desktop.getCurrentComponent()
file_url="file://foo.xlsx"
document = desktop.loadComponentFromURL(file_url, "_blank", 0, ())
controller=document.getCurrentController()
sheet=document.getSheets().getByIndex(0)
controller.setActiveSheet(sheet)
document.calculateAll()
file__out_url="file://foo_out.xlsx"
from com.sun.star.beans import PropertyValue
pv_filtername = PropertyValue()
pv_filtername.Name = "FilterName"
pv_filtername.Value = "Calc MS Excel 2007 XML"
document.storeAsURL(file__out_url, (pv_filtername,))
document.dispose()
After running the above code, and opening foo_out.xlsx it shows the "old" values, not the recalculated values. I know that the calculateAll() is taking a little while, as I would expect for it to do the recalculation. But, the new values don't seem to actually get saved.
If I open it in Excel it does an auto-recalculate and shows the correct values and if I open in LibreOffice and do Recalculate Hard it shows the correct values. But, what I need is to save it, from python like above, so that it already contains the recalculated values.
Is there any way to do that?
Essentially, what I want to do from python is:
open, recalculate hard, saveas
It seems that this was a problem with an older version of LibreOffice. I was using 5.0.6.2, on Linux, and even though I was recalculating, the new values were not even showing up when I extracted the cell values directly.
However, I upgraded to 6.2 and the problem has gone away, using the same code and the same input files.
I decided to just answer my own question, instead of deleting it, as this was leading to a frustration until I solved it.

Constant first row of a .csv file?

I have a Python code which is logging some data into a .csv file.
logging_file = 'test.csv'
dt = datetime.datetime.now()
f = open(logging_file, 'a')
f.write('\n "{:%H:%M:%S}",{},{}'.format(dt,x,y,))
The above code is the core part and this produces continuous data in .csv file as
"00:34:09" ,23.05,23.05
"00:36:09" ,24.05,24.05
"00:38:09" ,26.05,26.05
... etc.,
Now I wish to add the following lines in first row of this data. time, data1,data2.I expect output as
time, data1, data2
"00:34:09" ,23.05,23.05
"00:36:09" ,24.05,24.05
"00:38:09" ,26.05,26.05
... etc.,
I tried many ways. Those ways not produced me the result as preferred format.But I am unable to get my expected result.
Please help me to solve the problem.
I would recommend writing a class specifically for creating and managing logs.Have it initialize a file, on creation, with the expected first line (don't forget a \n character!), and keep track of any necessary information about that log(the name of the log it created, where it is, etc). You can then have the class 'write' to the log (append the log, really), you can create new logs as necessary, and, you can have it check for existing logs, and make decisions about either updating what is existing, or scrapping it and starting over.

How to operate on unsaved Excel file?

I'd like to automate a loop:
ABAQUS generates a Excel file;
Matlab utilises data in Excel file;
loop 1 and 2.
Now my question is: after step 1, the Excel file from ABAQUS is unsaved as Book1. I cannot use Matlab command to save it. Is there a way not to save this ''Book1'' file, but use the data in it? Or if I can find where it is so I can use the data inside? (I assume that Excel always saves the file even though user doesn't?)
Thank you! 
As agentp mentioned, if you are running Abaqus via a Python script, you can just use Python to create a .txt file to save all the relevant information. If well structured, a .txt file can be as readable as an Excel spreadsheet. Because Matlab and Python have intrinsic functions to read and write files this communication can be easily done.
As for Matlab calling Abaqus, you can use something similar to:
system('abaqus cae nogui=YOUR_SCRIPT.py')
Your script that pipes to Excel should have some code similar to this:
abq_ExcelUtilities.excelUtilities.XYtoExcel(
xyDataNames='S:Mises PI: PART-1-1 E: 4309 IP: 1', trueName='')
writing the same data to a report (.rpt) file the code looks like this:
x0 = session.xyDataObjects['S:Mises PI: PART-1-1 E: 4309 IP: 1']
session.writeXYReport(fileName='abaqus.rpt', xyData=(x0, ))
now to "roll your own", use that x0 object: x0.data is a regular python tuple holding the actual data which you can write to a file however you like, eg:
file=open('myfile.csv','w')
for point in x0.data: file.write('%g,%g\n'%point)
file.close()
(you can comment or delete the writeXYReport call )

From spreadsheet to dictionary in ipython/python & more?

I would like to be able to take data from a file (spreadsheet or other) and create a dictionary that I can then iterate over in a loop for the keys, and have corresponding values inserted in my command for each key. Sorry if that does not make much sense, I will explain in more detail below.
I have several samples that I am running through a bioinformatics pipeline and I am trying to automate the process. One of the steps is adding "read group" information to my files which is done with the following shell command:
picard-tools AddOrReplaceReadGroups I=input.bam O=output.bam RGID=IDXX
RGLB=LBXX RGPL=PLXX RGPU=PUXX RGSM=SMXX VALIDATION_STRINGENCY=SILENT
SORT_ORDER=coordinate CREATE_INDEX=true
For each sample ID there is a different RGID, RGLB, GRPL, RGPU, and RGSM (and different input files, but I already know how to call that info.) What I would like to do is have a loop that executes this command for each sample ID and have the corresponding RGLB, GRPL, RGPU, and RGSM inserted into the command. Is there an easy way to do this? I have been reading a bit and it seems like a dictionary is probably the way to go, but it is not clear to me how to generate the dictionary and call the independent values into my command.
This should be pretty easy, but how you do it depends on the format of your input file. You're going to want something basically like this:
import subprocess # This is how we're going to call the commands.
samples = {} # Empty dict
with open('inputfile','r') as f:
for line in f:
# Extract sampleID, other things depending on file format...
samples[sampleID] = [rgid, rglb, grpl, rgpu, rgsm] # Populate dict
for sampleID in samples:
rgid, rglb, grpl, rgpu, rgsm = samples[sampleID]
# Now you can run your commands using the subprocess module.
# Remember to add a change based on sampleID if e.g. the IO files differ.
subprocess.call(['picard-tools', 'AddOrReplaceReadGroups', 'I=input.bam',
'O=output.bam', 'RGID=%s' % rgid, 'RGLB=%s' % rglb, 'RGPL=%s' %rgpl,
'RGPU=%s' % rgpu, 'RGSM=%s' % rgsm, 'VALIDATION_STRINGENCY=SILENT',
'SORT_ORDER=coordinate', 'CREATE_INDEX=true'])

Categories

Resources