How to link interactive problems (w.r.t. CodeJam)? - python

I'm not sure if it's allowed to seek for help(if not, I don't mind not getting an answer until the competition period is over).
I was solving the Interactive Problem (Dat Bae) on CodeJam. On my local files, I can run the judge (testing_tool.py) and my program (<name>.py) separately and copy paste the I/O manually. However, I assume I need to find a way to make it automatically.
Edit: To make it clear, I want every output of x file to be input in y file and vice versa.
Some details:
I've used sys.stdout.write / sys.stdin.readline instead of print / input throughout my program
I tried running interactive_runner.py, but I don't seem to figure out how to use it.
I tried running it on their server, my program in first tab, the judge file in second. It's always throwing TLE error.
I don't seem to find any tutorial to do the same either, any help will be appreciated! :/

The usage is documented in comments inside the scripts:
interactive_runner.py
# Run this as:
# python interactive_runner.py <cmd_line_judge> -- <cmd_line_solution>
#
# For example:
# python interactive_runner.py python judge.py 0 -- ./my_binary
#
# This will run the first test set of a python judge called "judge.py" that
# receives the test set number (starting from 0) via command line parameter
# with a solution compiled into a binary called "my_binary".
testing_tool.py
# Usage: `testing_tool.py test_number`, where the argument test_number
# is 0 for Test Set 1 or 1 for Test Set 2.
So use them like this:
python interactive_runner.py python testing_tool.py 0 -- ./dat_bae.py

Related

max_iters doesn't seem to work with GLPK_MI solver in Python

I'm debugging my code right now and since it's running with some datas and not with other ones, I wanted to set the 'max_iters' option to 1 to see if it works in only 1 iteration or if it needs more. I realised it doesn't seem to even use it. I tried putting a string "hello" instead of an int and it even worked. Do someone knows if it's a known problem?
self.prob.solve(solver="GLPK_MI", max_iters=1)
I'm using the CVXPY module with CVXOPT.
EDIT:
I want to do this because I don't get an error, it just continues to run forever. And with the project I'm working on it can take a lot of time to run so I wonder if it's really not working or if it's just a question of time.
Wouldn't be better if you set the max iterations as a variable? (just a suggestion)
In any case, in CVXOPT you need to set the max number of iteration as
'maxiters' : 1
or you can set it as a variable and then call solver as per below
opts = {'maxiters' : 1}
self.prob.solve(solver="GLPK_MI", options = opts)

Snakemake rule stops with 'MissingOutputException' after successful processing of first input

I wrote my first snakemake rule that uses a python script for processing files:
rule sanitize_labels:
input:
"data/raw/labels/rois_essence_31_10_2019_final.shp",
"data/raw/labels/pts_carte_auto_final.shp"
output:
"data/interim/labels/rois_essence_31_10_2019_final.csv",
"data/interim/labels/pts_carte_auto_final.csv"
params:
crs = 32189,
log = True
script:
"../../scripts/data/sanitize_labels.py"
It runs successfully for the first file, than stops with this message:
Waiting at most 5 seconds for missing files.
MissingOutputException in line 9 of E:\code\projects\essences\workflow\rules\pre
processing.smk:
Missing files after 5 seconds:
data/interim/labels/pts_carte_auto_final.csv
This might be due to filesystem latency. If that is the case, consider to increa
se the wait time with --latency-wait.
Removing output files of failed job sanitize_labels since they might be corrupte
d:
data/interim/labels/rois_essence_31_10_2019_final.csv
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: E:\code\projects\essences\.snakemake\log\2020-02-10T025157.458955.
snakemake.log
I tried swapping file order both in input and output; always only the first file gets processed.
In my python script, I refer to input and output as snakemake.input[0] and snakemake.output[0]. If I understand correctly snakemake.input[0] is assigned the current input in each call of the script (no matter what's the number of inputs in the rule). Same goes for snakemake.output[0]. Is that correct?
Do you have other hints at what can cause this error?
I'm running snakemake version 5.10.0 (installed as snakemake-minimal from bioconda channel.).
Thanks a lot for any hint.
Adding to Maarten's answer, once you have specified the generic rule he provided, you then request the final outputs you want as the rule 'all' as your first rule:
rule all:
input: expand("data/interim/labels/{name}.csv", name=DATASETS)
If you place the input directive in your generic sanitize_labels rule, it is no longer generic. Snakemake expands the input you provide to create the same rule as in your question.
Go through the tutorial again if it's still not clear. While you may think and write your rules from start to finish, snakemake evaluates from finish to start. You request the final outputs in all (as inputs) and snakemake decides what needs to be run. It's confusing at first, but just remember to request your final output in all and keep your rules generic.
I think you need to take a look again at the "idea" behind snakemake. Probably what you need is something like this:
rule sanitize_labels:
input:
"data/raw/labels/{name}.shp"
output:
"data/interim/labels/{name}.csv"
params:
crs = 32189,
log = True
script:
"../../scripts/data/sanitize_labels.py"
Where you do not exactly specify the filename, but you tell snakemake how it can generate a certain output from a certain input. In this case, if your script needs both data/interim/labels/rois_essence_31_10_2019_final.csv and data/interim/labels/pts_carte_auto_final.csv, Snakemake "understands" how to make these files, and it knows which inputs it needs.

Can you delete or replace a python logging message?

I'm looking for a way to replace or delete the last message wrote by python's logging module. The goal is to log a change in variables once it occurs. If the variable changes again, the old log message should be deleted and the new one printed instead.
Hi,
I am using pythons's logging module for a deep learning project I'm currently working on. As some GPUs just don't have enough memory to support the default batch size during training and there is no apparent connection between batch size and actual memory usage that could be used for calculations beforehand, I'm catching the runtime error once it occurs and decrease the batch size by one.
This process can be repeated quite a bit and I'm always logging which batch size did not work and which will be the next one tried. Instead of having 10-30 of these messages (or more) I'd like to simply delete the last one and replace it with the newer one instead.
I've already checked out the python logging documentation, stumbled upon the LogRecord object, but upon trying to deal with it, it seems this object does not actually keep a record of all logs, but rather saves some more info on one specific log instead.
If there is simply no way to do this, I will look into some kind of bundling solution as described here: Python logging: bundle reoccurring messages
The code below shows the log message I'm looking to replace.
Any help is greatly apreciated.
training_not_successful = True
while training_not_successful:
try:
model.run_training(global_settings['epochs'],
train_loader,
test_loader,
global_settings['checkpoint_output_path'],
model_name,
global_settings['best_net_criterion'])
training_not_successful = False
except MemoryError:
logging.warning("Ran out of CUDA memory using batch size " + str(batch_size) +
". Trying again with batch size " + str(batch_size-1))
batch_size -= 1
train_loader, test_loader = get_train_test_loaders(
train_dataset_list,
test_dataset_list,
value_counts,
batch_size
)
I believe (correct me if I'm wrong) that the logging module does not allow to supress newlines, meaning that it's simply not possible to do something like that.
It is possible to do it with a print though:
import shutil
def display(variable, rewritable=False):
columns, lines = shutil.get_terminal_size(fallback=(80, 20))
text = str(variable)
filled = text + ((columns - len(text)) * ' ')
print(filled, end='\r' if rewritable else '\n')
if __name__ == "__main__":
from random import random
from time import sleep
for i in range(10):
display(f"x = {random()}", True)
sleep(1)
display(f"x = 0.0") # test if old value is overwritten completely
display("Done!")
Tested this on linux, but it should work everywhere. (the shutil.get_terminal_size function)
It's not mandatory, but very nice when entire line is overwritten as opposed to only the part that's changed.
The key is character \r - it returns the cursor to the end of the line, that's it. Now you can start writing from the front again, overwriting the line if it has anything else, which is exactly what you want.
Display function is simple, but I'll explain it anyway:
First line gets terminal size, what we need is the width of the line, so we can pad the text with spaces and fill entire line with spaces, to completely overwrite previous line no matter what it had.
Then we convert our variable to a string.
After that, it's just simple math, our string takes n characters, so the rest should be spaces, so we add width - n spaces to final string, and then we print it - entire line is overwritten.
rewritable flag allows to control when the variable should be rewritten next time you call display.
While this is not what you want, as it does not use logging module, since there's no way (that I know of) to make logging module to print \r instead of \n, I think this is a good enough substitute, that could be used if it turns out that you can indeed do this with logging module.

LLDB: Set breakpoint at offset from function start using python api

I have a lldb python module with a simple setup:
def __lldb_init_module (debugger, dict):
target = debugger.GetSelectedTarget()
target.DeleteAllBreakpoints()
process = target.GetProcess()
I can easily set a breakpoint at the start of a function using:
breakpoint = target.BreakpointCreateByName("functionName", "moduleName")
breakpoint.SetScriptCallbackFunction( "python_module.bp_hit" )
and I know this works because my bp_hit function is called correctly.
However, I really need the breakpoint set at X number of bytes from the start of the address of functionName. If I knew the address of the functionName, I could simply add X to the address and use BreakpointCreateByAddress.
What is the python code that will provide me the address of the functionName?
Use SBTarget::FindFunctions to find the SBSymbolContext(s) that match a your function name. That returns a list of SBSymbolContext matches (since there may be more than one.) SBSymbolContext::GetStartAddress will give you an SBAddress for the start of that symbol. Then use SBAddress::OffsetAddress to add your offset. There is a SBTarget::CreateBreakpointByAddress but annoyingly enough it only takes an lldb::addr_t not an SBAddress. You can get an lldb::addr_t from an SBAddress with SBAddress::GetLoadAddress() passing in your target.
An alternative to Jim's answer is to use the FindSymbols function of SBTarget.

How to check python code including libraries?

I'm working on some machine learning code and today I've lost about 6 hours because simple typo.
It was this:
numpy.empty(100,100)
instead of
numpy.empty([100,100])
As I'm not really used to numpy, so I forgot the brackets. The code happily crunched the numbers and at the end, just before saving results to disk, it crashed on that line.
Just to put things in perspective I code on remote machine in shell, so IDE is not really an option. Also I doubt IDE would catch this.
Here's what I already tried:
running pylint - well pylint kinda works. After I've disabled everything apart of errors and warnings, it even seem to be usefull. But pylint have serious issue with imported modules. As seen on official bug tracker devs know about it, but cannot/won't do anything about it. There is suggested workaround, but ignoring whole module, would not help in my case.
running pychecker - if I create code snippet with the mistake I made, the pychecker reports error - same error as python interpreter. However if I run pychecker on the actual source file (~100 LOC) it reported other errors (unused vars, unused imports, etc.); but the faulty numpy line was skipped.
At last I have tried pyflakes but it does even less checking than pychecker/pylint combo.
So is there any reliable method which can check code in advance? Without actually running it.
A language with stronger type checking would have been able to save you from this particular error, but not from errors in general. There are plenty of ways to go wrong that pass static type checking. So if you have computations that takes a long time, it makes sense to adopt the following strategies:
Test the code from end to end on small examples (that run in a few seconds or minutes) before running it on big data that will consume hours.
Structure long-running computations so that intermediate results are saved to files on disk at appropriate points in the computation. This means that when something breaks, you can fix the problem and restart the computation from the last save point.
Run the code from the interactive interpreter, so that in the event of an exception you are returned to the interactive session, giving you a chance of being able to recover the data using a post-mortem debugging session. For example, suppose I have some long-running computation:
def work(A, C):
B = scipy.linalg.inv(A) # takes a long time when A is big
return B.dot(C)
I run this from the interactive interpreter and it raises an exception:
>>> D = work(A, C)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "q22080243.py", line 6, in work
return B.dot(C)
ValueError: matrices are not aligned
Oh no! I forgot to transpose C! Do I have to do the inversion of A again? Not if I call pdb.pm:
>>> import pdb
>>> pdb.pm()
> q22080243.py(6)work()
-> return B.dot(C)
(Pdb) B
array([[-0.01129249, 0.06886091, ..., 0.08530621, -0.03698717],
[ 0.02586344, -0.04872148, ..., -0.04853373, 0.01089163],
...,
[-0.11463087, 0.15048804, ..., 0.0722889 , -0.12388141],
[-0.00467437, -0.13650975, ..., -0.13894875, 0.02823997]])
Now, unlike in Lisp, I can't just set things right and continue the execution. But at least I can recover the intermediate results:
(Pdb) D = B.dot(C.T)
(Pdb) numpy.savetxt('result.txt', D)
Do you use unit tests? There is really no better way.

Categories

Resources