pycharm: pass console variables to script - python

I'm running a script which loads some huge amount of data using pickles.
For this big amount of data, running the script takes a lot of time which in turn makes it very hard to work with (especially to debug).
For solving the problem above I thought about passing some of the variables defined in the console to the script. This will allow me to load the pickles only one time and just pass it to the script every time I want to use their data.
I tried to find a way to do this but couldn't find any.
Is there any way of doing passing console variables to a script?

Never mind. I can just create a function in the script and then call it from the console instead of having __main__.
For example, for a script A.py, add a function b(params) and then in the console just run
from <pathToA>.A import *
b(params)

Related

Is it possible to initialise a module before running a python program?

I wrote a python program which uses a module (pytesseract, specifically) and I notice it takes a few seconds to import the module once I run it. I am wondering if there is a way to initialise the module before running the main program in order to cut the duration of the actual program by a few seconds. Any suggestions?
One possible solution for slow startup time would be to split your program into two parts--one part that is always running as a daemon or service and another that communicates with it to process individual tasks.
As a quick answer without more info, pytesseract also imports (if they are installed) PIL, numpy, and pandas. If you don't need these, you could uninstall them to reduce load time.
I presume that you need to start your application multiple times with different arguments and you don't want to waste time on imports every time, right?
You can wrap actual code in while True: and use input() to get new arguments. Or read arguments from the file.

How can I initialise an object to be used in multiple calls Python from the command-line

I have a script I've written that uses a very large object. I load the object with pickle, but it takes quite a few seconds to do so. That's not a big deal if it has to happen once or twice, but I'm hoping to use the code many hundreds or thousands of times!
I think my issue is that I'd like to almost 'leave' the object alive and then be able to call it from command line whenever I need it. I'm reasonably new to Python so I'm not sure how possible that is; sorry if I haven't used the right terminology in my question. I'm writing and running my python in Spyder at the moment, but eventually I'd like to run it on a server, calling the code as and when required.
If your script is looping over the python program, move the loop inside the program.
If on the other hand, you want to be able to use the large object on demand, you probably need a client/server configuration. Thriftpy is a very simple way to achieve this. The thriftpy server will hold the object and the processing logic, and the client will be a command line script that will call the server and pass whatever parameters you need to process the object.

Python - run another python script with current environment passing the arguments over and getting the printed print output

A little bit of an ugly question, but I didn't find existing SO posts which cover it.
Right now I need to use an existing python tool available on this github
This is a rather big piece of code with a lot of dependencies which I don't want to mess with. In a nutshell one can run its module by passing the command line arguments, for example:
timesearch.py timesearch -r "subreddit1" -l "1466812800" -up "1498348800"
Now, I need to run this tool a bunch of times using a for loop, passing over different argument values each time. The tool also prints out some output into command line when you run it - and I would like to intercept and print it out from my python script as well. Finally, I need to ensure that before I move on in my loop and run the tool another time that current execution of the timesearch tool is completed.
One side note here - I do need to ensure that the timesearch is executed using same environment which I use to run my main script with for loop.
I am trying to understand what is the best way to do it.
If I just go for this it doesn't work:
import os
#for loop will go here
os.system('python timesearch.py timesearch -r "ethereum" -l "1466812800" -up "1498348800"')
It fails due to several reasons - it doesn't use the environment in which I am writing my script with a loop, it also doesn't capture the print output of timesearch.
Any advice on how to achieve it?
Just to highlight - I can't just go and pull function I need in timesearch, since it calls the __init__ to set up some things based on the arguments you pass.
I wouldn't call python script with os.system. There is basically one function which you need to use: main(sys.argv[1:])
https://github.com/voussoir/timesearch/blob/master/timesearch/__init__.py#L435.

refactoring code to keep large objects/models in memory in iPython to be reused in python scripts

My script depends on loading lots of variables in a minute and uses them globally in many functions. Every time I call that script in iPython, it loads them again, taking time.
I tried to take these calls to load and populate functions out of that script, but then these global variables are not available to the functions in the script.
It gives NameError: name 'clf' is not defined error message.
Is there a best way to refactor this code to keep these globals in memory and make the script use them? The script loads many variables like these, and uses them in other functions as globals.
vectorizer_title, vectorizer_desc, clf,
df_instance, vocab, all_tokens, df_dist_all,
df_soc2class_proba, dict_p2s,
dict_f2m, token_pattern, cleanup_pattern,
excluded_words = load_data_and_model(lang)
dict_token2idx_all, dict_token2idx_instance,
dist_array, token_dist_to_instance_min,
dict_bigram_by_instance, denominate,
similar_threshold = populate_data(1)
I had asked this question after trying
from depended_library import *
it had not worked in iPython.
But used with python and used in a Flask Web API it works.
Importing library using the "from" statement executes also the codes out of functions in the depended_library in addition to defining functions.
(If someone explains the problem with iPython and suggest a solution, I shall select it as answer.)

How can I call a python script with arguments from within Processing?

I have a python script which outputs a JSON when called with different arguments. I am looking for a way to call that script from within Processing and load the output using something like loadJSONObject()
The problem is that I don't know how to call the python script with arguments from within Processing.
Any tip will be appreciated, thanks!
One option, as pointed out in the comments, is to use open, and then load the file that generates the normal way.
Another -arguably much better- way is to not do this and to run your python script as services with a web interface instead, so that your python scripts sits listening on http://localhost:1234, for instance, and your Processing sketch can simply load a file "http://localhost:1234/somefile?input=whatever" and not even care what is actually generating the content.
The upside there is also that you can run your script anywhere that can be reached via URLs, and those things don't need to rely on python being available as an executable.

Categories

Resources