Passing an array to a slave python script - python

I am quite new to python.
I learned how to pass arguments as string or floats to a slave script.
As an instance, here it is the main script:
#main script (mainscript.py)
import subprocess, sys
import numpy as np
x = np.linspace(0.5,3.2,10)
for i in range(x.size) :
subprocess.call([sys.executable,'slavescript.py',
'%s' %sys.argv[1], '%s' %sys.argv[2], '%s' %xpnt[i]])
And here the slave script:
#slave script (slavescript.py)
import sys
sys.argv[1] = str(sys.argv[1])
sys.argv[2] = int(sys.argv[2])
sys.argv[3] = float(sys.argv[3])
...
...
Now, if in python I run the following command:
run mainscript.py N 5
Then slavescript.py starts using N as a string, 5 as an integer and the third argument is converted to a float. slavescript.py is run m times, where m is the size of the array x.
I would like to pass the whole content of the array x at once, i.e. without the for loop in the main script. I think that the subprocess.call can have only strings among its arguments... I hope someone may have time to help me or give me some hints.
Thanks for the attention.
Noctu

The only reason to use a separate process is if you need parallel processing. If you do need that, then if you're managing lots of workers, use something like celery.
If you do find it appropriate to roll your own, you need to reduce what you want to send to a textual representation. I suggest using the json module.
If you don't need a separate process, just import the other python module, and access its functionality directly in code (it should already by wrapped up in functions).

Related

Passing list as an argument in python script in a subproces call

I am using Subprocess call to execute Insert_API from main script.Values to be inserted are stored in a list and I need to pass the list as an argument while subprocess call. It gives me error saying arguments must be strings . If I make str(list) then it works but Insert does not work properly as its passed as string not as list. Tried with passing as string and converting again to list in script but that also does not work properly.
Is there any way to pass List as it is ? I have 2 lists to pass everytime. Kindly, Let me know if there is any way to pass 2 lists in subprocess call ?
Sample code :
subprocess.check_output(["python","Metadata_Index_EAV_LOAD.py",Dataset1,MetaDataIndex_Table,SchemaFields,FieldValues,r_date])
SchemaField and result are two lists that I need to pass as an argument.
Perhaps you wish to pass a list through args, but this shouldn't be done so.
Why? Because length of the command line passed to the kernel is limited. Newest Linux distros accept pretty long strings, but still. When you have a lot of stuff that can contain special characters and so on, it sometimes may go terribly wrong. Especially if you will give usage of the receiver script to other people, who can forget to bypass shell and send the data directly to your script. And in this case whole mess can rise.
So, I advise you to use pipes. I.e. in receiver read from sys.stdin, and in sender use Popen().communicate().
Something like this:
# Receiver:
from cPickle import loads
import sys
data = sys.stdin.read() # This will return after '^D' is received.
data = loads(data)
print "I received:"
for x in data:
print x
print "\t", data[x]
# Sender:
from cPickle import dumps
from subprocess import Popen, PIPE
data = {"list1": [1, 2, 3, 4],
"list2": [3, 4, 5, 6]}
cmd = ["python", "receiver.py"]
p = Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE)
result, error = p.communicate(dumps(data))
if error:
raise Exception(error)
print "Receiver sent:"
print result
Now, if other people need not to use Python to send data to your script, then do not use pickling but json as suggested in another answer.
Or you can even use literal_eval():
# In receiver:
from ast import literal_eval
import sys
data = sys.stdin.read()
data = literal_eval(data)
#...
# And in sender:
# ...
result, error = p.communicate(repr(data))
#...
literal_eval() is a secure version of eval() that will evaluate only built-in data types, not expressions and other objects.
If you still insist on using args, then you should use something like:
list1 = [1, 2, 3, 4]
def convert (l):
return ",".join((str(x) for x in l))
def get (s):
return s.split(",")
But then you have to make sure that your separator isn't appearing anywhere else in the string or make an escaping mechanism. Also if you would not be sending only alpha-numeric characters you would have to encode the stuff using URL-safe base64 to avoid possible clashes with OS and/or shell.
You could set up argparse in your mainScript to handle input arguments using
parser = argparse.ArgumentParser()
parser.add_argument('-sf','--schema_field', action='append', help="List of id's")
args = parser.parse_args()
this way in your command line argument you just add id's to the list by re-using the shortcut.
python -c mainScript.py -sf 1001 -sf 1002 -sf 1003
And this should return args.schema_field as a list inside your mainScript.
Handling complex optional and positional commandline arguments is definitely the job for argparse - so I recommend going over the official docs here : https://docs.python.org/2/howto/argparse.html
You probably could find a way of passing the arguments as list such as suggested in this answer but still maybe an easier solution would be to use the json.loads method.
Instead of running list(SchemaFields) for instance try running something like:
import json
schema_list = json.loads(SchemaFields)
values = json.loads(FieldValues)
And you probably will have to keep sending the arguments as strings.
Still, I agree with Roseman's comment, it just seems if you used some base class with all these operations ready to go and just invoked them from your main script this would be much simpler (not sure what you mean by hard-coding values this way as still you could send sys.args input in the main script).

Python - How to take input from command line and pipe it into socket.gethostbyaddr("")

I have been scouring the internet looking for the answer to this. Please not my python coding skills are not all that great. I am trying to create a command line script that will take the input from the command line like this:
$python GetHostID.py serverName.com
the last part is what I am wanting to pass on as a variable to socket.gethostbyaddr("") module. this is the code that I have so far. can someone help me figure out how to put that variable into the (" "). I think the "" is creating problems with using a simple variable name as it is trying to treat it as a string of text as appose to a variable name.
here is the code I have in my script:
#!/bin/python
#
import sys, os
import optparse
import socket
remoteServer = input("Enter a remote host to scan: ")
remoteServerIP = socket.gethostbyaddr(remoteServer)
socket.gethostbyaddr('remoteServer')[0]
os.getenv('remoteServer')
print (remoteServerIP)
any help would be welcome. I have been racking my brain over this...
thanks
The command line arguments are available as the list sys.argv, whose first element is the path to the program. There are a number of libraries you can use (argparse, optparse, etc.) to analyse the command line, but for your simple application you could do something like this:
import sys
import sys, os
import optparse
import socket
remoteServer = sys.argv[1]
remoteServerIP = socket.gethostbyaddr(remoteServer)
print (remoteServerIP)
Running this program with the command line
$ python GetHostID.py holdenweb.com
gives the output
('web105.webfaction.com', [], ['108.59.9.144'])
os.getenv('remoteserver') does not use the variable remoteserver as an argument. Instead it uses a string 'remoteserver'.
Also, are you trying to take input as a command line argument? Or are you trying to take it as user input? Your problem description and implementation differ here. The easiest way would be to run your script using
python GetHostID.py
and then in your code include
remoteServer = raw_input().strip().split()
to get the input you want for remoteserver.
you can use sys.argv
for
$python GetHostID.py serverName.com
sys.argv would be
['GetHostID.py', 'serverName.com']
but for being friendly to the user have a look at the argparse Tutorial
In Python 2, input reads text and evaluates it as a Python expression in the current context. This is almost never what you want; you want raw_input instead. However, in Python 3, input does what raw_input did in version 2, and raw_input is not available.
So, if you need your code to work in both Python 2 and 3, you should do something like this after your imports block:
# Apply Python 3 semantics to input() if running under v2.
try:
input = raw_input
def raw_input(*a, **k):
raise NameError('use input()')
except NameError:
pass
This has no effect in Python 3, but in v2 it replaces the stock input with raw_input, and raw_input with a function that always throws an exception (so you notice if you accidentally use raw_input).
If you find yourself needing to smooth over lots of differences between v2 and v3, the python-future library will probably make your life easier.

How to Consume an mpi4py application from a serial python script

I tried to make a library based on mpi4py, but I want to use it in serial python code.
$ python serial_source.py
but inside serial_source.py exists some function called parallel_bar
from foo import parallel_bar
# Can I to make this with mpi4py like a common python source code?
result = parallel_bar(num_proc = 5)
The motivation for this question is about finding the right way to use mpi4py to optimize programs in python which were not necessarily designed to be run completely in parallel.
This is indeed possible and is in the documentation of mpi4py in the section Dynamic Process Management. What you need is the so called Spawn functionality which is not available with MSMPI (in case you are working with Windows) see also Spawn not implemented in MSMPI.
Example
The first file provides a kind of wrapper to your function to hide all the MPI stuff, which I guess is your intention. Internally it calls the "actual" script containing your parallel code in 4 newly spawned processes.
Finally, you can open a python terminal and call:
from my_prog import parallel_fun
parallel_fun()
# Hi from 0/4
# Hi from 3/4
# Hi from 1/4
# Hi from 2/4
# We got the magic number 6
my_prog.py
import sys
import numpy as np
from mpi4py import MPI
def parallel_fun():
comm = MPI.COMM_SELF.Spawn(
sys.executable,
args = ['child.py'],
maxprocs=4)
N = np.array(0, dtype='i')
comm.Reduce(None, [N, MPI.INT], op=MPI.SUM, root=MPI.ROOT)
print(f'We got the magic number {N}')
Here the child file with the parallel code:
child.py
from mpi4py import MPI
import numpy as np
comm = MPI.Comm.Get_parent()
print(f'Hi from {comm.Get_rank()}/{comm.Get_size()}')
N = np.array(comm.Get_rank(), dtype='i')
comm.Reduce([N, MPI.INT], None, op=MPI.SUM, root=0)
Unfortunately I don't think this is possible as you have to run the MPI code specifically with mpirun.
The best you can do is the opposite where you write generic chunks of code which can be called either by an MPI process or a normal python process.
The only other solution is to wrapper the whole MPI part of your code into an external call and call it with subprocess in your non MPI code, however this will be tied to your system configuration quite heavily, and is not really that portable.
Subprocess is detailed in this thread Using python with subprocess Popen, and is worth a look, the complexity here is making the correct call in the first place i.e
command = "/your/instance/of/mpirun /your/instance/of/python your_script.py -arguments"
And then getting the result back into your single threaded code, which dependent on size there are many ways, but something like parallel hdf5 would be a good place to look if you have to pass back big array data.
Sorry I cant give you an easy solution.

Common variables in modules

I have three python files, let's call them master.py, slave1.py and slave2.py. Now slave1.py and slave2.py do not have any functions, but are required to do two different things using the same input (say the variable inp).
What I'd like to do is to call both the slave programs from master, and specify the one input variable inp in master, so I don't have to do it twice. Also so I can change the outputs of both slaves in one master program etc.
I'd like to keep the code of both slave1.py and slave2.py separate, so I can debug them individually if required, but when I try to do
#! /usr/bin/python
# This is master.py
import slave1
import slave2
inp = #some input
both slave1 and slave2 run before I can change the input. As I understand it, the way python imports modules is to execute them first. But is there some way to delay executing them so I can specify the common input? Or any other way to specify the input for both files from one place?
EDIT: slave1 and slave2 perform two different simulations being given a particular initial condition. Since the output of the two are the same, I'd like to display them in a similar manner, as well as have control over which files to write the simulated data to. So I figured importing both of them into a master file was the easiest way to do that.
Write the code in your slave modules as functions, import the functions, then call the functions from master with whatever input you need. If you need to have more stateful information, consider constructing an object.
You can do imports at any time:
inp = #some input
import slave1
import slave2
Note that this is generally considered bad design - you would be better off making the modules contain a function, rather than just having it happen when you import the module.
It looks like the architecture of your program is not really optimal. I think you have two files that execute immediately when you run them with python slave1.py. That is nice for scripting, but when you import them you run in trouble as you have experienced.
Best is to wrap your code in the slave files in a function (as suggested by #sr2222) and call these explicitly from master.py:
slave1.py/ slave2.py
def run(inp):
#your code
master.py
import slave1, slave2
inp = "foo"
slave1.run(inp)
slave2.run(inp)
If you still want to be able to run the slaves independently you could add something like this at the end:
if __name__ == "__main__":
inp = "foo"
run(inp)

How to execute an arbitrary shell script and pass multiple variables via Python?

I am building an application plugin in Python which allows users to arbitrarily extend the application with simple scripts (working under Mac OS X). Executing Python scripts is easy, but some users are more comfortable with languages like Ruby.
From what I've read, I can easily execute Ruby scripts (or other arbitrary shell scripts) using subprocess and capture their output with a pipe; that's not a problem, and there's lots of examples online. However, I need to provide the script with multiple variables (say a chunk of text along with some simple boolean information about the text the script is modifying) and I'm having trouble figuring out the best way to do this.
Does anyone have a suggestion for the best way to accomplish this? My goal is to provide scripts with the information they need with the least required code needed for accessing that information within the script.
Thanks in advance!
See http://docs.python.org/library/subprocess.html#using-the-subprocess-module
args should be a string, or a sequence
of program arguments. The program to
execute is normally the first item in
the args sequence or the string if a
string is given, but can be explicitly
set by using the executable argument.
So, your call can look like this
p = subprocess.Popen( args=["script.sh", "-p", p_opt, "-v", v_opt, arg1, arg2] )
You've put arbitrary Python values into the args of subprocess.Popen.
If you are going to be launching multiple scripts and need to pass the same information to each of them, you might consider using the environment (warning, I don't know Python, so the following code most likely sucks):
#!/usr/bin/python
import os
try:
#if environment is set
if os.environ["child"] == "1":
print os.environ["string"]
except:
#set environment
os.environ["child"] = "1"
os.environ["string"] = "hello world"
#run this program 5 times as a child process
for n in range(1, 5):
os.system(__file__)
One approach you could take would be to use json as a protocol between parent and child scripts, since json support is readily available in many languages, and is fairly expressive. You could also use a pipe to send an arbitrary amount of data down to the child process, assuming your requirements allow you to have the child scripts read from standard input. For example, the parent could do something like (Python 2.6 shown):
#!/usr/bin/env python
import json
import subprocess
data_for_child = {
'text' : 'Twas brillig...',
'flag1' : False,
'flag2' : True
}
child = subprocess.Popen(["./childscript"], stdin=subprocess.PIPE)
json.dump(data_for_child, child.stdin)
And here is a sketch of a child script:
#!/usr/bin/env python
# Imagine this were written in a different language.
import json
import sys
d = json.load(sys.stdin)
print d
In this trivial example, the output is:
$ ./foo12.py
{u'text': u'Twas brillig...', u'flag2': True, u'flag1': False}

Categories

Resources