using pipe with python - python

import re
import subprocess
sub = subprocess.Popen(['/home/karthik/Downloads/stanford-parser-2011-06- 08/lexparser.csh'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr = subprocess.PIPE)
sub.stdin.write("i am a fan of ac milan which is the best club in the world")
relns = []
while(True):
rel = sub.stdout.readline()
m = re.search("Sentence skipped", rel)
if m != None:
print 'stop'
sys.exit(0)
if rel == '\n':
break
relns.append(rel)
print relns
sub.terminate()
So i want to the stanford parser and using the lexparser.csh to parse this line of text . But when i run this piece of code i am a getting the output of the default of text. The actual text given in is not being parsed. So am i using pipes the right way ?And i've seen in a lot of examples - a '-' is used along with the command . Why is that being used ? Cos when i use that the script just stalls at sub.stdout.readline()

You may need to call flush() on sub.stdin after writing.

Related

My python script stops... no error, just stops

I am running a script that iterates through a text file. On each line on the text file there is an ip adress. The script grabs the banner, then writes the ip + banner on another file.
The problem is, it just stops around 500 lines, more or less, with no error.
Another weird thing is if i run it with python3 it does what i said above. If i run it with python it iterates through those 500 lines, then starts at the beggining. I noticed this when i saw repetitions in my output file. Anyway here is the code, maybe you guys can tell me what im doing wrong:
import os
import subprocess
import concurrent.futures
#import time, random
import threading
import multiprocessing
with open("ipuri666.txt") as f:
def multiprocessing_func():
try:
line2 = line.rstrip('\r\n')
a = subprocess.Popen(["curl", "-I", line2, "--connect-timeout", "1", "--max-time", "1"], stdout=subprocess.PIPE)
b = subprocess.Popen(["grep", "Server"], stdin=a.stdout, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
#a.stdout.close()
out, err = b.communicate()
g = open("IP_BANNER2","a")
print( "out: {0}".format(out))
g.write(line2 + " " + "out: {0}\n".format(out))
print("err: {0}".format(err))
except IOError:
print("Connection timed out")
if __name__ == '__main__':
#starttime = time.time()
processes = []
for line in f:
p = multiprocessing.Process(target=multiprocessing_func, args=())
processes.append(p)
p.start()
for process in processes:
process.join()
If your use case allows I would recommend just rewriting this as a shell script, there is no need to use Python. (This would likely solve your issue indirectly.)
#!/usr/bin/env bash
readarray -t ips < ipuri666.txt
for ip in ${ips[#]}; do
output=$(curl -I "$ip" --connect-timeout 1 --max-time 1 | grep "Server")
echo "$ip $output" >> fisier.txt
done
The script is slightly simpler than what you are trying to do, for instance I do not capture the error. This should be pretty close to what you are trying to accomplish. I will update again if needed.

Get local DNS settings in Python

Is there any elegant and cross platform (Python) way to get the local DNS settings?
It could probably work with a complex combination of modules such as platform and subprocess, but maybe there is already a good module, such as netifaces which can retrieve it in low-level and save some "reinventing the wheel" effort.
Less ideally, one could probably query something like dig, but I find it "noisy", because it would run an extra request instead of just retrieving something which exists already locally.
Any ideas?
Using subprocess you could do something like this, in a MacBook or Linux system
import subprocess
process = subprocess.Popen(['cat', '/etc/resolv.conf'],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stdout, stderr = process.communicate()
print(stdout, stderr)
or do something like this
import subprocess
with open('dns.txt', 'w') as f:
process = subprocess.Popen(['cat', '/etc/resolv.conf'], stdout=f)
The first output will go to stdout and the second to a file
Maybe this one will solve your problem
import subprocess
def get_local_dns(cmd_):
with open('dns1.txt', 'w+') as f:
with open('dns_log1.txt', 'w+') as flog:
try:
process = subprocess.Popen(cmd_, stdout=f, stderr=flog)
except FileNotFoundError as e:
flog.write(f"Error while executing this command {str(e)}")
linux_cmd = ['cat', '/etc/resolv.conf']
windows_cmd = ['windows_command', 'parameters']
commands = [linux_cmd, windows_cmd]
if __name__ == "__main__":
for cmd in commands:
get_local_dns(cmd)
Thanks #MasterOfTheHouse.
I ended up writing my own function. It's not so elegant, but it does the job for now. There's plenty of room for improvement, but well...
import os
import subprocess
def get_dns_settings()->dict:
# Initialize the output variables
dns_ns, dns_search = [], ''
# For Unix based OSs
if os.path.isfile('/etc/resolv.conf'):
for line in open('/etc/resolv.conf','r'):
if line.strip().startswith('nameserver'):
nameserver = line.split()[1].strip()
dns_ns.append(nameserver)
elif line.strip().startswith('search'):
search = line.split()[1].strip()
dns_search = search
# If it is not a Unix based OS, try "the Windows way"
elif os.name == 'nt':
cmd = 'ipconfig /all'
raw_ipconfig = subprocess.check_output(cmd)
# Convert the bytes into a string
ipconfig_str = raw_ipconfig.decode('cp850')
# Convert the string into a list of lines
ipconfig_lines = ipconfig_str.split('\n')
for n in range(len(ipconfig_lines)):
line = ipconfig_lines[n]
# Parse nameserver in current line and next ones
if line.strip().startswith('DNS-Server'):
nameserver = ':'.join(line.split(':')[1:]).strip()
dns_ns.append(nameserver)
next_line = ipconfig_lines[n+1]
# If there's too much blank at the beginning, assume we have
# another nameserver on the next line
if len(next_line) - len(next_line.strip()) > 10:
dns_ns.append(next_line.strip())
next_next_line = ipconfig_lines[n+2]
if len(next_next_line) - len(next_next_line.strip()) > 10:
dns_ns.append(next_next_line.strip())
elif line.strip().startswith('DNS-Suffix'):
dns_search = line.split(':')[1].strip()
return {'nameservers': dns_ns, 'search': dns_search}
print(get_dns_settings())
By the way... how did you manage to write two answers with the same account?

Processing realtime output of access_log with subprocess

I am trying to accomplish an idea which i really don't know how to do it. Basically i am trying to capture the value through grep command as shown below
p = subprocess.Popen('tail -f /data/qantasflight/run/tomcat/logs/localhost_access_log.2016-02-29.txt | grep /qantas-ui/int/price?', stdout=subprocess.PIPE, shell = True)
stdout = p.communicate()[0]
Process the stdout value and then push the value as shown below
f = urllib.urlopen("http://162.16.1.90:9140/TxnService", params2)
param2 is the value where i will process the result given by the subprocess.Popen
In a nutshell i want the following :
-- Wait for the new value --> -- Process the value --> -- Push the value -->
This should be realtime and the python script will keep on getting the new value, process it and then push the value.
i try code somethink as you want, but im only printing new lines in log, you can process line and push
from __future__ import print_function
import subprocess
from time import sleep
f = open('test.log', 'r')
while True:
line = ''
while len(line) == 0 or line[-1] != '\n':
tail = f.readline()
if tail == '':
continue
line = tail
print(line, end='')
this print new line to console, just edit and use :) maybe i help you :)

Python 3.4.3 subprocess.Popen get output of command without piping?

I am trying to assign the output of a command to a variable without the command thinking that it is being piped. The reason for this is that the command in question gives unformatted text as output if it is being piped, but it gives color formatted text if it is being run from the terminal. I need to get this color formatted text.
So far I've tried a few things. I've tried Popen like so:
output = subprocess.Popen(command, stdout=subprocess.PIPE)
output = output.communicate()[0]
output = output.decode()
print(output)
This will let me print the output, but it gives me the unformatted output that I get when the command is piped. That makes sense, as I'm piping it here in the Python code. But I am curious if there is a way to assign the output of this command, directly to a variable, without the command running the piped version of itself.
I have also tried the following version that relies on check_output instead:
output = subprocess.check_output(command)
output = output.decode()
print(output)
And again I get the same unformatted output that the command returns when the command is piped.
Is there a way to get the formatted output, the output the command would normally give from the terminal, when it is not being piped?
Using pexpect:
2.py:
import sys
if sys.stdout.isatty():
print('hello')
else:
print('goodbye')
subprocess:
import subprocess
p = subprocess.Popen(
['python3.4', '2.py'],
stdout=subprocess.PIPE
)
print(p.stdout.read())
--output:--
goodbye
pexpect:
import pexpect
child = pexpect.spawn('python3.4 2.py')
child.expect(pexpect.EOF)
print(child.before) #Print all the output before the expectation.
--output:--
hello
Here it is with grep --colour=auto:
import subprocess
p = subprocess.Popen(
['grep', '--colour=auto', 'hello', 'data.txt'],
stdout=subprocess.PIPE
)
print(p.stdout.read())
import pexpect
child = pexpect.spawn('grep --colour=auto hello data.txt')
child.expect(pexpect.EOF)
print(child.before)
--output:--
b'hello world\n'
b'\x1b[01;31mhello\x1b[00m world\r\n'
Yes, you can use the pty module.
>>> import subprocess
>>> p = subprocess.Popen(["ls", "--color=auto"], stdout=subprocess.PIPE)
>>> p.communicate()[0]
# Output does not appear in colour
With pty:
import subprocess
import pty
import os
master, slave = pty.openpty()
p = subprocess.Popen(["ls", "--color=auto"], stdout=slave)
p.communicate()
print(os.read(master, 100)) # Print 100 bytes
# Prints with colour formatting info
Note from the docs:
Because pseudo-terminal handling is highly platform dependent, there
is code to do it only for Linux. (The Linux code is supposed to work
on other platforms, but hasn’t been tested yet.)
A less than beautiful way of reading the whole output to the end in one go:
def num_bytes_readable(fd):
import array
import fcntl
import termios
buf = array.array('i', [0])
if fcntl.ioctl(fd, termios.FIONREAD, buf, 1) == -1:
raise Exception("We really should have had data")
return buf[0]
print(os.read(master, num_bytes_readable(master)))
Edit: nicer way of getting the content at once thanks to #Antti Haapala:
os.close(slave)
f = os.fdopen(master)
print(f.read())
Edit: people are right to point out that this will deadlock if the process generates a large output, so #Antti Haapala's answer is better.
A working polyglot example (works the same for Python 2 and Python 3), using pty.
import subprocess
import pty
import os
import sys
master, slave = pty.openpty()
# direct stderr also to the pty!
process = subprocess.Popen(
['ls', '-al', '--color=auto'],
stdout=slave,
stderr=subprocess.STDOUT
)
# close the slave descriptor! otherwise we will
# hang forever waiting for input
os.close(slave)
def reader(fd):
try:
while True:
buffer = os.read(fd, 1024)
if not buffer:
return
yield buffer
# Unfortunately with a pty, an
# IOError will be thrown at EOF
# On Python 2, OSError will be thrown instead.
except (IOError, OSError) as e:
pass
# read chunks (yields bytes)
for i in reader(master):
# and write them to stdout file descriptor
os.write(1, b'<chunk>' + i + b'</chunk>')
Many programs automatically turn off colour printing codes when they detect they are not connected directly to a terminal. Many programs will have a flag so you can force colour output. You could add this flag to your process call. For example:
grep "search term" inputfile.txt
# prints colour to the terminal in most OSes
grep "search term" inputfile.txt | less
# output goes to less rather than terminal, so colour is turned off
grep "search term" inputfile.txt --color | less
# forces colour output even when not connected to terminal
Be warned though. The actual colour output is done by the terminal. The terminal interprets special character espace codes and changes the text colour and background color accordingly. Without the terminal to interpret the colour codes you will just see the text in black with these escape codes interspersed throughout.

How can I use Python to pipe stdin/stdout to Perl script

This Python code pipes data through Perl script fine.
import subprocess
kw = {}
kw['executable'] = None
kw['shell'] = True
kw['stdin'] = None
kw['stdout'] = subprocess.PIPE
kw['stderr'] = subprocess.PIPE
args = ' '.join(['/usr/bin/perl','-w','/path/script.perl','<','/path/mydata'])
subproc = subprocess.Popen(args,**kw)
for line in iter(subproc.stdout.readline, ''):
print line.rstrip().decode('UTF-8')
However, it requires that I first to save my buffers to a disk file (/path/mydata). It's cleaner to loop through the data in Python code and pass line-by-line to the subprocess like this:
import subprocess
kw = {}
kw['executable'] = '/usr/bin/perl'
kw['shell'] = False
kw['stderr'] = subprocess.PIPE
kw['stdin'] = subprocess.PIPE
kw['stdout'] = subprocess.PIPE
args = ['-w','/path/script.perl',]
subproc = subprocess.Popen(args,**kw)
f = codecs.open('/path/mydata','r','UTF-8')
for line in f:
subproc.stdin.write('%s\n'%(line.strip().encode('UTF-8')))
print line.strip() ### code hangs after printing this ###
for line in iter(subproc.stdout.readline, ''):
print line.rstrip().decode('UTF-8')
subproc.terminate()
f.close()
The code hangs with the readline after sending the first line to the subprocess. I have other executables that use this exact same code perfectly.
My data files can be quite large (1.5 GB) Is there way to accomplish piping the data without saving to file? I don't want to re-write the perl script for compatibility with other systems.
Your code is blocking at the line:
for line in iter(subproc.stdout.readline, ''):
because the only way this iteration can terminate is when EOF (end-of-file) is reached, which will happen when the subprocess terminates. You don't want to wait till the process terminates, however, you only want to wait till its finished processing the line that was sent to it.
Futhermore, you're encountering issues with buffering as Chris Morgan has already pointed out. Another question on stackoverflow discusses how you can do non-blocking reads with subprocess. I've hacked up a quick and dirty adaptation of the code from that question to your problem:
def enqueue_output(out, queue):
for line in iter(out.readline, ''):
queue.put(line)
out.close()
kw = {}
kw['executable'] = '/usr/bin/perl'
kw['shell'] = False
kw['stderr'] = subprocess.PIPE
kw['stdin'] = subprocess.PIPE
kw['stdout'] = subprocess.PIPE
args = ['-w','/path/script.perl',]
subproc = subprocess.Popen(args, **kw)
f = codecs.open('/path/mydata','r','UTF-8')
q = Queue.Queue()
t = threading.Thread(target = enqueue_output, args = (subproc.stdout, q))
t.daemon = True
t.start()
for line in f:
subproc.stdin.write('%s\n'%(line.strip().encode('UTF-8')))
print "Sent:", line.strip() ### code hangs after printing this ###
try:
line = q.get_nowait()
except Queue.Empty:
pass
else:
print "Received:", line.rstrip().decode('UTF-8')
subproc.terminate()
f.close()
It's quite likely that you'll need to make modifications to this code, but at least it doesn't block.
Thanks srgerg. I had also tried the threading solution. This solution alone, however, always hung. Both my previous code and srgerg's code were missing the final solution, Your tip gave me one last idea.
The final solution writes enough dummy data force the final valid lines from the buffer. To support this, I added code that tracks how many valid lines were written to stdin. The threaded loop opens the output file, saves the data, and breaks when the read lines equal the valid input lines. This solution ensures it reads and writes line-by-line for any size file.
def std_output(stdout,outfile=''):
out = 0
f = codecs.open(outfile,'w','UTF-8')
for line in iter(stdout.readline, ''):
f.write('%s\n'%(line.rstrip().decode('UTF-8')))
out += 1
if i == out: break
stdout.close()
f.close()
outfile = '/path/myout'
infile = '/path/mydata'
subproc = subprocess.Popen(args,**kw)
t = threading.Thread(target=std_output,args=[subproc.stdout,outfile])
t.daemon = True
t.start()
i = 0
f = codecs.open(infile,'r','UTF-8')
for line in f:
subproc.stdin.write('%s\n'%(line.strip().encode('UTF-8')))
i += 1
subproc.stdin.write('%s\n'%(' '*4096)) ### push dummy data ###
f.close()
t.join()
subproc.terminate()
See the warnings mentioned in the manual about using Popen.stdin and Popen.stdout (just above Popen.stdin):
Warning: Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.
I realise that having a gigabyte-and-a-half string in memory all at once isn't very desirable, but using communicate() is a way that will work, while as you've observed, once the OS pipe buffer fills up, the stdin.write() + stdout.read() way can become deadlocked.
Is using communicate() feasible for you?

Categories

Resources