Event hub Send failed MessageSendResult.Timeout Python - python

I have been having some issues with timeouts while sending messages to EventHub.
import sys
import logging
import datetime
import time
import os
from azure.eventhub import EventHubClient, Sender, EventData
logger = logging.getLogger("azure")
ADDRESS = "xxx"
USER = "xxx"
KEY = "xxx"
ENDPOINT = "xxx"
try:
if not ADDRESS:
raise ValueError("No EventHubs URL supplied.")
# Create Event Hubs client
client = EventHubClient(ADDRESS, username=USER, password=KEY, debug=True)
sender = client.add_sender(partition="0", send_timeout=300, keep_alive=10)
client.run()
try:
start_time = time.time()
for i in range(10000):
print("Sending message: {}".format(i))
message = "Message {}".format(i)
sender.send(EventData(message))
except:
raise
finally:
end_time = time.time()
client.stop()
run_time = end_time - start_time
logger.info("Runtime: {} seconds".format(run_time))
except KeyboardInterrupt:
pass
My context is as follow; i am able to send messages without problem from my personal development computer, from a virtual machine in Azure, and from on premises server1, but when trying to send messages to on premises server2 i receive the error:
azure.eventhub.common.EventHubError: Send failed: Message send failed with result: MessageSendResult.Timeout
I have tried modifying the send_timeout and the keep_alive (even though i dont belive this configurations are to blame) but with no success, my personal guess is that there is something in my on premises server2 that is blocking or interfering with my communication. Firstly, am i changing the timeout value correctly? i have checked the source code of the class here: link but it seems i am doing it right, but i actually belive such property implies the time after the message is in the queue for sending instead of how long we wait for the response of the event. Secondly, is there a way i can validate that the problem relies on the envoiroment of my on premises server2? for example like exploring the network path with traceroute, or dig? The system is a CentOS. Could it be related to new upgrades in the Python SDK? i just saw this other question where it shows that my method for uploading events has been upgraded just the "01/08/2020" maybe is something related to such upgrades(i doubt it)?
Anyhow, any clues would be greatly aprecciated. For now i will be testing on other servers and checking i can manage to change my implementation to the newer version and see if that solves the issue.

It sounds like a networking issue. Try pinging TCP endpoint of your namespace on port 9354 on server2. If firewall is blocking outbound connection to the endpoint, then either you need to fix it or try enabling websockets which can go through 443.

Related

pyzmq proxy in a strange state after subscribing multiple processes

I'm having a weird issue with the proxy in pyzmq. Here's the code of that proxy:
import zmq
context = zmq.Context.instance()
frontend_socket = context.socket(zmq.XSUB)
frontend_socket.bind("tcp://0.0.0.0:%s" % sub_port)
backend_socket = context.socket(zmq.XPUB)
backend_socket.bind("tcp://0.0.0.0:%s" % pub_port)
zmq.proxy(frontend_socket, backend_socket)
I'm using that proxy to send messages between ~50 processes that run on 6 different machines. The total amount of topics is around 1,000, but since multiple processes can listen on the same topics, the total amount of subscriptions is around 10,000.
In normal times this works very well, messages go through the proxy correctly as long as a process publishes it and at least one other processes is subscribed to the topic. It works whether the publisher or subscriber was started first.
But at some point in time, when we start a new process (let's call it X), it starts behaving strangely. Everything that was already connected keeps working, but the new processes that we connect can only get messages to go through if the publisher is connected before the subscriber. X can be any one of the processes that normally work, and it can be from any machine, and the result is the same. When we get in this state, killing X makes everything work again, and starting it again makes it fail. If we stop other processes and then start X, it works well (so it's not related with X's code in particular).
I'm not sure if we could be reaching some limit of ZMQ? I've read examples of people that seem to have way more processes, subscriptions, etc. than us. It could be some option that we should set on the proxy, so far here are the ones we've tried without success:
Changing RCVHWM on frontend_socket
Changing SNDHWM on backend_socket
Setting XPUB_VERBOSE on backend_socket
Setting XPUB_VERBOSER on backend_socket
Here is sample code of how we publish messages to the proxy:
topic = "test"
message = {"test": "test"}
context = zmq.Context.instance()
socket = context.socket(zmq.PUB)
socket.connect("tcp://1.2.3.4:1234")
while True:
time.sleep(1)
socket.send_multipart([topic.encode(), json.dumps(message).encode()])
Here is sample code of how we subscribe to messages from the proxy:
topic = "test"
context = zmq.Context.instance()
socket = context.socket(zmq.SUB)
socket.connect("tcp://1.2.3.4:5678")
socket.subscribe(topic)
while True:
multi_part = socket.recv_multipart()
[topic, message] = multi_part
print(topic.decode(), message.decode())
Has anyone ever seen a similar issue? Is there something we can do to avoid the proxy getting in this state?
Thanks!
Make all the publishers (proxy and publish process) XPUB ( + sockopt verbose/verboser) then read from the publisher sockets on a poll loop. The first byte of the subscription message will tell you if the message is sub/unsub followed by the subject/topic. If you log all of the this information with timestamps it should tell you which component is at fault (it could be any of the three) and help with a fix.
The format of the subscription messages that arrive on the publisher (XPUB) will be
Subscription [0x01][topic]
Unsubscription [0x00][topic]
Code needed
I usually work on C++ but this is the general idea in python
proxy
You need to create a capture socket (this acts like a network tap). You connect a ZMQ_PAIR socket to the proxy (capture) over inproc and then read the contents at the other end of the socket. As you are using XPUB/XSUB you will see the subscription messages.
zmq.proxy(frontend, backend, capture)
read the docs/examples for the python proxy.
publisher
In this case you need to read from the publishing socket in the same thread as you are sending on it. That's the reason I said a poll loop might be best.
This code is not tested at all.
topic = "test"
message = {"test": "test"}
context = zmq.Context.instance()
socket = context.socket(zmq.XPUB)
socket.connect("tcp://1.2.3.4:1234")
poller = zmq.Poller()
poller.register(socket, zmq.POLLIN)
timeout = 1000 #ms
while True:
socks = dict(poller.poll(timeout))
if not socks : # 1
socket.send_multipart([topic.encode(), json.dumps(message).encode()])
if socket in socks:
sub_msg = socket.recv()
# print out the message here.

ZeroMQ REQ .recv() hangs with messages larger than ~1kB if run inside Docker

I'm working on a relatively simple Python / ZeroMQ based work distribution system, using REQ/ROUTER sockets. The system is distributed and worker nodes are geographically distributed on different continents.
The ROUTER, responsible for distributing work, .bind()-s a ROUTER socket. Workers .connect() to it over TCP using a REQ socket.
In the process of setting up a new worker node, I've noticed that while smaller messages (up to 1kB) do the trip with no issues, replies of ~2kB and up, sent by the ROUTER-end are never received by the worker into their REQ-socket - when I call recv(), the socket just hangs.
The worker code runs inside Docker containers, and I was able to work around the issue when running the same image with --net=host - it seems to not happen if Docker is using the host network.
I'm wondering if this is something in the network stack configuration on the host machine or in Docker, or maybe something that can be prevented in my code?
Here is a simplified version of my code that reproduces this issue:
Worker
import sys
import zmq
import logging
import time
READY = 'R'
def worker(connect_to):
ctx = zmq.Context()
socket = ctx.socket(zmq.REQ)
socket.connect(connect_to)
log = logging.getLogger(__name__)
while True:
socket.send_string(READY)
log.debug("Send READY message, waiting for reply")
message = socket.recv()
log.debug("Got reply of %d bytes", len(message))
time.sleep(5)
if __name__ == '__main__':
logging.basicConfig(level=logging.DEBUG)
worker(sys.argv[1])
Router
import sys
import zmq
import logging
REPLY_SIZE = 1024 * 8
def router(bind_to):
ctx = zmq.Context()
socket = ctx.socket(zmq.ROUTER)
socket.bind(bind_to)
poller = zmq.Poller()
poller.register(socket, zmq.POLLIN)
log = logging.getLogger(__name__)
while True:
socks = dict(poller.poll(5000))
if socks.get(socket) == zmq.POLLIN:
message = socket.recv_multipart()
log.debug("Received message of %d parts", len(message))
identity, _ = message[:2]
res = handle_message(message[2:])
log.debug("Sending %d bytes back in response on socket", len(res))
socket.send_multipart([identity, '', res])
def handle_message(parts):
log = logging.getLogger(__name__)
log.debug("Got message: %s", parts)
return 'A' * REPLY_SIZE
if __name__ == '__main__':
logging.basicConfig(level=logging.DEBUG)
router(sys.argv[1])
FWIW I was able to reproduce this on Ubuntu 16.04 (both router and worker) with Docker 17.09.0-ce, libzmq 4.1.5 and PyZMQ 15.4.0.
No, sir, the socket does not hang at all:
Why?
The issue is, that you have instructed the Socket()-instance to enter into an infinitely blocking state, once having called .recv() method, without specifying a zmq.NOBLOCK flag ( the ZMQ_DONTWAIT flag in the ZeroMQ original API ).
This is the cause, that upon other circumstances reported yesterday, moves the code into infinite blocking, as there seem to be other issues that prevent Docker-container to properly deliver any first message to the hands of the Worker's Docker-embedded-ZeroMQ-Context() I/O-engine and to the hands of the REQ-access-point. As the REQ-archetype uses a strict two-step Finite-State-Automaton - strictly striding ( .send()->.recv()->.send()-> ... ad infimum )
This cause->effect reversing is wrong and misleading -
the issue of "socket just hangs"
is un-decideable
from an issue Docker does not deliver a single message ( to allow .recv() to return )
Next steps:
may use .poll() in REQ-side to sniff without blocking for any already arrived message in the Worker.
Once there are none such, focus on Docker first + next may benefit from ZeroMQ Context()-I/O-engine performance and link-level tweaking configuration options.

Python Pika and RabbitMQ Connecting to Publish

Trying to send data into a RabbitMQ queue using Python.
I haven't configured the server but it is running for other processes. I have a working login and can access the web output without problem.
The example code RabbitMQ gives for python uses Pika:
#!/usr/bin/env python
import pika
connection = pika.BlockingConnection(pika.ConnectionParameters(
host='xxx.xxx.xxx.xxx:xxxxx'))
channel = connection.channel()
channel.queue_declare(queue='Test')
channel.basic_publish(exchange='',
routing_key='hello',
body='Hello World!')
print(" [x] Sent 'Hello World!'")
connection.close()
This runs and kicks me off with:
pika.exceptions.ConnectionClosed
Not a lot to go on but safe assumption is a login issue because the example code doesn't have any login info.
So I added it.
import pika
import sys
try:
credentials = pika.PlainCredentials('username', 'password')
connection = pika.BlockingConnection(pika.ConnectionParameters('xxx.xxx.xxx.xxx',
xxxxx,
'virtualhostnamehere',
credentials,))
channel = connection.channel()
channel.queue_declare(queue='Test')
channel.basic_publish(exchange='amq.direct',
body='Hello World!')
print(" [x] Sent 'Hello World!'")
except:
e = sys.exc_info()[0]
print e
It seems to hang around for a good few minutes before giving me:
<class 'pika.exceptions.IncompatibleProtocolError'>
The server is running other services fine but I can't seem to pinpoint what I've done wrong.
The login is correct. The vhost name is correct. The host is correct. the exchange name is correct.
Would appreciate a point in the right direction.
Update:
I've tried using URLParameters as well with the same results.
parameters = pika.URLParameters('amqp://username:password#xxx.xxx.xxx.xxx:xxxxx/notmyvhostname')
connection = pika.BlockingConnection(parameters)
But I guess the port doesn't change anything. It's port 15672 and the
login is the same as I used to get on the browser output.
Use port 5672 - or whichever default port you have setup for AMQP listener. Port 15672 is for web UI access, which is done over HTTP, hence the incompatible protocol error

How to achieve tcpflow functionality (follow tcp stream) purely within python

I am writing a tool in python (platform is linux), one of the tasks is to capture a live tcp stream and to
apply a function to each line. Currently I'm using
import subprocess
proc = subprocess.Popen(['sudo','tcpflow', '-C', '-i', interface, '-p', 'src', 'host', ip],stdout=subprocess.PIPE)
for line in iter(proc.stdout.readline,''):
do_something(line)
This works quite well (with the appropriate entry in /etc/sudoers), but I would like to avoid calling an external program.
So far I have looked into the following possibilities:
flowgrep: a python tool which looks just like what I need, BUT: it uses pynids
internally, which is 7 years old and seems pretty much abandoned. There is no pynids package
for my gentoo system and it ships with a patched version of libnids
which I couldn't compile without further tweaking.
scapy: this is a package manipulation program/library for python,
I'm not sure if tcp stream
reassembly is supported.
pypcap or pylibpcap as wrappers for libpcap. Again, libpcap is for packet
capturing, where I need stream reassembly which is not possible according
to this question.
Before I dive deeper into any of these libraries I would like to know if maybe someone
has a working code snippet (this seems like a rather common problem). I'm also grateful if
someone can give advice about the right way to go.
Thanks
Jon Oberheide has led efforts to maintain pynids, which is fairly up to date at:
http://jon.oberheide.org/pynids/
So, this might permit you to further explore flowgrep. Pynids itself handles stream reconstruction rather elegantly.See http://monkey.org/~jose/presentations/pysniff04.d/ for some good examples.
Just as a follow-up: I abandoned the idea to monitor the stream on the tcp layer. Instead I wrote a proxy in python and let the connection I want to monitor (a http session) connect through this proxy. The result is more stable and does not need root privileges to run. This solution depends on pymiproxy.
This goes into a standalone program, e.g. helper_proxy.py
from multiprocessing.connection import Listener
import StringIO
from httplib import HTTPResponse
import threading
import time
from miproxy.proxy import RequestInterceptorPlugin, ResponseInterceptorPlugin, AsyncMitmProxy
class FakeSocket(StringIO.StringIO):
def makefile(self, *args, **kw):
return self
class Interceptor(RequestInterceptorPlugin, ResponseInterceptorPlugin):
conn = None
def do_request(self, data):
# do whatever you need to sent data here, I'm only interested in responses
return data
def do_response(self, data):
if Interceptor.conn: # if the listener is connected, send the response to it
response = HTTPResponse(FakeSocket(data))
response.begin()
Interceptor.conn.send(response.read())
return data
def main():
proxy = AsyncMitmProxy()
proxy.register_interceptor(Interceptor)
ProxyThread = threading.Thread(target=proxy.serve_forever)
ProxyThread.daemon=True
ProxyThread.start()
print "Proxy started."
address = ('localhost', 6000) # family is deduced to be 'AF_INET'
listener = Listener(address, authkey='some_secret_password')
while True:
Interceptor.conn = listener.accept()
print "Accepted Connection from", listener.last_accepted
try:
Interceptor.conn.recv()
except: time.sleep(1)
finally:
Interceptor.conn.close()
if __name__ == '__main__':
main()
Start with python helper_proxy.py. This will create a proxy listening for http connections on port 8080 and listening for another python program on port 6000. Once the other python program has connected on that port, the helper proxy will send all http replies to it. This way the helper proxy can continue to run, keeping up the http connection, and the listener can be restarted for debugging.
Here is how the listener works, e.g. listener.py:
from multiprocessing.connection import Client
def main():
address = ('localhost', 6000)
conn = Client(address, authkey='some_secret_password')
while True:
print conn.recv()
if __name__ == '__main__':
main()
This will just print all the replies. Now point your browser to the proxy running on port 8080 and establish the http connection you want to monitor.

Execute a Go script from Django webapp safely

How do you launch a Go script in a django app safely?
I made a go script which is self contained. I would like to be able to launch a job from a django web app (I use celery to have the job run in the background). What would be the proper/safer way of achieving this? Maybe a way to isolate this process?
I feel that running...
os.system(f"./goscript -o {option1} -b {optiom2}")
...is quite unsafe.
as a bonus, I'd like to be able to get the output to see if the script crashes etc... but that is a bonus question.
Something like this should help IMHO
import shlex
import subprocess
def get_output(command, working_folder=None):
logging.debug("Executing %s in %s", command, working_folder)
try:
output = subprocess.check_output(shlex.split(command), cwd=working_folder)
return output.decode("utf-8")
except OSError:
logging.error("Command being executed: {}".format(command))
raise
Thanks to the people who've answered. I've actually justed remembered that a much better solution would be to use an asynchronous messaging queue library. The one I'm familiar with, and this is very easy to adapt is ZMQ https://zeromq.org/. It's dead easy to do a server/client setup with the Go script listening as a server and the django app requesting as a client for a job.
As a proof of concept, here's a snippet from the documentations of the different libraries.
Server in GO
This script is the server, written in Go, I beleive it can be set as a service to run continuously, waiting for django to send a job to do.
// source: https://github.com/pebbe/zmq4/blob/master/examples/hwserver.go
//
// Hello World server.
// Binds REP socket to tcp://*:5555
// Expects "Hello" from client, replies with "World"
//
package main
import (
zmq "github.com/pebbe/zmq4"
"fmt"
"time"
)
func main() {
// Socket to talk to clients
responder, _ := zmq.NewSocket(zmq.REP)
defer responder.Close()
responder.Bind("tcp://*:5555")
for {
// Wait for next request from client
msg, _ := responder.Recv(0)
fmt.Println("Received ", msg)
// Do some 'work', can take a whilst
time.Sleep(time.Second)
// Send reply back to client
reply := "World"
responder.Send(reply, 0)
fmt.Println("Sent ", reply)
}
}
Client in python
Here is the python app that can be easilly called within any http request. It can create the zmq context and start sending stuff to do to the go server.
# source: http://zguide.zeromq.org/py:hwclient
#
# Hello World client in Python
# Connects REQ socket to tcp://localhost:5555
# Sends "Hello" to server, expects "World" back
#
import zmq
context = zmq.Context()
# Socket to talk to server
print("Connecting to hello world server…")
socket = context.socket(zmq.REQ)
socket.connect("tcp://localhost:5555")
# Do 10 requests, waiting each time for a response
for request in range(10):
print("Sending request %s …" % request)
socket.send(b"Hello")
# Get the reply.
message = socket.recv()
print("Received reply %s [ %s ]" % (request, message))
# Gracefully closing the sockects
socket.close()
context.term()
# Back to normal django stuff
What's creat with this approach is that the clieant can dynamically creat and shut down the zmq context. Furthermore, you don't even have to have the go script running on the same server. You could communicate to any IP addresses. Provided you take the care to at least encrypte the packages, o look at the security features ZMQ provides.
--
note: I know I'm answering my own question, but it's like telephoning IT, you need to phone them to solve the problem right when they pick up the phone

Categories

Resources