Azure servicebus python - how to send a bunch of messages - python

To avoid errors when I have to send a group of messages that is larger than max size, I wrote a class useful to send bunch of messages.
Well, first of all would be wonderful if somebody could show me an example that explains how to avoid this problem.
Trying to solve the problem by myself I found extremely hard understanding the size of the message (ServiceBusMessage).
The method sb_msg.message.get_message_encoded_size() it’s the nearest thing of what I need.
Do you know how to calculate the message size?
def send_as_json(self, msg, id_field=None, size_in_bytes=262144):
if isinstance(msg, list):
payload = self.topic_sender.create_message_batch(size_in_bytes)
for m in msg:
try:
# add a message to the batch
sb_msg = ServiceBusMessage(json.dumps(m), message_id=m.get(id_field, uuid.uuid4()), content_type='application/json')
total_size = payload.size_in_bytes + sb_msg.message.get_message_encoded_size()
if total_size > size_in_bytes:
_log.info(f'sending partial batch of {payload.size_in_bytes} bytes')
self.send_service_bus_message(payload)
payload = self.topic_sender.create_message_batch(size_in_bytes)
payload.add_message(sb_msg)
except ValueError as e:
# ServiceBusMessageBatch object reaches max_size.
# New ServiceBusMessageBatch object can be created here to send more data.
raise Exception('', e)
self.send_service_bus_message(payload)
else:
sb_msg = ServiceBusMessage(json.dumps(msg), message_id=msg.get(id_field, uuid.uuid4()), content_type = 'application/json')
self.send_service_bus_message(sb_msg)

Azure ServiceBus - Python:
Below code will give you a insight on how to send batch of messages:
def send_batch_message(sender):
# create a batch of messages
batch_message = sender.create_message_batch()
for _ in range(10):
try:
# add a message to the batch
batch_message.add_message(ServiceBusMessage("Message inside a ServiceBusMessageBatch"))
except ValueError:
# ServiceBusMessageBatch object reaches max_size.
# New ServiceBusMessageBatch object can be created here to send more data.
break
# send the batch of messages to the queue
sender.send_messages(batch_message)
print("Sent a batch of 10 messages")
For more detail information you can visit below Microsoft docs, In this link we have clear information of sending messages as single, list or batch: Link
How to find the size of message:
With the help of below function we can find the size of the message in bytes:
Import sys
sys.getsizeof(variable_name) #this gives us the bytes occupied by the variable.
To check the max_size_in_bytes of a batch, we can simply print below message
batch_message = sender.create_message_batch() # Step1
batch_message.add_message(ServiceBusMessage("Message we want to add")) #Step2 Add message to the batch in a loop.
print(batch_message) #Step3 We will get the output as below as yellow after printing, below is the output:
Output:
ServiceBusMessageBatch(max_size_in_bytes=1048576, message_count=10)

Related

How to Compile a While Loop statement in PySpark on Apache Spark with Databricks

I'm trying send data to my Data Lake with a While Loop.
Basically, the intention is to continually loop through code and send data to my Data Lake when ever data received from my Azure Service Bus using the following code:
This code receives message from my Service Bus
def myfunc():
with ServiceBusClient.from_connection_string(CONNECTION_STR) as client:
# max_wait_time specifies how long the receiver should wait with no incoming messages before stopping receipt.
# Default is None; to receive forever.
with client.get_queue_receiver(QUEUE_NAME, session_id=session_id, max_wait_time=5) as receiver:
for msg in receiver:
# print("Received: " + str(msg))
themsg = json.loads(str(msg))
# complete the message so that the message is removed from the queue
receiver.complete_message(msg)
return themsg
This code assigns a variable to the message:
result = myfunc()
The following code sends the message to my data lake
rdd = sc.parallelize([json.dumps(result)])
spark.read.json(rdd) \
.write.mode("overwrite").json('/mnt/lake/RAW/FormulaClassification/F1Area/')
I would like help looping through the code to continually checking for messages and sending the results to my data lake.
I believe the solution is accomplished with a While Loop but not sure
Just because you're using Spark doesn't mean you cannot loop
First off all, you're only returning the first message from your receiver, so it should look like this
with client.get_queue_receiver(QUEUE_NAME, session_id=session_id, max_wait_time=5) as receiver:
msg = str(next(receiver))
# print("Received: " + msg)
themsg = json.loads(msg)
# complete the message so that the message is removed from the queue
receiver.complete_message(msg)
return themsg
To answer your question,
while True:
result = json.dumps(myfunc())
rdd = sc.parallelize([result])
spark.read.json(rdd) \ # You should use rdd.toDF().json here instead
.write.mode("overwrite").json('/mnt/lake/RAW/FormulaClassification/F1Area/')
Keep in mind that the output file names aren't consistent and you might not want them to be overwritten
Alternatively, you should look into writing your own Source / SparkDataStream class that defines SparkSQL sources so that you don't need a loop in your main method and it's natively handled by Spark

Pulling historical channel messages python

I am attempting to create a small dataset by pulling messages/responses from a slack channel I am a part of. I would like to use python to pull the data from the channel however I am having trouble figuring out my api key. I have created an app on slack but I am not sure how to find my api key. I see my client secret, signing secret, and verification token but can't find my api key
Here is a basic example of what I believe I am trying to accomplish:
import slack
sc = slack.SlackClient("api key")
sc.api_call(
"channels.history",
channel="C0XXXXXX"
)
I am willing to just download the data manually if that is possible as well. Any help is greatly appreciated.
messages
See below for is an example code on how to pull messages from a channel in Python.
It uses the official Python Slack library and calls
conversations_history with paging. It will therefore work with
any type of channel and can fetch large amounts of messages if
needed.
The result will be written to a file as JSON array.
You can specify channel and max message to be retrieved
threads
Note that the conversations.history endpoint will not return thread messages. Those have to be retrieved additionaly with one call to conversations.replies for every thread you want to retrieve messages for.
Threads can be identified in the messages for each channel by checking for the threads_ts property in the message. If it exists there is a thread attached to it. See this page for more details on how threads work.
IDs
This script will not replace IDs with names though. If you need that here are some pointers how to implement it:
You need to replace IDs for users, channels, bots, usergroups (if on a paid plan)
You can fetch the lists for users, channels and usergroups from the API with users_list, conversations_list and usergroups_list respectively, bots need to be fetched one by one with bots_info (if needed)
IDs occur in many places in messages:
user top level property
bot_id top level property
as link in any property that allows text, e.g. <#U12345678> for users or <#C1234567> for channels. Those can occur in the top level text property, but also in attachments and blocks.
Example code
import os
import slack
import json
from time import sleep
CHANNEL = "C12345678"
MESSAGES_PER_PAGE = 200
MAX_MESSAGES = 1000
# init web client
client = slack.WebClient(token=os.environ['SLACK_TOKEN'])
# get first page
page = 1
print("Retrieving page {}".format(page))
response = client.conversations_history(
channel=CHANNEL,
limit=MESSAGES_PER_PAGE,
)
assert response["ok"]
messages_all = response['messages']
# get additional pages if below max message and if they are any
while len(messages_all) + MESSAGES_PER_PAGE <= MAX_MESSAGES and response['has_more']:
page += 1
print("Retrieving page {}".format(page))
sleep(1) # need to wait 1 sec before next call due to rate limits
response = client.conversations_history(
channel=CHANNEL,
limit=MESSAGES_PER_PAGE,
cursor=response['response_metadata']['next_cursor']
)
assert response["ok"]
messages = response['messages']
messages_all = messages_all + messages
print(
"Fetched a total of {} messages from channel {}".format(
len(messages_all),
CHANNEL
))
# write the result to a file
with open('messages.json', 'w', encoding='utf-8') as f:
json.dump(
messages_all,
f,
sort_keys=True,
indent=4,
ensure_ascii=False
)
This is using the slack webapi. You would need to install requests package. This should grab all the messages in channel. You need a token which can be grabbed from apps management page. And you can use the getChannels() function. Once you grab all the messages you will need to see who wrote what message you need to do id matching(map ids to usernames) you can use getUsers() functions. Follow this https://api.slack.com/custom-integrations/legacy-tokens to generate a legacy-token if you do not want to use a token from your app.
def getMessages(token, channelId):
print("Getting Messages")
# this function get all the messages from the slack team-search channel
# it will only get all the messages from the team-search channel
slack_url = "https://slack.com/api/conversations.history?token=" + token + "&channel=" + channelId
messages = requests.get(slack_url).json()
return messages
def getChannels(token):
'''
function returns an object containing a object containing all the
channels in a given workspace
'''
channelsURL = "https://slack.com/api/conversations.list?token=%s" % token
channelList = requests.get(channelsURL).json()["channels"] # an array of channels
channels = {}
# putting the channels and their ids into a dictonary
for channel in channelList:
channels[channel["name"]] = channel["id"]
return {"channels": channels}
def getUsers(token):
# this function get a list of users in workplace including bots
users = []
channelsURL = "https://slack.com/api/users.list?token=%s&pretty=1" % token
members = requests.get(channelsURL).json()["members"]
return members

I cannot fetch a mail using imap in python

The fetch method gives this error:
imaplib.IMAP4.error: FETCH command error: BAD [b'Could not parse command']
I am not attaching all of my code. I want to get the unseen msg using imap to get the body and save it as text and then download the attachment.
import imaplib, email, os
user= "test9101997"
password="Monday#123"
imap_url="imap.gmail.com"
attach_dir='E:\PROJECT\attachment'
filePath='D:\ATTACH'
con=imaplib.IMAP4_SSL(imap_url)
con.login(user,password)
con.select('INBOX')
#UIDs=con.search(None,'UNSEEN')
#print(UIDs)
(result, messages) = con.search(None, 'UnSeen')
if result == "OK":
for message in messages:
try:
ret, data =con.fetch(message,'(RFC822)')
except:
print ("No new emails to read.")
#self.close_connection()
#exit()
#result, data=con.fetch(i,'(RFC822)')
raw=email.message_from_bytes(data[0][1])
I think you may be confused about the return value of con.search(). If you take a look at the value of messages after that call (assuming that result is OK), it's collection of strings, not a list of message ids. That is, after a call like:
result, messages = con.search(None, 'UnSeen')
The value of messages may look something like:
['1 2 15 20']
So when you try to iterate over it like this:
for message in messages:
The value of message in the first loop iteration will be 1 2 15 20, and that's why you're getting the command error: the request you're making doesn't make any sense. You'll want to do something like this instead:
(result, blocks) = con.search(None, 'UnSeen')
if result == "OK":
for messages in blocks:
for message in messages.split():
ret, data = con.fetch(message, '(RFC822)')
raw = email.message_from_bytes(data[0][1])
There's really no good reason for the imaplib module to return data in this fashion.

receiving and sending mavlink messages using pymavlink library

I have created a proxy between QGC(Ground Control Station) and vehicle in Python. Here is the code:
gcs_conn = mavutil.mavlink_connection('tcpin:localhost:15795')
gcs_conn.wait_heartbeat()
print("Heartbeat from system (system %u component %u)" %(gcs_conn.target_system, gcs_conn.target_system))
vehicle = mavutil.mavlink_connection('tcp:localhost:5760')
vehicle.wait_heartbeat() # recieving heartbeat from the vehicle
print("Heartbeat from system (system %u component %u)" %(vehicle.target_system, vehicle.target_system))
while True:
gcs_msg = gcs_conn.recv_match()
if gcs_msg == None:
pass
else:
vehicle.mav.send(gcs_msg)
print(gcs_msg)
vcl_msg = vehicle.recv_match()
if vcl_msg == None:
pass
else:
gcs_conn.mav.send(vcl_msg)
print(vcl_msg)
I need to receive the messages from the QGC and then forward them to the vehicle and also receive the messages from the vehicle and forward them to the QGC.
When I run the code I get this error.
is there any one who can help me?
If you print your message before sending you'll notice it always fails when you try to send a BAD_DATA message type.
So this should fix it (same for vcl_msg):
if gcs_msg and gcs_msg.get_type() != 'BAD_DATA':
vehicle.mav.send(gcs_msg)
PD: I noticed that you don't specify tcp as input or output, it defaults to input. Than means both connections are inputs. I recommend setting up the GCS connection as output:
gcs_conn = mavutil.mavlink_connection('tcp:localhost:15795', input=False)
https://mavlink.io/en/mavgen_python/#connection_string
For forwarding MAVLink successfully a few things need to happen. I'm assuming you need a usable connection to a GCS, like QGroundControl or MissionPlanner. I use QGC, and my design has basic testing with it.
Note that this is written with Python3. This snippet is not tested, but I have a (much more complex) version tested and working.
from pymavlink import mavutil
import time
# PyMAVLink has an issue that received messages which contain strings
# cannot be resent, because they become Python strings (not bytestrings)
# This converts those messages so your code doesn't crash when
# you try to send the message again.
def fixMAVLinkMessageForForward(msg):
msg_type = msg.get_type()
if msg_type in ('PARAM_VALUE', 'PARAM_REQUEST_READ', 'PARAM_SET'):
if type(msg.param_id) == str:
msg.param_id = msg.param_id.encode()
elif msg_type == 'STATUSTEXT':
if type(msg.text) == str:
msg.text = msg.text.encode()
return msg
# Modified from the snippet in your question
# UDP will work just as well or better
gcs_conn = mavutil.mavlink_connection('tcp:localhost:15795', input=False)
gcs_conn.wait_heartbeat()
print(f'Heartbeat from system (system {gcs_conn.target_system} component {gcs_conn.target_system})')
vehicle = mavutil.mavlink_connection('tcp:localhost:5760')
vehicle.wait_heartbeat()
print(f'Heartbeat from system (system {vehicle.target_system} component {vehicle.target_system})')
while True:
# Don't block for a GCS message - we have messages
# from the vehicle to get too
gcs_msg = gcs_conn.recv_match(blocking=False)
if gcs_msg is None:
pass
elif gcs_msg.get_type() != 'BAD_DATA':
# We now have a message we want to forward. Now we need to
# make it safe to send
gcs_msg = fixMAVLinkMessageForForward(gcs_msg)
# Finally, in order to forward this, we actually need to
# hack PyMAVLink so the message has the right source
# information attached.
vehicle.mav.srcSystem = gcs_msg.get_srcSystem()
vehicle.mav.srcComponent = gcs_msg.get_srcComponent()
# Only now is it safe to send the message
vehicle.mav.send(gcs_msg)
print(gcs_msg)
vcl_msg = vehicle.recv_match(blocking=False)
if vcl_msg is None:
pass
elif vcl_msg.get_type() != 'BAD_DATA':
# We now have a message we want to forward. Now we need to
# make it safe to send
vcl_msg = fixMAVLinkMessageForForward(vcl_msg)
# Finally, in order to forward this, we actually need to
# hack PyMAVLink so the message has the right source
# information attached.
gcs_conn.mav.srcSystem = vcl_msg.get_srcSystem()
gcs_conn.mav.srcComponent = vcl_msg.get_srcComponent()
gcs_conn.mav.send(vcl_msg)
print(vcl_msg)
# Don't abuse the CPU by running the loop at maximum speed
time.sleep(0.001)
Notes
Make sure your loop isn't being blocked
The loop must quickly check if a message is available from one connection or the other, instead of waiting for a message to be available from a single connection. Otherwise a message on the other connection will not go through until the blocking connection has a message.
Check message validity
Check that you actually got a valid message, as opposed to a BAD_DATA message. Attempting to send BAD_DATA will crash
Make sure the recipient gets the correct information about the sender
By default PyMAVLink, when sending a message, will encode YOUR system and component IDs (usually left at zero), instead of the IDs from the message. A GCS receiving this may be confused (ie, QGC) and not properly connect to the vehicle (despite showing the messages in MAVLink inspector).
This is fixed by hacking PyMAVLink such that your system and component IDs match the forwarded message. This can be revered after the message is sent if necessary. See the example to see how I did it.
Loop update rate
It's important that the update rate is fast enough to handle high traffic conditions (especially, say, for downloading params), but it shouldn't peg out the CPU either. I find that a 1000hz update rate works well enough.

Deleting Messages in Slack

Sooo, I'm relatively new to programming, and trying to learn how to consume API's. I figured I would start out by building a Slack bot for moderation purposes since I use Slack a lot. For the most part, everything works except for when I try to delete a message. The API returns saying it can't find the message even though it is there in the channel (the slack API uses timestamps to locate said message). The timestamps match, but proclaims the message doesn't exist. Here is my code:
def __init__(self, token):
self.token = token
self.users = {}
self.channels = {}
self.slack = SlackClient(self.token)
self.as_user = True
def connect(self):
if self.slack.rtm_connect():
self.post_message('#test', "*AUTOMOD* _v0.1_")
while True:
# print(self.slack.rtm_read())
self.parse_data(self.slack.rtm_read())
time.sleep(1)
def parse_data(self, payload):
if payload:
if payload[0]['type'] == 'message':
print(("user: {} message: {} channel: {}").format(payload[0]['user'], payload[0]['text'], payload[0]['channel']))
self.handle_message(payload[0])
def handle_message(self, data):
# these users can post whatever they want.
WHITELISTED = ["U4DU2TS2F", "U3VSRJJD8", "U3WLZUTQE", "U3W1Q2ULT"]
# get userid
sent_from = (data['user'])
# ignore whitelisted
if sent_from in WHITELISTED:
return
# if message was sent from someone not in WHITELISTED, delete it
else:
print(("\n\ntimestamp of message: {}").format(data['ts']))
self.delete_message(data['channel'], data['ts'])
self.post_message(data['channel'], "```" + random.choice(dongers) + "```")
def delete_message(self, channel, timestamp):
print(("deleting message in channel '{}'...").format(channel))
print("timestamp check (just to make sure): ", timestamp)
deleted = self.slack.api_call("chat.delete",
channel=channel,
timestamp=timestamp,
as_user=self.as_user
)
if deleted.get('ok'):
print("\nsuccessfully deleted.\n")
else:
print(("\ncouldn't delete message: {}\n").format(deleted['error']))
OUTPUT
timestamp of message: 1488822718.000040
deleting message in channel: 'G4DGYCW2X'
timestamp check (just to make sure...): 1488822718.000040
couldn't delete message: message_not_found
Any ideas on what could be happening? Here is the chat.delete method for context.
EDIT:
Due #pvg's recommendation of "Minimal, Complete, and Verifiable example", I have placed the ENTIRE code from the project in a gist.
One issue might be that you appear to be passing a timestamp parameter to chat.delete, when the API method takes a ts parameter instead. (See docs)

Categories

Resources