Confluent Kafka python schema parser causes conflict with fastavro

Confluent Kafka python schema parser causes conflict with fastavro - python

I am running Python 3.9 with Confluent Kafka 1.7.0, avro-python3 1.10.0 and fastavro 1.4.1.
The following code uses Avro schema encoder in order to encode a message, which succeeds only if we transform the resulting schema encoding by getting rid of the MappingProxyType:
from confluent_kafka import Producer
from confluent_kafka.avro import CachedSchemaRegistryClient, MessageSerializer
from fastavro.schema import parse_schema
from fastavro.validation import validate
from types import MappingProxyType
from typing import Any
import sys
def transformMap(item: Any) -> Any:
if type(item) in {dict, MappingProxyType}:
return {k:transformMap(v) for k,v in item.items()}
elif type(item) is list:
return [transformMap(v) for v in item]
else:
return item
def main(argv = None):
msgType = 'InstrumentIdMsg'
idFigi = 'BBG123456789'
head = {'sDateTime': 1, 'msgType': msgType, 'srcSeq': 1,
'rDateTime': 1, 'src': 'Brownstone', 'reqID': None,
'sequence': 1}
msgMap = {'head': head, 'product': 'Port', 'idIsin': None, 'idFigi': idFigi,
'idBB': None, 'benchmark': None, 'idCusip': None,'idCins': None}
registryClient = CachedSchemaRegistryClient(url = 'http://local.KafkaRegistry.com:8081')
schemaId, schema, version = registryClient.get_latest_schema(msgType)
serializer = MessageSerializer(registry_client = registryClient)
schemaMap = schema.to_json()
# NOTE:
# schemaMap cannot be used since it uses mappingproxy
# which causes validate() and parse_schema() to throw
schemaDict = transformMap(schemaMap)
isValid = validate(datum = msgMap, schema = schemaDict, raise_errors = True)
parsed_schema = parse_schema(schema = schemaDict)
msg = serializer.encode_record_with_schema_id(schema_id = schemaId,
record = msgMap)
producer = Producer({'bootstrap.servers': 'kafkaServer:9092'})
producer.produce(key = idFigi,
topic = 'TOPIC_NAME',
value = msg)
return 0
if __name__ == '__main__':
sys.exit(main())
The transformation basically leaves everything unchanged except altering MappingProxyType to dict instances.
Is there a problem in the way I am calling the standard library which causes mapping proxy to be used, which in turn causes fastavro to throw? Can this be fixed by something as a user, or is this really a bug in the Confluent Kafka library?
In addition, the output schemaId from registryClient.get_latest_schema() is marked in the docs to return str but returns int. If I understand correctly, this is the intended input into the schema_id parameter of serializer.encode_record_with_schema_id() (and it works correctly if I call it), which is also marked as int. Is that a typo in the docs? In other words, it seems either registryClient.get_latest_schema() should return an integer, or serializer.encode_record_with_schema_id() should take a string, or I am doing something incorrectly :) Which one is it?
Thank you very much.

Related

In the controllers.py file of odoo, how to get the json string in the post request body?

I have defined a class in the controllers.py file to receive HTTP requests. The remote server sends a POST request and the data in the request body is a JSON string.
I can get the data in the request body directly by converting the JSON string to a dictionary via method http.request.jsonrequest, but for now, I need to get the original JSON string in the request body directly instead of a dictionary to verify a signature.
The method(json.dumps()) of directly converting to JSON strings cannot be used, as the string obtained in this way is not the same as the JSON string in the original request body, which can lead to a failure when verifying the signature.
What should I do about it? Please help me. Thank you.
this is my controllers.py
# -*- coding: utf-8 -*-
from odoo import http
class CallbackNotification(http.Controller):
def safety_judgement(self):
"""
:return:
"""
header = http.request.httprequest.headers.environ
signature = header['HTTP_X_TSIGN_OPEN_SIGNATURE']
time_stamp = header['HTTP_X_TSIGN_OPEN_TIMESTAMP']
remote_addr = http.request.httprequest.remote_addr
if remote_addr != '47.99.80.224':
return
#http.route('/signature/process/my_odoo', type='json', auth='none')
def receive_institution_auth(self, **kw):
"""
:param kw:
:return:
"""
self.safety_judgement()
request_body = http.request.jsonrequest
action = request_body['action']
flow_num = request_body['flowId']
http_env = http.request.env
sign_process_id = http_env['sign.process'].sudo().search([('flow_num', '=', flow_num)]).id
if action == 'SIGN_FLOW_UPDATE':
third_order = request_body['thirdOrderNo']
name_id_user_list = third_order.split(',')
model = name_id_user_list[0]
record_id = name_id_user_list[1]
approve_user_id = name_id_user_list[2]
if approve_user_id != 'p':
record_obj = http_env[model].sudo(user=int(approve_user_id)).browse(int(record_id))
sign_result = request_body['signResult']
result_description = request_body['resultDescription']
account_num = request_body['accountId']
org_or_account_num = request_body['authorizedAccountId']
sign_user_id = http_env['sign.users'].sudo().search([('account_num','=',account_num)]).id
http_manual_env = http_env['manual.sign'].sudo()
if account_num == org_or_account_num:
manual_id = http_manual_env.search([('sign_process_id','=',sign_process_id),
('sign_user_id','=',sign_user_id)]).id
else:
institution_id = http_env['institution.account'].sudo().search([('org_num','=',org_or_account_num)]).id
manual_id = http_manual_env.search([('sign_process_id', '=', sign_process_id),
('sign_user_id', '=', sign_user_id),
('institution_id','=',institution_id)]).id
if sign_result == 2:
http_manual_env.browse(manual_id).write({'sign_result':'success'})
http.request._cr.commit()
if approve_user_id != 'p':
record_obj.approve_action('approved','')
else:
http_env[model].sudo().browse(int(record_id)).write({'partner_sign_state':'success'})
elif sign_result == 3:
http_manual_env.browse(manual_id).write({'sign_result':'failed'})
if approve_user_id == 'p':
http_env[model].sudo().browse(int(record_id)).write({'partner_sign_state':'failed'})
elif sign_result == 4:
http_manual_env.browse(manual_id).write({'sign_result':'reject'})
http.request._cr.commit()
if approve_user_id != 'p':
record_obj.approve_action('reject', result_description)
else:
http_env[model].sudo().browse(int(record_id)).write({'partner_sign_state':'reject','partner_reject_reason':result_description})

To obtain a raw body data, you can use the following code:
raw_body_data = http.request.httprequest.data

pychong
You can send the value from JSON-RPC into your Json controller.
Js File:
Where pass the value before calling the controller like this.
var ajax = require('web.ajax');
ajax.jsonRpc("/custom/url", 'call', {'Your Key': Your Value}).then(function(data) {
if (data) {
// Code
} else {
// Code
}});
Py File :
Get the data from the post like this,
#http.route(['/custom/url'], type='json', auth="public", website=True)
def custom_cotroller(self, **post):
get_data = post.get('Your Key')
# Your Customise Code
Thanks

#Dipen Shah #CoolFlash95 #Charif DZ
Hello everyone,I've found a solution to this problem.But as I lay out the solution, I hope we can understand the root cause of the problem, so let's examine odoo's source code.
from odoo.http import JsonRequest--odoo version 10.0--line598
from odoo.http import JsonRequest--odoo version 11.0--line609
In Odoo10
request = self.httprequest.stream.read(),thenself.jsonrequest = json.loads(request)
In Odoo11
request=self.httprequest.get_data().decode(self.httprequest.charset),thenself.jsonrequest = json.loads(request)
We find that the self object of JsonRequest has the attribute jsonrequest that the type is dict.Unfortunately, the source code does not allow self to have 'another' attribute, which contains the original string in the request body.However,it is very easy to add the 'another' attribute,why not?
We can use setattr to dynamically change the methods in the source code.Let's change the method__init__of JsonRequest and add another attribute named stream_str.
eg.Odoo version is 10,python version is 2.7
# -*- coding: utf-8 -*-
import logging
from odoo.http import JsonRequest
import werkzeug
import json
_logger = logging.getLogger(__name__)
def __init__(self, *args):
"""
We have copied the method __init__ directly from the source code and added
only one line of code to it
"""
super(JsonRequest, self).__init__(*args)
self.jsonp_handler = None
args = self.httprequest.args
jsonp = args.get('jsonp')
self.jsonp = jsonp
request = None
request_id = args.get('id')
if jsonp and self.httprequest.method == 'POST':
# jsonp 2 steps step1 POST: save call
def handler():
self.session['jsonp_request_%s' % (request_id,)] = self.httprequest.form['r']
self.session.modified = True
headers = [('Content-Type', 'text/plain; charset=utf-8')]
r = werkzeug.wrappers.Response(request_id, headers=headers)
return r
self.jsonp_handler = handler
return
elif jsonp and args.get('r'):
# jsonp method GET
request = args.get('r')
elif jsonp and request_id:
# jsonp 2 steps step2 GET: run and return result
request = self.session.pop('jsonp_request_%s' % (request_id,), '{}')
else:
# regular jsonrpc2
request = self.httprequest.stream.read()
# We added this line of code,a new attribute named stream_str contains the origin string in request body when the request type is json.
self.stream_str = request
# Read POST content or POST Form Data named "request"
try:
self.jsonrequest = json.loads(request)
except ValueError:
msg = 'Invalid JSON data: %r' % (request,)
_logger.info('%s: %s', self.httprequest.path, msg)
raise werkzeug.exceptions.BadRequest(msg)
self.params = dict(self.jsonrequest.get("params", {}))
self.context = self.params.pop('context', dict(self.session.context))
# Replacing the __init__ method in the source code with the new __init__ method, but without changing the source code
setattr(JsonRequest, '__init__', __init__)
In the definition of the routing function, we can do this.
# -*- coding: utf-8 -*-
from odoo.http import Controller,route,request
class CallbackNotification(http.Controller):
#route('/signature/process/my_odoo', type='json', auth='none')
def receive_institution_auth(self, **kw):
# When the type='json',the request is the object of JsonRequest,we can get the new attribute stream_str very easy!
stream_str = request.stream_str
Now the problem has been solved.

How can we list all the parameters in the aws parameter store using Boto3? There is no ssm.list_parameters in boto3 documentation?

SSM — Boto 3 Docs 1.9.64 documentation
get_parameters doesn't list all parameters?

For those who wants to just copy-paste the code:
import boto3
ssm = boto3.client('ssm')
parameters = ssm.describe_parameters()['Parameters']
Beware of the limit of max 50 parameters!

This code will get all parameters, by recursively fetching until there are no more (50 max is returned per call):
import boto3
def get_resources_from(ssm_details):
results = ssm_details['Parameters']
resources = [result for result in results]
next_token = ssm_details.get('NextToken', None)
return resources, next_token
def main()
config = boto3.client('ssm', region_name='us-east-1')
next_token = ' '
resources = []
while next_token is not None:
ssm_details = config.describe_parameters(MaxResults=50, NextToken=next_token)
current_batch, next_token = get_resources_from(ssm_details)
resources += current_batch
print(resources)
print('done')

You can use get_paginator api. find below example, In my use case i had to get all the values of SSM parameter store and wanted to compare it with a string.
import boto3
import sys
LBURL = sys.argv[1].strip()
client = boto3.client('ssm')
p = client.get_paginator('describe_parameters')
paginator = p.paginate().build_full_result()
for page in paginator['Parameters']:
response = client.get_parameter(Name=page['Name'])
value = response['Parameter']['Value']
if LBURL in value:
print("Name is: " + page['Name'] + " and Value is: " + value)

One of the responses from above/below(?) (by Val Lapidas) inspired me to expand it to this (as his solution doesn't get the SSM parameter value, and some other, additional details).
The downside here is that the AWS function client.get_parameters() only allows 10 names per call.
There's one referenced function call in this code (to_pdatetime(...)) that I have omitted - it just takes the datetime value and makes sure it is a "naive" datetime. This is because I am ultimately dumping this data to an Excel file using pandas, which doesn't deal well with timezones.
from typing import List, Tuple
from boto3 import session
from mypy_boto3_ssm import SSMClient
def ssm_params(aws_session: session.Session = None) -> List[dict]:
"""
Return a detailed list of all the SSM parameters.
"""
# -------------------------------------------------------------
#
#
# -------------------------------------------------------------
def get_parameter_values(ssm_client: SSMClient, ssm_details: dict) -> Tuple[list, str]:
"""
Retrieve additional attributes for the SSM parameters contained in the 'ssm_details'
dictionary passed in.
"""
# Get the details
ssm_param_details = ssm_details['Parameters']
# Just the names, ma'am
param_names = [result['Name'] for result in ssm_param_details]
# Get the parames, including the values
ssm_params_with_values = ssm_client.get_parameters(Names=param_names,
WithDecryption=True)
resources = []
result: dict
for result in ssm_params_with_values['Parameters']:
# Get the matching parameter from the `ssm_details` dict since this has some of the fields
# that aren't in the `ssm_params_with_values` returned from "get_arameters".
param_details = next((zz for zz in ssm_param_details if zz.get('Name', None) == result['Name']), {})
param_policy = param_details.get('Policies', None)
if len(param_policy) == 0:
param_policy = None
resources.append({
'Name': result['Name'],
'LastModifiedDate': to_pdatetime(result['LastModifiedDate']),
'LastModifiedUser': param_details.get('LastModifiedUser', None),
'Version': result['Version'],
'Tier': param_details.get('Tier', None),
'Policies': param_policy,
'ARN': result['ARN'],
'DataType': result.get('DataType', None),
'Type': result.get('Type', None),
'Value': result.get('Value', None)
})
next_token = ssm_details.get('NextToken', None)
return resources, next_token
# -------------------------------------------------------------
#
#
# -------------------------------------------------------------
if aws_session is None:
raise ValueError('No session.')
# Create SSM client
aws_ssm_client = aws_session.client('ssm')
next_token = ' '
ssm_resources = []
while next_token is not None:
# The "describe_parameters" call gets a whole lot of info on the defined SSM params,
# except their actual values. Due to this limitation let's call the nested function
# to get the values, and a few other details.
ssm_descriptions = aws_ssm_client.describe_parameters(MaxResults=10,
NextToken=next_token)
# This will get additional details for the params, including values.
current_batch, next_token = get_parameter_values(ssm_client=aws_ssm_client,
ssm_details=ssm_descriptions)
ssm_resources += current_batch
print(f'SSM Parameters: {len(ssm_resources)}')
return ssm_resources
pythonawsboto3amazon-web-services

There's no ListParameters only DescribeParameter, which lists all the paremeters, or you can set filters.
Boto3 Docs Link:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ssm.html#SSM.Client.describe_parameters
AWS API Documentation Link:
https://docs.aws.amazon.com/systems-manager/latest/APIReference/API_DescribeParameters.html

You can use get_parameters() and get_parameters_by_path().

Use paginators.
paginator = client.get_paginator('describe_parameters')
More information here.

How to mock a redis client in Python?

I just found that a bunch of unit tests are failing, due a developer hasn't mocked out the dependency to a redis client within the test. I'm trying to give a hand in this matter but have difficulties myself.
The method writes to a redis client:
redis_client = get_redis_client()
redis_client.set('temp-facility-data', cPickle.dumps(df))
Later in the assert the result is retrieved:
res = cPickle.loads(get_redis_client().get('temp-facility-data'))
expected = pd.Series([set([1, 2, 3, 4, 5])], index=[1])
assert_series_equal(res.variation_pks, expected)
I managed to patch the redis client's get() and set() successfully.
#mock.patch('redis.StrictRedis.get')
#mock.patch('redis.StrictRedis.set')
def test_identical(self, mock_redis_set, mock_redis_get):
mock_redis_get.return_value = ???
f2 = deepcopy(self.f)
f3 = deepcopy(self.f)
f2.pk = 2
f3.pk = 3
self.one_row(f2, f3)
but I don't know how to set the return_value of get() to what the set() would set in the code, so that the test would pass.
Right now this line fails the test:
res = cPickle.loads(get_redis_client().get('temp-facility-data'))
TypeError: must be string, not MagicMock
Any advice please?

Think you can use side effect to set and get value in a local dict
data = {}
def set(key, val):
data[key] = val
def get(key):
return data[key]
mock_redis_set.side_effect = set
mock_redis_get.side_effect = get
not tested this but I think it should do what you want

If you want something more complete, you can try fakeredis
#patch("redis.Redis", return_value=fakeredis.FakeStrictRedis())
def test_something():
....

I think you can do something like this.
redis_cache = {
"key1": (b'\x80\x04\x95\x08\x00\x00\x00\x00\x00\x00\x00\x8c\x04test\x94.', "test"),
"key2": (None, None),
}
def get(redis_key):
if redis_key in redis_cache:
return redis_cache[redis_key][0]
else:
return None
mock = MagicMock()
mock.get = Mock(side_effect=get)
with patch('redis.StrictRedis', return_value=mock) as p:
for key in redis_cache:
result = self.MyClass.my_function(key)
self.assertEqual(result, redis_cache[key][1])

MessagePack and datetime

I need a fast way to send 300 short messages a second over zeromq between the python multiprocessing processes. Each message needs to contain an ID and time.time()
msgpack seems like the best way to serialize the dict before sending it via zeromq, and conveniently, msgpack has an example of exactly what I need, except it has a datetime.datetime.now().
import datetime
import msgpack
useful_dict = {
"id": 1,
"created": datetime.datetime.now(),
}
def decode_datetime(obj):
if b'__datetime__' in obj:
obj = datetime.datetime.strptime(obj["as_str"], "%Y%m%dT%H:%M:%S.%f")
return obj
def encode_datetime(obj):
if isinstance(obj, datetime.datetime):
return {'__datetime__': True, 'as_str': obj.strftime("%Y%m%dT%H:%M:%S.%f")}
return obj
packed_dict = msgpack.packb(useful_dict, default=encode_datetime)
this_dict_again = msgpack.unpackb(packed_dict, object_hook=decode_datetime)
The problem is that their example doesn't work, I get this error:
obj = datetime.datetime.strptime(obj["as_str"], "%Y%m%dT%H:%M:%S.%f")
KeyError: 'as_str'
Maybe because I'm on python 3.4, but I don't know what the issue with strptime. Would appreciate your help.

Given that messagepack ( import msgpack ) is good at serializing integers, I created a solution which only uses integers:
_datetime_ExtType = 42
def _unpacker_hook(code, data):
if code == _datetime_ExtType:
values = unpack(data)
if len(values) == 8: # we have timezone
return datetime.datetime(*values[:-1], dateutil.tz.tzoffset(None, values[-1]))
else:
return datetime.datetime(*values)
return msgpack.ExtType(code, data)
# This will only get called for unknown types
def _packer_unknown_handler(obj):
if isinstance(obj, datetime.datetime):
if obj.tzinfo:
components = (obj.year, obj.month, obj.day, obj.hour, obj.minute, obj.second, obj.microsecond, int(obj.utcoffset().total_seconds()))
else:
components = (obj.year, obj.month, obj.day, obj.hour, obj.minute, obj.second, obj.microsecond)
# we effectively double pack the values to "compress" them
data = msgpack.ExtType(_datetime_ExtType, pack(components))
return data
raise TypeError("Unknown type: {}".format(obj))
def pack(obj, **kwargs):
# we don't use a global packer because it wouldn't be re-entrant safe
return msgpack.packb(obj, use_bin_type=True, default=_packer_unknown_handler, **kwargs)
def unpack(payload):
try:
# we temporarily disable gc during unpack to bump up perf: https://pypi.python.org/pypi/msgpack-python
gc.disable()
# This must match the above _packer parameters above. NOTE: use_list is faster
return msgpack.unpackb(payload, use_list=False, encoding='utf-8', ext_hook=_unpacker_hook)
finally:
gc.enable()

Python3 and Python2 manage differently strings encoding : encoding-and-decoding-strings-in-python-3-x
Then it is needed to :
use b'as_str' (instead of 'as_str') as dictionary key
use encode and decode for the stored value
Modifying the code like this works with python2 and python3 :
import datetime
import msgpack
useful_dict = {
"id": 1,
"created": datetime.datetime.now(),
}
def decode_datetime(obj):
if b'__datetime__' in obj:
obj = datetime.datetime.strptime(obj[b'as_str'].decode(), "%Y%m%dT%H:%M:%S.%f")
return obj
def encode_datetime(obj):
if isinstance(obj, datetime.datetime):
obj = {'__datetime__': True, 'as_str': obj.strftime("%Y%m%dT%H:%M:%S.%f").encode()}
return obj
packed_dict = msgpack.packb(useful_dict, default=encode_datetime)
this_dict_again = msgpack.unpackb(packed_dict, object_hook=decode_datetime)

In the documentation it says that the default encoding of unicode strings of packb (and pack) is utf-8. You could simply search for the unicode string '__datetime__' in the decode_datetime function, instead of the byte object b'__datetime__', and add the encoding='utf-8' parameter to unpack. Like so:
def decode_datetime(obj):
if '__datetime__' in obj:
obj = datetime.datetime.strptime(obj["as_str"], "%Y%m%dT%H:%M:%S.%f")
return obj
packed_dict = msgpack.packb(useful_dict, default=encode_datetime)
this_dict_again = msgpack.unpackb(packed_dict, object_hook=decode_datetime, encoding='utf-8')

Using DPAPI with Python?

Is there a way to use the DPAPI (Data Protection Application Programming Interface) on Windows XP with Python?
I would prefer to use an existing module if there is one that can do it. Unfortunately I haven't been able to find a way with Google or Stack Overflow.
EDIT: I've taken the example code pointed to by "dF" and tweaked it into a standalone library which can be simply used at a high level to crypt and decrypt using DPAPI in user mode. Simply call dpapi.cryptData(text_to_encrypt) which returns an encrypted string, or the reverse decryptData(encrypted_data_string), which returns the plain text. Here's the library:
# DPAPI access library
# This file uses code originally created by Crusher Joe:
# http://article.gmane.org/gmane.comp.python.ctypes/420
#
from ctypes import *
from ctypes.wintypes import DWORD
LocalFree = windll.kernel32.LocalFree
memcpy = cdll.msvcrt.memcpy
CryptProtectData = windll.crypt32.CryptProtectData
CryptUnprotectData = windll.crypt32.CryptUnprotectData
CRYPTPROTECT_UI_FORBIDDEN = 0x01
extraEntropy = "cl;ad13 \0al;323kjd #(adl;k$#ajsd"
class DATA_BLOB(Structure):
_fields_ = [("cbData", DWORD), ("pbData", POINTER(c_char))]
def getData(blobOut):
cbData = int(blobOut.cbData)
pbData = blobOut.pbData
buffer = c_buffer(cbData)
memcpy(buffer, pbData, cbData)
LocalFree(pbData);
return buffer.raw
def Win32CryptProtectData(plainText, entropy):
bufferIn = c_buffer(plainText, len(plainText))
blobIn = DATA_BLOB(len(plainText), bufferIn)
bufferEntropy = c_buffer(entropy, len(entropy))
blobEntropy = DATA_BLOB(len(entropy), bufferEntropy)
blobOut = DATA_BLOB()
if CryptProtectData(byref(blobIn), u"python_data", byref(blobEntropy),
None, None, CRYPTPROTECT_UI_FORBIDDEN, byref(blobOut)):
return getData(blobOut)
else:
return ""
def Win32CryptUnprotectData(cipherText, entropy):
bufferIn = c_buffer(cipherText, len(cipherText))
blobIn = DATA_BLOB(len(cipherText), bufferIn)
bufferEntropy = c_buffer(entropy, len(entropy))
blobEntropy = DATA_BLOB(len(entropy), bufferEntropy)
blobOut = DATA_BLOB()
if CryptUnprotectData(byref(blobIn), None, byref(blobEntropy), None, None,
CRYPTPROTECT_UI_FORBIDDEN, byref(blobOut)):
return getData(blobOut)
else:
return ""
def cryptData(text):
return Win32CryptProtectData(text, extraEntropy)
def decryptData(cipher_text):
return Win32CryptUnprotectData(cipher_text, extraEntropy)

I have been using CryptProtectData and CryptUnprotectData through ctypes, with the code from
http://article.gmane.org/gmane.comp.python.ctypes/420
and it has been working well.

Also, pywin32 implements CryptProtectData and CryptUnprotectData in the win32crypt module.

The easiest way would be to use Iron Python.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Confluent Kafka python schema parser causes conflict with fastavro - python

Related

In the controllers.py file of odoo, how to get the json string in the post request body?

How can we list all the parameters in the aws parameter store using Boto3? There is no ssm.list_parameters in boto3 documentation?

How to mock a redis client in Python?

MessagePack and datetime

Using DPAPI with Python?

Categories

Resources