How to de-serialize kafka internal topic __consumer_offsets data

How to de-serialize kafka internal topic __consumer_offsets data - python

I am trying to read data from Kafka internal topic __consumer_offsets using kafka-python client. I create consumer and successfully fetched data but the problem is data is serialised and look like it's in wire format , I want to deserialize this data into some readable format, i figured out that there are key_deserialize and value_deserialize options available in kafka consumer api but the problem is i couldn't figure out what value should to give to these fields , Could anyone please help me in this ?
my consumer code looks like
consumer = KafkaConsumer(bootstrap_servers=Settings.instance().kafka_server,
consumer_timeout_ms=2000,
enable_auto_commit="False",
exclude_internal_topics="False",
value_deserializer = bytes.decode, # not working•
group_id=self._group_id
)
and the consumed message looks like ::
ConsumerRecord(topic='__consumer_offsets', partition=26, offset=12983, timestamp=1520765864606, timestamp_type=0, key=b'
\x00\x01\x00\x16console-consumer-56707\x00\x06events\x00\x00\x00\x00', value=b'\x00\x01\x00\x00\x00\x00\x00\x00\x00\xb6\x00\x00\x00\x00\x01b\x14\xb5\x8a\x9d\x00\x00\x01b\x19\xdb\
xe6\x9d', checksum=-1872169212, serialized_key_size=38, serialized_value_size=28)

Well, you need to implement custom Serializer(at producer end) and Deserializer (at consumer end). Make sure to put same custom value class on class path at consumer end as that of producer.
public class CustomDeserializer implements Deserializer<ValueClass> {
public CustomDeserializer() {
}
public void configure(Map<String, ?> map, boolean b) {
}
public ValueClass deserialize(String s, byte[] MessageBytes) {
ValueClass eEventMessage = null;
ObjectMapper objectMapper = new ObjectMapper();
try {
eEventMessage = (ValueClass)objectMapper.readValue(MessageBytes, ValueClass.class);
} catch (IOException ex) {
// your stuffs
}
return eEventMessage;
}
public void close() {
}
}
Set this custom class at consumer end properties.

Related

Swift HTTP session not sending actual Request

So I have some Swift code that send a request to my local host
//
// ContentView.swift
// Shared
//
// Created by Ulto4 on 10/23/21.
//
import SwiftUI
struct ContentView: View {
var body: some View {
VStack{
Text("Hello, world!")
.padding()
Button(action : {
self.fu()
}, label: {
Image(systemName: "pencil").resizable().aspectRatio(contentMode:.fit)
})
}
}
func fu(){
let url = URL(string: "http://127.0.0.1:5000/232")
guard let requestUrl = url else { fatalError() }
var request = URLRequest(url: requestUrl)
request.httpMethod = "GET"
let task = URLSession.shared.dataTask(with: request) { (data, response, error) in
if let error = error {
print("Error took place \(error)")
return
}
if let response = response as? HTTPURLResponse {
print("Response HTTP Status code: \(response.statusCode)")
}
}
}
struct ContentView_Previews: PreviewProvider {
static var previews: some View {
ContentView()
}
}
}
However, on my Flask app there are no get requests coming in and the function isn't running. There also isn't anything printing to the console.
I am fairly new to swift so I don't really know how to fix this.
Is there any other way to send requests in swift, if not, How would I fix this?

You are creating the URLSessionDataTask, but you never start it. Call task.resume(), e.g.
func performRequest() {
guard let url = URL(string: "http://127.0.0.1:5000/232") else {
fatalError()
}
let task = URLSession.shared.dataTask(with: url) { data, response, error in
if let error = error {
print("Error took place \(error)")
return
}
if let response = response as? HTTPURLResponse {
print("Response HTTP Status code: \(response.statusCode)")
}
}
task.resume() // you must call this to start the task
}
That having been said, a few caveats:
You are doing http rather than https. Make sure to temporarily enable insecure network requests with app transport settings, e.g.
You didn’t say if this was for macOS or iOS.
If running on physical iOS device, it will not find your macOS web server at 127.0.0.1 (i.e., it will not find a web server running on your iPhone). You will want to specify the IP number for your web server on your LAN.
If macOS, make sure to enable outbound network requests in the target’s “capabilities”:
You asked:
Is there any other way to send requests in swift?
It is probably beyond the scope of your question, but longer term, when using SwiftUI, you might consider using Combine, e.g., dataTaskPublisher. When running a simple “what was the status code” routine, the difference is immaterial, but when you get into more complicated scenarios where you have to parse and process the responses, Combine is more consistent with SwiftUI’s declarative patterns.
Let us consider a more complicated example where you need to parse JSON responses. For illustrative purposes, below I am testing with httpbin.org, which echos whatever parameters you send. And I illustrate the use of dataTaskPublisher and how it can be used with functional chaining patterns to get out of the mess of hairy imperative code:
struct SampleObject: Decodable {
let value: String
}
struct HttpBinResponse<T: Decodable>: Decodable {
let args: T
}
class RequestService: ObservableObject {
var request: AnyCancellable?
let decoder = JSONDecoder()
#Published var status: String = "Not started yet"
func startRequest() {
request = createRequest().sink { completion in
print("completed")
} receiveValue: { [weak self] object in
self?.status = "Received " + object.value
}
}
func createRequest() -> AnyPublisher<SampleObject, Error>{
var components = URLComponents(string: "https://httpbin.org/get")
components?.queryItems = [URLQueryItem(name: "value", value: "foo")]
guard let url = components?.url else {
fatalError("Unable to build URL")
}
return URLSession.shared.dataTaskPublisher(for: url)
.map(\.data)
.decode(type: HttpBinResponse<SampleObject>.self, decoder: decoder)
.map(\.args)
.receive(on: DispatchQueue.main)
.eraseToAnyPublisher()
}
}
struct ContentView: View {
#ObservedObject var requestService = RequestService()
var body: some View {
VStack{
Text("Hello, world!")
.padding()
Button {
requestService.startRequest()
} label: {
Image(systemName: "pencil").resizable().aspectRatio(contentMode:.fit)
}
Text(requestService.status)
}
}
}
But, like I said, it is beyond the scope of this question. You might want to make sure you get comfortable with SwiftUI and basic URLSession programming patterns (e.g., making sure you resume any tasks you create). Once you have that mastered, you can come back to Combine to write elegant networking code.
FWIW, like workingdog said, you could also use the new async-await rendition of data(for:delegate:). But when in the declarative world of SwiftUI, I would suggest Combine.

How do I correctly make consecutive calls to a child process in Node.js?

I have a Node.js application which is currently a web-based API. For one of my API functions, I make a call to a short Python script that I've written to achieve some extra functionality.
After reading up on communicating between Node and Python using the child_process module, I gave it a try and achieved my desired results. I call my Node function that takes in an email address, sends it to Python through std.in, my Python script performs the necessary external API call using the provided e-mail, and writes the output of the external API call to std.out and sends it back to my Node function.
Everything works properly until I fire off several requests consecutively. Despite Python correctly logging the changed e-mail address and also making the request to the external API with the updated e-mail address, after the first request I make to my API (returning the correct data), I keep receiving the same old data again and again.
My initial guess was that Python's input stream wasn't being flushed, but after testing the Python script I saw that I was correctly updating the e-mail address being received from Node and receiving the proper query results.
I think there's some underlying workings of the child_process module that I may not be understanding... since I'm fairly certain that the corresponding data is being correctly passed back and forth.
Below is the Node function:
exports.callPythonScript = (email)=>
{
let getPythonData = new Promise(function(success,fail){
const spawn = require('child_process').spawn;
const pythonProcess = spawn('python',['./util/emailage_query.py']);
pythonProcess.stdout.on('data', (data) =>{
let dataString = singleToDoubleQuote(data.toString());
let emailageResponse = JSON.parse(dataString);
success(emailageResponse);
})
pythonProcess.stdout.on('end', function(){
console.log("python script done");
})
pythonProcess.stderr.on('data', (data) => {
fail(data);
})
pythonProcess.stdin.write(email);
pythonProcess.stdin.end();
})
return getPythonData;
}
And here is the Python script:
import sys
from emailage.client import EmailageClient
def read_in():
lines = sys.stdin.readlines()
return lines[0]
def main():
client = EmailageClient('key','auth')
email = read_in()
json_response = client.query(email,user_email='authemail#mail.com')
print(json_response)
sys.stdout.flush()
if __name__ == '__main__':
main()
Again, upon making a single call to callPythonScript everything is returned perfectly. It is only upon making multiple calls that I'm stuck returning the same output over and over.
I'm hitting a wall here and any and all help would be appreciated. Thanks all!

I've used a Mutex lock for this kind of example. I can't seem to find the question the code comes from though, as I found it on SO when I had the same kind of issue:
class Lock {
constructor() {
this._locked = false;
this._waiting = [];
}
lock() {
const unlock = () => {
let nextResolve;
if (this._waiting.length > 0) {
nextResolve = this._waiting.pop(0);
nextResolve(unlock);
} else {
this._locked = false;
}
};
if (this._locked) {
return new Promise((resolve) => {
this._waiting.push(resolve);
});
} else {
this._locked = true;
return new Promise((resolve) => {
resolve(unlock);
});
}
}
}
module.exports = Lock;
Where I then call would implement it like this, with your code:
class Email {
constructor(Lock) {
this._lock = new Lock();
}
async callPythonScript(email) {
const unlock = await this._lock.lock();
let getPythonData = new Promise(function(success,fail){
const spawn = require('child_process').spawn;
const pythonProcess = spawn('python',['./util/emailage_query.py']);
pythonProcess.stdout.on('data', (data) =>{
let dataString = singleToDoubleQuote(data.toString());
let emailageResponse = JSON.parse(dataString);
success(emailageResponse);
})
pythonProcess.stdout.on('end', function(){
console.log("python script done");
})
pythonProcess.stderr.on('data', (data) => {
fail(data);
})
pythonProcess.stdin.write(email);
pythonProcess.stdin.end();
})
await unlock();
return getPythonData;
}
}
I haven't tested this code, and i've implemented where i'm dealing with arrays and each array value calling python... but this should at least give you a good start.

Multiple topics and priority of them

I am using pykafka for consuming message and now I am using balanced_consumer for consuming message from one topic. Now I have to consume messages from another topic, and if it is possible to priority consuming message from different topics. How can I handle with this problem? May be other library for python?

I just posted a post about this issue.
Even Though I am using Java, you can find the concept described there useful for your case.
What we did tackle the issue of prioritizing Kafka topics is -
We developed a mechanism to prioritize the consumption of Kafka topics. Such a mechanism will check if we want to process a message that was consumed from Kafka, or hold the processing for later.
We maped between the partitions and Booleans, which blocks the consuming of each partition if necessary, topicPartitionLocks. Blocking the preliminary ones, while continuing to consume from the tardy ones, creates prioritization of topics. A TimerTask updates this map and our consumers check if they are “allowed” to consume or have to wait – as you can see in the method waitForLatePartitionIfNeeded.
public class Prioritizer extends TimerTask {
private Map<String, Boolean> topicPartitionLocks = new ConcurrentHashMap<>();
private Map<String, Long> topicPartitionLatestTimestamps = new ConcurrentHashMap<>();
#Override
public void run(){
updateTopicPartitionLocks();
}
private void updateTopicPartitionLocks() {
Optional<Long> minValue = topicPartitionLatestTimestamps.values().stream().min((o1, o2) -> (int) (o1 - o2));
if(! minValue.isPresent()) {
return;
}
Iterator it = topicPartitionLatestTimestamps.entrySet().iterator();
while (it.hasNext()) {
Boolean shouldLock = false;
Map.Entry<String, Long> pair = (Map.Entry)it.next();
String topicPartition = pair.getKey();
if(pair.getValue() > (minValue.get() + maxGap)) {
shouldLock = true;
if(isSameTopicAsMinPartition(minValue.get(), topicPartition)) {
shouldLock = false;
}
}
topicPartitionLocks.put(topicPartition, shouldLock);
}
}
public boolean isLocked(String topicPartition) {
return topicPartitionLocks.get(topicPartition).booleanValue();
}
}
waitForLatePartitionIfNeeded method
private void waitForLatePartitionIfNeeded(final String topic, int partition) {
String topicPartition = topic + partition;
prioritizer.getTopicPartitionLocks.putIfAbsent(topicPartition);
while(prioritizer.isLocked(topicPartition)) {
monitorWaitForLatePartitionTimes(topicPartition, startTime);
Misc.sleep(timeToWaitBetweenGapToTardyPartitionChecks.get());
}
}
Using this we caused increased rebalance, so we solved it with this definitions:
We changed the next configuration in Kafka
request.timeout.ms: 7300000 (~2hrs)
max.poll.interval.ms: 7200000 (2hrs)
For graphs and general descriptions about the issue you can check my post:
How I Resolved Delays in Kafka Messages by Prioritizing Kafka Topics
Good Luck!

Firebase value event listener in python

I have the following code
public static class Post {
public String author;
public String title;
public Post(String author, String title) {
}
}
// Get a reference to our posts
final FirebaseDatabase database = FirebaseDatabase.getInstance();
DatabaseReference ref = database.getReference("server/saving-data/fireblog/posts");
// Attach a listener to read the data at our posts reference
ref.addValueEventListener(new ValueEventListener() {
#Override
public void onDataChange(DataSnapshot dataSnapshot) {
Post post = dataSnapshot.getValue(Post.class);
System.out.println(post);
}
#Override
public void onCancelled(DatabaseError databaseError) {
System.out.println("The read failed: " + databaseError.getCode());
}
});
I would like to write it in python. But it seems that Firebase library for python doesn't support it? Any ideas?

Thrift python - TApplicationException: Invalid method name

I have 2 services that are defined in the same thrift file and share a port. I can use any method from serviceA no problem but whenever i try to call any of ServiceB's methods i get the exception.
this is my thrift file (service-a.thrift):
service ServiceA extends common.CommonService {
list<i64> getByIds(1: list<i64> ids)
...
}
service ServiceB extends common.CommonService {
list<i64> getByIds(1: list<i64> ids)
...
}
notes:
I'm working with a python client
Thrift version 0.8.0
Any ideas?

We had this need as well and solved by writing a new implementation of TProcessor that creates a map of multiple processors. The only gotcha is that with this implementation you need to ensure no method names overlap - i.e. don't use nice generic names like Run() in different servers. Apologies on not converting C# to Python...
Example Class:
using System;
using System.Collections;
using System.Collections.Generic;
using System.IO;
using System.Reflection;
using Thrift;
using Thrift.Protocol;
/// <summary>
/// Processor that allows for multiple services to run under one roof. Requires no method name conflicts across services.
/// </summary>
public class MultiplexProcessor : TProcessor {
public MultiplexProcessor(IEnumerable<TProcessor> processors) {
ProcessorMap = new Dictionary<string, Tuple<TProcessor, Delegate>>();
foreach (var processor in processors) {
var processMap = (IDictionary) processor.GetType().GetField("processMap_", BindingFlags.NonPublic | BindingFlags.Instance).GetValue(processor);
foreach (string pmk in processMap.Keys) {
var imp = (Delegate) processMap[pmk];
try {
ProcessorMap.Add(pmk, new Tuple<TProcessor, Delegate>(processor, imp));
}
catch (ArgumentException) {
throw new ArgumentException(string.Format("Method already exists in process map: {0}", pmk));
}
}
}
}
protected readonly Dictionary<string, Tuple<TProcessor, Delegate>> ProcessorMap;
internal protected Dictionary<string, Tuple<TProcessor, Delegate>> GetProcessorMap() {
return new Dictionary<string, Tuple<TProcessor, Delegate>>(ProcessorMap);
}
public bool Process(TProtocol iprot, TProtocol oprot) {
try {
TMessage msg = iprot.ReadMessageBegin();
Tuple<TProcessor, Delegate> fn;
ProcessorMap.TryGetValue(msg.Name, out fn);
if (fn == null) {
TProtocolUtil.Skip(iprot, TType.Struct);
iprot.ReadMessageEnd();
var x = new TApplicationException(TApplicationException.ExceptionType.UnknownMethod, "Invalid method name: '" + msg.Name + "'");
oprot.WriteMessageBegin(new TMessage(msg.Name, TMessageType.Exception, msg.SeqID));
x.Write(oprot);
oprot.WriteMessageEnd();
oprot.Transport.Flush();
return true;
}
Console.WriteLine("Invoking service method {0}.{1}", fn.Item1, fn.Item2);
fn.Item2.Method.Invoke(fn.Item1, new object[] {msg.SeqID, iprot, oprot});
}
catch (IOException) {
return false;
}
return true;
}
}
Example Usage:
Processor = new MultiplexProcessor(
new List<TProcessor> {
new ReportingService.Processor(new ReportingServer()),
new MetadataService.Processor(new MetadataServer()),
new OtherService.Processor(new OtherService())
}
);
Server = new TThreadPoolServer(Processor, Transport);

As far as I know, there's no straightforward way to bind several services to a single port without adding this field to TMessage and recompiling thrift. If you want to have two services using the same Server, you should reimplement Thrift Server, which it doesn't seem an easy task

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to de-serialize kafka internal topic __consumer_offsets data - python

Related

Swift HTTP session not sending actual Request

How do I correctly make consecutive calls to a child process in Node.js?

Multiple topics and priority of them

Firebase value event listener in python

Thrift python - TApplicationException: Invalid method name

Categories

Resources