I am trying to compare the MD5 string between PHP and Python, the server we have is working fine with PHP clients, but when we tried to do the same in python, we always get an invalid response from the server.
I have the following piece of code In Python
import hashlib
keyString = '96f6e3a1c4748b81e41ac58dcf6ecfa0'
decodeString = ''
length = len(keyString)
for i in range(0, length, 2):
subString1 = keyString[i:(i + 2)]
decodeString += chr(int(subString1, 16))
print(hashlib.md5(decodeString.encode("utf-8")).hexdigest())
Produces: 5a9536a1490714cb77a02080f902be4c
now, the same concept in PHP:
$serverRandom = "96f6e3a1c4748b81e41ac58dcf6ecfa0";
$length = strlen($serverRandom);
$server_rand_code = '';
for($i = 0; $i < $length; $i += 2)
{
$server_rand_code .= chr(hexdec(substr($serverRandom, $i, 2)));
}
echo 'SERVER CODE: '.md5($server_rand_code).'<br/>';
Produces: b761f889707191e6b96954c0da4800ee
I tried checking the encoding, but no luck, the two MD5 output don't match at all, any help?
Looks like your method of generating the byte string is incorrect, so the input to hashlib.md5 is wrong:
print(decodeString.encode('utf-8'))
# b'\xc2\x96\xc3\xb6\xc3\xa3\xc2\xa1\xc3\x84t\xc2\x8b\xc2\x81\xc3\xa4\x1a\xc3\x85\xc2\x8d\xc3\x8fn\xc3\x8f\xc2\xa0'
The easiest way to interpret the string as a hex string of bytes is to use binascii.unhexlify, or bytes.fromhex:
import binascii
decodeString = binascii.unhexlify(keyString)
decodeString2 = bytes.fromhex(keyString)
print(decodeString)
# b'\x96\xf6\xe3\xa1\xc4t\x8b\x81\xe4\x1a\xc5\x8d\xcfn\xcf\xa0'
print(decodeString == decodeString2)
# True
You can now directly use the resulting bytes object in hashlib.md5:
import hashlib
result = hashlib.md5(decodeString)
print(result.hexdigest())
# 'b761f889707191e6b96954c0da4800ee'
I'm reading content from a JSON file and appending it to a text file . I'm getting the following error :
' write() argument must be str, not generator ' when I run this code and I'm not able to correct it .
with open('stackExchangeAPI.json','r') as json_file:
tags_list = []
data = json.load(json_file)
for i in range(0,99):
for j in range(0,99):
tags = data[i]["items"][j]["tags"]
with open('tags.txt','a+') as tags_file:
tags_file.seek(0)
d = tags_file.read(100)
if len(d) > 0 :
tags_file.write("\n")
tags_file.write(f'{tags[i]}' for i in range(0,(len(tags)-1)))
The error is from the last line ' tags_file.write(f'......) '
Can someone please help me rectify this ?
You're trying to write the for loop to the file. Try changing the last line to:
[tags_file.write(f'{tags[i]}') for i in range(0,(len(tags)-1))]
As it says, you are trying to write a generator, you must first convert it to a string, probably by using join:
out = ''.join(f'{tags[i]}' for i in range(0,(len(tags)-1)))
tags_file.write(out)
At the moment I have a byte stream of a string that is received by my Python code and must be converted into a string. For now I managed to extract each character, convert them and append them to a string individually. The code looks something like this:
import struct
# The byte stream is received and stored in byte_stream
text = ''
i = 0
while i < len(byte_stream):
text = text + struct.unpack('c', byte_stream[i])[0]
i += 1
print(text)
But that surely cannot be the most efficient way... Is there a more elegant way to do achieve the same result?
From Convert bytes to a Python string:
byte_stream = [112, 52, 52]
''.join(map(chr, bytes))
>> p44
I have the following JSON array:
[u'steve#gmail.com']
"u" is apparently the unicode character, and it was automatically created by Python. Now, I want to bring this back into Objective-C and decode it into an array using this:
+(NSMutableArray*)arrayFromJSON:(NSString*)json
{
if(!json) return nil;
NSData *jsonData = [json dataUsingEncoding:NSUTF8StringEncoding];
//I've also tried NSUnicodeStringEncoding here, same thing
NSError *e;
NSMutableArray *result= [NSJSONSerialization JSONObjectWithData:jsonData options:NSJSONReadingMutableContainers error:&e];
if (e != nil) {
NSLog(#"Error:%#", e.description);
return nil;
}
return result;
}
However, I get an error: (Cocoa error 3840.)" (Invalid value around character 1.)
How do I remedy this?
Edit: Here's how I bring the entity from Python back into objective-c:
First I convert the entity to a dictionary:
def to_dict(self):
return dict((p, unicode(getattr(self, p))) for p in self.properties()
if getattr(self, p) is not None)
I add this dictionary to a list, set the value of my responseDict['entityList'] to this list, then self.response.out.write(json.dumps(responseDict))
However the result I get back still has that 'u' character.
[u'steve#gmail.com'] is the decoded python value of the array it is not valid JSON.
The valid JSON string data would be just ["steve#gmail.com"].
Dump the data from python back into a JSON string by doing:
import json
python_data = [u'steve#gmail.com']
json_string = json.dumps(data)
The u prefix on python string literals indicates that those strings are unicode rather than the default encoding in python2.X (ASCII).
I have thousands of text files containing multiple JSON objects, but unfortunately there is no delimiter between the objects. Objects are stored as dictionaries and some of their fields are themselves objects. Each object might have a variable number of nested objects. Concretely, an object might look like this:
{field1: {}, field2: "some value", field3: {}, ...}
and hundreds of such objects are concatenated without a delimiter in a text file. This means that I can neither use json.load() nor json.loads().
Any suggestion on how I can solve this problem. Is there a known parser to do this?
This decodes your "list" of JSON Objects from a string:
from json import JSONDecoder
def loads_invalid_obj_list(s):
decoder = JSONDecoder()
s_len = len(s)
objs = []
end = 0
while end != s_len:
obj, end = decoder.raw_decode(s, idx=end)
objs.append(obj)
return objs
The bonus here is that you play nice with the parser. Hence it keeps telling you exactly where it found an error.
Examples
>>> loads_invalid_obj_list('{}{}')
[{}, {}]
>>> loads_invalid_obj_list('{}{\n}{')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "decode.py", line 9, in loads_invalid_obj_list
obj, end = decoder.raw_decode(s, idx=end)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 376, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting object: line 2 column 2 (char 5)
Clean Solution (added later)
import json
import re
#shameless copy paste from json/decoder.py
FLAGS = re.VERBOSE | re.MULTILINE | re.DOTALL
WHITESPACE = re.compile(r'[ \t\n\r]*', FLAGS)
class ConcatJSONDecoder(json.JSONDecoder):
def decode(self, s, _w=WHITESPACE.match):
s_len = len(s)
objs = []
end = 0
while end != s_len:
obj, end = self.raw_decode(s, idx=_w(s, end).end())
end = _w(s, end).end()
objs.append(obj)
return objs
Examples
>>> print json.loads('{}', cls=ConcatJSONDecoder)
[{}]
>>> print json.load(open('file'), cls=ConcatJSONDecoder)
[{}]
>>> print json.loads('{}{} {', cls=ConcatJSONDecoder)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 339, in loads
return cls(encoding=encoding, **kw).decode(s)
File "decode.py", line 15, in decode
obj, end = self.raw_decode(s, idx=_w(s, end).end())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 376, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting object: line 1 column 5 (char 5)
Sebastian Blask has the right idea, but there's no reason to use regexes for such a simple change.
objs = json.loads("[%s]"%(open('your_file.name').read().replace('}{', '},{')))
Or, more legibly
raw_objs_string = open('your_file.name').read() #read in raw data
raw_objs_string = raw_objs_string.replace('}{', '},{') #insert a comma between each object
objs_string = '[%s]'%(raw_objs_string) #wrap in a list, to make valid json
objs = json.loads(objs_string) #parse json
How about something like this:
import re
import json
jsonstr = open('test.json').read()
p = re.compile( '}\s*{' )
jsonstr = p.sub( '}\n{', jsonstr )
jsonarr = jsonstr.split( '\n' )
for jsonstr in jsonarr:
jsonobj = json.loads( jsonstr )
print json.dumps( jsonobj )
Solution
As far as I know }{ does not appear in valid JSON, so the following should be perfectly safe when trying to get strings for separate objects that were concatenated (txt is the content of your file). It does not require any import (even of re module) to do that:
retrieved_strings = map(lambda x: '{'+x+'}', txt.strip('{}').split('}{'))
or if you prefer list comprehensions (as David Zwicker mentioned in the comments), you can use it like that:
retrieved_strings = ['{'+x+'}' for x in txt.strip('{}').split('}{'))]
It will result in retrieved_strings being a list of strings, each containing separate JSON object. See proof here: http://ideone.com/Purpb
Example
The following string:
'{field1:"a",field2:"b"}{field1:"c",field2:"d"}{field1:"e",field2:"f"}'
will be turned into:
['{field1:"a",field2:"b"}', '{field1:"c",field2:"d"}', '{field1:"e",field2:"f"}']
as proven in the example I mentioned.
Why don't you load the file as string, replace all }{ with },{ and surround the whole thing with []? Something like:
re.sub('\}\s*?\{', '\}, \{', string_read_from_a_file)
Or simple string replace if you are sure you always have }{ without whitespaces in between.
In case you expect }{ to occur in strings as well, you could also split on }{ and evaluate each fragment with json.load, in case you get an error, the fragment wasn't complete and you have to add the next to the first one and so forth.
import json
file1 = open('filepath', 'r')
data = file1.readlines()
for line in data :
values = json.loads(line)
'''Now you can access all the objects using values.get('key') '''
How about reading through the file incrementing a counter every time a { is found and decrementing it when you come across a }. When your counter reaches 0 you'll know that you've come to the end of the first object so send that through json.load and start counting again. Then just repeat to completion.
Suppose you added a [ to the start of the text in a file, and used a version of json.load() which, when it detected the error of finding a { instead of an expected comma (or hits the end of the file), spit out the just-completed object?
Replace a file with that junk in it:
$ sed -i -e 's;}{;}, {;g' foo
Do it on the fly in Python:
junkJson.replace('}{', '}, {')