Incorrect padding while decoding a Base64 string

Incorrect padding while decoding a Base64 string - python

I am trying to decode a Base64 encoded byte string to a valid HTTP URL. I have tried appending necessary padding (=). But it still does not seem to work.
I have tried the following code.
import base64
encoded = b"aHR0cHM6Ly9mb3Jtcy5nbGUvWU5ZXQ0d2NRWHVLNnNwdjU="
decoded = base64.b64decode(encoded)
print(decoded)
The string encoded has a missing character as a part of noise. Is there a way to detect that missing character and then perform the decode operation?

So, you have this aHR0cHM6Ly9mb3Jtcy5nbGUvWU5ZXQ0d2NRWHVLNnNwdjU= base64 encoding of an URL with exactly one character missing.
For the missing character, you've 64 choices: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/ (for base64) and 48 possible positions to put the missing character in -a-H-R-0-c-H-M-6-L-y-9-m-b-3-J-t-c-y-5-n-b-G-U-v-W-U-5-Z-X-Q-0-d-2-N-R-W-H-V-L-N-n-N-w-d-j-U-=- (- indicates the possible positions)
So, you've 64 * 48 = 3072 possible encoded strings. Either you can try to generate them by your hand or write some code to do the same.
Once you generate them, you can decode the string to get the URL using some built-in libraries & check whether this URL is valid or not. If you also need to know whether this URL exists or not, you can make an HTTP request to the URL & check the response StatusCode.
Code:
package main
import (
"encoding/base64"
"fmt"
"net/http"
)
func main() {
encodedURL := "aHR0cHM6Ly9mb3Jtcy5nbGUvWU5ZXQ0d2NRWHVLNnNwdjU="
options := "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/"
length := len(encodedURL)
for i := 0; i <= length; i++ {
for idx := 0; idx < 64; idx++ {
tempEncoded := encodedURL[:i] + options[idx:idx+1] + encodedURL[i:]
decodedURL, _ := base64.URLEncoding.DecodeString(tempEncoded)
resp, err := http.Get(string(decodedURL))
if err == nil && resp.StatusCode == http.StatusOK {
fmt.Println("this URL is valid & exists: ", string(decodedURL))
}
}
}
}

when the length of the unencoded input is not a multiple of three, the encoded output must have padding added so that its length is a multiple of four.
len(encoded) is 47, it should be 48, So append another =
encoded = b"aHR0cHM6Ly9mb3Jtcy5nbGUvWU5ZXQ0d2NRWHVLNnNwdjU=="
print(decoded)
b'https://forms.gle/YNY]\r\x1d\xd8\xd4V\x1dR\xcd\x9c\xdc\x1d\x8d'

Related

MD5 Mismatch between Python and PHP

I am trying to compare the MD5 string between PHP and Python, the server we have is working fine with PHP clients, but when we tried to do the same in python, we always get an invalid response from the server.
I have the following piece of code In Python
import hashlib
keyString = '96f6e3a1c4748b81e41ac58dcf6ecfa0'
decodeString = ''
length = len(keyString)
for i in range(0, length, 2):
subString1 = keyString[i:(i + 2)]
decodeString += chr(int(subString1, 16))
print(hashlib.md5(decodeString.encode("utf-8")).hexdigest())
Produces: 5a9536a1490714cb77a02080f902be4c
now, the same concept in PHP:
$serverRandom = "96f6e3a1c4748b81e41ac58dcf6ecfa0";
$length = strlen($serverRandom);
$server_rand_code = '';
for($i = 0; $i < $length; $i += 2)
{
$server_rand_code .= chr(hexdec(substr($serverRandom, $i, 2)));
}
echo 'SERVER CODE: '.md5($server_rand_code).'<br/>';
Produces: b761f889707191e6b96954c0da4800ee
I tried checking the encoding, but no luck, the two MD5 output don't match at all, any help?

Looks like your method of generating the byte string is incorrect, so the input to hashlib.md5 is wrong:
print(decodeString.encode('utf-8'))
# b'\xc2\x96\xc3\xb6\xc3\xa3\xc2\xa1\xc3\x84t\xc2\x8b\xc2\x81\xc3\xa4\x1a\xc3\x85\xc2\x8d\xc3\x8fn\xc3\x8f\xc2\xa0'
The easiest way to interpret the string as a hex string of bytes is to use binascii.unhexlify, or bytes.fromhex:
import binascii
decodeString = binascii.unhexlify(keyString)
decodeString2 = bytes.fromhex(keyString)
print(decodeString)
# b'\x96\xf6\xe3\xa1\xc4t\x8b\x81\xe4\x1a\xc5\x8d\xcfn\xcf\xa0'
print(decodeString == decodeString2)
# True
You can now directly use the resulting bytes object in hashlib.md5:
import hashlib
result = hashlib.md5(decodeString)
print(result.hexdigest())
# 'b761f889707191e6b96954c0da4800ee'

Rijndael Padding in Python

I know where the problem is but I don't know how to fix it. The problem is with the padding. I have absolutely no idea about it and how it works. I tried searching online but nothing seemed to help. I am trying to implement this function to work with my website. The python encrypts and sends the data and PHP decrypts it.
Here's the actual code of my python:
from rijndael.cipher.crypt import new
from rijndael.cipher.blockcipher import MODE_CBC
import base64
PADDING = b'.'
def r_pad(payload, block_size=32):
return payload + (block_size - len(payload) % block_size) * PADDING
KEY = 'lkirwf897+22#bbtrm8814z5qq=498j5'
IV = '741952hheeyy66#cs!9hjv887mxx7#8y'
plain_text = "A padded string to BLOCKSIZE length."
rjn = new(KEY, MODE_CBC, IV, blocksize=32)
encd = rjn.encrypt(r_pad(plain_text))
data = base64.b64encode(encd)
print(data)
rjn = new(KEY, MODE_CBC, IV, blocksize=32)
data = base64.b64decode(data)
decd = rjn.decrypt(r_pad(data))
print (decd)
This is the output:
Dv0Y/AFXdFMlDrcldFCu8v5o9zAlLNgyM+vO+PFeSrqWdzP1S1cumviFiEjNAjz5njnMMC9lfxsBl71x5y+xCw==
A padded string to BLOCKSIZE length.............................Å¿:è°⌐┘n┤«╞Px╜:æC┬♣╬Q┤▼«U_♦â☻ìr
I need the output of the encrypted string to be something like this:
Dv0Y/AFXdFMlDrcldFCu8v5o9zAlLNgyM+vO+PFeSrpO8Ve82mdUcc4rkzp9afDYc75NmkSd4mdflt38kceOdA==
A padded string to BLOCKSIZE length
I tried to make RIJNDAEL256 function out of this code:
EncryptRJ256("lkirwf897+22#bbtrm8814z5qq=498j5", "741952hheeyy66#cs!9hjv887mxx7#8y", "A padded string to BLOCKSIZE length.")
Public Function EncryptRJ256(ByVal prm_key As String, ByVal prm_iv As String, ByVal prm_text_to_encrypt As String) As String
Dim s As String = prm_text_to_encrypt
Dim managed2 As New RijndaelManaged With {
.Padding = PaddingMode.Zeros,
.Mode = CipherMode.CBC,
.BlockSize = 256
}
Dim stream As New MemoryStream
Dim stream2 As New CryptoStream(stream, managed2.CreateEncryptor(Encoding.ASCII.GetBytes(prm_key), Encoding.ASCII.GetBytes(prm_iv)), CryptoStreamMode.Write)
Dim bytes As Byte() = Encoding.ASCII.GetBytes(s)
stream2.Write(bytes, 0, bytes.Length)
stream2.FlushFinalBlock()
Return Convert.ToBase64String(stream.ToArray)
End Function
Can anyone please help? I am lost at this point. :/

Python String Prefix by 4 Byte Length

I'm trying to write a server in Python to communicate with a pre-existing client whose message packets are ASCII strings, but prepended by four-byte unsigned integer values representative of the length of the remaining string.
I've done a receiver, but I'm sure there's a a more pythonic way. Until I find it, I haven't done the sender. I can easily calculate the message length, convert it to bytes and transmit the message.The bit I'm struggling with is creating an integer which is an array of four bytes.
Let me clarify: If my string is 260 characters in length, I wish to prepend a big-endian four byte integer representation of 260. So, I don't want the ASCII string "0260" in front of the string, rather, I want four (non-ASCII) bytes representative of 0x00000104.
My code to receive the length prepended string from the client looks like this:
sizeBytes = 4 # size of the integer representing the string length
# receive big-endian 4 byte integer from client
data = conn.recv(sizeBytes)
if not data:
break
dLen = 0
for i in range(sizeBytes):
dLen = dLen + pow(2,i) * data[sizeBytes-i-1]
data = str(conn.recv(dLen),'UTF-8')
I could simply do the reverse. I'm new to Python and feel that what I've done is probably longhand!!
1) Is there a better way of receiving and decoding the length?
2) What's the "sister" method to encode the length for transmission?
Thanks.

The struct module is helpful here
for writing:
import struct
msg = 'some message containing 260 ascii characters'
length = len(msg)
encoded_length = struct.pack('>I', length)
encoded_length will be a string of 4 bytes with value '\x00\x00\x01\x04'
for reading:
length = struct.unpack('>I', received_msg[:4])[0]

An example using asyncio:
import asyncio
import struct
def send_message(writer, message):
data = message.encode()
size = struct.pack('>L', len(data))
writer.write(size + data)
async def receive_message(reader):
data = await reader.readexactly(4)
size = struct.unpack('>L', data)[0]
data = await reader.readexactly(size)
return data.decode()
The complete code is here

Base64 encoding of CryptoDigest / SHA1 - string doesn't match the result from Java / Python

I'm trying to get message digest of a string on IOS. I have tried nv-ios-digest 3rd party Hash lib but still no use.
Below is the function i'm using to get the base64encoded string of a message digest.
-(NSString*) sha1:(NSString*)input //sha1- Digest
{
NSData *data = [input dataUsingEncoding:NSUTF8StringEncoding];
uint8_t digest[CC_SHA1_DIGEST_LENGTH];
CC_SHA1(data.bytes, data.length, digest);
NSMutableString* output = [NSMutableString stringWithCapacity:CC_SHA1_DIGEST_LENGTH * 2];
for(int i = 0; i < CC_SHA1_DIGEST_LENGTH; i++){
[output appendFormat:#"%02x", digest[i]];//digest
}
return [NSString stringWithFormat:#"%#",[[[output description] dataUsingEncoding:NSUTF8StringEncoding]base64EncodedStringWithOptions:0]]; //base64 encoded
}
Here is my sample input string - '530279591878676249714013992002683ec3a85216db22238a12fcf11a07606ecbfb57b5'
When I use this string either in java or python I get same result - '5VNqZRB1JiRUieUj0DufgeUbuHQ='
But in IOS I get 'ZTU1MzZhNjUxMDc1MjYyNDU0ODllNTIzZDAzYjlmODFlNTFiYjg3NA=='
Here is the code I'm using in python:
import hashlib
import base64
def checkForDigestKey(somestring):
msgDigest = hashlib.sha1()
msgDigest.update(somestring)
print base64.b64encode(msgDigest.digest())
Let me know if there is anyway to get the same result for IOS.

You are producing a binary digest in Python, a hexadecimal digest in iOS.
The digests are otherwise equal:
>>> # iOS-produced base64 value
...
>>> 'ZTU1MzZhNjUxMDc1MjYyNDU0ODllNTIzZDAzYjlmODFlNTFiYjg3NA=='.decode('base64')
'e5536a65107526245489e523d03b9f81e51bb874'
>>> # Python-produced base64 value
...
>>> '5VNqZRB1JiRUieUj0DufgeUbuHQ='.decode('base64')
'\xe5Sje\x10u&$T\x89\xe5#\xd0;\x9f\x81\xe5\x1b\xb8t'
>>> from binascii import hexlify
>>> # Python-produced value converted to a hex representation
...
>>> hexlify('5VNqZRB1JiRUieUj0DufgeUbuHQ='.decode('base64'))
'e5536a65107526245489e523d03b9f81e51bb874'
Use base64.b64encode(msgDigest.hexdigest()) in Python to produce the same value, or Base-64 encode the digest bytes instead of hexadecimal characters in iOS.

Decoding JSON array with unicode character

I have the following JSON array:
[u'steve#gmail.com']
"u" is apparently the unicode character, and it was automatically created by Python. Now, I want to bring this back into Objective-C and decode it into an array using this:
+(NSMutableArray*)arrayFromJSON:(NSString*)json
{
if(!json) return nil;
NSData *jsonData = [json dataUsingEncoding:NSUTF8StringEncoding];
//I've also tried NSUnicodeStringEncoding here, same thing
NSError *e;
NSMutableArray *result= [NSJSONSerialization JSONObjectWithData:jsonData options:NSJSONReadingMutableContainers error:&e];
if (e != nil) {
NSLog(#"Error:%#", e.description);
return nil;
}
return result;
}
However, I get an error: (Cocoa error 3840.)" (Invalid value around character 1.)
How do I remedy this?
Edit: Here's how I bring the entity from Python back into objective-c:
First I convert the entity to a dictionary:
def to_dict(self):
return dict((p, unicode(getattr(self, p))) for p in self.properties()
if getattr(self, p) is not None)
I add this dictionary to a list, set the value of my responseDict['entityList'] to this list, then self.response.out.write(json.dumps(responseDict))
However the result I get back still has that 'u' character.

[u'steve#gmail.com'] is the decoded python value of the array it is not valid JSON.
The valid JSON string data would be just ["steve#gmail.com"].
Dump the data from python back into a JSON string by doing:
import json
python_data = [u'steve#gmail.com']
json_string = json.dumps(data)
The u prefix on python string literals indicates that those strings are unicode rather than the default encoding in python2.X (ASCII).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Incorrect padding while decoding a Base64 string - python

Related

MD5 Mismatch between Python and PHP

Rijndael Padding in Python

Python String Prefix by 4 Byte Length

Base64 encoding of CryptoDigest / SHA1 - string doesn't match the result from Java / Python

Decoding JSON array with unicode character

Categories

Resources