Python Read Binary vs NodeJS Read Binary

Python Read Binary vs NodeJS Read Binary - python

I have a method in a Python REST service that expects binary contents of an image file that it extracts and saves it to a file that can later be opened as a valid image file:
#require_http_module(['POST'])
def my_method(request)
image_data = request.FILES['image']
fs.save('image.png', image_data)
If I send a request at this service via a Python script, it works fine:
import requests
image_file = '......image.png'
image_data = open(image_file, 'rb')
requests.post('http://127.0.0.1:8080/...', files = dict(image = image_data))
If, however, I use NodeJS to dispatch a request at the Python service, it doesn't work properly:
import { readFileSync } from 'fs';
import FormData from 'form-data';
import fetch from 'node-fetch';
const IMAGE_DATA = readFileSync('......image.png', { encoding: 'binary' });
const FORM_DATA = new FormData();
FORM_DATA.append('image', IMAGE_DATA, { filename: 'image.png' });
fetch('http://127.0.0.1:8080/...', { method: 'POST', body: FORM_DATA });
Opening the image files saved by the Python service for both these requests in VSCode as binary data makes the reason clear: Encoding/Charset differences between the two clients.
The window on the right is the PNG file sent by Python script and saved by the Python service.
The window on the left is the PNG file sent by NodeJS script and saved by the Python service.
Is there a way to fix this in NodeJS?
EDIT
Upon further testing, I found out that the following:
const IMAGE_DATA_1 = readFileSync('imageSource.png', { encoding: 'binary' });
writeFileSync('imageDest1.png', IMAGE_DATA_1);
// THE FILE imageDest1.png CONTAINS BAD DATA AND CANNOT BE OPENED IN A GRAPHICS PROGRAM
const IMAGE_DATA_2 = readFileSync('imageSource.png');
writeFileSync('imageDest2.png', IMAGE_DATA_2);
// THE FILE imageDest2.png CONTAINS VALID DATA AND CAN BE OPENED IN A GRAPHICS PROGRAM
So the solution was to not use { encoding: 'binary' } flags. However, I would've thought that image files contain binary data and should be opened with binary encoding. So why is this causing issues?

Related

Save the API response (Json format ) as image file locally to system

Below is the api response im getting :::
{"contentType":"image/jpeg","createdTime":"2021-10-10T11:00:47.000Z","fileName":"Passport_Chris J Passport Color - pp.jpg","id":10144,"size":105499,"updatedTime":"2021-10-10T11:00:47.000Z","links":[{"rel":"self","href":"https://dafzprod.custhelp.com/services/rest/connect/v1.4/CompanyRegd.ManagerDetails/43/FileAttachments/10144?download="},{"rel":"canonical","href":"https://dafzprod.custhelp.com/services/rest/connect/v1.4/CompanyRegd.ManagerDetails/43/FileAttachments/10144"},{"rel":"describedby","href":"https://dafzprod.custhelp.com/services/rest/connect/v1.4/metadata-catalog/CompanyRegd.ManagerDetails/FileAttachments","mediaType":"application/schema+json"}]}
i need to save this file as jpg format locally to my system? could you please provide me a solution through python

You might have to decode the JSON-string (if not already done):
import json
json_decoded = json.loads(json_string)
Afterwards you can get the URL to retrieve and the filename from this JSON-structure
url = json_decoded['links'][0]['href']
local_filename = json_decoded['fileName']
Now you can download the file and save it (as seen here How to save an image locally using Python whose URL address I already know?):
import urllib.request
urllib.request.urlretrieve(url, local_filename)

Write in-memory data to an encrypted ZIP or 7Z (without using an on-disk temp file)

I have 2 GB of data in memory (for example data = b'ab' * 1000000000) that I would like to write in a encrypted ZIP or 7Z file.
How to do this without writing data to a temporary on-disk file?
Is it possible with only Python built-in tools (+ optionally 7z)?
I've already looked at this:
ZipFile.writestr writes from a in-memory string/bytes which is good but:
ZipFile.setpassword: only for read, and not write
How to create an encrypted ZIP file? : most answers use a file as input (and cannot work with in-memory data), especially the solutions with pyminizip and those with:
subprocess.call(['7z', 'a', '-mem=AES256', '-pP4$$W0rd', '-y', 'myarchive.zip']...
Other solutions require to trust an implementation of cryptography by a third-party tool (see comments), so I would like to avoid them.

7z.exe has the -si flag, which lets it read data for a file from stdin. This way you could still use 7z's commandline from a subprocess even without an extra file:
from subprocess import Popen, PIPE
# inputs
szip_exe = r"C:\Program Files\7-Zip\7z.exe" # ... get from registry maybe
memfiles = {"data.dat" : b'ab' * 1000000000}
arch_filename = "myarchive.zip"
arch_password = "Swordfish"
for filename, data in memfiles.items():
args = [szip_exe, "a", "-mem=AES256", "-y", "-p{}".format(arch_password),
"-si{}".format(filename), output_filename]
proc = Popen(args, stdin=PIPE, stdout=PIPE, stderr=PIPE)
proc.stdin.write(data)
proc.stdin.close() # causes 7z to terminate
# proc.stdin.flush() # instead of close() when on Mac, see comments
proc.communicate() # wait till it actually has
The write() takes somewhat above 40 seconds on my machine, which is quite a lot. I can't say though if that's due to any inefficiencies from piping the whole data through stdin or if it's just how long compressing and encrypting a 2GB file takes. EDIT: Packing the file from HDD took 47 seconds on my machine, which speaks for the latter.

ORIGINAL POST 03.19.2022
Here is one way to accomplish your use case using pyzipper
import fs
import pyzipper
# create in-memory file system
mem_fs = fs.open_fs('mem://')
mem_fs.makedir('hidden_dir')
# generate data
data = b'ab' * 10
secret_password = b'super secret password'
# Create encrypted password protected ZIP file in-memory
with pyzipper.AESZipFile(mem_fs.open('/hidden_dir/password_protected.zip', 'wb'),
'w',
compression=pyzipper.ZIP_LZMA,
encryption=pyzipper.WZ_AES) as zf:
zf.setpassword(secret_password)
zf.writestr('data.txt', data)
# Read encrypted password protected ZIP file from memory
with pyzipper.AESZipFile(mem_fs.open('/hidden_dir/password_protected.zip', 'rb')) as zf:
zf.setpassword(secret_password)
my_secrets = zf.read('data.txt')
print(my_secrets)
# output
b'abababababababababab'
UPDATED 03.21.2022
Reading through our comments you continue to raise concerns about the cryptography components of modules, such as pyzipper, but not 7Z LIB/SDK. Here is an academic paper on 7Z LIB/SDK version 19 cryptography.
Based on your concerns have you considered encrypting your data in memory prior to writing it to a zipfile?
Here is an example for doing this and writing the encrypted data to a file in memory:
import os
import fs
import base64
from cryptography.fernet import Fernet
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
mem_fs = fs.open_fs('mem://')
mem_fs.makedir('hidden_dir')
password = b"password"
salt = os.urandom(16)
kdf = PBKDF2HMAC(
algorithm=hashes.SHA256(),
length=32,
salt=salt,
iterations=390000,
)
key = base64.urlsafe_b64encode(kdf.derive(password))
f = Fernet(key)
data = b'ab' * 10
encrypted_message = f.encrypt(data)
with mem_fs.open('hidden_dir/encrypted.text', 'wb') as in_file_in_memory:
in_file_in_memory.write(encrypted_message)
in_file_in_memory.close()
with mem_fs.open('hidden_dir/encrypted.text', 'rb') as out_file_in_memory:
raw_data = out_file_in_memory.read()
decrypted_data = f.decrypt(raw_data)
print(decrypted_data)
# output
b'abababababababababab'
Previously in the comments I mentioned key management, which is similar to maintaining a list of passwords for your zip archives.
I don't know your setup, but you could pregenerate keys in advance and stored them in a secure way for use in your code.
I don't have 7z installed on my Mac, so I could only give you pseudocode. The examples below aren't using 7z.
import os
import fs
import base64
import pyzipper
from zipfile import ZipFile
from cryptography.fernet import Fernet
mem_fs = fs.open_fs('mem://')
mem_fs.makedir('hidden_dir')
# pregenerate key
f = Fernet(b'-6_WO-GLrlXexdSbon_fKJoVOVBh66LdYrEM0Kvcwf0=')
data = b'ab' * 10
encrypted_message = f.encrypt(data)
with mem_fs.open('hidden_dir/encrypted.text', 'wb') as in_file_in_memory:
in_file_in_memory.write(encrypted_message)
in_file_in_memory.close()
# This uses standard ZIP with no password, but the data
# is encrypted
with mem_fs.open('hidden_dir/encrypted.text', 'rb') as out_file_in_memory:
raw_data = out_file_in_memory.read()
with ZipFile('archive.zip', mode='w') as zip_file:
zip_file.writestr('file.txt', raw_data)
# This uses pyzipper to create a password word protected
# encrypted file, which stores the encrypted.text.
# overkill, because the data is already encrypted prior
with mem_fs.open('hidden_dir/encrypted.text', 'rb') as out_file_in_memory:
raw_data = out_file_in_memory.read()
secret_password = b'super secret password'
# Create encrypted password protected ZIP file in-memory
with pyzipper.AESZipFile('password_protected.zip',
'w',
compression=pyzipper.ZIP_LZMA,
encryption=pyzipper.WZ_AES) as zf:
zf.setpassword(secret_password)
zf.writestr('data.txt', raw_data)
I'm still looking into how to pipe this encrypted.text to subprocess 7-zip.

It would probably be simplest to use third-party applications such as RadeonRAMDisk to emulate disk operations in-memory, but you stated you prefer not to. Another possibility is to extend PyFilesystem to allow encrypted zip-file operations on a memory filesystem.

I don't know about Python, but it is possible to do this using the 7zip C++ interface. However, it is a lot of work. Here's an excerpt from my implementation that I'm using for a project that has to pack zip files:
class CArchiveUpdateCallback : public IArchiveUpdateCallback
{
public:
STDMETHODIMP GetProperty(UInt32 Index, PROPID PropID, PROPVARIANT* PropValue)
{
const std::wstring& FilePath = m_FileList[Index].first;
const std::wstring& ItemPath = m_FileList[Index].second;
switch (PropID)
{
case kpidPath:
V_VT(PropValue) = VT_BSTR;
V_BSTR(PropValue) = SysAllocString(ItemPath.c_str());
break;
case kpidSize:
V_VT(PropValue) = VT_UI8;
PropValue->uhVal.QuadPart = Utils::GetSize(FilePath);
break;
}
return S_OK;
}
STDMETHODIMP GetStream(UInt32 ItemIndex, ISequentialInStream** InStream)
{
const std::wstring& FilePath = m_FileList[ItemIndex].first;
HRESULT hr = CInStream::Create(FilePath, IID_ISequentialInStream, (void**)InStream);
return hr;
}
protected:
std::vector<std::pair<std::wstring, std::wstring>> m_FileList;
};
It currently works exclusively with on-disk files, but could be modified to accommodate in-memory buffers. For example, it could operate on a list of (in-memory) IInStream objects instead of a list of file paths.

I am not able able to understand why you don't want temporary on-disk file, as it would reduce the complexity.
And yes I have found few solution which requires only built in modules of python:
You can use subprocess to interact with powershell and create zip file using powershell command. You can either run the command or save the command in a .ps1 file and execute it. (This solution requires you to install 7zip software)
def run(self, cmd):
completed = subprocess.run(["powershell", "-Command", cmd], capture_output=True)
return completed
and the powershell code would be:
# Note this code is not written by me, link is provide to the actual owner
function Write-ZipUsing7Zip([string]$FilesToZip, [string]$ZipOutputFilePath, [string]$Password, [ValidateSet('7z','zip','gzip','bzip2','tar','iso','udf')][string]$CompressionType = 'zip', [switch]$HideWindow)
{
# Look for the 7zip executable.
$pathTo32Bit7Zip = "C:\Program Files (x86)\7-Zip\7z.exe"
$pathTo64Bit7Zip = "C:\Program Files\7-Zip\7z.exe"
$THIS_SCRIPTS_DIRECTORY = Split-Path $script:MyInvocation.MyCommand.Path
$pathToStandAloneExe = Join-Path $THIS_SCRIPTS_DIRECTORY "7za.exe"
if (Test-Path $pathTo64Bit7Zip) { $pathTo7ZipExe = $pathTo64Bit7Zip }
elseif (Test-Path $pathTo32Bit7Zip) { $pathTo7ZipExe = $pathTo32Bit7Zip }
elseif (Test-Path $pathToStandAloneExe) { $pathTo7ZipExe = $pathToStandAloneExe }
else { throw "Could not find the 7-zip executable." }
# Delete the destination zip file if it already exists (i.e. overwrite it).
if (Test-Path $ZipOutputFilePath) { Remove-Item $ZipOutputFilePath -Force }
$windowStyle = "Normal"
if ($HideWindow) { $windowStyle = "Hidden" }
# Create the arguments to use to zip up the files.
# Command-line argument syntax can be found at: http://www.dotnetperls.com/7-zip-examples
$arguments = "a -t$CompressionType ""$ZipOutputFilePath"" ""$FilesToZip"" -mx9"
if (!([string]::IsNullOrEmpty($Password))) { $arguments += " -p$Password" }
# Zip up the files.
$p = Start-Process $pathTo7ZipExe -ArgumentList $arguments -Wait -PassThru -WindowStyle $windowStyle
# If the files were not zipped successfully.
if (!(($p.HasExited -eq $true) -and ($p.ExitCode -eq 0)))
{
throw "There was a problem creating the zip file '$ZipFilePath'."
}
}
Using powershell dependency 7zip4PowerShell and then interact with the shell using subprocess. (Link provided)
Launch PowerShell with administrative escalation.
Install the 7-zip module by entering the cmdlet below. It does query the PS gallery and uses a third-party repository to download the dependencies. If you’re OK with the security considerations, approve the installation to proceed:
Install-Module -Name 7zip4PowerShell -Verbose
Change directories to where you want the compressed file saved.
Create a secure string for your compressed file’s encryption by entering the cmdlet below:
$SecureString = Read-Host -AsSecureString
Enter the password you wish to use in PowerShell. The password will be obfuscated by asterisks. The plain text entered will be converted to $SecuresString, and you’ll use that in the next step.
Enter the following cmdlet to encrypt the resulting compressed file:
Compress-7zip -Path "\path ofiles" -ArchiveFileName "Filename.zip" -Format Zip -SecurePassword $SecureString
The resulting ZIP file will be saved to the chosen directory once the command has completed processing.
You can either follow the process in powershell terminal, or just interact with the terminal after installing the dependency using subprocess.
References:
The Powershell Code (Method 1)
2nd Method
Executing powershell command in python

PIL Image as Bytes with BytesIO to prevent hard disk saving

Problematic
I have a PIL Image and i want to convert it to a bytes array. I can't save the image on my hard disk so i can't use the default open(file_path, 'rb') function.
What i tried
To overturn this problem i'm trying to use the io library doing this :
buf = io.BytesIO()
image.save(buf, format='JPEG')
b_image = buf.getvalue()
Considering image as a functional PIL Image.
the "b_image" will be used as argument for the Microsoft Azure cognitives services function read_in_stream()
If we look in the documentation, we can see that this function image argument have to be :
image
xref:Generator
Required
An image stream.
Documentation available here
The issue
When i execute it i got the error :
File "C:...\envs\trainer\lib\site-packages\msrest\service_client.py", line 137, in stream_upload
chunk = data.read(self.config.connection.data_block_size)
AttributeError: 'bytes' object has no attribute 'read'
There is no error in the client authentification or at another point because when i give as parameter an image imported with this line :
image = open("./1.jpg", 'rb')
Everything is working correctly..
Sources
I also saw this post that explains exactly what i want to do but in my case it's not working. Any idea would be appreciated.

When we use the method read_in_stream, we need to provide a stream. But the code BytesIO.getvalue will return the content of the stream as string or bytes. So please update code as below
buf = io.BytesIO()
image.save(buf, format='JPEG')
computervision_client.read_in_stream(buf)
For more details, please refer to here
Update
Regarding the issue, I suggest you use rest API to implement your need.
import io
import requests
from PIL import Image
import time
url = "{your endpoint}/vision/v3.1/read/analyze"
key = ''
headers = {
'Ocp-Apim-Subscription-Key': key,
'Content-Type': 'application/octet-stream'
}
// process image
...
with io.BytesIO() as buf:
im.save(buf, 'jpeg')
response = requests.request(
"POST", url, headers=headers, data=buf.getvalue())
# get result
while True:
res = requests.request(
"GET", response.headers['Operation-Location'], headers=headers)
status = res.json()['status']
if status == 'succeeded':
print(res.json()['analyzeResult'])
break
time.sleep(1)

How to copy / download file created in Pyodide in browser?

I managed to run Pyodide in browser. I created hello.txt file. But how can I access it.
Pyodide https://github.com/iodide-project/pyodide/blob/master/docs/using_pyodide_from_javascript.md
pyodide.runPython('open("hello.txt", "w")')
What I tried in chrome devtools?
pyodide.runPython('os.chdir("../")')
pyodide.runPython('os.listdir()')
pyodide.runPython('os.path.realpath("hello.txt")')
Output for listdir
["hello.txt", "lib", "proc", "dev", "home", "tmp"]
Output for realpath
"/hello.txt"
Also,
pyodide.runPython('import platform')
pyodide.runPython('platform.platform()')
Output
"Emscripten-1.0-x86-JS-32bit"
All outputs in chrome devtools console.
It is created in root folder. But how it can be accessed in file explorer or anyway to copy file to Download folder?
Thanks

Indeed pyodide operates in an in-memory (MEMFS) filesystem created by Emscripten. You can't directly write files to disk from pyodide since it's executed in the browser sandbox.
You can however, pass your file to JavaScript, create a Blob out of it and then download it. For instance, using,
let txt = pyodide.runPython(`
with open('/test.txt', 'rt') as fh:
txt = fh.read()
txt
`);
const blob = new Blob([txt], {type : 'application/text'});
let url = window.URL.createObjectURL(blob);
window.location.assign(url);
It should have been also possible to do all of this from the Python side, using the type conversions included in pyodide, i.e.
from js import Blob, document
from js import window
with open('/test.txt', 'rt') as fh:
txt = fh.read()
blob = Blob.new([txt], {type : 'application/text'})
url = window.URL.createObjectURL(blob)
window.location.assign(url)
however at present, this unfortunately doesn't work, as it depends on pyodide#788 being resolved first.

I have modified the answer by rth. It will download file with the name of file.
let txt = pyodide.runPython(`
with open('/test.txt', 'rt') as fh:
txt = fh.read()
txt
`);
const blob = new Blob([txt], {type : 'application/text'});
let url = window.URL.createObjectURL(blob);
var downloadLink = document.createElement("a");
downloadLink.href = url;
downloadLink.download = "test.txt";
document.body.appendChild(downloadLink);
downloadLink.click();

Flask Application File Upload - Error while getting contents of file

I am developing a flask application which uploads a file to IBM Bluemix Cloudant DB. I need to save the contents of the file as a key value pair in Cloudant.
If I try to save a text file, it reads the content correctly. For other type of files it does not work.
Following is my flask REST API CODE:
#app.route('/upload', methods=['POST'])
def upload_file():
file_to_upload = request.files['file_upload'];
response = CloudantDB().upload_file_to_db(file_to_upload);
//tHE FUNCTION upload_file under CloudantDB is as shown below.
file_name = file.filename;
uploaded_file_content = file.read();
data = {
'file_name': file_name,
'file_contents': uploaded_file_content,
'version': version
}
my_doc = self.database.create_document(data);
I know the error is because "uploaded_file_content" is in a different format (i.e. For PDFs, JPGs etc).
Is there anyway I can overcome this?
Thanks!

The difference is that text files contain ordinary text whereas JPG, PNG etc. contain binary data.
Binary data should be uploaded as an attachment with a mime type, and you need to base64 encode the data. You don't show what create_document() is doing, but it's unlikely that it is able to treat binary data as an attachment. This might fix it for you:
from base64 import b64encode
uploaded_file_content = b64encode(file.read());
data = {
'file_name': file_name,
'version': version,
'_attachments': {
file_name : {
'content-type': 'image/png',
'data': uploaded_file_content
}
}
}
my_doc = self.database.create_document(data);
It should also be possible with your current code to simply base64 encode the file content and upload it. So that you know what type of data is stored should you later retrieve it, you will need to add another key value pair to store the mime type as content-type does above.
Attachments have advantages in that they can be individually addressed, read, deleted, updated without affecting the containing document, so you are probably better off using them.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Read Binary vs NodeJS Read Binary - python

Related

Save the API response (Json format ) as image file locally to system

Write in-memory data to an encrypted ZIP or 7Z (without using an on-disk temp file)

PIL Image as Bytes with BytesIO to prevent hard disk saving

How to copy / download file created in Pyodide in browser?

Flask Application File Upload - Error while getting contents of file

Categories

Resources