I am using http://cloudinary.com/documentation/image_upload_api_reference as reference.
There are two cases in which I want to upload the files to cloudinary.
Upload image by directly giving url link.
Upload image bytes by taking them from different source.
I could solve case 1, but had trouble in 2nd. I am pasting my code flow below for reference.
import cloudinary
import cloudinary.uploader
from io import BytesIO
from StringIO import StringIO
def upload_image_to_cloudinary(img_tag):
logging.debug("Uploading Image to cloudinary : %s"%img_tag)
if 'src' not in img_tag.attrs:
del img_tag
return
img_src = img_tag['src']
if img_src.startswith('/blob'):
quip_client = pgquip.get_client()
blob_ids = img_src.split('/')
blob_response = quip_client.get_blob(blob_ids[2], blob_ids[3])
img_src_str = blob_response.read() # this returns str object.
# img_src = BytesIO(img_src_str)
img_src = StringIO(img_src_str)
cloudinary_response = cloudinary.uploader.upload_image(
img_src,
use_filename=True,
folder="/pagalguy/articles",
width=546,
crop="limit"
)
img_tag['src'] = cloudinary_response.metadata.get("url")
return img_tag
In case where img_src is a image blob str returned by another api, I passed it as file param mentioned in cloudinary doc in a very similar way as any external image url for eg: https://media.licdn.com/mpr/mpr/shrinknp_400_400/AAEAAQAAAAAAAAIkAAAAJGRhNzJiYjY1LTUxOTctNDI4NC1hOGIwLWQ1OTVlNmZlZmVmYw.jpg
And, for checking how generic upload flows work like boto for s3, I check below repo code.
Refered https://github.com/boto/boto/blob/develop/boto/vendored/six.py#L633 this too.
Error Log:
Invalid URL for upload
Traceback (most recent call last):
File "/base/data/home/apps/s~pagalguy-staging/namita:v1.397698162588746989/articleslib/article_util.py", line 68, in upload_images_n_publish
tag = image_util.upload_image_to_cloudinary(tag)
File "/base/data/home/apps/s~pagalguy-staging/namita:v1.397698162588746989/api/image_util.py", line 133, in upload_image_to_cloudinary
crop="limit"
File "/base/data/home/apps/s~pagalguy-staging/namita:v1.397698162588746989/libs/cloudinary/uploader.py", line 23, in upload_image
result = upload(file, **options)
File "/base/data/home/apps/s~pagalguy-staging/namita:v1.397698162588746989/libs/cloudinary/uploader.py", line 17, in upload
return call_api("upload", params, file = file, **options)
File "/base/data/home/apps/s~pagalguy-staging/namita:v1.397698162588746989/libs/cloudinary/uploader.py", line 226, in call_api
raise Error(result["error"]["message"])
Error: Invalid URL for upload
Finally I don't know which is the correct way to upload image bytes to cloudinary.
Your img_src parameter, which represents file, should be populated with either a byte array buffer (bytearray) or a Base64 URI. You can try something like:
with open(img_src_str, "rb") as imageFile:
f = imageFile.read()
img_src = bytearray(f)
cloudinary_response = cloudinary.uploader.upload(
img_src,
...
)
That API can upload bytes, so if you are uploading _io.BytesIO, you may just use .getvalue() method to your bytesIO object like uploader.upload(image_stream.getvalue(), public_id = filename)
Related
Below is the api response im getting :::
{"contentType":"image/jpeg","createdTime":"2021-10-10T11:00:47.000Z","fileName":"Passport_Chris J Passport Color - pp.jpg","id":10144,"size":105499,"updatedTime":"2021-10-10T11:00:47.000Z","links":[{"rel":"self","href":"https://dafzprod.custhelp.com/services/rest/connect/v1.4/CompanyRegd.ManagerDetails/43/FileAttachments/10144?download="},{"rel":"canonical","href":"https://dafzprod.custhelp.com/services/rest/connect/v1.4/CompanyRegd.ManagerDetails/43/FileAttachments/10144"},{"rel":"describedby","href":"https://dafzprod.custhelp.com/services/rest/connect/v1.4/metadata-catalog/CompanyRegd.ManagerDetails/FileAttachments","mediaType":"application/schema+json"}]}
i need to save this file as jpg format locally to my system? could you please provide me a solution through python
You might have to decode the JSON-string (if not already done):
import json
json_decoded = json.loads(json_string)
Afterwards you can get the URL to retrieve and the filename from this JSON-structure
url = json_decoded['links'][0]['href']
local_filename = json_decoded['fileName']
Now you can download the file and save it (as seen here How to save an image locally using Python whose URL address I already know?):
import urllib.request
urllib.request.urlretrieve(url, local_filename)
During a file upload, i decided to read the file and save as base64 until s3 becomes available to our team. I use the code below to convert the file to bs64.
def upload_file_handler(file):
"""file -> uploaded bytestream from django"""
bs4 = base64.b64encode(file.read())
return {'binary': bs4, 'name': file.name}
I store the binary derived from the above in a str to a db. Now the challenge is getting the file back and uploading to s3.
I attempted to run bs64.decode on the file string from the db and write to a file. But when i open the file, it seems broken, I've attempted with breakthrough.
q = Report.objects.first()
data = q.report_binary
f = base64.b64decode(data)
content_file = ContentFile(f, name="hello.docx")
instance = TemporaryFile(image=content_file)
instance.save()
This is one of the files i am trying to recreate from the binary.
https://gist.github.com/saviour123/38300b3ff2c7a0d1a01c15332c583e20
How can i generate the file from the base64 binary?
I've used PyMuPDF library to parse the content of any specific page of a pdf file locally and found it working. However, when I try to apply the same logic while parsing the content of any specific page of a pdf file available online, I encounter an error.
I got success using the following script (local pdf):
import fitz
path = r'C:\Users\WCS\Desktop\pymupdf\Regular Expressions Cookbook.pdf'
doc = fitz.open(path)
page1 = doc.loadPage(5)
page1text = page1.getText("text")
print(page1text)
The script below throws an error (pdf that is available online):
import fitz
import requests
URL = 'https://buildmedia.readthedocs.org/media/pdf/pdfminer-docs/latest/pdfminer-docs.pdf'
res = requests.get(URL)
doc = fitz.open(res.content)
page1 = doc.loadPage(5)
page1text = page1.getText("text")
print(page1text)
Error that the script encounters:
Traceback (most recent call last):
File "C:\Users\WCS\AppData\Local\Programs\Python\Python37-32\general_demo.py", line 8, in <module>
doc = fitz.open(res.content)
File "C:\Users\WCS\AppData\Local\Programs\Python\Python37-32\lib\site-packages\fitz\fitz.py", line 2010, in __init__
_fitz.Document_swiginit(self, _fitz.new_Document(filename, stream, filetype, rect, width, height, fontsize))
RuntimeError: cannot open b'%PDF-1.5\n%\xd0\xd4\xc5\xd8\n1 0 obj\n<<\n/Length 843 \n/Filter /FlateDecode\n>>\nstream\nx\xdamUMo\xe20\x10\xbd\xe7Wx\x0f\x95\xda\x03\xc5N\xc8W\x85\x90\x9c\x84H\x1c\xb6\xad\nZ\xed\x95&\xa6\x8bT\x12\x14\xe0\xd0\x7f\xbf~3\x13\xda\xae\xf
How can I read the content directly from online?
Looks like you need to initialize the object with stream:
>>> # from memory
>>> doc = fitz.open(stream=mem_area, filetype="pdf")
mem_area has the data of the document.
https://pymupdf.readthedocs.io/en/latest/document.html#Document
I think you were missing the read() function to read file as bytesIO which pymupdf can then consume.
with fitz.open(stream=uploaded_pdf.read(), filetype="pdf") as doc:
text = ""
for page in doc:
text += page.getText()
print(text)
I have a similar issues like How to upload a bytes image on Google Cloud Storage from a Python script.
I tried this
from google.cloud import storage
import cv2
from tempfile import TemporaryFile
import google.auth
credentials, project = google.auth.default()
client = storage.Client()
# https://console.cloud.google.com/storage/browser/[bucket-id]/
bucket = client.get_bucket('document')
# Then do other things...
image=cv2.imread('/Users/santhoshdc/Documents/Realtest/15.jpg')
with TemporaryFile() as gcs_image:
image.tofile(gcs_image)
blob = bucket.get_blob(gcs_image)
print(blob.download_as_string())
blob.upload_from_string('New contents!')
blob2 = bucket.blob('document/operations/15.png')
blob2.upload_from_filename(filename='gcs_image')
This is the error that's posing up
> Traceback (most recent call last): File
> "/Users/santhoshdc/Documents/ImageShapeSize/imageGcloudStorageUpload.py",
> line 13, in <module>
> blob = bucket.get_blob(gcs_image) File "/Users/santhoshdc/.virtualenvs/test/lib/python3.6/site-packages/google/cloud/storage/bucket.py",
> line 388, in get_blob
> **kwargs) File "/Users/santhoshdc/.virtualenvs/test/lib/python3.6/site-packages/google/cloud/storage/blob.py",
> line 151, in __init__
> name = _bytes_to_unicode(name) File "/Users/santhoshdc/.virtualenvs/test/lib/python3.6/site-packages/google/cloud/_helpers.py",
> line 377, in _bytes_to_unicode
> raise ValueError('%r could not be converted to unicode' % (value,)) ValueError: <_io.BufferedRandom name=7> could not be
> converted to unicode
Can anyone guide me what's going wrong or what I'm doing incorrectly?
As suggested by #A.Queue in(gets deleted after 29 days)
from google.cloud import storage
import cv2
from tempfile import TemporaryFile
client = storage.Client()
bucket = client.get_bucket('test-bucket')
image=cv2.imread('example.jpg')
with TemporaryFile() as gcs_image:
image.tofile(gcs_image)
gcs_image.seek(0)
blob = bucket.blob('example.jpg')
blob.upload_from_file(gcs_image)
The file got uploaded,but uploading a numpy ndarray doesn't get saved as an image file on the google-cloud-storage
PS:
numpy array has to be convert into any image format before saving.
This is fairly simple, use the tempfile created to store the image, here's the code.
with NamedTemporaryFile() as temp:
#Extract name to the temp file
iName = "".join([str(temp.name),".jpg"])
#Save image to temp file
cv2.imwrite(iName,duplicate_image)
#Storing the image temp file inside the bucket
blob = bucket.blob('ImageTest/Example1.jpg')
blob.upload_from_filename(iName,content_type='image/jpeg')
#Get the public_url of the saved image
url = blob.public_url
You are calling blob = bucket.get_blob(gcs_image) which makes no sense. get_blob() is supposed to get a string argument, namely the name of the blob you want to get. A name. But you pass a file object.
I propose this code:
with TemporaryFile() as gcs_image:
image.tofile(gcs_image)
gcs_image.seek(0)
blob = bucket.blob('documentation-screenshots/operations/15.png')
blob.upload_from_file(gcs_image)
Have been trying to put an image into a PDF file using PyMuPDF / Fitz and everywhere I look on the internet I get the same syntax, but when I use it I'm getting a runtime error.
>>> doc = fitz.open("NewPDF.pdf")
>>> page = doc[1]
>>> rect = fitz.Rect(0,0,880,1080)
>>> page.insertImage(rect, filename = "Image01.jpg")
error: object is not a stream
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\fitz\fitz.py", line 1225, in insertImage
return _fitz.Page_insertImage(self, rect, filename, pixmap, overlay)
RuntimeError: object is not a stream
>>> page
page 1 of NewPDF.pdf
I've tried a few different variations on this, with pixmap and without, with overlay value set, and without. The PDF file exists and can be opened with Adobe Acrobat Reader, and the image file exists - I have tried PNG and JPG.
Thank you in advanced for any help.
just some hints to attempt:
Ensure that your "Image01.jpg" file is open and use the full path.
image_path = "/full/path/to/Image01.jpg"
image_file = Image.open(
open(image_path, 'rb'))
# side-note: generally it is better to use the open with syntax, see link below
# https://stackoverflow.com/questions/9282967/how-to-open-a-file-using-the-open-with-statement
To ensure that you are actually on the pdf page that you expect to be, try this. This code will insert the image only on the first page
for page in doc:
page.InsertImage(rect, filename=image_path)
break # Without this, the image will appear on each page of your pdf