I am using gfx to convert a particular page in a pdf to a .png image, but the image created is of very bad quality. I need to use gfx and can't use any other module. the code used is:
import gfx
pdf_loc=”C:\new.pdf”
pagenumber=12
doc = gfx.open('pdf',pdf_loc)
page = doc.getPage(page_number)
img = gfx.ImageList()
img.setparameter("antialise", "1") # turn on antialising
img.startpage(page.width,page.height)
page.render(img)
img.endpage()
input_loc="C:\newimg.png"
img.save(input_loc)
You can use the swfrender
add this
gfx.setparameter("zoom", "400")
You can learn more on http://wiki.swftools.org/wiki/Python_gfx_tutorial
Related
I have two pdfs " file.pdf ; BACKGROUND.pdf ". i wanna use BACKGROUND.pdf as background for file.pdf with python.
i don't know where to start i'm a python developper beginner
Maybe you could convert your first PDF to an image using this :
https://www.geeksforgeeks.org/convert-pdf-to-image-using-python/
# import module
from pdf2image import convert_from_path
# Store Pdf with convert_from_path function
images = convert_from_path('example.pdf')
for i in range(len(images)):
# Save pages as images in the pdf
images[i].save('page'+ str(i) +'.jpg', 'JPEG')
Then use the image as background for the second file.
Or use another util :
How to programmatically add a background to a pdf?
qpdf --underlay "background.pdf" -- file.pdf output.pdf
Should work for most cases.
Python users can use https://github.com/pikepdf/pikepdf a wrapper around qpdf
documentation at https://pikepdf.readthedocs.io/en/latest/
especially for this case Overlays, underlays, watermarks, n-up
I have attempted to use PyMuPDF to covert a PDF document to an image, so that I can use it in openCV. However I have an attribute error come up when I try to save the image and I'm not sure how to get around this?
import fitz
pdf = fitz.open('cornwall.pdf')
page = pdf.load_page(0)
pix = page.get_pixmap()
pix.writeImage("cornwall_output.png")
AttributeError: 'Pixmap' object has no attribute 'writeImage'
use pil_save method instead
https://pymupdf.readthedocs.io/en/latest/pixmap.html#Pixmap.pil_save
import fitz
pdf = fitz.open('cornwall.pdf')
page = pdf.load_page(0)
pix = page.get_pixmap()
pix.pil_save("cornwall_output.png")
# optional arg in this method:
# optimize=True
There is a standard way to save a PyMuPDF Pixmap: pix.save(). There is a handful of possible image formats available in this case: PNG, PSD (Adobe Photoshop), PS (Postscript) and the less popular PAM, PBM, PGM, PNM, PPM. Use pix.pil_save() instead only if you need more alternatives (e.g. JPEG) or special features offered by Pillow.
I have a set of many songs, some of which have png images in metadata, and I need to convert these to jpg.
I know how to convert png images to jpg in general, but I am currently accessing metadata using eyed3, which returns ImageFrame objects, and I don't know how to manipulate these. I can, for instance, access the image type with
print(img.mime_type)
which returns
image/png
but I don't know how to progress from here. Very naively I tried loading the image with OpenCV, but it is either not a compatible format or I didn't do it properly. And anyway I wouldn't know how to update the old image with the new one either!
Note: While I am currently working with eyed3, it is perfectly fine if I can solve this any other way.
I was finally able to solve this, although in a not very elegant way.
The first step is to load the image. For some reason I could not make this work with eyed3, but TinyTag does the job:
from PIL import Image
from tinytag import TinyTag
tag = TinyTag.get(mp3_path, image=True)
image_data = tag.get_image()
img_bites = io.BytesIO(image_data)
photo = Image.open(im)
Then I manipulate it. For example we may resize it and save it as jpg. Because we are using Pillow (PIL) for these operations, we actually need to save the image and finally load it back to get the binary data (this detail is probably what should be improved in the process).
photo = photo.resize((500, 500)) # suppose we want 500 x 500 pixels
rgb_photo = photo.convert("RGB")
rgb_photo.save(temp_file_path, format="JPEG")
The last step is thus to load the image and set it as metadata. You have more details about this step in this answer.:
audio_file = eyed3.load(mp3_path) # this has been loaded before
audio_file.tag.images.set(
3, open(temp_file_path, "rb").read(), "image/jpeg"
)
audio_file.tag.save()
I wrote a code that takes a screenshot that I want to paste into a word document using docx. So far I have to save the image as a png file. The relevant part of my code is:
from docx import Document
import pyautogui
import docx
doc = Document()
images = []
img = pyautogui.screenshot(region = (some region))
images.append(img)
img.save(imagepath.png)
run =doc.add_picture(imagepath.png)
run
I would like to be able to add the image without saving it. Is it possible to do this using docx?
Yes, according to add_picture — Document objects — python-docx 0.8.10 documentation, add_picture can import data from a stream as well.
As per Screenshot Functions — PyAutoGUI 1.0.0 documentation, screenshot() produces a PIL/Pillow image object which can be save()'d with a BytesIO() as destination to produce a compressed image data stream in memory.
So that'll be:
import io
imdata = io.BytesIO()
img.save(imdata, format='png')
imdata.seek(0)
doc.add_picture(imdata)
del imdata # cannot reuse it for other pictures, you need a clean buffer each time
# can use .truncate(0) then .seek(0) instead but this is probably easier
I'm trying to convert a PDF's first page to an image. However, the PDF is coming straight from the database in a base64 format. I then convert it to a blob. I want to know if it's possible to convert the first page of the PDF to an image within my Python code.
I'm familiar with being able to use filename in the Image object:
Image(filename="test.pdf[0]") as img:
The issue I'm facing is there is not an actual filename, just a blob. This is what I have so far, any suggestions would be appreciated.
x = object['file']
fileBlob = base64.b64decode('x')
with Image(**what do I put here for pdf blob?**) as img:
more code
It works for me
all_pages = Image(blob=blob_pdf) # PDF will have several pages.
single_image = all_pages.sequence[0] # Just work on first page
with Image(single_image) as i:
...
Documentation says something about blobs.
So it should be:
with Image(blob=fileBlob):
#etc etc
I didn't test that but I think this is what you are after.