Embed one pdf into another pdf using PyMuPDF - python

In need of help from learned people on this forum. I just want to embed one pdf file to another pdf file. So that when I go to the attachment section of the second file I can get to see and open the first file. I would like to do this with help of PyMupdf. Got a command embeddedFileAdd to do so but I am not sure how to use it.

Just Soved it with this code:
import fitz
pdf1=r'C:\Users\Amit PC\Desktop\pdf1.pdf'
pdf2=r'C:\Users\Amit PC\Desktop\pdf2.pdf'
outfile=r'C:\Users\Amit PC\Desktop\test2.pdf'
img= bytearray(open(pdf2,'rb').read())
doc1=fitz.open(pdf1)
doc1.embeddedFileAdd(img,'attach.pdf')
doc1.save(outfile, deflate = True)
doc1.close()

Related

How to read and print the contents of a ttf file?

Is there any way that I can open, read and write a ttf file?
Example:
with open('xyz.ttf') as f:
content = f.readline()
print(content)
A bit more:
If I open a .ttf (font) file with windows font viewer we see the following image
From this I like to extract following lines as text, with proper style.
What is exactly inside this file with *.ttf extension. I think you need to add more details of the input and output. If you reffering to a font type database you must first find a module/package to open and read it, since *.ttf isn't a normal text file.
Read the given links and install the required packages first:
https://pypi.python.org/pypi/FontTools
Then, as suggested:
from fontTools.ttLib import TTFont
font = TTFont('/path/to/font.ttf')
print(font)
<fontTools.ttLib.TTFont object at 0x10c34ed50>
If you need help with something else trying putting the input and expected output.
Other links:
http://www.starrhorne.com/2012/01/18/how-to-extract-font-names-from-ttf-files-using-python-and-our-old-friend-the-command-line.html
Here is a another useful python script:
https://gist.github.com/pklaus/dce37521579513c574d0

Django the powerpoint generated using python-pptx library has error message

I use python-pptx v0.6.2 to generate powerpoint. I read a exist powerpoint into BytesIO, then do some modification and save it. I can download the file successfully, and I'm sure the content can be write into the file. But when I open the powerpoint, it will popup a error message "Powerpoint found a problem with content in foo.pptx. Powerpoint can attempt to repair the presatation.", then I have to click "repair" button, the powerpoint will display as "repaired" mode. My Python version is 3.5.2 and Django version is 1.10. Below is my code:
with open('foo.pptx', 'rb') as f:
source_stream = BytesIO(f.read())
prs = Presentation(source_stream)
first_slide = prs.slides[0]
title = first_slide.shapes.title
subtitle = first_slide.placeholders[1]
title.text = 'Title'
subtitle.text = "Subtitle"
response = HttpResponse(content_type='application/vnd.ms-powerpoint')
response['Content-Disposition'] = 'attachment; filename="sample.pptx"'
prs.save(source_stream)
ppt = source_stream.getvalue()
source_stream.close()
response.write(ppt)
return response
Any help is appreciate, thanks in advance!
It looks like you've got problems with the IO.
The first three lines can be replaced by:
prs = Presentation('foo.pptx')
Placing the file into a memory-based stream just uses unnecessary resources.
On the writing, you're writing to that original (unnecessary) stream, which is dicey. I suspect that because you didn't seek(0) that you're appending onto the end of it. Also it's conceptually more complicated to deal with reuse.
If you use a fresh BytesIO buffer for the save I think you'll get the proper behavior. It's also better practice because it decouples the open, modify, and save, which you can then factor into separate methods later.
If you eliminate the first BytesIO you should just need the one for the save in order to get the .pptx "file" into the HTTP response.

post text from files into a web page form using python

I have a set of text files. I need to input them one after the other to a web server. I know how to input text using mechanize but have no idea how to extract text from files stored on computer and input them one after the other. In other words, say I have 10 files on my hard disk, I need to post text from one file, submit, then post another file and the process should go on until all the files are posted. Please help me with suggestions.
Thank you.
First make a for loop to iterate through your files. Then read the files and encode them into a POST request with urllib and urllib2. You need to change the url, filename pattern, and form fields accordingly.
url = "http://www.example.com/form"
import glob
import urllib
import urllib2
for filename in glob.glob("file*.txt"):
filedata = open(filename).read()
data = urllib.urlencode({'data' : filedata})
urllib2.urlopen(url=url, data=data)

How to insert annotation into pdf with Python

I want to add text or annotation in the exsting pdf file to interpret some key words.
At first I tried the pyPdf & reportlib to merge t he original pdf file & new generated interpretion pdf file, but it doesn't work. Because the original file keep out all the words of interpretation pdf and make new pdf file invisible. Don't know why? If I test to merge two new generated interpretion pdf file into one, it works well.
So I am thinking to try to use another way to insert just annotation into existing pdf file by python. Anybody have related experience can give me suggestion? Thanks!
Adding a watermark to existing pdf using PyPDF certainly works for me:
template = PdfFileReader(file("template.pdf", "rb")) #template pdf
output=PdfFileWriter() #writer for the merged pdf
for i in range(new.getNumPages()):
page=template.getPage(i)
page.mergePage(new.getPage(i))
output.addPage(page)
Read my other SO answer for reference.
Read my complete article to know more about pdf generation and merging in python.

how to insert a string to pdf using pypdf?

sorry,.. i'am a noob in python..
I need to create a pdf file, without using an existing pdf files.. (pure create a new one)
i have googling, and lot of them is merge 2 pdf or create a new file copies from a particular page in another file... what i want to achieve is make a report page (in chart), but for first step or the simple one "how to insert a string into my pdf file ? (hello world mybe)"..
this is my code to make a new pdf file with a single blankpage
from pyPdf import PdfFileReader, PdfFileWriter
op = PdfFileWriter()
# here to add blank page
op.addBlankPage(200,200)
#how to add string here, and insert it to my blank page ?
ops = file("document-output.pdf", "wb")
op.write(ops)
ops.close()
You want "pisa" or "reportlab" for generating arbitrary PDF documents, not "pypdf".
http://www.xhtml2pdf.com/doc/pisa-en.html
http://www.reportlab.org
Also check out the pyfpdf library. I've used the php port of this library for a few years and it's quite flexible, allowing you to work with flowable text, lines, rectangles, and images.
http://code.google.com/p/pyfpdf

Categories

Resources