I am using PyPDF2 to take an input PDF of any paper size and convert it to a PDF of A4 size with the input PDF scaled and fit in the centre of the output pdf.
Here's an example of an input (convert to pdf with imagemagick convert image.png input.pdf), which can be of any dimensions:
And the expected output is:
I'm not a developer and my knowledge of python is basic but I have been trying to figure this out from the documentation, but haven't had much success.
My latest attempt is as follows:
from pypdf import PdfReader, PdfWriter, Transformation, PageObject
from pypdf import PaperSize
pdf_reader = PdfReader("input.pdf")
page = pdf_reader.pages[0]
writer = PdfWriter()
A4_w = PaperSize.A4.width
A4_h = PaperSize.A4.height
# resize page2 to fit *inside* A4
h = float(page.mediabox.height)
w = float(page.mediabox.width)
print(A4_h, h, A4_w, w)
scale_factor = min(A4_h / h, A4_w / w)
print(scale_factor)
transform = Transformation().scale(scale_factor, scale_factor).translate(0, A4_h / 3)
print(transform.ctm)
# page.scale_by(scale_factor)
page.add_transformation(transform)
# merge the pages to fit inside A4
# prepare A4 blank page
page_A4 = PageObject.create_blank_page(width=A4_w, height=A4_h)
page_A4.merge_page(page)
print(page_A4.mediabox)
writer.add_page(page_A4)
writer.write("output.pdf")
Which gives this output:
I could be completely off track with my approach and it may be the inefficient way of doing it.
I was hoping I would have a simple function in the package where I can define the output paper size and the scaling factor, similar to this.
You almost got it!
The transformations are applied only to the content, but not to the boxes (mediabox/trimbox/cropbox/artbox/bleedbox).
You need to adjust the cropbox:
from pypdf.generic import RectangleObject
page.cropbox = RectangleObject((0, 0, A4_w, A4_h))
Full script
from pypdf import PdfReader, PdfWriter, Transformation, PageObject, PaperSize
from pypdf.generic import RectangleObject
reader = PdfReader("input.pdf")
page = reader.pages[0]
writer = PdfWriter()
A4_w = PaperSize.A4.width
A4_h = PaperSize.A4.height
# resize page to fit *inside* A4
h = float(page.mediabox.height)
w = float(page.mediabox.width)
scale_factor = min(A4_h/h, A4_w/w)
transform = Transformation().scale(scale_factor,scale_factor).translate(0, A4_h/3)
page.add_transformation(transform)
page.cropbox = RectangleObject((0, 0, A4_w, A4_h))
# merge the pages to fit inside A4
# prepare A4 blank page
page_A4 = PageObject.create_blank_page(width = A4_w, height = A4_h)
page.mediabox = page_A4.mediabox
page_A4.merge_page(page)
writer.add_page(page_A4)
writer.write('output.pdf')
Related
I'm trying to create a pdf with python and I want to put a text in pdf char by char.
I can't find out how to do it and when it saves output pdf all of the characters are on each other.
this is my code snippet:
from fpdf import FPDF
pdf = FPDF('P', 'mm', (100,100))
# Add a page
pdf.add_page()
# set style and size of font
# that you want in the pdf
pdf.add_font('ariblk', '', "ArialBlack.ttf", uni=True)
pdf.set_font("ariblk",size = int(50*0.8))
text = [['a','b','c','d','e','w','q'],['f','g','h','i','j','k','l']]
print("creating pdf...")
line = 0
for w in range(0,len(text)):
for h in range(0,len(text[w])):
# create a cell
r = int (50)
g = int (100)
b = int (10)
pdf.set_text_color(r, g, b)
text_out = text[w][h]
pdf.cell(0,line, txt = text_out, ln = 2)
# save the pdf with name .pdf
pdf.output(name = "img/output.pdf", dest='F')
print("pdf created!")
and this is what my code output is:
(this is copy-paste from the output pdf): iljfbeqaghdckw
(this is a screenshot of the output):
I don't know fpdf module but I think that your problem only comes from the fact that you don't change the X, Y coordinates of printing of each character.
You have to use 'pdf.set_xy()` to set the X and Y coordinates of each of your characters
I made small changes to the font and colors for my tests.
from fpdf import FPDF
import random
pdf = FPDF('P', 'mm', (100,100))
# Add a page
pdf.add_page()
# set style and size of font
# that you want in the pdf
#pdf.add_font('ariblk', '', "ArialBlack.ttf", uni=True)
pdf.set_font("Arial",size = int(24))
text = [['a','b','c','d','e','w','q'],['f','g','h','i','j','k','l']]
print("creating pdf...")
line = 10
for w in range(len(text)):
for h in range(len(text[w])):
# create a cell
r = random.randint(1, 255)
g = random.randint(1, 255)
b = random.randint(1, 255)
pdf.set_text_color(r, g, b)
text_out = text[w][h]
pdf.set_xy(10*w, 10*h)
pdf.cell(10, 10, txt=text_out, ln=0, align='C')
# save the pdf with name .pdf
pdf.output(name = "output.pdf", dest='F')
print("pdf created!")
Then, you have to adapt the offset of X and/or Y according to the display you want to obtain in print.
Remark: As you don't change the values of r, g, b in your for loops, the best is to go up the assignment of variables r, g and b before the for loops
Output in the PDF:
a f
b g
c h
d i
e j
w k
q l
I try to reproduce a data augmentation method, which comes from the paper:
Qinwei Xu, Ruipeng Zhang, Ya Zhang, Yanfeng Wang and Qi Tian "A Fourier-based Framework for Domain Generalization" (CVPR 2021).
It is mentioned in the paper that they set the real part to a constant (the constant in the paper is 20000) to eliminate the amplitude and realize the reconstruction of the image relying only on the phase.
Below is my code:
img = process_img("./data/house.jpg", 128)
img_fft = torch.fft.fft2(img, dim=(-2, -1))
amp = torch.full(img_fft.shape, 200000)
img_fft.real = amp
img_ifft = torch.fft.ifft2(img_fft, dim=(-2, -1))
img_ifft = img_ifft.squeeze(0)
img_ifft = img_ifft.transpose(2, 0)
img_ifft = np.array(img_ifft)
cv2.imshow("", img_ifft.real)
Among them, the process_img function is only used to convert ndarray to tensor, as shown below:
loader = transforms.Compose([transforms.ToTensor()])
def process_img(img_path, img_size):
img = cv2.imread(img_path)
img = cv2.resize(img, (img_size, img_size))
img = img.astype(np.float32) / 255.0
img = loader(img)
img = img.unsqueeze(0)
return img
The first is the original image, the second is the image provided by the paper, and the third is the image generated by my code:
It can be seen that the images generated by my method are very different from those provided in the paper, and there are some artifacts. Why is there such a result?
You are confusing "real"/"imaginary" parts of complex numbers with "amplitude"/"phase" representation.
Here's the quick guide:
A complex number z can be expressed by either a sum of its real part x and its imaginary part y:
z = x + j y
Alternatively, once can express the same complex number z as a rotated vector with amplitude r and an angle phi:
z = r exp(j phi)
Where r = sqrt(x^2 + y^2) and phi=atan2(x,y).
This image (from Wikipedia) explain this visually:
In your code, you replace the "real" part, but in the paper, they suggest replacing the "amplitude".
If you want to replace the amplitude:
const_amp = ... # whatever the constant amplitude you want
new_fft = const_amp * torch.exp(1j * img_fft.angle())
# reconstruct the new image from the modulated Fourier:
img_ifft = torch.fft.ifft2(new_fft, dim=(-2, -1))
This results with the following image:
After reading this and taking the courses, I am struggling to solve the second problem in assignment 1 (notMnist):
Let's verify that the data still looks good. Displaying a sample of the labels and images from the ndarray. Hint: you can use matplotlib.pyplot.
Here is what I tried:
import random
rand_smpl = [ train_datasets[i] for i in sorted(random.sample(xrange(len(train_datasets)), 1)) ]
print(rand_smpl)
filename = rand_smpl[0]
import pickle
loaded_pickle = pickle.load( open( filename, "r" ) )
image_size = 28 # Pixel width and height.
import numpy as np
dataset = np.ndarray(shape=(len(loaded_pickle), image_size, image_size),
dtype=np.float32)
import matplotlib.pyplot as plt
plt.plot(dataset[2])
plt.ylabel('some numbers')
plt.show()
but this is what I get:
which doesn't make much sense. To be honest my code may too, since I am not really sure how to tackle that problem!
The pickles are created like this:
image_size = 28 # Pixel width and height.
pixel_depth = 255.0 # Number of levels per pixel.
def load_letter(folder, min_num_images):
"""Load the data for a single letter label."""
image_files = os.listdir(folder)
dataset = np.ndarray(shape=(len(image_files), image_size, image_size),
dtype=np.float32)
print(folder)
num_images = 0
for image in image_files:
image_file = os.path.join(folder, image)
try:
image_data = (ndimage.imread(image_file).astype(float) -
pixel_depth / 2) / pixel_depth
if image_data.shape != (image_size, image_size):
raise Exception('Unexpected image shape: %s' % str(image_data.shape))
dataset[num_images, :, :] = image_data
num_images = num_images + 1
except IOError as e:
print('Could not read:', image_file, ':', e, '- it\'s ok, skipping.')
dataset = dataset[0:num_images, :, :]
if num_images < min_num_images:
raise Exception('Many fewer images than expected: %d < %d' %
(num_images, min_num_images))
print('Full dataset tensor:', dataset.shape)
print('Mean:', np.mean(dataset))
print('Standard deviation:', np.std(dataset))
return dataset
where that function is called like this:
dataset = load_letter(folder, min_num_images_per_class)
try:
with open(set_filename, 'wb') as f:
pickle.dump(dataset, f, pickle.HIGHEST_PROTOCOL)
The idea here is:
Now let's load the data in a more manageable format. Since, depending on your computer setup you might not be able to fit it all in memory, we'll load each class into a separate dataset, store them on disk and curate them independently. Later we'll merge them into a single dataset of manageable size.
We'll convert the entire dataset into a 3D array (image index, x, y) of floating point values, normalized to have approximately zero mean and standard deviation ~0.5 to make training easier down the road.
Do this as below:
#define a function to conver label to letter
def letter(i):
return 'abcdefghij'[i]
# you need a matplotlib inline to be able to show images in python notebook
%matplotlib inline
#some random number in range 0 - length of dataset
sample_idx = np.random.randint(0, len(train_dataset))
#now we show it
plt.imshow(train_dataset[sample_idx])
plt.title("Char " + letter(train_labels[sample_idx]))
Your code changed the type of dataset actually, it is not an ndarray of size (220000, 28,28)
In general, pickle is a file which holds some objects, not the array itself. You should use the object from pickle directly to get your train dataset (using the notation from your code snippet):
#will give you train_dataset and labels
train_dataset = loaded_pickle['train_dataset']
train_labels = loaded_pickle['train_labels']
UPDATED:
Per request from #gsarmas the link to my solution for whole Assignment1 lies here.
The code is commented and mostly self-explanatory, but in case of any questions feel free to contact via any way you prefer on github
Please check with this code
pickle_file = train_datasets[0]
with open(pickle_file, 'rb') as f:
# unpickle
letter_set = pickle.load(f)
# pick a random image index
sample_idx = np.random.randint(len(letter_set))
# extract a 2D slice
sample_image = letter_set[sample_idx, :, :]
plt.figure()
# display it
plt.imshow(sample_image)
Use this code:
#random select a letter
i = np.random.randint( len(train_datasets) )
plt.title( "abcdefghij"[i] )
#read the file of selected letter
f = open( train_datasets[i], "rb" )
f = pickle.load(f)
#random select an image in the file
j = np.random.randint( len(f) )
#show image
plt.axis('off')
img = plt.imshow( f[ j, :, : ] )
enter image description here
I have a big number of screenshots that need to be cropped. All the images look similar - there is a rectangular window with blue border, containing some graphical elements inside. This window is contained inside another one but I need to crop only the inner window. Across all images the dimensions of the inner window are different and so is the content. The content in most cases includes elements with rectangular form and sometimes - blue border, the same border as the inner window. I am mentioning this because I am thinking of the following flow:
A script that goes through all images in the target directory. For each of them:
Find the area to be cropped (inner window)
Crop the area
Save the file
How can this be done? Python is not compulsory, can be any other too also.
It's not straightforward but this is a possible recipe:
import matplotlib.pyplot as plt
import numpy as np
def synthimage():
w,h = 300,200
im = np.random.randint(0,255,(w,h,3))/255
xa = np.random.randint(50,w-60)
xb = xa + np.random.randint(50,90)
ya = np.random.randint(50,h-60)
yb = ya + np.random.randint(20,50)
im[xa:xb,ya] = np.array([1,0,0])
im[xa:xb,yb] = np.array([1,0,0])
im[xa,ya:yb] = np.array([1,0,0])
im[xb,ya:yb] = np.array([1,0,0])
return im
def getRectPoints(im):
x,y = [],[]
for i in range(im.shape[0]):
for j in range(im.shape[1]):
if (im[i,j]-np.array([1,0,0])).sum()==0:
x.append(i)
y.append(j)
return np.array(x),np.array(y)
def denoise(x,y):
nx,ny = [],[]
for i in range(x.shape[0]):
d = np.sqrt((x[i]-x)**2+(y[i]-y)**2)
m = d<2
if len(m.nonzero()[0])>2:
nx.append(x[i])
ny.append(y[i])
return np.array(nx),np.array(ny)
im = synthimage()
plt.imshow(np.swapaxes(im,0,1),origin='lower',interpolation='nearest')
plt.show()
x,y = getRectPoints(im)
plt.scatter(x,y,c='red')
plt.xlim(0,300)
plt.ylim(0,200)
plt.show()
nx,ny = denoise(x,y)
plt.scatter(nx,ny,c='red')
plt.xlim(0,300)
plt.ylim(0,200)
plt.show()
#Assuming rectangle has no rotation (otherwise check Scipy ConveHull)
xmi = nx.min()
xma = nx.max()
ymi = ny.min()
yma = ny.max()
new = np.ones(im.shape)
new[xmi:xma,ymi:yma] = im[xmi:xma,ymi:yma]
plt.imshow(np.swapaxes(new,0,1),origin='lower',interpolation='nearest')
plt.show()
, the name of the functions should be self-explaining. Synthetic data was generated for the purpose of this exercise. The results are (in order):
Obviously each one of this steps can be changed depending on the requirements but this would be a functional solution for the majority of case-studies.
I've been fighting with pyplot for few days now. I want to return a pdf report with 4 samples on each page. 4 inline subplots for each: text with the name and some statistics, and 3 graphs of values vs time. I found a tutorial online and tried it (see below) but it gives nothing. the pdf is empty. I can't find where it is wrong.
Thank you in advance !
from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.pyplot as plt
t=[n*5 for n in range(len(ratio))]
y_list_ratio=[[x*100/l[3]for x in l[2]]for l in hit_ratio]
props = dict(boxstyle='round', facecolor='wheat', alpha=0.5)
pp = PdfPages('multipage.pdf')
# Generate the pages
nb_plots = len(hit_ratio)
nb_plots_per_page = 4
nb_pages = int(numpy.ceil(nb_plots / float(nb_plots_per_page)))
grid_size = (nb_plots_per_page, 4)
for i, elt in enumerate(hit_ratio):
# Create a figure instance (ie. a new page) if needed
if i % nb_plots_per_page == 0:
plt = plt.figure(figsize=(8.27, 11.69), dpi=100)
# Plot stuffs !
plt.subplot2grid(grid_size, (i % nb_plots_per_page, 0))
plt.text(0.5,0.5,"Puit Hauteur_pic Digitonine\n"+ \
str(elt[-1])+" "+str(elt[5])+" "+str(elt[6]),horizontalalignment='center',verticalalignment='center', bbox=props)
plt.subplot2grid(grid_size, (i % nb_plots_per_page, 1))
plt.plot(t,hit_norm[i][0])
plt.subplot2grid(grid_size, (i % nb_plots_per_page, 2))
plt.plot(t,y_list_ratio[i])
plt.subplot2grid(grid_size, (i % nb_plots_per_page, 3))
plt.plot(t,elt[7])
plt.plot(t,elt[8])
# Close the page if needed
if (i + 1) % nb_plots_per_page == 0 or (i + 1) == nb_plots:
fig2.tight_layout()
pp.savefig(fig2)
# Write the PDF document to the disk
pp.close()
Since you don't have any answers yet, I have an alternate suggestion:
Try ReportLab.
from reportlab.lib import colors, utils
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter, landscape
from reportlab.lib.units import inch
from reportlab.platypus import SimpleDocTemplate, Table, TableStyle, Image, PageBreak, KeepTogether
from reportlab.lib.styles import ParagraphStyle as PS
from reportlab.lib.enums import TA_CENTER
from reportlab.platypus.paragraph import Paragraph
landscape_pic_width = 8
letter_pic_width = 6
....
def get_image(path, width=1*inch):
#'''returns an image for adding to report'''
img = utils.ImageReader(path)
iw, ih = img.getSize()
aspect = ih / float(iw)
return Image(path, width=width, height=(width * aspect))
def plot_stuff():
#whatever you want to plot, finish with saving the figure
elements = [] #this is a list that will contain the items to be included in the report
report_title = str(report_title)
c_table = Table(Conditions_Table_data)
c_table.setStyle(TableStyle([('ALIGN', (0,0),(-1,-1),'CENTER'), ('INNERGRID', (0,0), (-1,-1), 0.25, colors.black), ('BOX', (0,0),(-1,-1), 2, colors.blueviolet), ('SIZE', (0,0),(-1,-1), 10), ('SPAN',(-3,3),(-1,3))]))
#tells it how to format the table
#puts in the logo, the assigned report title, the entered report title, the conditions table, and the setup picture
elements.append(get_image(logo_picture, width = 9*inch))
elements.append(Paragraph(document_title, PS(name='Heading1', spaceAfter = 22, fontName = 'Times-Bold', fontSize = 18, alignment = TA_CENTER)))
#you can append items to the list "elements" with a for loop.
doc = SimpleDocTemplate(path_plus_title + ".pdf", pagesize=landscape(letter))
#creates the report. Will throw an error if the report exists and is already open. Otherwise, will generate a report
#this WILL overwrite an existing report with the same name. Timestamps being forced into the data file names help.
doc.build(elements)
There's definitely sections missing from this code...but these are the items I import ("inch", for instance, is a value for sizing that you multiply by the number of inches you want for that dimension)
You basically build a list of the items that go into your report pdf in the order they go in. For each element, there's some style setting that takes place. You can include text (Paragraphs), tables (it's a "list of lists" with each list being a row), and images.