Search/Update Graphics in a PDF - python

Is there any way to search for a graphic in a PDF page?
For example, I have used the PyFPDF library to generate a PDF with few rectangles and circles (code below). Now I would like to
get a list of ellipses that are printed in each page and their location.
and update the color of all the circles to blue
Is there any library which allows this?
PDF generation with few rectangles and circles:
Sample PDF generated using code below: Download the PDF
from fpdf import FPDF
# Prepare PDF generator
pdf = FPDF(orientation = 'L', unit = 'in', format = 'A4')
pdf.add_page()
pdf.set_fill_color(0, 0, 0)
# Draw the rectangle
pdf.rect(x = 1, y = 1, w = 2, h = 2, style = 'S')
pdf.rect(x = 1.5, y = 1.5, w = 5, h = 1, style = 'S')
pdf.ellipse(x = 1, y = 4, w = 1, h = 1, style = 'S')
pdf.ellipse(x = 4, y = 5, w = 1, h = 1, style = 'S')
pdf.ellipse(x = 9, y = 2, w = 1, h = 1, style = 'F')
pdf.rect(x = 6, y = 5, w = 2, h = 1, style = 'F')
# Write to file
pdf.output("test.pdf")

NOT YOUR ANSWER just a visual comment
Its not been easy to dissect and rebuild by hand so used some shortcuts. but the pdf supplied has all vectors as only one black encoded stream object that looks something like this
4 0 obj
<</Filter /FlateDecode /Length 251>>
stream
xœU’»q!„sª 3舜ºÆ¿#'Nܾ-.‚ïЮVßCN¥Åß¹äœãWh<7±°$¶Hª‹Þ°þ|Æ#ÙÖ^­­
©yW¸Ì¼˜­‚ïcÂÕHÜ’M]Þr˜F0KÄÝ:€›ÄäBIstK¹,A# á9E>£ŽóPïÌÔWQ¹¼„õÑI²^9ØìR‚à;ÂáÖW(©Êåš¾Þõœ"!”¼k‚Æû‹Rñ ÏTëóöÒ²'®¶;F¨uç8§j—òaZ°ô²r#)­_¾ ù·¼+‰ |æñTøº×óÿ_Ø+ü)„Ÿ
endstream
endobj
Thus at a simple level everything at once could possibly be prefixed as blue. but if we decode the stream, it is easier to inject colours per part of stroked stream here I marked the stack targets in green and have injected 0 0 1 rg for blue (end of square rect & 1 group of the chords and 0 g for stop Graphics change between times
Thus as with many a PDF question:-
Can I reverse engineer PDF to
"other the and that ,this Do" the answer is do it at source
$pdf->SetDrawColor(0,0,0);
$pdf->Circle(100,50,30,'D');
$pdf->SetFillColor(0,0,255);
$pdf->Circle(110,47,7,'F');

Related

Dynamically positioning of elements in pyFPDF

I am using pyFPDF to automate reports (trying at least), and I am having problems dynamically adjusting the positions of multi_cells and graphs after page 01.
I tried two things:
I was using fixed distances between each graph and text, using something like:
pdf = FPDF()
pdf.add_page()
pdf.set_font('Arial', style = 'B', size = 12)
position_y = 100
position_x = (A4_size_x)/2
text = 80
var = [0,1,2,3,4,5]
for i in var:
pdf.image(f'1. Gráficos/{cpf}-{fator}.png', #Graphs that were dynamically generated
x = position_x - 25,
y = position_y + (text * var),
w = 50,
h = 50)
pdf.set_xy(position_x - 50, ((position_y + (text * var)) - 45))
pdf.multi_cell(w = 100,
h = 5,
txt = 'Some Dynamic Text',
align='C')
pdf.output(f'2. Reports/{cpf}.pdf', 'F')
The problem here is that when it reaches a page break, the position values calculate from the new page and they break the whole document.
I also tried disabling page break:
pdf.set_auto_page_break(auto = False, margin = 0.0)
But then there is only one page in the document.
The question here is, how to keep adding objects in the same intervals across multiple pages?
Thank you!

How to obtain surfaces tag in Gmsh Python api?

i am trying to generate geometries and meshes with the Python api of Gmsh, planning to use it in FEniCS.
I started creating my geometry following the steps reported here: https://jsdokken.com/src/tutorial_gmsh.html
The author first create the volume and then retrieve the surfaces with the command:
surfaces = gmsh.model.occ.getEntities(dim=2)
Finally, he is able to relate the surface to the tag simply by finding the center of mass (com). He uses the command gmsh.model.occ.getCenterOfMass(dim,tag) and compares it with the know com position of his surfaces, like this:
inlet_marker, outlet_marker, wall_marker, obstacle_marker = 1, 3, 5, 7
walls = []
obstacles = []
for surface in surfaces:
com = gmsh.model.occ.getCenterOfMass(surface[0], surface[1])
if np.allclose(com, [0, B/2, H/2]):
gmsh.model.addPhysicalGroup(surface[0], [surface[1]], inlet_marker)
inlet = surface[1]
gmsh.model.setPhysicalName(surface[0], inlet_marker, "Fluid inlet")
elif np.allclose(com, [L, B/2, H/2]):
gmsh.model.addPhysicalGroup(surface[0], [surface[1]], outlet_marker)
gmsh.model.setPhysicalName(surface[0], outlet_marker, "Fluid outlet")
elif np.isclose(com[2], 0) or np.isclose(com[1], B) or np.isclose(com[2], H) or np.isclose(com[1],0):
walls.append(surface[1])
else:
obstacles.append(surface[1])
Now, my problem is that this cannot work if two or more surfaces share the same com, such as two concentric cylinders.
How can i discriminate between them in such situation?
For example in case of an hollow cylinder, i would like to have a tag for each surface in order to apply different boundary conditions in FEniCS.
Thanks in advance!
You can make use of gmsh.model.getAdjacencies(dim,tag) where dim and tag are the dimension and tag of your entity of interest. This functions returns two lists up, down. The first one gives you the tags of all entities adjacent (=neighbouring) to the entity of interest with dimension dim + 1. The second list gives you the tags of all entities adjacent to the entity of interest with dimension dim - 1.
In 3D (i.e. dim = 3) the up list will be empty because there are no 4D structures in gmsh. The down list will contain all surface tags the boundary of the volume is made of.
Below is an example code. Part 1 is straight forward and in Part 2 I added a functions that sorts the surface tags by their x-coordinate.
import gmsh
gmsh.initialize()
## PART 1:
tag_cylinder_1 = gmsh.model.occ.addCylinder(0, 0, 0, 1, 0, 0, 0.1)
tag_cylinder_2 = gmsh.model.occ.addCylinder(0, 0, 0, 1, 0, 0, 0.2)
gmsh.model.occ.synchronize()
up_cyl_1, down_cyl_1 = gmsh.model.getAdjacencies(3,tag_cylinder_1)
up_cyl_2, down_cyl_2 = gmsh.model.getAdjacencies(3,tag_cylinder_2)
com_1 = gmsh.model.occ.getCenterOfMass(2, down_cyl_1[0])
com_2 = gmsh.model.occ.getCenterOfMass(2, down_cyl_1[1])
com_3 = gmsh.model.occ.getCenterOfMass(2, down_cyl_1[2])
## PART 2:
def calcBoxVolume(box):
dx = box[3] - box[0]
dy = box[4] - box[1]
dz = box[5] - box[2]
return dx*dy*dz
def getOrderedTags(cylTag):
up, down = gmsh.model.getAdjacencies(3,cylTag)
surf_COM = []
for surface in down:
com = [surface] + list(gmsh.model.occ.getCenterOfMass(2, surface))
surf_COM.append(com)
orderedSurfaces = sorted(surf_COM,key = lambda x: x[1])
orderedSurfaceTags = [item[0] for item in orderedSurfaces]
return orderedSurfaceTags
def setPhysicalTags(name,cylTag):
orderedSurfaces = getOrderedTags(cylTag)
gmsh.model.addPhysicalGroup(2, [orderedSurfaces[0]],name="inlet_"+name)
gmsh.model.addPhysicalGroup(2, [orderedSurfaces[1]],name="tube_"+name)
gmsh.model.addPhysicalGroup(2, [orderedSurfaces[3]],name="outlet_"+name)
def setPhysicalTagsCylDiff(name,cylTag):
orderedSurfaces = getOrderedTags(cylTag)
tag_A = orderedSurfaces[1]
tag_B = orderedSurfaces[2]
box_tube_A = gmsh.model.getBoundingBox(2,tag_A)
box_tube_B = gmsh.model.getBoundingBox(2,tag_B)
volBoxA = calcBoxVolume(box_tube_A)
volBoxB = calcBoxVolume(box_tube_B)
if volBoxA > volBoxB:
innerTag = tag_B
outerTag = tag_A
else:
innerTag = tag_A
outerTag = tag_B
gmsh.model.addPhysicalGroup(2, [orderedSurfaces[0]],name="inlet_"+name)
gmsh.model.addPhysicalGroup(2, [innerTag],name="tube_inner_"+name)
gmsh.model.addPhysicalGroup(2, [outerTag],name="tube_outer_"+name)
gmsh.model.addPhysicalGroup(2, [orderedSurfaces[3]],name="outlet_"+name)
# setPhysicalTags("Cylinder_1",tag_cylinder_1)
# setPhysicalTags("Cylinder_2",tag_cylinder_2)
outDimTags, outDimTagsMap = gmsh.model.occ.cut([(3,tag_cylinder_2)],[(3,tag_cylinder_1)])
cylDiffTag = outDimTags[0][1]
gmsh.model.occ.synchronize()
setPhysicalTagsCylDiff("CylDiff",cylDiffTag)
gmsh.model.mesh.generate(2)
gmsh.fltk.run()
gmsh.finalize()

visualize a two-dimensional point set using Python

I'm new to Python and want to perform a rather simple task. I've got a two-dimensional point set, which is stored as binary data (i.e. (x, y)-coordinates) in a file, which I want to visualize. The output should look as in the picture below.
However, I'm somehow overwhelmed by the amount of google results on this topic. And many of them seem to be for three-dimensional point cloud visualization and/or a massive amount of data points. So, if anyone could point me to a suitable solution for my problem, I would be really thankful.
EDIT: The point set is contained in a file which is formatted as follows:
0.000000000000000 0.000000000000000
1.000000000000000 1.000000000000000
1
0.020375738732779 0.026169010160356
0.050815740313746 0.023209931647163
0.072530406907906 0.023975230642589
The first data vector is the one in the line below the single "1"; i.e. (0.020375738732779, 0.026169010160356). How do I read this into a vector in python? I can open the file using f = open("pointset file")
Install and import matplotlib and pyplot:
import matplotlib.pyplot as plt
Assuming this is your data:
x = [1, 2, 5, 1, 5, 7, 8, 3, 2, 6]
y = [6, 7, 1, 2, 6, 2, 1, 6, 3, 1]
If you need, you can use a comprehension to split the coordinates into seperate lists:
x = [p[0] for p in points]
y = [p[1] for p in points]
Plotting is as simple as:
plt.scatter(x=x, y=y)
Result:
Many customizations are possible.
EDIT: following question edit
In order to read the file:
x = []
y = []
with open('pointset_file.txt', 'r') as f:
for line in f:
coords = line.split(' ')
x.append(float(coords[0]))
y.append(float(coords[1]))
You could read your data as follow, and plot using scattr plot. this method is considering for small number of data and not csv, just the format you have presented.
import matplotlib.pyplot as plt
with open("pointset file") as fid:
lines = fid.read().split("\n")
# lines[:2] looks like the bounds for each axis, if yes use it in plot
data = [[float(d) for d in line.split(" ") if d] for line in lines[3:]]
plt.scatter(data[0], data[1])
plt.show()
Assuming you want a plot looking pretty much exactly like the sample image you give, and you want the plot to display the data with both axes in equal proportion, one could use a general purpose multimedia library like pygame to achieve this:
#!/usr/bin/env python3
import sys
import pygame
# windows will never be larger than this in their largest dimension
MAX_WINDOW_SIZE = 400
BG_COLOUR = (255, 255, 255,)
FG_COLOUR = (0, 0, 0,)
DATA_POINT_SIZE = 2
pygame.init()
if len(sys.argv) < 2:
print('Error: need filename to read data from')
pygame.quit()
sys.exit(1)
else:
data_points = []
# read in data points from file first
with open(sys.argv[1], 'r') as file:
[next(file) for _ in range(3)] # discard first 3 lines of file
# now the rest of the file contains actual data to process
data_points.extend(tuple(float(x) for x in line.split()) for line in file)
# file read complete. now let's find the min and max bounds of the data
top_left = [float('+Inf'), float('+Inf')]
bottom_right = [float('-Inf'), float('-Inf')]
for datum in data_points:
if datum[0] < top_left[0]:
top_left[0] = datum[0]
if datum[1] < top_left[1]:
top_left[1] = datum[1]
if datum[0] > bottom_right[0]:
bottom_right[0] = datum[0]
if datum[1] > bottom_right[1]:
bottom_right[1] = datum[1]
# calculate space dimensions
space_dimensions = (bottom_right[0] - top_left[0], bottom_right[1] - top_left[1])
# take the biggest of the X or Y dimensions of the point space and scale it
# up to our maximum window size
biggest = max(space_dimensions)
scale_factor = MAX_WINDOW_SIZE / biggest # all points will be scaled up by this factor
# screen dimensions
screen_dimensions = tuple(sd * scale_factor for sd in space_dimensions)
# basic init and draw all points to screen
display = pygame.display.set_mode(screen_dimensions)
display.fill(BG_COLOUR)
for point in data_points:
# translate and scale each point
x = point[0] * scale_factor - top_left[0] * scale_factor
y = point[1] * scale_factor - top_left[1] * scale_factor
pygame.draw.circle(display, FG_COLOUR, (x, y), DATA_POINT_SIZE)
pygame.display.update()
while True:
for event in pygame.event.get():
if event.type == pygame.QUIT:
pygame.quit()
sys.exit(0)
pygame.time.wait(50)
Execute this script and pass the name of the file which holds your data in as the first argument. It will spawn a window with the data points displayed.
I generated a bunch of uniformly distributed random x,y points to test it, with:
from random import random
for _ in range(1000):
print(random(), random())
This produces a window looking like the following:
If the space your data points are within is not of square size, the window shape will change to reflect this. The largest dimension of the window, either width or height, will always stay at a specified size (I used 400px as a default in my demo).
Admittedly, this is not the most elegant or concise solution, and reinvents the wheel a little bit, however it gives you the most control on how to display your data points, and it also deals with both the reading in of the file data and the display of it.
To read your file:
import pandas as pd
import numpy as np
df = pd.read_csv('your_file',
sep='\s+',
header=None,
skiprows=3,
names=['x','y'])
For now I've created a random dataset
import random
df = pd.DataFrame({'x':[random.uniform(0, 1) for n in range(100)],
'y':[random.uniform(0, 1) for n in range(100)]})
I prefer Plotly for any kind of figure
import plotly.express as px
fig = px.scatter(df,
x='x',
y='y')
fig.show()
From here you can easily update labels, colors, etc.

Creating PDF with Python using FPDF char by char

I'm trying to create a pdf with python and I want to put a text in pdf char by char.
I can't find out how to do it and when it saves output pdf all of the characters are on each other.
this is my code snippet:
from fpdf import FPDF
pdf = FPDF('P', 'mm', (100,100))
# Add a page
pdf.add_page()
# set style and size of font
# that you want in the pdf
pdf.add_font('ariblk', '', "ArialBlack.ttf", uni=True)
pdf.set_font("ariblk",size = int(50*0.8))
text = [['a','b','c','d','e','w','q'],['f','g','h','i','j','k','l']]
print("creating pdf...")
line = 0
for w in range(0,len(text)):
for h in range(0,len(text[w])):
# create a cell
r = int (50)
g = int (100)
b = int (10)
pdf.set_text_color(r, g, b)
text_out = text[w][h]
pdf.cell(0,line, txt = text_out, ln = 2)
# save the pdf with name .pdf
pdf.output(name = "img/output.pdf", dest='F')
print("pdf created!")
and this is what my code output is:
(this is copy-paste from the output pdf): iljfbeqaghdckw
(this is a screenshot of the output):
I don't know fpdf module but I think that your problem only comes from the fact that you don't change the X, Y coordinates of printing of each character.
You have to use 'pdf.set_xy()` to set the X and Y coordinates of each of your characters
I made small changes to the font and colors for my tests.
from fpdf import FPDF
import random
pdf = FPDF('P', 'mm', (100,100))
# Add a page
pdf.add_page()
# set style and size of font
# that you want in the pdf
#pdf.add_font('ariblk', '', "ArialBlack.ttf", uni=True)
pdf.set_font("Arial",size = int(24))
text = [['a','b','c','d','e','w','q'],['f','g','h','i','j','k','l']]
print("creating pdf...")
line = 10
for w in range(len(text)):
for h in range(len(text[w])):
# create a cell
r = random.randint(1, 255)
g = random.randint(1, 255)
b = random.randint(1, 255)
pdf.set_text_color(r, g, b)
text_out = text[w][h]
pdf.set_xy(10*w, 10*h)
pdf.cell(10, 10, txt=text_out, ln=0, align='C')
# save the pdf with name .pdf
pdf.output(name = "output.pdf", dest='F')
print("pdf created!")
Then, you have to adapt the offset of X and/or Y according to the display you want to obtain in print.
Remark: As you don't change the values of r, g, b in your for loops, the best is to go up the assignment of variables r, g and b before the for loops
Output in the PDF:
a f
b g
c h
d i
e j
w k
q l

Find average colour of each section of an image

I am looking for the best way to achieve the following using Python:
Import an image.
Add a grid of n sections (4 shown in this example below).
For each section find the dominant colour.
Desired output
Output an array, list, dict or similar capturing these dominant colour values.
Maybe even a Matplotlib graph showing the colours (like pixel art).
What have I tried?
The image could be sliced using image slicer:
import image_slicer
image_slicer.slice('image_so_grid.png', 4)
I could then potentially use something like this to get the average colour but Im sure there are better ways to do this.
What are the best ways to do this with Python?
This works for 4 sections, but you'll need to figure out how to make it work for 'n' sections:
import cv2
img = cv2.imread('image.png')
def fourSectionAvgColor(image):
rows, cols, ch = image.shape
colsMid = int(cols/2)
rowsMid = int(rows/2)
numSections = 4
section0 = image[0:rowsMid, 0:colsMid]
section1 = image[0:rowsMid, colsMid:cols]
section2 = image[rowsMid: rows, 0:colsMid]
section3 = image[rowsMid:rows, colsMid:cols]
sectionsList = [section0, section1, section2, section3]
sectionAvgColorList = []
for i in sectionsList:
pixelSum = 0
yRows, xCols, chs = i.shape
pixelCount = yRows*xCols
totRed = 0
totBlue = 0
totGreen = 0
for x in range(xCols):
for y in range(yRows):
bgr = i[y,x]
b = bgr[0]
g = bgr[1]
r = bgr[2]
totBlue = totBlue+b
totGreen = totGreen+g
totRed = totRed+r
avgBlue = int(totBlue/pixelCount)
avgGreen = int(totGreen/pixelCount)
avgRed = int(totRed/pixelCount)
avgPixel = (avgBlue, avgGreen, avgRed)
sectionAvgColorList.append(avgPixel)
return sectionAvgColorList
print(fourSectionAvgColor(img))
cv2.waitKey(0)
cv2.destroyAllWindows()
You can use scikit-image's view_as_blocks together with numpy.mean. You specify the block size instead of the number of blocks:
import numpy as np
from skimage import data, util
import matplotlib.pyplot as plt
astro = data.astronaut()
blocks = util.view_as_blocks(astro, (8, 8, 3))
print(astro.shape)
print(blocks.shape)
mean_color = np.mean(blocks, axis=(2, 3, 4))
fig, ax = plt.subplots()
ax.imshow(mean_color.astype(np.uint8))
Output:
(512, 512, 3)
(64, 64, 1, 8, 8, 3)
Don't forget the cast to uint8 because matplotlib and scikit-image expect floating point images to be in [0, 1], not [0, 255]. See the scikit-image documentation on data types for more info.

Categories

Resources