How to crop empty space from SVG? - python

How do you crop all empty space from an SVG file, either from the command line or Python?
I have several SVG files formatted to the standard A2 letter document size, yet are mostly empty, and I need to bulk crop them down so their view box is the same as the minimum bounding box for their contents.
I can do this in Inkscape using the "Resize Page to Selection" option, but I don't see any way to access this function from the command-line. I thought it might be a call like:
inkscape -z --verb=FitCanvasToDrawing --verb=FileSave --verb=FileClose file.svg
as suggested here but that has no effect.

Related

How to remove images in a pdf file?

https://babel.hathitrust.org/cgi/pt?id=mdp.35112102781921&view=1up&seq=11
I want to use a program or a command (latter preferred) to remove images of the same position (e.g., Digitized by Google) in pdf files like the above.
Could you show me a convenient way to do so?

How to update all colors in an editable PDF document?

I have a pdf that I can edit using Adobe Acrobat Pro.
For each color found in the document (solid filled rectangles, each pixel in an image, font color, ...) I need to update it to a new color. This new color is determined using a method that takes the old color (RGB or Hex) as input.
I can do this manually but to save time (100+ page doc) I'm looking for a solution to do this using Python.
Using Fitz I've been able to extract images and save the modified images, or cycle through the rectangles but I have not been able to find a way to update the pdf document and save a copy of it.
The only thing I need to change is the color, everything else needs to stays as is.
Is there a way to do this in python?
Edit Clarification:
I only need to update non-black and non-white objects. I only need to apply f(colour) = newColour for coloured objects. Note: f(black) = black and f(white) = white so I don't need to explicitly ignore these objects.

Extract a label from several single page PDF files and align them to fill an A4 page (to save paper)

I receive a single page A4 PDF file from my shipping courier. It contains a small label on the top left corner, and the rest of the page is blank. As I usually have to print several a day, I end up wasting a lot of paper. I could drop all the PDFs to inkscape (it imports the label as a grouped object) and manually align them to fit an A4 page. But it will become tedious really fast and it would waste time.
Can you point me in the right direction as to what to look for in order to write a python script to do this?
You would need to determine the size of the label in PDF units (1 point = 1/72 inch).
Then determine how many labels you can fit onto one page, i.e. how many columns and rows you can have (taking the needed printing margin into account).
The script could take the PDF pages as command line arguments, import each page as Form XObject and place the labels into the row/column raster.
I would do the following:
Once:
Create a file with a different linked image for each label that fits on the page (with correct proportions)
For each label file (Inkscape command line):
Open with Inkscape
Create an object of the size of the label and clip the contents to that object (if necessary, e.g. because there's more in the file than just that single group object)
Resize page to contents
Save
Then:
create a CSV file from your current set of label file names, suitable for https://gitlab.com/Moini/nextgenerator - as many as fit on one page per line
use the extension (see documentation)
Note that the extension can also be used from the command line, if needed.

How to change image position and text wrapping using python-docx?

After adding the image using add_method(), I want to change the image position and text wrapping properties.
I want to change the text-wrapping: in front of text
I want to change the properties as
horizontal
alignment : right , relative to : margin
vertical
absolute position: 2.15 cm , below: Page
This is how I change it manually in word, but I want to do it using python-docx
Is there any way to get it done?
The short answer is "No."
There are two ways images can be placed in Word, inline images and floating images.
An inline image is placed in a run and is essentially treated as a big character. The height of the line it appears on is adjusted upward to fit the image and the paragraph it is in flows between pages depending on the text before it, just like any other paragraph.
A floating image lies on the drawing layer, like a clear plastic sheet above the document layer where the text lives. It is given an absolute position and in general does not flow with the text (although it can be anchored to part of the text). Text can be set to wrap around the image, wherever it ends up on the page.
python-docx currently only supports inline images. There is no existing API support for floating images (and the text wrapping they allow).

Tools to extract glyph data from bitmap font image

I have all the characters of a font rendered in a PNG. I want to use the PNG for texture-mapped font rendering in OpenGL, but I need to extract the glyph information - character position and size, etc.
Are there any tools out there to assist in that process? Most tools I see generate the image file and glyph data together. I haven't found anything to help me extract from an existing image.
I use the gimp as my primary image editor, and I've considered writing a plugin to assist me in the process of identifying the bounding box for each character. I know python but haven't done any gimp plugins before, so that would be a chore. I'm hoping something already exists...
Generally, the way this works is you use a tool to generate the glyph images. That tool will also generate metric information for those glyphs (how big they are, etc). You shouldn't be analyzing the image to find the glyphs; you should have additional information alongside your glyphs that tell where they are, how big they should be, etc.
Consider the letter "i". Depending on the font, there will be some space to the left and right of it. Even if you have a tool that can identify the glyph "i", you would need to know how many pixels of space the font put to the left and right of the glyph. This is pretty much impossible to do accurately for all letters. Not without some very advanced computer vision algorithms. And since the tool that generated those glyphs already knew how much spacing they were supposed to get, you would be better off changing the tool to write the glyph info as well.
You can use PIL to help you automate the process.
Assuming there is at least one row/column of background colour separating lines/characters, you can use the Image.crop method to check each row (and then each column of the row) if it contains only the background colour; thus you get the borders of each character.
Ping if you need further assistance.

Categories

Resources