R package Bio3D tutorial reproducing - python

I started working with R package called Bio3D
(http://thegrantlab.org/bio3d/index.php)
and encountered a problem during reproducing examples from "Protein Structure Networks with Bio3D" tutorial
(http://thegrantlab.org/bio3d/tutorials/protein-structure-networks).
Here is the fragment I am trying to do:
"
The code snippet below first sets the file paths for the example HIVpr starting structure (pdbfile) and trajectory data (dcdfile), then reads these files (producing the objects dcd and pdb).
dcdfile <- system.file("examples/hivp.dcd", package = "bio3d")
pdbfile <- system.file("examples/hivp.pdb", package = "bio3d")
# Read MD data
dcd <- read.dcd(dcdfile)
pdb <- read.pdb(pdbfile)
inds <- atom.select(pdb, resno = c(24:27, 85:90), elety = "CA")
trj <- fit.xyz(fixed = pdb$xyz, mobile = dcd,
fixed.inds = inds$xyz, mobile.inds = inds$xyz)
Once we have the superposed trajectory frames we can asses the extent to which the atomic fluctuations of individual residues (in this very short example simulation) are correlated with one another and build a network from this data:
cij <- dccm(trj)
net <- cna(cij)
plot(net, pdb)
"
And till this moment everything works well.
# View the correlations in pymol
view.dccm(cij, pdb, launch = FALSE)
Here I open generated pdb file corr.inpcrd with pymol.
But instead of nice cartoon 3D model I see just aminoacid residues represented by dots.
Tried to solve the problem with pymol using settings for cartoons, ribbons, colors, transparency and command show but it changed nothing.
Would be grateful for your suggestions!
I have not enough reputation to illustrate expected and obtained outcome with images but probably I will be able to send them directly if necessary.
Thank you!

Typically this will work if pymol is in your path for executables (see here: http://tinyurl.com/lzhpz3w for more about where bio3d expects to find pymol and muscle).
view.dccm(cij, pdb, launch = FALSE)
I don't use windows myself but if you post this question on the bio3d bitbucket issues page https://bitbucket.org/Grantlab/bio3d/issues you will get help from experienced windows bio3d users including the author of this function.

Try setting launch=TRUE in your call to the view.dccm() function to have both PDB and pymol script loaded for you.

Related

Extruding an STL from a binary png image with python

I have been working on this code for a project at work which will (hopefully) take in images from a scanning electron microscope and generate 3D STL files of the structures were imaging. I'm at the stage with the code where I'm trying to generate a 3D structure from a 'coloured in' binary image I've made with some edge detection code I wrote. I came across this post How can i extrude a stl with python that basically does exactly what I need (generating a meshed 3D structure from a binary image). I've tried using/adapting the code in the answer to that post (see below) but I keep running into the following error: polyline2 = mr.distanceMapTo2DIsoPolyline(dm.value(), isoValue=127) RuntimeError: Bad expected access. I cant find anything online about why this is happening and I'm no expert in Python so have no idea myself. If anyone has an idea, I'd really appreciate it!
Code from answer to above post:
import meshlib.mrmeshpy as mr
# load image as Distance Map object:
dm = mr.loadDistanceMapFromImage(mr.Path("your-image.png"), 0)
# find boundary contour of the letter:
polyline2 = mr.distanceMapTo2DIsoPolyline(dm.value(), isoValue=127)
# triangulate the contour
mesh = mr.triangulateContours(polyline2.contours2())
# extrude itself:
mr.addBaseToPlanarMesh(mesh, zOffset=30)
# export the result:
mr.saveMesh(mesh, mr.Path("output-mesh.stl"))
I have tried the following:
Reconfigured the MeshLib package that this command uses. Package docs here: https://meshinspector.github.io/MeshLib/html/index.html#PythonIntegration
Updating VS studio/python/MeshLib
In older version of meshlib python module RuntimeError: Bad expected access indicated that mr.loadDistanceMapFromImage had failed, you should had checked it like this:
import meshlib.mrmeshpy as mr
# load image as Distance Map object:
dm = mr.loadDistanceMapFromImage(mr.Path("your-image.png"), 0)
# check dm
if ( not dm.has_value() ):
raise Exception(dm.error())
# find boundary contour of the letter:
polyline2 = mr.distanceMapTo2DIsoPolyline(dm.value(), isoValue=127)
# triangulate the contour
mesh = mr.triangulateContours(polyline2.contours2())
# extrude itself:
mr.addBaseToPlanarMesh(mesh, zOffset=30)
# export the result:
mr.saveMesh(mesh, mr.Path("output-mesh.stl"))
But in actual release your code will rise exception with real error.
Please make sure that path is correct, if it doesn't help please provide more info like png file and version of python and version of MeshLib and anything else you find related.
P.S. If there is real problem with MeshLib better open issue in github.

Increase graph size in plantUML from python?

MWE
To generate PlantUML diagrams in (sub)folder: /Diagrams/ I use the following python script:
from plantuml import PlantUML
import os
from os.path import abspath
from shutil import copyfile
os.environ['PLANTUML_LIMIT_SIZE'] = str(4096 * 4) # set max with to 4 times the default (16,384)
server = PlantUML(url='http://www.plantuml.com/plantuml/img/',
basic_auth={},
form_auth={}, http_opts={}, request_opts={})
diagram_dir = "./Diagrams"
#directory = os.fsencode()
for file in os.listdir(diagram_dir):
filename = os.fsdecode(file)
if filename.endswith(".txt"):
server.processes_file(abspath(f'./Diagrams/{filename}'))
It is used to generate for example the following test.txt file:
#startuml
'Enforce straight lines
skinparam linetype ortho
' Set direction of graph hierarchy
Left to Right direction
' create work package data
rectangle "something something something" as ffd0
rectangle "something something something" as ffd1
rectangle "something something something something something" as ffd2
rectangle "something something something something" as ffd3
rectangle "something something somethingsomethingsomething" as ffd4
rectangle "something something something something something something" as ffd5
rectangle "something something something something" as ffd6
rectangle "something something something " as ffd7
' Implement graph hierarchy
ffd0-->ffd1
ffd1-->ffd2
ffd2-->ffd3
ffd3-->ffd4
ffd4-->ffd5
ffd5-->ffd6
ffd6-->ffd7
#enduml
Expected behavior
Because I set the PLANTUML_LIMIT_SIZE variable to 16384 (pixels) as the FAQ suggests, I would expect this to fill up the picture of the diagram with all the blocks connected side by side up to a max width of 4096 * 4 pixels.
To test whether perhaps setting it from the python script was implemented incorrectly I also tried to set it manually with: set PLANTUML_LIMIT_SIZE=16384 to expect the same behavior as explained in the above paragraph (a picture filled up till 16384 pixels).
Observed behavior
Instead PlantUML cuts off the picture at 2000 horizontal pictures as shown in the figure below:
Question
How can I ensure the PlantUML does not cut off the blocks of the diagrams of n pixels (height or width), from a python script?
The best way I've found to prevent diagrams from being cut off, without trying to guess at the size or picking some arbitrarily large limit, is to select SVG output.
Note that setting PLANTUML_LIMIT_SIZE is only going to have an effect if you're running PlantUML locally, but it appears the Python interface you're using sends the diagram to the online service. I don't know the internals of that interface, but per the documentation you should be able to get SVG output by using http://www.plantuml.com/plantuml/svg/ as the service URL.
If you need the final image in PNG format, you will need to convert it with another tool.
Approach 1:
To prevent the diagram from being cut off I followed the following steps:
Downloaded the plantuml.jar from this location http://sourceforge.net/projects/plantuml/files/plantuml.jar/download
Put the diagram which I wrote in a someLargeDiagram.txt file, in the same directory as the plantuml.jar file.
Opened terminal on Ubuntu 20.04 in that same directory and ran:
java -jar plantuml.jar -verbose someLargeDiagram.txt
That successfully generated the diagram as .png file, which was not cut off.
Approach 2:
After creating even larger graphs, they got cut-off again, and it gave the message to increase the PLANTUML_LIMIT_SIZE. I tried passing the size as an argument in the commandline using: java -jar plantuml.jar -verbose -PLANTUML_LIMIT_SIZE=8192 Diagrams/latest.uml however that did not work, nor did ..-PLANTUML_LIMIT_SIZE 8192... This link suggested one could set it as an environment variable, so I did that in Ubuntu 20.04 using command: export PLANTUML_LIMIT_SIZE 8192, after which I successfully created a larger diagram that was not cut-off with command:
java -jar plantuml.jar -verbose Diagrams/latest.uml

Can Matplotlib save a plot and re-open it after in its own GUI, without losing information? [duplicate]

Is there a way to save a Matplotlib figure such that it can be re-opened and have typical interaction restored? (Like the .fig format in MATLAB?)
I find myself running the same scripts many times to generate these interactive figures. Or I'm sending my colleagues multiple static PNG files to show different aspects of a plot. I'd rather send the figure object and have them interact with it themselves.
I just found out how to do this. The "experimental pickle support" mentioned by #pelson works quite well.
Try this:
# Plot something
import matplotlib.pyplot as plt
fig,ax = plt.subplots()
ax.plot([1,2,3],[10,-10,30])
After your interactive tweaking, save the figure object as a binary file:
import pickle
pickle.dump(fig, open('FigureObject.fig.pickle', 'wb')) # This is for Python 3 - py2 may need `file` instead of `open`
Later, open the figure and the tweaks should be saved and GUI interactivity should be present:
import pickle
figx = pickle.load(open('FigureObject.fig.pickle', 'rb'))
figx.show() # Show the figure, edit it, etc.!
You can even extract the data from the plots:
data = figx.axes[0].lines[0].get_data()
(It works for lines, pcolor & imshow - pcolormesh works with some tricks to reconstruct the flattened data.)
I got the excellent tip from Saving Matplotlib Figures Using Pickle.
As of Matplotlib 1.2, we now have experimental pickle support. Give that a go and see if it works well for your case. If you have any issues, please let us know on the Matplotlib mailing list or by opening an issue on github.com/matplotlib/matplotlib.
This would be a great feature, but AFAIK it isn't implemented in Matplotlib and likely would be difficult to implement yourself due to the way figures are stored.
I'd suggest either (a) separate processing the data from generating the figure (which saves data with a unique name) and write a figure generating script (loading a specified file of the saved data) and editing as you see fit or (b) save as PDF/SVG/PostScript format and edit in some fancy figure editor like Adobe Illustrator (or Inkscape).
EDIT post Fall 2012: As others pointed out below (though mentioning here as this is the accepted answer), Matplotlib since version 1.2 allowed you to pickle figures. As the release notes state, it is an experimental feature and does not support saving a figure in one matplotlib version and opening in another. It's also generally unsecure to restore a pickle from an untrusted source.
For sharing/later editing plots (that require significant data processing first and may need to be tweaked months later say during peer review for a scientific publication), I still recommend the workflow of (1) have a data processing script that before generating a plot saves the processed data (that goes into your plot) into a file, and (2) have a separate plot generation script (that you adjust as necessary) to recreate the plot. This way for each plot you can quickly run a script and re-generate it (and quickly copy over your plot settings with new data). That said, pickling a figure could be convenient for short term/interactive/exploratory data analysis.
Why not just send the Python script? MATLAB's .fig files require the recipient to have MATLAB to display them, so that's about equivalent to sending a Python script that requires Matplotlib to display.
Alternatively (disclaimer: I haven't tried this yet), you could try pickling the figure:
import pickle
output = open('interactive figure.pickle', 'wb')
pickle.dump(gcf(), output)
output.close()
Good question. Here is the doc text from pylab.save:
pylab no longer provides a save function, though the old pylab
function is still available as matplotlib.mlab.save (you can still
refer to it in pylab as "mlab.save"). However, for plain text
files, we recommend numpy.savetxt. For saving numpy arrays,
we recommend numpy.save, and its analog numpy.load, which are
available in pylab as np.save and np.load.
I figured out a relatively simple way (yet slightly unconventional) to save my matplotlib figures. It works like this:
import libscript
import matplotlib.pyplot as plt
import numpy as np
t = np.arange(0.0, 2.0, 0.01)
s = 1 + np.sin(2*np.pi*t)
#<plot>
plt.plot(t, s)
plt.xlabel('time (s)')
plt.ylabel('voltage (mV)')
plt.title('About as simple as it gets, folks')
plt.grid(True)
plt.show()
#</plot>
save_plot(fileName='plot_01.py',obj=sys.argv[0],sel='plot',ctx=libscript.get_ctx(ctx_global=globals(),ctx_local=locals()))
with function save_plot defined like this (simple version to understand the logic):
def save_plot(fileName='',obj=None,sel='',ctx={}):
"""
Save of matplolib plot to a stand alone python script containing all the data and configuration instructions to regenerate the interactive matplotlib figure.
Parameters
----------
fileName : [string] Path of the python script file to be created.
obj : [object] Function or python object containing the lines of code to create and configure the plot to be saved.
sel : [string] Name of the tag enclosing the lines of code to create and configure the plot to be saved.
ctx : [dict] Dictionary containing the execution context. Values for variables not defined in the lines of code for the plot will be fetched from the context.
Returns
-------
Return ``'done'`` once the plot has been saved to a python script file. This file contains all the input data and configuration to re-create the original interactive matplotlib figure.
"""
import os
import libscript
N_indent=4
src=libscript.get_src(obj=obj,sel=sel)
src=libscript.prepend_ctx(src=src,ctx=ctx,debug=False)
src='\n'.join([' '*N_indent+line for line in src.split('\n')])
if(os.path.isfile(fileName)): os.remove(fileName)
with open(fileName,'w') as f:
f.write('import sys\n')
f.write('sys.dont_write_bytecode=True\n')
f.write('def main():\n')
f.write(src+'\n')
f.write('if(__name__=="__main__"):\n')
f.write(' '*N_indent+'main()\n')
return 'done'
or defining function save_plot like this (better version using zip compression to produce lighter figure files):
def save_plot(fileName='',obj=None,sel='',ctx={}):
import os
import json
import zlib
import base64
import libscript
N_indent=4
level=9#0 to 9, default: 6
src=libscript.get_src(obj=obj,sel=sel)
obj=libscript.load_obj(src=src,ctx=ctx,debug=False)
bin=base64.b64encode(zlib.compress(json.dumps(obj),level))
if(os.path.isfile(fileName)): os.remove(fileName)
with open(fileName,'w') as f:
f.write('import sys\n')
f.write('sys.dont_write_bytecode=True\n')
f.write('def main():\n')
f.write(' '*N_indent+'import base64\n')
f.write(' '*N_indent+'import zlib\n')
f.write(' '*N_indent+'import json\n')
f.write(' '*N_indent+'import libscript\n')
f.write(' '*N_indent+'bin="'+str(bin)+'"\n')
f.write(' '*N_indent+'obj=json.loads(zlib.decompress(base64.b64decode(bin)))\n')
f.write(' '*N_indent+'libscript.exec_obj(obj=obj,tempfile=False)\n')
f.write('if(__name__=="__main__"):\n')
f.write(' '*N_indent+'main()\n')
return 'done'
This makes use a module libscript of my own, which mostly relies on modules inspect and ast. I can try to share it on Github if interest is expressed (it would first require some cleanup and me to get started with Github).
The idea behind this save_plot function and libscript module is to fetch the python instructions that create the figure (using module inspect), analyze them (using module ast) to extract all variables, functions and modules import it relies on, extract these from the execution context and serialize them as python instructions (code for variables will be like t=[0.0,2.0,0.01] ... and code for modules will be like import matplotlib.pyplot as plt ...) prepended to the figure instructions. The resulting python instructions are saved as a python script whose execution will re-build the original matplotlib figure.
As you can imagine, this works well for most (if not all) matplotlib figures.
If you are looking to save python plots as an interactive figure to modify and share with others like MATLAB .fig file then you can try to use the following code. Here z_data.values is just a numpy ndarray and so you can use the same code to plot and save your own data. No need of using pandas then.
The file generated here can be opened and interactively modified by anyone with or without python just by clicking on it and opening in browsers like Chrome/Firefox/Edge etc.
import plotly.graph_objects as go
import pandas as pd
z_data=pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/api_docs/mt_bruno_elevation.csv')
fig = go.Figure(data=[go.Surface(z=z_data.values)])
fig.update_layout(title='Mt Bruno Elevation', autosize=False,
width=500, height=500,
margin=dict(l=65, r=50, b=65, t=90))
fig.show()
fig.write_html("testfile.html")

How to use lattice in Rpy2 and save result to pdf?

I am following the documentation for rpy2 here (http://rpy.sourceforge.net/rpy2/doc-2.1/html/graphics.html?highlight=lattice). I can successfully plot interactively using lattice from rpy2, e.g.:
iris = r('iris')
p = lattice.xyplot(Formula("Petal.Length ~ Petal.Width"),
data=iris)
rprint = robj.globalenv.get("print")
rprint(p)
rprint displays the graph. However, when I try to save the graph to pdf by first doing:
r.pdf("myfile.pdf")
and then my lattice calls, it does not work and instead results in an empty pdf. If I do the same (call r.pdf, then plot) with ggplot2 or with the R base, then I get a working pdf. Does lattice require anything special from within Rpy2 to save the results to a PDF file? The following does not work either:
iris = r('iris')
r.pdf("myfile.pdf")
grdevices = importr('grDevices')
p = lattice.xyplot(Formula("Petal.Length ~ Petal.Width"),
data=iris)
rprint = robj.globalenv.get("print")
rprint(p)
grdevices.dev_off()
Thank you.
you need some equivalent of dev.off() after the print command.
That is, in order to save your graphs to pdf, the general outline is:
pdf(...)
print(....)
dev.off()
Failing to call dev.off() will result in an empty pdf file.
from this source, it appears that the equivalent in rpy2 might be
grdevices.dev_off()
The solution is to use:
robjects.r["dev.off"]()
For some reason the other variants do not do the trick.

How to get the diff of two PDF files using Python?

I need to find the difference between two PDF files. Does anybody know of any Python-related tool which has a feature that directly gives the diff of the two PDFs?
What do you mean by "difference"? A difference in the text of the PDF or some layout change (e.g. an embedded graphic was resized). The first is easy to detect, the second is almost impossible to get (PDF is an VERY complicated file format, that offers endless file formatting capabilities).
If you want to get the text diff, just run a pdf to text utility on the two PDFs and then use Python's built-in diff library to get the difference of the converted texts.
This question deals with pdf to text conversion in python: Python module for converting PDF to text.
The reliability of this method depends on the PDF Generators you are using. If you use e.g. Adobe Acrobat and some Ghostscript-based PDF-Creator to make two PDFs from the SAME word document, you might still get a diff although the source document was identical.
This is because there are dozens of ways to encode the information of the source document to a PDF and each converter uses a different approach. Often the pdf to text converter can't figure out the correct text flow, especially with complex layouts or tables.
I do not know your use case, but for regression tests of script which generates pdf using reportlab, I do diff pdfs by
Converting each page to an image using ghostsript
Diffing each page against page image of standard pdf, using PIL
e.g
im1 = Image.open(imagePath1)
im2 = Image.open(imagePath2)
imDiff = ImageChops.difference(im1, im2)
This works in my case for flagging any changes introduced due to code changes.
Met the same question on my encrypted pdf unittest, neither pdfminer nor pyPdf works well for me.
Here are two commands (pdftocairo, pdftotext) work perfect on my test. (Ubuntu Install: apt-get install poppler-utils)
You can get pdf content by:
from subprocess import Popen, PIPE
def get_formatted_content(pdf_content):
cmd = 'pdftocairo -pdf - -' # you can replace "pdftocairo -pdf" with "pdftotext" if you want to get diff info
ps = Popen(cmd, shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
stdout, stderr = ps.communicate(input=pdf_content)
if ps.returncode != 0:
raise OSError(ps.returncode, cmd, stderr)
return stdout
Seems pdftocairo can redraw pdf files, pdftotext can extract all text.
And then you can compare two pdf files:
c1 = get_formatted_content(open('f1.pdf').read())
c2 = get_formatted_content(open('f2.pdf').read())
print(cmp(c1, c2)) # for binary compare
# import difflib
# print(list(difflib.unified_diff(c1, c2))) # for text compare
Even though this question is quite old, my guess is that I can contribute to the topic.
We have several applications generating tons of PDFs. One of these apps is written in Python and recently I wanted to write integration tests to check if the PDF generation was working correctly.
Testing PDF generation is HARD, because the specs for PDF files are very complicated and non-deterministic. Two PDFs, generated with the same exact input data, will generate different files, so direct file comparison is discarded.
The solution: we have to go with testing the way they look like (because THAT should be deterministic!).
In our case, the PDFs are being generated with the reportlab package, but this doesn't matter from the test perspective, we just need a filename or the PDF blob (bytes) from the generator. We also need an expectation file containing a "good" PDF to compare with the one coming from the generator.
The PDFs are converted to images and then compared. This can be done in multiple ways, but we decided to use ImageMagick, because it is extremely versatile and very mature, with bindings for almost every programming language out there. For Python 3, the bindings are offered by the Wand package.
The test looks something like the following. Specific details of our implementation were removed and the example was simplified:
import os
from unittest import TestCase
from wand.image import Image
from app.generators.pdf import PdfGenerator
DIR = os.path.dirname(__file__)
class PdfGeneratorTest(TestCase):
def test_generated_pdf_should_match_expectation(self):
# `pdf` is the blob of the generated PDF
# If using reportlab, this is what you get calling `getpdfdata()`
# on a Canvas instance, after all the drawing is complete
pdf = PdfGenerator().generate()
# PDFs are vectorial, so we need to set a resolution when
# converting to an image
actual_img = Image(blob=pdf, resolution=150)
filename = os.path.join(DIR, 'expected.pdf')
# Make sure to use the same resolution as above
with Image(filename=filename, resolution=150) as expected:
diff = actual.compare(expected, metric='root_mean_square')
self.assertLess(diff[1], 0.01)
The 0.01 is as low as we can tolerate small differences. Considering that diff[1] varies from 0 to 1 using the root_mean_square metric, we are here accepting a difference up to 1% on all channels, comparing with the sample expected file.
Check this out, it can be useful: pypdf

Categories

Resources