Problems using Matlab.Engine from Python to read Arrays

Problems using Matlab.Engine from Python to read Arrays - python

i have the following problem. I want to read a .txt-File in Python and use the Variables in Matlab afterwards. I´ve written a Script in Python, which is reading this file line by line and gets the values after Names I let this searching for.
At this point, the Script saves the Values as str. If I want to convert the Values via Matlab.Engine to my Workspace, I define them as for example as
workspace['A'] = float(A)
and this works well. The Problem I´m facing is the handling of Vectors, which my script defines as Str. In the file there´re stores as {1, 2, 3}. I replace the { with [, but in the end I´m not able to set them up in the right way, to have them as a [ a X b] - Double Variable in my workpace.
Multiple ways of Converting didnt worked by now.. Maybe one of you guys already had some kind of problem in the past.
Converting the Vectors in many ways, e.g. using numpy

Related

How can I make a list that contains various types of elements in Python?

I have the following parameters in a Python file that is used to send commands pertaining to boundary conditions to Abaqus:
u1=0.0,
u2=0.0,
u3=0.0,
ur1=UNSET,
ur2=0.0,
ur3=UNSET
I would like to place these values inside a list and print that list to a .txt file. I figured I should convert all contents to strings:
List = [str(u1), str(u2), str(u3), str(ur1), str(ur2), str(ur3)]
This works only as long as the list does not contain "UNSET", which is a command used by Abaqus and is neither an int or str. Any ideas how to deal with that? Many thanks!

UNSET is an Abaqus/cae defined symbolic constant. It has a member name that returns the string representation, so you might do something like this:
def tostring(v):
try:
return(v.name)
except:
return(str(v))
then do for example
bc= [0.,1,UNSET]
print "u1=%s u2=%s u3=%s\n"%tuple([tostring(b) for b in bc])
u1=0. u2=1 u3=UNSET
EDIT simpler than that. After doing things the hard way I realize the symbolic constant is handled properly by the string conversion so you can just do this:
print "u1=%s u2=%s u3=%s\n"%tuple(['%s'%b for b in bc])

How to write back to a PDB file after doing Superimposer for atoms of a protein in PDB.BIO python

I read and extracted information of atoms from a PDB file and did a Superimposer() to align a mutation to wild-type. How can I write the aligned values of atoms back to PDB file? I tried to use PDBIO() library but it doesn't work since it doesn't accept a list as an input. Anyone has an idea how to do it?
mutantAtoms = []
mutantStructure = PDBParser().get_structure("name",pdbFile)
mutantChain = mutStructure[0]["B"]
# Extract information of atoms
for residues in mutantChain:
mutantAtoms.append(residues)
# Do alignment
si =Superimposer()
si.set_atoms(wildtypeAtoms, mutantAtoms)
si.apply(mutantAtoms)
Now mutantAtoms is the aligned atom to wild-type atom. I need to write this information to a PDB file. My question is how to convert from list of aligned atoms to a structure and use PDBIO() or some other ways to write to a PDB file.

As I see in an example in the PDBIO package documentation in Biopython documentation:
p = PDBParser()
s = p.get_structure("1fat", "1fat.pdb")
io = PDBIO()
io.set_structure(s)
io.save("out.pdb")
Seems like PDBIO module needs an object of class Structure to work, which is in principle what I understand Superimposer works with. When you say it does not accept a list do you mean you have a list of structures? In that case you could simply do it by iterating throught the structures as in:
for s in my_results_list:
io.set_structure(s)
io.save("out.pdb")
If what you have is a list of atoms, I guess you could create a Structure object with that and then pass it to PDBIO.
However, it is difficult to tell more without knowing more about your problem. You could put on your question the code lines where you get the problem.
Edit: Now I have better understood what you want to do. So I have seen in an interesting Biopython Structural Bioinformatics FAQ some information about the Structure class, which is a little complex apparently. At first sight, I do not see a very easy way to create Structure objects from scratch, but what you could do is modify the structure you get from PDBIO substituting the atoms list with the result you get from Superimposer and then write the .pdb file using the same modified structure. So you could try to put your mutantAtoms list into the mutantStructure object you already have.

Reading multidimensional array data into Python

I have data in the format of 10000x500 matrix contained in a .txt file. In each row, data points are separated from each other by one whitespace and at the end of each row there a new line starts.
Normally I was able to read this kind of multidimensional array data into Python by using the following snippet of code:
with open("position.txt") as f:
data = [line.split() for line in f]
# Get the data and convert to floats
ytemp = np.array(data)
y = ytemp.astype(np.float)
This code worked until now. When I try to use the exact some code with another set of data formatted in the same way, I get the following error:
setting an array element with a sequence.
When I try to get the 'shape' of ytemp, it gives me the following:
(10001,)
So it converts the rows to array, but not the columns.
I thought of any other information to include, but nothing came to my mind. Basically I'm trying to convert my data from a .txt file to a multidimensional array in Python. The code worked before, but now for some reason that is unclear to me it doesn't work. I tried to look compare the data, of course it's huge, but everything seems quite similar between the data that is working and the data that is not working.
I would be more than happy to provide any other information you may need. Thanks in advance.

Use numpy's builtin function:
data = numpy.loadtxt('position.txt')
Check out the documentation to explore other available options.

error with gdalbuildvrt, in Python

I am new to python/GDAL and am running into perhaps a trivial issue. This may stem from the fact that I don't really understand how to use GDAL properly in python, or something careless, but even though I think I am following the help doc, I keep getting a syntax error when trying to use "gdalbuildvrt".
What I want to do is take several (amount varies for each set, call it N) geotagged 1-band binary rasters [all values are either 0 or 1] of different sizes (each raster in the set overlaps for the most part though), and "stack" them on top of each other so that they are aligned properly according to their coordinate information. I want this "stack" simply so I can sum the values and produce a 'total' tiff that has an extent to match the exclusive extent (meaning not just the overlap region) of all the original rasters. The resulting tiff would have values ranging from 0 to N, to represent the total number of "hits" the pixel in that location received over the course of the N rasters.
I was led to gdalbuildvrt [http://www.gdal.org/gdalbuildvrt.html] and after reading about it, it seemed that by using the keyword -separate, I would be able to achieve what I need. However, each time I try to run my program, I get a syntax error. The following shows two of the several different ways I tried calling gdalbuildvrt:
gdalbuildvrt -separate -input_file_list stack.vrt inputlist.txt
gdalbuildvrt -separate stack.vrt inclassfiles
Where inputlist.txt is a text file with a path to the tif on every line, just like the help doc specifies. And inclassfiles is a python list of the pathnames. Every single time, no matter which way I call it, I get a syntax error on the first word after the keywords (i.e. 'inputlist' in inputlist.txt, or 'stack' in stack.vrt).
Could someone please shed some light on what I might be doing wrong? Alternatively, does anyone know how else I could use python to get what I need?
Thanks so much.

gdalbuildvrt is a GDAL command line utility. From your example its a bit unclear how you actually run it, but when running from within Python you should execute it as a subprocess.
And in your first line you have the .vrt and the .txt in the wrong order. The textfile containing the files should follow directly after the -input_file_list.
From within Python you can call gdalbuildvrt like:
import os
os.system('gdalbuildvrt -separate -input_file_list inputlist.txt stack.vrt')
Note that the command is provided as a string. Using a Python list with the files can be done with something like:
os.system('gdalbuildvrt -separate stack.vrt %s') % ' '.join(data)
The ' '.join(data) part converts the list to a string with a space between the items.
Depending on how your GDAL is build, its sometimes possible to use wildcards as well:
os.system('gdalbuildvrt -separate stack.vrt *.tif')

reading a binary file in python

I have to read a binary file in python. This is first written by a Fortran 90 program in this way:
open(unit=10,file=filename,form='unformatted')
write(10)table%n1,table%n2
write(10)table%nH
write(10)table%T2
write(10)table%cool
write(10)table%heat
write(10)table%cool_com
write(10)table%heat_com
write(10)table%metal
write(10)table%cool_prime
write(10)table%heat_prime
write(10)table%cool_com_prime
write(10)table%heat_com_prime
write(10)table%metal_prime
write(10)table%mu
if (if_species_abundances) write(10)table%n_spec
close(10)
I can easily read this binary file with the following IDL code:
n1=161L
n2=101L
openr,1,file,/f77_unformatted
readu,1,n1,n2
print,n1,n2
spec=dblarr(n1,n2,6)
metal=dblarr(n1,n2)
cool=dblarr(n1,n2)
heat=dblarr(n1,n2)
metal_prime=dblarr(n1,n2)
cool_prime=dblarr(n1,n2)
heat_prime=dblarr(n1,n2)
mu =dblarr(n1,n2)
n =dblarr(n1)
T =dblarr(n2)
Teq =dblarr(n1)
readu,1,n
readu,1,T
readu,1,Teq
readu,1,cool
readu,1,heat
readu,1,metal
readu,1,cool_prime
readu,1,heat_prime
readu,1,metal_prime
readu,1,mu
readu,1,spec
print,spec
close,1
What I want to do is reading this binary file with Python. But there are some problems.
First of all, here is my attempt to read the file:
import numpy
from numpy import *
import struct
file='name_of_my_file'
with open(file,mode='rb') as lines:
c=lines.read()
I try to read the first two variables:
dummy, n1, n2, dummy = struct.unpack('iiii',c[:16])
But as you can see I had to add to dummy variables because, somehow, the fortran programs add the integer 8 in those positions.
The problem is now when trying to read the other bytes. I don't get the same result of the IDL program.
Here is my attempt to read the array n
double = 8
end = 16+n1*double
nH = struct.unpack('d'*n1,c[16:end])
However, when I print this array I get non sense value. I mean, I can read the file with the above IDL code, so I know what to expect. So my question is: how can I read this file when I don't know exactly the structure? Why with IDL it is so simple to read it? I need to read this data set with Python.

What you're looking for is the struct module.
This module allows you to unpack data from strings, treating it like binary data.
You supply a format string, and your file string, and it will consume the data returning you binary objects.
For example, using your variables:
import struct
content = f.read() #I'm not sure why in a binary file you were using "readlines",
#but if this is too much data, you can supply a size to read()
n, T, Teq, cool = struct.unpack("dddd",content[:32])
This will make n, T, Teq, and cool hold the first four doubles in your binary file. Of course, this is just a demonstration. Your example looks like it wants lists of doubles - conveniently struct.unpack returns a tuple, which I take for your case will still work fine (if not, you can listify them). Keep in mind that struct.unpack needs to consume the whole string passed into it - otherwise you'll get a struct.error. So, either slice your input string, or only read the number of characters you'll use, like I said above in my comment.
For example,
n_content = f.read(8*number_of_ns) #8, because doubles are 8 bytes
n = struct.unpack("d"*number_of_ns,n_content)

Did you give scipy.io.readsav a try?
Simply read you file like this:
mydict = scipy.io.readsav('name_of_file')

It looks like you are trying to read the cooling_0000x.out file generated by RAMSES.
Note that the first two integers (n1, n2) provide the dimensions of the two dimentional tables (arrays) that follow in the body of the file... So you need to first process those two integers before you know how much real*8 data is in the rest of the file.
scipy should be of help -- it lets you read arbitrary dimensioned binary data:
http://wiki.scipy.org/Cookbook/InputOutput#head-e35c7736718209eea00ebf37a7e1dfb91df696e1
If you already have this python code, please let me know as I was going to write it today (17Sep2014).
Rick

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.