Writing to .ply files using python - python

I am using the PlyFile library (https://pypi.org/project/plyfile) in Python.
I used vertex = plydata['vertex'] to generate a list of vertex co-ordinates. The datatype is as follows -
PlyElement('vertex', (PlyProperty('x', 'float'), PlyProperty('y', 'float'), PlyProperty('z', 'float')), count=2500086, comments=[])
After that I generated a list of all the x-axis values and put them into a numpy array and performed some operations on them. Then I replaced the original x-axis values in plydata['vertex'] with these new ones.
Now I want to write these values into a .ply file and create a new mesh. How would I go about it? I tried going through the docs but the code is quite messy.
Any insights would help, Thanks!

Saving a .ply file:
ply_data: PlyData
ply_data.text = True # for asci format
ply_data.write(path)

Related

How do you save textnets (python) to gml / gexf or access dataframe of graph?

I have been using textnets (python) to analyse a corpus. I need to export the resulting graph for further analysis / layout editing in Gephi. Having read the docs I am still confused on how to either save the resulting igraph Graph in the appropriate format or to access the pandas dataframe which could then be exported. For example using the tutorial from docs, if using:
from textnets import Corpus, Textnet
from textnets import examples
corpus = Corpus(examples.moon_landing)
tn = Textnet(corpus.tokenized(), min_docs=1)
print(tn)
I had thought I could either return a pandas data frame by calling 'tn' though this returns a 'Textnet' object.
I had also thought I could return an igraph.Graph object and then subsequently use Graph.write_gml() using something like tn.project(node_type='doc').write_gml('test.gml') to save the file in an appropriate format but this returns a ProjectedTextnet.
Any advise would be most welcome.
For the second part of your question, you can convert the textnet object to an igraph:
g = tn.graph
Then save as gml:
g.write_gml("test.gml")

I have a .csv file that contains the edges of a graph (Name1, Name2). How do I convert this into a adjacency matrix?

I'm doing a machine learning project on Python that requires me to process some data and convert them into an adjacency matrix. The data is saved in csv files, and are saved as such:
114787,375519
114787,285613
114787,7448
114787,4914
114787,51343
.
.
The problem is, these numbers do not represent indices but just names. There are only 19000 nodes or so, and the numbers in the .csv file are basically random names given to the various nodes of a graph.
I need to put all these into an adjacency matrix while retaining the information about which index is represented by which name and I cannot for the life of me figure out how to do so. Would really appreciate some help.
I think below way will work.
import networkx as nx
f = open("edge_lst.csv", "rb")
G = nx.read_edgelist(f)
f.close()

How can I partially read 2d satellite file in Python? (binary, fromfile)

I have a lot of satellite data that is consists of two-dimension.
(I convert H5 to 2d array data that not include latitude information
I made Lat/Lon information data additionally.)
I know real Lat/Lon coordination and grid coordination in one data.
How can I partially read 2d satellite file in Python?
"numpy.fromfile" was usually used to read binary file.
if I use option as count in numpy.fromfile, I can read binary file partially.
However I want to skip front records in one data for save memory.
for example, i have 3x3 2d data as follow:
python
a= [[1,2,3]
[4,5,6]
[7,8,9]]
I just read a[3][0] in Python. (result = 7)
When I read file in Fortran, I used "recl, rec".
Fortran
open(1, file='exsmaple.bin', access='direct', recl=4) ! recl=4 means 4 btype
read(1, rec=lat*x-lon) filename
close(1)
lat means position of latitude in data.
(lat = 3 in above exsample ; start number is 1 in Fortran.)
lon means position of longitude in data.
(lon = 1 in above exsample ; start number is 1 in Fortran.)
x is no. rows.
(x = 3, above example, array is 3x3)
I can read file, and use only 4 byte of memory.
I want to know similar method in Python.
Please give me special information to save time and memory.
Thank you for reading my question.
2016.10.28.
Solution
python
Data = [1,2,3,4,5,6,7,8,9], dtype = int8, filename=name
a = np.memmap(name, dtype='int8', mode='r', shape=(1), offset=6)
print a[0]
result : 7
To read .h5 files :
import h5py
ds = h5py.File(filename, "r")
variable = ds['variable_name']
It's hard to follow your description. Some proper code indentation would help over come your English language problems.
So you have data on a H5 file. The simplest approach is to h5py to load it into a Python/numpy session, and select the necessary data from those arrays.
But it sounds as though you have written a portion of this data to a 'plain' binary file. It might help to know how you did it. Also in what way is this 2d?
np.fromfile reads a file as though it was 1d. Can you read this file, up to some count? And with a correct dtype?
np.fromfile accepts an open file. So I think you can open the file, use seek to skip forward, and then read count items from there. But I haven't tested that idea.

optimal data structure to store million of pixels in python?

I have several images and after some basic processing and contour detection I want to store the detected pixels locations and their adjacent neighbours values into a Python Data Structure. I settled for numpy.array
The pixel locations from each Image are retrieved using:
locationsPx = cv2.findNonZero(SomeBWImage)
which will return an array of the shape (NumberOfPixels,1L,2L) with :
print(locationsPx[0]) : array([[1649, 4]])
for example.
My question is: is it possible to store this double array on a single column in another array? Or should I use a list and drop the array all together?
note: the dataset of images might increase so the dimensions of my chose data structure will not be only huge, but also variable
EDIT: or maybe numpy.array is not good idea and Pandas Dataframe is better suited? I am open to suggestion from those who have more experience in this.
Numpy arrays are great for computation. They are not great for storing data if the size of the data keeps changing. As ali_m pointed out, all forms of array concatenation in numpy are inherently slow. Better to store the arrays in a plain-old python list:
coordlist = []
coordlist.append(locationsPx[0])
Alternatively, if your images have names, it might be attractive to use a dict with the image names as keys:
coorddict = {}
coorddict[image_name] = locationsPx[0]
Either way, you can readily iterate over the contents of the list:
for coords in coordlist:
or
for image_name, coords in coorddict.items():
And pickle is a convenient way to store your results in a file:
import pickle
with open("filename.pkl", "wb") as f:
pickle.dump(coordlist, f, pickle.HIGHEST_PROTOCOL)
(or same with coorddict instead of coordlist).
Reloading is trivially easy as well:
with open("filename.pkl", "rb") as f:
coordlist = pickle.load(f)
There are some security concerns with pickle, but if you only load files you have created yourself, those don't apply.
If you find yourself frequently adding to a previously pickled file, you might be better off with an alternative back end, such as sqlite.

Import Multiple Text files (Large Number) using numpy and Post Processing

This forum has been extremely helpful for a python novice like me to improve my knowledge. I have generated a large number of raw data in text format from my CFD simulation. My objective is to import these text files into python and do some postprocessing on them. This is a code that I have currently.
import numpy as np
from matplotlib import pyplot as plt
import os
filename=np.array(['v1-0520.txt','v1-0878.txt','v1-1592.txt','v1-3020.txt','v1-5878.txt'])
for i in filename:
format_name= i
path='E:/Fall2015/Research/CFDSimulations_Fall2015/ddn310/Autoexport/v1'
data= os.path.join(path,format_name)
X,Y,U,V,T,Tr = np.loadtxt(data,usecols=(1,2,3,4,5,6),skiprows=1,unpack = True) # Here X and Y represents the X and Y coordinate,U,V,T,Tr represents the Dependent Variables
plt.figure(1)
plt.plot(T,Y)
plt.legend(['vt1a','vtb','vtc','vtd','vte','vtf'])
plt.grid(b=True)
Is there a better way to do this, like importing all the text files (~10000 files) at once into python and then accessing whichever files I need for post processing (maybe indexing). All the text files will have the same number of columns and rows.
I am just a beginner to Python.I will be grateful if someone can help me or point me in the right direction.
Your post needs to be edited to show proper indentation.
Based on a quick read, I think you are:
reading a file, making a small edit, and write it back
then you load it into a numpy array and plot it
Presumably the purpose of your edit is to correct some header or value.
You don't need to write the file back. You can use content directly in loadtxt.
content = content.replace("nodenumber","#nodenumber") # Ignoring Node number column
data1=np.loadtxt(content.splitlines())
Y=data1[:,2]
temp=data1[:,5]
loadtxt accepts any thing that feeds it line by line. content.splitlines() makes a list of lines, which loadtxt can use.
the load could be more compact with:
Y, temp = np.loadtxt(content.splitlines(), usecols=(2,5), unpack=True)
With usecols you might not even need the replace step. You haven't given us a sample file to test.
I don't understand your multiple file needs. One way other you need to open and read each file, one by one. And it would be best to close one before going on to the next. The with open(name) as f: syntax is great for ensuring that a file is closed.
You could collect the loaded data in larger lists or arrays. If Y and temp are identical in size for all files, they can be collected into larger dimensional array, e.g. YY[i,:] = Y for the ith file, where YY is preallocated. If they can vary in size, it is better to collect them in lists.

Categories

Resources