I have a stack of CT-scan images. After processing (one image from those stack) CT-scan image using Matlab, I saved XY coordinates for each different boundary region in different Excel sheets as follows:
I = imread('myCTscan.jpeg');
BW = im2bw(I);
[coords, labeledImg] = bwboundaries(BW, 4, 'holes');
sheet = 1;
for n=1:length(coords);
xlswrite('fig.xlsx',coords{n,1},sheet,'A1');
sheet = sheet+1;
end
The next step is then to import this set of coordinates and plot it into Abaqus CAE Sketch for finite element analysis.
I figure out that my workflow is something like this:
Import Excel workbook
For each sheet in workbook:
2.1. For each row: read both column to get xy coordinates (each row has two column, x and y coordinate)
2.2. Put each xy coordinates inside a list
2.3. From list, sketch using spline method
Repeat step 2 for other sheets within the workbook
I searched for a while and found something like this:
from abaqus import *
lines= open('fig.xlsx', 'r').readlines()
pointList= []
for line in lines:
pointList.append(eval('(%s)' %line.strip()))
s1= mdb.models['Model-1'].ConstrainedSketch(name='mySketch', sheetSize=500.0)
s1.Spline(points= pointList)
But this only read XY coordinates from only one sheet and I'm stuck at step 3 above. Thus my problem is that how to read these coordinates in different sheets using Abaqus/Python (Abaqus 6.14, Python 2.7) script?
I'm new to Python programming, I can read and understand the syntax but can't write very well (I'm still struggling on how to import Python module in Abaqus). Manually type each coordinates (like in Abaqus' modelAExample.py tutorial) is practically impossible since each of my CT-scan image can have 100++ of boundary regions and 10k++ points.
I'm using:
Windows 7 x64
Abaqus 6.14 (with built in Python 2.7)
Excel 2013
Matlab 2016a with Image Processing Toolbox
You are attempting to read excel files as comma separated files. CSV files by definition can not have more than one tab. Your read command is interpreting the file as a csv and not allowing you to iterate over the tabs in your file (though it begs the question how your file is opening properly in the first place as you are saving an xlsx and reading a csv).
There are numerous python libraries that will parse and process XLS/XLSX files.
Take a look at pyxl and use it to read your file in.
You would likely use something like
from openpyxl import Workbook
(some commands to open the workbook)
listofnames=wb.sheetnames
for k in listofnames:
ws=wb.worksheets(k)
and then input your remaining commands.
Related
I have a data in an array (B=[1,2,3,4,5]) which i took from the DataTable i tried python for loop to imported in excel file using this code:
def Cells(a,b):
return str(chr(b+96) + str(a))
import clr
clr.AddReference("Microsoft.Office.Interop.Excel")
import Microsoft.Office.Interop.Excel as Excel
ex = Excel.ApplicationClass()
ex.Visible = True
workbook = ex.Workbooks.Open(r"F:\Programming\Excel\Plot data.xlsx")
worksheet=workbook.worksheets("Sheet1")
Adding_Max_Principal_Stress=Model.Analyses[0].Solution.AddMaximumPrincipalStress()
Model.Analyses[0].Solution.EvaluateAllResults()
A=Adding_Max_Principal_Stress.PlotData
B=A.Values[1]
C=B.Count
for i in range (C):
E=B[i]
worksheet.range(Cells(1+i,1)).Value=E
In this code, B contains the list of data like B=[1,2,3,4,5,...] the items in B has 100000 data(array values) which takes 5 hrs to import the data in excel. Is there any possibility that i can speed this process.
Im guessing IronPython doesnt support too many different Python libraries? In which case, the code needs to be almost pure python?
Have you used a line profiler to see what the slowest part of the
code is?
It seems like in your .range formula you are only setting one cell at
a time, have you tried setting all of the values at once? Each time that the VBA code has to interact with excel, it is very slow. If you can set all of the values of the cells at once, it will be much faster. I know how to do this in VBA code,
Do you have to save as an excel file? Can you save as a CSV file instead? Creating large excel files is slow.
I would like to draw a line between the centers of two ( non-adjacent ) cells in an Excel work sheet using openpyxl.
Using openpyxl I have created a fairly large lookup table. Many of the points in the lookup table are interpolated from a hand full of known points.
I would like to draw lines between the cells that were created using the known points. These lines would sort of circle the areas that are interpolated.
Expected Result:
(This is the actual Excel generated spread sheet, The lines were added by hand in Excel. I want to automate the line drawing. )
In this case the white cells are known data points. Green(ish) cells are inside of the bounding trianges. Redish-Blueish are outside.
All of data on this sheet was populated via a new sheet using openpyxl.
The openpyxl documention hints that this is possible but I do not understand how.
Something along the lines of:
ws.Line['A1':'P17].style['heavy','black']
I think is what I am looking for.
[A bit more data ]
Using Excel and win32com I can automate drawing these lines.
line = ws.Shapes.Addline(3,4,70,80).Line
However do to other limitations in Excel I have to create this offline using openpyxl. Other_Limitations
So to re-phrase my question:
Can openpyxl even draw lines?
I am beginning to think that I will have to create the spreadsheet with openpyxl then open the newly created workbook with Excel and draw the lines with Excel.
I really don't think Excel is suitable for this. The drawing subsystem uses a completely different coordinate system to the worksheet itself. Thus, although it is possible to "anchor" a drawing between two cells, the proportions will be extremely hard to calculate.
I'm sure matplotlib, seaborn or other graphics libraries have tools more suitable for this job.
So my original Question was if this can be done with Openpyxl.
That question still stands.
However here is my solution to draw lines with win32com / Python in Excel directly.
This is not ideal for my situation but it works.
def Drawline(Sheet,Start,End):
StartCell = Sheet.Cells(Start[0],Start[1])
StartAdjacent = Sheet.Cells(Start[0]+1,Start[1]+1)
EndCell = Sheet.Cells(End[0],End[1])
EndAdjacent = Sheet.Cells(End[0]+1,End[1]+1)
Y1 = ( StartCell.Top + StartAdjacent.Top ) / 2
X1 = ( StartCell.Left + StartAdjacent.Left ) / 2
Y2 = ( EndCell.Top + EndAdjacent.Top ) / 2
X2 = ( EndCell.Left + EndAdjacent.Left ) / 2
Sheet.Shapes.AddLine(X1,Y1,X2,Y2)
This will draw a line from the center of Start to the center of End on Sheet.
I have a recording of tracking data in .edf format (SR-RESEARCH eyelink). I want to convert it to ASC/CSV format in python. I have the GUI application but I want to do it programmatically (in Python).
I found the package pyEDFlib but couldn't find an example to how convert the eye-tracking .edf file to .asc or .csv.
What will the best best way to do it?
Thanks
If I trust the page here: http://pyedflib.readthedocs.io/en/latest, you can run through all the signals in the file this way:
import pyedflib
import numpy as np
f = pyedflib.EdfReader("data/test_generator.edf")
n = f.signals_in_file
signal_labels = f.getSignalLabels()
sigbufs = np.zeros((n, f.getNSamples()[0]))
for i in np.arange(n):
sigbufs[i, :] = f.readSignal(i)
The pyEDFlib library simply reads the file into an EdfReader object.
Then you just need to go through and make row for each.
I assume that signal_labels (in the code above) will be an array with all the labels so make a comma separated string out of them
signal_labels_row = ",".join(signal_labels)
Then do the same for each signal, 1 comma separated String for each
Then simply write them in a file.
I can see they provide an example of how to read a file and extract all the data you need here
https://github.com/holgern/pyedflib/blob/master/demo/readEDFFile.py
Based on your answers i have created this python3 script to export all singnals to multiple .csv files https://github.com/folkien/pyEdfToCsv
I am working on a really big script right now where I have a csv file that I have removed rows and columns from, and edited the headers. I need to create one big shapefile for the entire csv file then create individual shape files for the units under one of the headers. I thougt the best way to do this would be to use arcpy.MakeXyEventLayer(), I saw in an arcgis sample script to then use arcpy.GetCount() for the output file of the xyEveveLayer, then arcpy.SaveToLayerFile_management() and arcpy.FeatureClassToShapefile_ conversion, but when I run the script only my csv file is getting edited and there is no layer in the output file. Is there a step I am missing or should this be making my shape.
this is the few lines of code I have used after all of he csv file editing to do what is described above:
outLyr = sys.arg[3] # shapefile layer output name
XYLyr.newLyr(csvOut, lyrOutFile, spRef, sys.argv[4], sys.argv[5]) # x coordinate column; y coordinate column
print arcpy.GetCount_management(lyrOutFile)
csv2LYR.saveLYR(lyrOutFile, curDir)
arcpy.SaveToLayerFile_management does not save data to a shapefile or any other kind of featureclass. It only creates a .lyr file, which points to a data source and renders it with saved symbology, etc. You can use arcpy.FeatureClassToShapefile_conversion to create the shapefile from the in-memory feature layer created with arcpy.MakeXyEventLayer. Help for that tool is here.
I have to port an algorithm from an Excel sheet to python code but I have to reverse engineer the algorithm from the Excel file.
The Excel sheet is quite complicated, it contains many cells in which there are formulas that refer to other cells (that can also contains a formula or a constant).
My idea is to analyze with a python script the sheet building a sort of table of dependencies between cells, that is:
A1 depends on B4,C5,E7 formula: "=sqrt(B4)+C5*E7"
A2 depends on B5,C6 formula: "=sin(B5)*C6"
...
The xlrd python module allows to read an XLS workbook but at the moment I can access to the value of a cell, not the formula.
For example, with the following code I can get simply the value of a cell:
import xlrd
#open the .xls file
xlsname="test.xls"
book = xlrd.open_workbook(xlsname)
#build a dictionary of the names->sheets of the book
sd={}
for s in book.sheets():
sd[s.name]=s
#obtain Sheet "Foglio 1" from sheet names dictionary
sheet=sd["Foglio 1"]
#print value of the cell J141
print sheet.cell(142,9)
Anyway, It seems to have no way to get the formul from the Cell object returned by the .cell(...) method.
In documentation they say that it is possible to get a string version of the formula (in english because there is no information about function name translation stored in the Excel file). They speak about formulas (expressions) in the Name and Operand classes, anyway I cannot understand how to get the instances of these classes by the Cell class instance that must contains them.
Could you suggest a code snippet that gets the formula text from a cell?
[Dis]claimer: I'm the author/maintainer of xlrd.
The documentation references to formula text are about "name" formulas; read the section "Named references, constants, formulas, and macros" near the start of the docs. These formulas are associated sheet-wide or book-wide to a name; they are not associated with individual cells. Examples: PI maps to =22/7, SALES maps to =Mktng!$A$2:$Z$99. The name-formula decompiler was written to support inspection of the simpler and/or commonly found usages of defined names.
Formulas in general are of several kinds: cell, shared, and array (all associated with a cell, directly or indirectly), name, data validation, and conditional formatting.
Decompiling general formulas from bytecode to text is a "work-in-progress", slowly. Note that supposing it were available, you would then need to parse the text formula to extract the cell references. Parsing Excel formulas correctly is not an easy job; as with HTML, using regexes looks easy but doesn't work. It would be better to extract the references directly from the formula bytecode.
Also note that cell-based formulas can refer to names, and name formulas can refer both to cells and to other names. So it would be necessary to extract both cell and name references from both cell-based and name formulas. It may be useful to you to have info on shared formulas available; otherwise having parsed the following:
B2 =A2
B3 =A3+B2
B4 =A4+B3
B5 =A5+B4
...
B60 =A60+B59
you would need to deduce the similarity between the B3:B60 formulas yourself.
In any case, none of the above is likely to be available any time soon -- xlrd priorities lie elsewhere.
Update: I have gone and implemented a little library to do exactly what you describe: extracting the cells & dependencies from an Excel spreadsheet and converting them to python code. Code is on github, patches welcome :)
Just to add that you can always interact with excel using win32com (not very fast but it works). This does allow you to get the formula. A tutorial can be found here [cached copy] and details can be found in this chapter [cached copy].
Essentially you just do:
app.ActiveWorkbook.ActiveSheet.Cells(r,c).Formula
As for building a table of cell dependencies, a tricky thing is parsing the excel expressions. If I remember correctly the Trace code you mentioned does not always do this correctly. The best I have seen is the algorithm by E. W. Bachtal, of which a python implementation is available which works well.
So I know this is a very old post, but I found a decent way of getting the formulas from all the sheets in a workbook as well as having the newly created workbook retain all the formatting.
First step is to save a copy of your .xlsx file as .xls
-- Use the .xls as the filename in the code below
Using Python 2.7
from lxml import etree
from StringIO import StringIO
import xlsxwriter
import subprocess
from xlrd import open_workbook
from xlutils.copy import copy
from xlsxwriter.utility import xl_cell_to_rowcol
import os
file_name = '<YOUR-FILE-HERE>'
dir_path = os.path.dirname(os.path.realpath(file_name))
subprocess.call(["unzip",str(file_name+"x"),"-d","file_xml"])
xml_sheet_names = dict()
with open_workbook(file_name,formatting_info=True) as rb:
wb = copy(rb)
workbook_names_list = rb.sheet_names()
for i,name in enumerate(workbook_names_list):
xml_sheet_names[name] = "sheet"+str(i+1)
sheet_formulas = dict()
for i, k in enumerate(workbook_names_list):
xmlFile = os.path.join(dir_path,"file_xml/xl/worksheets/{}.xml".format(xml_sheet_names[k]))
with open(xmlFile) as f:
xml = f.read()
tree = etree.parse(StringIO(xml))
context = etree.iterparse(StringIO(xml))
sheet_formulas[k] = dict()
for _, elem in context:
if elem.tag.split("}")[1]=='f':
cell_key = elem.getparent().get(key="r")
cell_formula = elem.text
sheet_formulas[k][cell_key] = str("="+cell_formula)
sheet_formulas
Structure of Dictionary 'sheet_formulas'
{'Worksheet_Name': {'A1_cell_reference':'cell_formula'}}
Example results:
{u'CY16': {'A1': '=Data!B5',
'B1': '=Data!B1',
'B10': '=IFERROR(Data!B12,"")',
'B11': '=IFERROR(SUM(B9:B10),"")',
It seems that it is impossible now to do what you want with xlrd. You can have a look at this post for the detailed description of why it is so difficult to implement the functionality you need.
Note that the developping team does a great job for support at the python-excel google group.
I know this post is a little late but there's one suggestion that hasn't been covered here. Cut all the entries from the worksheet and paste using paste special (OpenOffice). This will convert the formulas to numbers so there's no need for additional programming and this is a reasonable solution for small workbooks.
Ye! With win32com it's works for me.
import win32com.client
Excel = win32com.client.Dispatch("Excel.Application")
# python -m pip install pywin32
file=r'path Excel file'
wb = Excel.Workbooks.Open(file)
sheet = wb.ActiveSheet
#Get value
val = sheet.Cells(1,1).value
# Get Formula
sheet.Cells(6,2).Formula