convert an xmi file to xml file using python - python

I need to convert an activity diagram in xmi format to xml format.Is this conversion possible using python?Are there any tools to convert xmi files to xml?

Converting XML to XML is usually called XML transformation. For Python you can use libxsltmod to perform XML transformations by using XSLT 'stylesheets'.

As Ignacio says, the problem may not be that the target tool expects XML but that probably expects a diffent XMI format.
Unfortunately, each tool follows its own interpretation of the XMI standard so two modeling tools will most likely generate two incompatible XMI files for the same model. See an example in this "model once open anywhere not true" post

you can get the information that you need (classes and attribute ...) from any file.xmi this doc maybe help
from xml.dom import minidom
xmldoc = minidom.parse('file.xmi')
for element in xmldoc.getElementsByTagName("UML:Class"):
print(" -> UML:Class ",element.getAttribute('name'))
for a in element.getElementsByTagName("UML:Attribute"):
print(" -> UML:Attr : ",a.getAttribute('name'))

Related

How to extract some text from json file without loading it?

python lxml can be used to extract text (e.g., with xpath) from XML files without having to fully parse XML. For example, I can do the following which is faster than BeautifulSoup, especially for large input. I'd like to have some equivalent code for JSON.
from lxml import etree
tree = etree.XML('<foo><bar>abc</bar></foo>')
print type(tree)
r = tree.xpath('/foo/bar')
print [x.tag for x in r]
I see http://goessner.net/articles/JsonPath/. But I don't see an example python code to extract some text from a json file without having use json.load(). Could anybody show me an example? Thanks.
I'm assuming you don't want to load the entire JSON for performance reasons.
If that's the case, perhaps ijson is what you need. I used it to search huge JSON files (>8gb) and it works well.
However, you will have to implement the search code yourself.

Accessing non tree structured xml data in python

I have several xml files that I want to parse in python. I am aware of the ElementTree package in python, however my xml files aren't stored in a tree like structure. Below is an example
<tag1 attribute1="at1" attribute2="at2">My files are text that I annotated with a tool
to create these xml files.</tag1>
Some parts of the text are enclosed in an xml tag, whereas others are not.
<tag1 attribute1="at1" attribute2="at2"><tag2 attribute3="at3" attribute4="at4">Some
are even enclosed in multiple tags.</tag1></tag2>
And some have overlapping tags:
<tag1 attribute1="at1" attribute2="at2">This is an example sentence
<tag3 attribute5="at5">containing a nested example sentence</tag3></tag1>
Whenever I use an ElementTree like function to parse the file, I can only access the very first tag. I am looking for a way to parse all the tags and don't want a tree like structure. Any help is greatly appreciated.
If you have one XML fragment per line, just parse each line individually.
for line in some_file:
# parse using ET and getroot.

Python 3.x: parse ATOM XML and convert to dict

I'm struggling to parse an ATOM XML file, coming from an API, to a common data structure, like dict, Pandas dataframe or JSON,
I understand XML files are more complex than JSON files, and hence there won't be a very simple, generic solution to this. I hope that given the fact that I'm dealing with an ATOM structure might help parsing the file to a more general data structure.
The structure of the XML data: http://opendata.cbs.nl/ODataFeed/OData/70266ned/TypedDataSet
And similar for JSON here: http://opendata.cbs.nl/ODataFeed/OData/70266ned/TypedDataSet
The reason I can't use the JSON file is that it is often not available.
I played around with libraries like xml.etree, xmltodict, lxml, xmljson and feedparser, but I keep getting errors.
For example, using feedparser:
r = requests.get('http://opendata.cbs.nl/ODataFeed/OData/70266ned/TypedDataSet')
tree = ElementTree.fromstring(r.content)
Yields the error
xml.etree.ElementTree.ParseError: not well-formated (invalid token): line 1, column 0
Help would be highly appreciated!
I don't know if you solved it yet but, have you tried using?:
tree = ElementTree.fromstring(r.text)
r.content returns the content in bytes (see: http://docs.python-requests.org/en/master/api/#requests.Response)

Python pickle to xml

How can I convert to a pickle object to a xml document?
For example, I have a pickle like that:
cpyplusplus_test
Coordinate
p0
(I23
I-11
tp1
Rp2
.
I want to get something like:
<Coordinate>
<x>23</x>
<y>-11</y>
</Coordinate>
The Coordinate class has x and y attributes of course. I can supply a xml schema for conversion.
I tried gnosis.xml module. It can objectify xml documents to python object. But it cannot serialize objects to xml documents like above.
Any suggestion?
Thanks.
gnosis.xml does support pickling to XML:
import gnosis.xml.pickle
xml_str = gnosis.xml.pickle.dumps(obj)
To deserialize the XML, use loads:
o2 = gnosis.xml.pickle.loads(xml_str)
Of course, this will not directly convert existing pickles to XML — you have to first deserialize them into live object, and then dump them to XML.
Having said that, I must warn you that gnosis.xml is quite slow, somewhat fragile, and most likely unmaintained (last release was over six years ago). It is also very bloated, containing a huge number of subpackages with lots and lots of features that not only you won't need, but are untested and buggy. We tried to use for our development and, after a lot of effort wasted on trying to debug and improve it, ended up writing a simple XML pickler running at ~500 lines of code, and never looked back.
First you need to unpickle the data by pickle.load or pickle.loads. Then generate xml snippet. If you have a pickle in tmpStr variable, simply do this:
c = pickle.loads(tmpStr)
print '<Coordinate>\n<x>%d</x>\n<y>%d</y>\n</Coordinate>' % (c.x, c.y)
Writing to file is left as an exercise to the reader.

Parsing N-Triples Via Streaming

I was fairly confused about this for some time but I finally learned how to parse a large N-Triples RDF store (.nt) using Raptor and the Redland Python Extensions.
A common example is to do the following:
import RDF
parser=RDF.Parser(name="ntriples")
model=RDF.Model()
stream=parser.parse_into_model(model,"file:./mybigfile.nt")
for triple in model:
print triple.subject, triple.predicate, triple.object
Parse_into_model() by default loads the object into memory, so if you are parsing a big file you could consider using a HashStorage as your model and serializing it that way.
But what if you want to just read the file and say, add it to MongoDB without loading it into a Model or anything complicated like that?
import RDF
parser=RDF.NTriplesParser()
for triple in parser.parse_as_stream("file:./mybigNTfile.nt"):
print triple.subject, triple.predicate, triple.object

Categories

Resources