I have two SVGs that where the same but I changed some values inside, for instance:
SVG1
<desc
id="desc20622">[Visualization]
name=H131B1;
</desc>
and the other is:
SVG2
<desc
id="desc20622">[Visualization]
name=R131C2;
</desc>
Now I have realocated a lot of elements in one SVG and I would like to replicate this changes to the other SVG. How is the simplest way to consume those SVGs, compare the ids, copy the values from SVG2 to SVG1 and save a new SVG file?
I'm familiar with a bunch of programming languages but I was taking a look at Python to do this job using minidom or xml.etree.ElementTree.
Could some one help me on that? Thanks in advance.
I figured out how to do it by my own with Python.
import xml.etree.ElementTree as ET
tree1 = ET.parse('SVG1.svg')
root1 = tree1.getroot()
tree2 = ET.parse('SVG2.svg')
root2 = tree2.getroot()
for child1 in root1.iter('desc'):
for child2 in root2.iter('desc'):
if child1.attrib == child2.attrib:
child1.text = child2.text
break
tree1.write('output.svg')
Just have to parse both SVGs, iterate on every desc compare the id and copy the text!
Related
How can I create XML files like this?
<?xml version="1.0" encoding="utf-8"?>
<data>
<li class= 'playlistItem' data-type='local' data-mp3='PATH' >
<a class='playlistNonSelected' href='#'>NAME</a>
</li>
...
</data>
I'd create this dynamically and for each item I have, I'd fill in the PATH and NAME variables with the values I need.
I'm trying to use lxml. This is what I've come up with so far, but I don't think it's correct:
from lxml import etree
for item in my_list:
root = etree.Element('li', class = 'playlistItem', data-type = 'local', data-mp3 = PATH)
child = etree.Element('a', class = 'playlistNonSelected', href ='#')
child.text = NAME
Even if the above was correct, I'm lost at this point, because if I have 20 items in the list, how can I do this for each of them and then write it all to an XML file? I've tried looking at other answers but most of the replies are to generate XML like this:
<root>
<child/>
<child>some text</child>
</root>
And I can't figure out how to generate the kind I need. Sorry if I've made obvious mistakes. I appreciate any help. Thank you!
You are on the right track save for a few minor syntax and usage issues:
class is a Python keyword, you can't use it as a function parameter name (which is essentially what class = 'playlistItem' is doing
data-type is not a valid variable name in Python, it will be evaluated as data MINUS type, consider using something like dataType or data_type. There might be ways around this but, IMHO, that would make the code unnecessarily complicated without adding any value (please see Edit #1 on how to do this)
That being said, the following code snippet should give you something usable and you can move on from there. Please feel free to let me know if you need any additional help:
from lxml import etree
data_el = etree.Element('data')
# You can do this in a loop and keep adding new elements
# Note: A deepcopy will be required for subsequent items
li_el = etree.SubElement(data_el, "li", class_name = 'playlistItem', data_type = "local", data_mp3 = "PATH")
a_el = etree.SubElement(li_el, "a", class_name = 'playlistNotSelected', href='#')
print etree.tostring(data_el, encoding='utf-8', xml_declaration = True, pretty_print = True)
This will generate the following output (which you can write to a file):
<?xml version='1.0' encoding='utf-8'?>
<data>
<li class_name="playlistItem" data_mp3="PATH" data_type="local">
<a class_name="playlistNotSelected" href="#"/>
</li>
</data>
Edit #0:
Alternatively, you can also write to a file by converting it to an ElementTree first, e.g.
# Replace sys.stdout with a file object pointing to your object file:
etree.ElementTree(data_el).write(sys.stdout, encoding='utf-8', xml_declaration = True, pretty_print = True)
Edit #1:
Since element attributes are dictionaries, you can use set to specify arbitrary attributes without any restrictions, e.g.
li_el.set('class', 'playlistItem')
li_el.set('data-type', 'local')
I am using python to parse a .xml file which is quite complicated since it has a lot of nested children; accessing some of the values contained in it is quite annoying since the code starts to become pretty bad looking.
Let me first present you the .xml file:
<?xml version="1.0" encoding="utf-8"?>
<Start>
<step1 stepA="5" stepB="6" />
<step2>
<GOAL1>11111</GOAL1>
<stepB>
<stepBB>
<stepBBB stepBBB1="pinco">1</stepBBB>
</stepBB>
<stepBC>
<stepBCA>
<GOAL2>22222</GOAL2>
</stepBCA>
</stepBC>
<stepBD>-NO WOMAN NO CRY
-I SHOT THE SHERIF
-WHO LET THE DOGS OUT
</stepBD>
</stepB>
</step2>
<step3>
<GOAL3 GOAL3_NAME="GIOVANNI" GOAL3_ID="GIO">
<stepB stepB1="12" stepB2="13" />
<stepC>XXX</stepC>
<stepC>
<stepCC>
<stepCC GOAL4="saf12">33333</stepCC>
</stepCC>
</stepC>
</GOAL3>
</step3>
<step3>
<GOAL3 GOAL3_NAME="ANDREA" GOAL3_ID="DRW">
<stepB stepB1="14" stepB2="15" />
<stepC>YYY</stepC>
<stepC>
<stepCC>
<stepCC GOAL4="fwe34">44444</stepCC>
</stepCC>
</stepC>
</GOAL3>
</step3>
</Start>
My goal would be to access the values contained inside of the children named "GOAL" in a nicer way then the one I wrote in my sample code below. Furthermore I would like to find an automated way to find the values of GOALS having the same type of tag belonging to different children having the same name:
Example: GIOVANNI and ANDREA are both under the same kind of tag (GOAL3_NAME) and belong to different children having the same name (<step3>) though.
Here is the code that I wrote:
import xml.etree.ElementTree as ET
data = ET.parse('test.xml').getroot()
GOAL1 = data.getchildren()[1].getchildren()[0].text
print(GOAL1)
GOAL2 = data.getchildren()[1].getchildren()[1].getchildren()[1].getchildren()[0].getchildren()[0].text
print(GOAL2)
GOAL3 = data.getchildren()[2].getchildren()[0].text
print(GOAL3)
GOAL4_A = data.getchildren()[2].getchildren()[0].getchildren()[2].getchildren()[0].getchildren()[0].text
print(GOAL4_A)
GOAL4_B = data.getchildren()[3].getchildren()[0].getchildren()[2].getchildren()[0].getchildren()[0].text
print(GOAL4_B)
and the output that I get is the following:
11111
22222
33333
44444
The output that I would like should be like this:
11111
22222
GIOVANNI
33333
ANDREA
44444
As you can see I am able to read GOAL1 and GOAL2 easily but I am looking for a nicer code practice to access those values since it seems to me too long and hard to read/understand.
The second thing I would like to do is getting GOAL3 and GOAL4 in a automated way so that I do not have to repeat similar lines of codes and make it more readable and understandable.
Note: as you can see I was not able to read GOAL3. If possible I would like to get both the GOAL3_NAME and GOAL3_ID
In order to make the .xml file structure more understandable I post an image of what it looks like:
The highlighted elements are what I am looking for.
here is simple example for iterating from head to tail with a recursive method and cElementTree(15-20x faster), you can than collect the needed information from that
import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
def get_tail(root):
for child in root:
print child.text
get_tail(child)
get_tail(root)
import xml.etree.cElementTree as ET
data = ET.parse('test.xml')
for d in data.iter():
if d.tag in ["GOAL1", "GOAL2", "stepCC", "stepCC"]:
print d.text
elif d.tag in ["GOAL3", "GOAL4"]:
print d.attrib.values()[0]
I have a nested XML that looks like this:
<data>foo <data1>hello</data1> bar</data>
I am using minidom, but no matter how I try to get the values between "data", I am only get "foo" but not "bar"
It is even worse if the XML is like this:
<data><data1>hello</data1> bar</data>
I only get a "None", which is correct according to the logic above. So I came accross this: http://levdev.wordpress.com/2011/07/29/get-xml-element-value-in-python-using-minidom and concluded that it is due to the limitation of minidom?
So I used the method in that blog and I now get
foo <data1>hello</data1> bar
and
<data1>hello</data1> bar
which is acceptable. However, if I try to create a new node (createTextNode) using the output above as node values, the XML becomes:
<data>foo <data1>hello</data1> bar</data>
and
<data><data1>hello</data1> bar</data>
Is there any way that I can create it so that it looks like the original? Thank you.
You can use element tree For xml it very efficient for both retrieval and creation of the node
have a look at the link below
element tree--
tutorials
mixed xml
someof the examples of creating node
import xml.etree.ElementTree as ET
data = ET.Element('data')
data1= ET.SubElement(data, 'data1',attr="value")
data1.text="hello"
data.text="bar"
data1.tail="some code"
ET.dump(data)
output :<data>bar<data1 attr="value">hello</data1>some code</data>
Use the following function to prettify your xml so it is a LOT easier to see...first of all..
import xml.dom.minidom as minidom
def prettify(elem):
"""Return a pretty-printed XML string for the Element. Props goes
to Maxime from stackoverflow for this code."""
rough_string = et.tostring(elem, 'utf-8')
reparsed = minidom.parseString(rough_string)
return reparsed.toprettyxml(indent="\t")
That makes stepping through the tree visually a lot simpler.
Next I would suggest a modification in your xml that will make your life a whole lot easier i think.
Instead of :
<data>foo
<data1>hello</data1>
bar
</data>
which is not a correct XML format I would save your 'foo' and 'bar' as attributes of
it looks like this:
<data var1='foo' var2='bar'>
<data1>hello</data1>
</data>
to do this using xml.etree.ElementTree:
import xml.etree.ElementTree as ET
data = ET.Element('data', {'var1:'foo', 'var2':'bar'})
data1= ET.SubElement(data, 'data1')
data1.text='hello'
print prettify(data)
So after pointed out by #pandubear, the XML:
<data>foo <data1>hello</data1> bar</data>
Does have two text nodes, containing "foo " and " bar", so what can be done is to iterate through all the child nodes in data and get the values.
My current code is
xml_obj = lxml.objectify.Element('root_name')
xml_obj[root_name] = str('text')
lxml.etree.tostring(xml_obj)
but this creates the following xml:
<root_name><root_name>text</root_name></root_name>
In the application I am using this for I could easily use text substitution to solve this problem, but it would be nice to know how to do it using the library.
I'm not that familiar with objectify, but i don't think that's the way it's intended to be used. The way it represents objects, is that a node at any given level is, say, a classname, and the subnodes are field names (with types) and values. And the normal way to use it would be something more like this:
xml_obj = lxml.objectify.Element('xml_obj')
xml_obj.root_path = 'text'
etree.dump(xml_obj)
<root_name xmlns:py="http://codespeak.net/lxml/objectify/pytype" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" py:pytype="TREE">
<root_name py:pytype="str">text</root_name>
</root_name>
What you want would be way easier to do with etree:
xml_obj = lxml.etree.Element('root_path')
xml_obj.text = 'text'
etree.dump(xml_obj)
<root_path>text</root_path>
If you really need it to be in objectify, it looks like while you shouldn't mix directly, you can use tostring to generate XML, then objectify.fromstring to bring it back. But probably, if this is what you want, you should just use etree to generate it.
I don't think you can write data into the root element. You may need to create a child element like this:
xml_obj = lxml.objectify.Element('root_name')
xml_obj.child_name = str('text')
lxml.etree.tostring(xml_obj)
I am a bit stuck on a project I am doing which uses Python -which I am very new to. I have been told to use ElementTree and get specified data out of an incoming XML file. It sounds simple but I am not great at programming. Below is a (very!) tiny example of an incoming file along with the code I am trying to use.
I would like any tips or places to go next with this. I have tried searching and following what other people have done but I can't seem to get the same results. My aim is to get the information contained in the "Active", "Room" and "Direction" but later on I will need to get much more information.
I have tried using XPaths but it does not work too well, especially with the namespaces the xml uses and the fact that an XPath for everything I would need would become too large. I have simplified the example so I can understand the principle to do, as after this it must be extended to gain more information from an "AssetEquipment" and multiple instances of them. Then end goal would be all information from one equipment being saved to a dictionary so I can manipulate it later, with each new equipment in its own separate dictionary.
Example XML:
<AssetData>
<Equipment>
<AssetEquipment ID="3" name="PC960">
<Active>Yes</Active>
<Location>
<RoomLocation>
<Room>23</Room>
<Area>
<X-Area>-1</X-Area>
<Y-Area>2.4</Y-Area>
</Area>
</RoomLocation>
</Location>
<Direction>Positive</Direction>
<AssetSupport>12</AssetSupport>
</AssetEquipment>
</Equipment>
Example Code:
tree = ET.parse('C:\Temp\Example.xml')
root = tree.getroot()
ns = "{http://namespace.co.uk}"
for equipment in root.findall(ns + "Equipment//"):
tagname = re.sub(r'\{.*?\}','',equipment.tag)
name = equipment.get('name')
if tagname == 'AssetEquipment':
print "\tName: " + repr(name)
for attributes in root.findall(ns + "Equipment/" + ns + "AssetEquipment//"):
attname = re.sub(r'\{.*?\}','',attributes.tag)
if tagname == 'Room': #This does not work but I need it to be found while
#in this instance of "AssetEquipment" so it does not
#call information from another asset instead.
room = equipment.text
print "\t\tRoom:", repr(room)
import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
for elem in tree.getiterator():
if elem.tag=='{http://www.namespace.co.uk}AssetEquipment':
output={}
for elem1 in list(elem):
if elem1.tag=='{http://www.namespace.co.uk}Active':
output['Active']=elem1.text
if elem1.tag=='{http://www.namespace.co.uk}Direction':
output['Direction']=elem1.text
if elem1.tag=='{http://www.namespace.co.uk}Location':
for elem2 in list(elem1):
if elem2.tag=='{http://www.namespace.co.uk}RoomLocation':
for elem3 in list(elem2):
if elem3.tag=='{http://www.namespace.co.uk}Room':
output['Room']=elem3.text
print output