How can I create KML files from the XML files using python.
I have a lot of XML files. I have already parsed the data from the XML files using the SAX parser.
Now I want to create KML files from the data that I have parsed.
Is there any other way apart from xml.dom.minidom to write the KML file.
I am currently thinking of creating a template KML file. Then copying the template KML file and filling the 'data' in it.
Can anybody suggest a better way ?
My main concern is Maintainability (writing the data using minidom is pretty confusing for somebody to read).
Try xml.etree.ElementTree. Here's a short example creating a couple of points in a KML file:
from xml.etree import ElementTree as et
class Kml(object):
def __init__(self):
self.root = et.Element('kml')
self.doc = et.SubElement(self.root,'Document')
def add_placemark(self,name,desc,lat,long,alt):
pm = et.SubElement(self.doc,'Placemark')
et.SubElement(pm,'name').text = name
et.SubElement(pm,'description').text = desc
pt = et.SubElement(pm,'Point')
et.SubElement(pt,'coordinates').text = '{},{},{}'.format(lat,long,alt)
def write(self,filename):
tree = et.ElementTree(self.root)
tree.write(filename)
kml = Kml()
kml.add_placemark('Location1','Description1',-120,45,0)
kml.add_placemark('Location2','Description2',60,-45,0)
kml.write('out.kml')
Related
I have tried to save the xml file in the following variable and later work on it as normal xml file. This is not working. How can I approach this situation. I need to edit the xml file without editing in the original file and without creating a new xml file. Is that possible?
comment_2 = open("cool.xml").read()
Thanks and Regards
You can use xml.etree.ElementTree to parse the XML file and then save it to a variable:
import xml.getElementTree as ET
tree = ET.parse('xml.etr.xml')
root = tree.getroot()
root.save(root)
root_variable = root_variable
Then you can save the xml file to an instance of ElementTree.
I have an XML input file which I need to split into multiple files based on MAPPING and WORKFLOW tags.
Since I have two MAPPING tags in my input XML and one WORKFLOW tag, I need to generate three files:
m_demo_trans_agg.XML
m_demo_trans_exp.XML
wf_m_demo_trans_agg_exp.XML
So, my mapping file (starting with m_) will have tags SOURCE, TARGET, and MAPPINGS. The workflow file will have tags WORKFLOW and CONFIG.
Please let me know how can I create mapping XML.
I started with workflow XML creation.
My code looks like:
import xml.etree.ElementTree
tree = ET.parse('input.xml')
root = tree.getroot()
target_node_first_parent = 'FOLDER'
target_nodes = ['SOURCE', 'TARGET', 'MAPPING']
for node in root.iter(target_node_first_parent):
for subnode in node.iter():
if subnode.tag in ['SOURCE', 'TARGET', 'MAPPING']:
print(subnode.tag)
node.remove(subnode)
out_tree = ET.ElementTree(root)
out_tree.write('output.xml')
I am getting the TARGET tags in my output.xml.
I am open to using any libraries apart from xml.etree.ElementTree.
Please assist.
Thanks
I have a directory of xml files and I'm trying to merge them all into one big xml file
full = ET.Element('dataset')
for filename in glob.glob(os.path.join(path, '*.xml')):
tree = ET.parse(filename, parser=xmlp)
root = tree.getroot()
for pair in root: #root.iter('pair'):
full.append(pair)
I tried the above code and get this trivial error:
ParseError: parsing finished: line 330, column 0
The problem is that only the first file is appended to the new xml doc, how can I avoid this? Or is there a better way of merging? (The structures are identical)
Edit: they are of this structure:
<dataset>
<pair>
<t1></t1>
<t2></t2>
</pair>
...
</dataset>
Update: Used XML Copy Editor, and couldn't open told me unknown encoding MS932 even though it is in ISO-8859-1. Same error I got from trying to open with lxml and not xml in python. Manually recreated a new xml, not really a solution but oh well.
Thanks
I am new to etree. I wanted to read etree and put that particular information in another file format like html, xml, etc. I checked and now I can do that but now what about other way around? Like, If I want to read any other file format and generate or write into etree. Please give me some suggestions or with example to proceed with that.
Suppose you want to write an xml file test.xml like the following:
<?xml version='1.0' encoding='ASCII'?>
<document category = "location">
<name>Timbuktu</name>
<name>Eldorado</name>
</document>
The corresponding code would be:
from lxml import etree
root = etree.Element("document", {"category" : "locations"})
for location in ["Timbuktu", "Eldorado"]:
name = etree.SubElement(root, "name")
name.text = location
tree = etree.ElementTree(element=root, file=None, parser=None)
tree.write('test.xml', pretty_print=True, xml_declaration=True)
If you want to add further sub-elements under name then you have to nest another for loop and create subelements under the name tag object.
Currently, my code uses the name of an XML file as a parameter in order to take that file, parse some of its content and use it to rename said file, what I mean to do is actually run my program once and that program will search for every XML file (even if its inside a zip) inside the directory and rename it using the same parameters which is what I am having problems with.
#encoding:utf-8
import os, re
from sys import argv
script, nombre_de_archivo = argv
regexFecha = r'\d{4}-\d{2}-\d{2}'
regexLocalidad = r'localidad=\"[\w\s.,-_]*\"'
regexNombre = r'nombre=\"[\w\s.,-_]*\"'
regexTotal = r'total=\"\d+.?\d+\"'
fechas = []; localidades = []; nombres = []; totales = []
archivo = open(nombre_de_archivo)
for linea in archivo.readlines():
fechas.append(re.findall(regexFecha, linea))
localidades.append(re.findall(regexLocalidad, linea))
nombres.append(re.findall(regexNombre, linea))
totales.append(re.findall(regexTotal, linea))
fecha = str(fechas[1][0])
localidad = str(localidades[1][0]).strip('localidad=\"')
nombre = str(nombres[1][0]).strip('nombre=\"')
total = str(totales[1][0]).strip('total=\"')
nombre_nuevo_archivo = fecha+"_"+localidad+"_"+nombre+"_"+total+".xml"
os.rename(nombre_de_archivo, nombre_nuevo_archivo)
EDIT: an example of this would be.
directory contains only 3 files as well as the program.
silly.xml amusing.zip feisty.txt
So, you run the program and it ignores feisty as it is a .txt file and it reads silly.xml, ti then parses "fechas, localidad, nombre, total" concatenate or append or whatever and use that as the new file for silly.xml, then the program checks if zip has an xml file, if it does then it does the same thing.
so in the end we would have
20141211_sonora_walmart_2033.xml 20141008_sonora_starbucks_102.xml feisty txt amusing.zip
Your question is not clear, and the code you posted is too broad.
I can't debug regular expressions with the power of my sight, but there's a number of things you can do to simplify the code. Simple code means less errors, and an easier time debugging.
To locate your target files, use glob.glob:
files = glob.glob('dir/*.xml')
To parse them, ditch regular expressions and use the ElementTree API.
import xml.etree.ElementTree as ET
tree = ET.parse('target.xml')
root = tree.getroot()
There's also modules to navigate XML files with CSS notation and XPATH. Extracting fields form the filename using regex is okay, but check out named groups.