Python text file to xml - python

I have one question about transforming a text file to XML. I have done nice conversion of text file and it's look like:
Program: 5 Start: 2013-09-11 05:30:00 Duration 06:15:00 Title: INFOCANALE
And my output in XML will be like
<data>
<eg>
<program>Program 5</program>
<start>2013-09-11 05:30:00</start>
<duration>06:15:00</duration>
<title>INFOCANALE</title>
</eg>
</dat‌​a>
Can python convert text file to XML?
Can you help me with some advice, or some code.

I think easiest way would be to change your file into csv file like this:
Program,Start,Duration,Title
5,2013-09-11 05:30:00,06:15:00,INFOCANALE
And then convert it like:
from lxml import etree
import csv
root = etree.Element('data')
rdr = csv.reader(open("your file name here"))
header = rdr.next()
for row in rdr:
eg = etree.SubElement(root, 'eg')
for h, v in zip(header, row):
etree.SubElement(eg, h).text = v
f = open(r"C:\temp\data2.xml", "w")
f.write(etree.tostring(root))
f.close()
# you also can use
# etree.ElementTree(root).write(open(r"C:\temp\data2.xml", "w"))

Related

create a condition that separates text.tag when parsing xml with python

I have this xml file that i dawnload from a source filename file.xml that inside of it`s Details have two OrderDetail, what i do with this is that i decode it and i write a new xml file in witch i parse it to get some information from.
<root>
<Details>
<OrderDetail ParentLineID="">H4sIAAAAAAAEAOy963LbyJbn+/
lMxLwDwtO7qnaMYeF+8d7VHZJolV0lWypRLu/u6g4HCC
QljClCmwTLdn+aFzkvd57k4EJSJAGIyJWQ8E+Ve3qqLd7
XSlxW/jLzl3//ty83E+UPNpvHyfTHZ/oL7dm//ev//B9/P
06m4/hqMQvS7==
</OrderDetail>
<OrderDetail ParentLineID="">H4sIAAAAAAAEAOy963LbyJbn+/
lMxLwDwtO7qnaMYeF+8d7VHZJolV0lWypRLu/u6g4HCC
QljClCmwTLdn+aFzkvd57k4EJSJAGIyJWQ8E+Ve3qqLd7
XSlxW/jLzl3//ty83E+UPNpvHyfTHZ/oL7dm//ev//B9/P
06m4/hqMQvS7==
</OrderDetail>
</Details>
</root>
tree = ET.parse('file.xml')
root = tree.getroot()
DEST_FILE_NAME = "XMLparser\\decompresed.xml"
def translate_to_file():
for child in root.iter('OrderDetail'):
child.get('ParentLineID')
result = zlib.decompress(base64.b64decode(child.text), 16 + zlib.MAX_WBITS).decode('utf-8')
with open(DEST_FILE_NAME, "w") as file:
file.write(result)
def read_file():
with open(DEST_FILE_NAME) as file:
return file.readlines()
def clean_file(lines):
with open(DEST_FILE_NAME, 'w') as file:
lines = filter(lambda x: x.strip(), lines)
file.writelines(lines)
def main():
translate_to_file()
lines = read_file()
clean_file(lines)
main()
when this file is decodedcrates an xml file
how can i create two separated xml files for reach OrderDetail ? so i take the first base64 decompresed and create an XML file . i take the other base64 decompresed and create a separate XML file ?

xml file to csv file python script

I need a python script for extract data from xml file
I have a xml file as shoen below:
<software>
<name>Update Image</name>
<Build>22.02</Build>
<description>Firmware for Delta-M Series </description>
<CommonImages> </CommonImages>
<ModelBasedImages>
<ULT>
<CNTRL_0>
<file type="UI_APP" ver="2.35" crc="1234"/>
<file type="MainFW" ver="5.01" crc="5678"/>
<SIZE300>
<file type="ParamTableDB" ver="1.1.4" crc="9101"/>
</SIZE300>
</CNTRL_0>
<CNTRL_2>
<file type="UI_APP" ver="2.35" crc="1234"/>
<file type="MainFW" ver="5.01" crc="9158"/>
</CNTRL_2>
</ULT>
</ModelBasedImages>
</software>
I want the data in table format like:
type ver crc
UI_APP 2.35 1234
MainFW 5.01 5678
ParamTableDB 1.1.4 9101
UI_APP 2.35 1234
MainFW 5.01 9158
Extract into any type of file csv/doc....
I tried this code:
import xml.etree.ElementTree as ET
import csv
tree = ET.parse("Build_40.01 (copy).xml")
root = tree.getroot()
# open a file for writing
Resident_data = open('ResidentData.csv', 'w')
# create the csv writer object
csvwriter = csv.writer(Resident_data)
resident_head = []
count = 0
for member in root.findall('file'):
resident = []
address_list = []
if count == 0:
name = member.find('type').tag
resident_head.append(name)
ver = member.find('ver').tag
resident_head.append(ver)
crc = member.find('crc').tag
resident_head.append(crc)
csvwriter.writerow(resident_head)
count = count + 1
name = member.find('type').text
resident.append(name)
ver = member.find('ver').text
resident.append(ver)
crc = member.find('crc').text
resident.append(crc)
csvwriter.writerow(resident)
Resident_data.close()
Thanks in advance
edited:xml code updated.
Use the xpath expression .//file to find all <file> elements in the XML document, and then use each element's attributes to populate the CSV file through a csv.DictWriter:
import csv
import xml.etree.ElementTree as ET
tree = ET.parse("Build_40.01 (copy).xml")
root = tree.getroot()
with open('ResidentData.csv', 'w') as f:
w = csv.DictWriter(f, fieldnames=('type', 'ver', 'crc'))
w.writerheader()
w.writerows(e.attrib for e in root.findall('.//file'))
For your sample input the output CSV file will look like this:
type,ver,crc
UI_APP,2.35,1234
MainFW,5.01,5678
ParamTableDB,1.1.4,9101
UI_APP,2.35,1234
MainFW,5.01,9158
which uses the default delimiter (comma) for a CSV file. You can change the delimiter using the delimiter=' ' option to DictWriter(), however, you will not be able to obtain the same formatting as your sample output, which appears to use fixed width fields (but you might get away with using tab as the delimiter).

Use content of a file in the path to a second file

I want to insert into a path the content of a txt file.
Example:
I have a txt file in ./path/date.txt with the content
08122016
How do I put the content (08122016) on the path of a second file?
Something like this:
s = open('/erp/date/**date.txt content**').read()
Use os.path.join:
import os
with open(r'./path/date.txt', 'rt') as input_file:
data = input_file.read()
with open(os.path.join('/erp/date', data), 'rt') as input_file2:
data2 = input_file2.read()
#open the date file
f = open("./path/date.txt", 'r')
#read the content
content=f.read()
#close file
f.close()
#insert date in path
s=open("/erp/date/"+str(content)).read()
You can insert strings into other strings like that (python 2.7.12):
path = 'home/user/path/%s' % content
The %s in the the string will be replaced by the content variable.

Python read in file: ERROR: line contains NULL byte

I would like to parse an .ubx File(=my input file). This file contains many different NMEA sentences as well as raw receiver data. The output file should just contain informations out of GGA sentences. This works fine as far as the .ubx File does not contain any raw messages. However if it contains raw data
I get the following error:
Traceback (most recent call last):
File "C:...myParser.py", line 25, in
for row in reader:
Error: line contains NULL byte
My code looks like this:
import csv
from datetime import datetime
import math
# adapt this to your file
INPUT_FILENAME = 'Rover.ubx'
OUTPUT_FILENAME = 'out2.csv'
# open the input file in read mode
with open(INPUT_FILENAME, 'r') as input_file:
# open the output file in write mode
with open(OUTPUT_FILENAME, 'wt') as output_file:
# create a csv reader object from the input file (nmea files are basically csv)
reader = csv.reader(input_file)
# create a csv writer object for the output file
writer = csv.writer(output_file, delimiter=',', lineterminator='\n')
# write the header line to the csv file
writer.writerow(['Time','Longitude','Latitude','Altitude','Quality','Number of Sat.','HDOP','Geoid seperation','diffAge'])
# iterate over all the rows in the nmea file
for row in reader:
if row[0].startswith('$GNGGA'):
time = row[1]
# merge the time and date columns into one Python datetime object (usually more convenient than having both separately)
date_and_time = datetime.strptime(time, '%H%M%S.%f')
date_and_time = date_and_time.strftime('%H:%M:%S.%f')[:-6] #
writer.writerow([date_and_time])
My .ubx file looks like this:
$GNGSA,A,3,16,25,29,20,31,26,05,21,,,,,1.30,0.70,1.10*10
$GNGSA,A,3,88,79,78,81,82,80,72,,,,,,1.30,0.70,1.10*16
$GPGSV,4,1,13,02,08,040,17,04,,,47,05,18,071,44,09,02,348,24*49
$GPGSV,4,2,13,12,03,118,24,16,12,298,36,20,15,118,30,21,44,179,51*74
$GPGSV,4,3,13,23,06,324,35,25,37,121,47,26,40,299,48,29,60,061,49*73
$GPGSV,4,4,13,31,52,239,51*42
$GLGSV,3,1,10,65,07,076,24,70,01,085,,71,04,342,34,72,13,029,35*64
$GLGSV,3,2,10,78,35,164,41,79,75,214,48,80,34,322,46,81,79,269,49*64
$GLGSV,3,3,10,82,28,235,52,88,39,043,43*6D
$GNGLL,4951.69412,N,00839.03672,E,124610.00,A,D*71
$GNGST,124610.00,12,,,,0.010,0.010,0.010*4B
$GNZDA,124610.00,03,07,2016,00,00*79
µb<  ¸½¸Abð½ . SB éF é v.¥ # 1 f =•Iè ,
Ïÿÿ£Ëÿÿd¡ ¬M 0+ùÿÿ³øÿÿµj #ª ² -K*
,¨ , éºJU /) ++ f 5 .lG NL C8G /{; „> é óK 3 — Bòl . "¿ 2 bm¡
4âH ÐM X cRˆ 35 »7 Óo‡ž "*ßÿÿØÜÿÿUhQ`
3ŒðÿÿÂïÿÿþþûù ÂÈÿÿñÅÿÿJX ES
$²I uM N:w (YÃÿÿV¿ÿÿ> =ìî 1¥éÿÿèÿÿmk³m /?ÔÿÿÒÿÿšz+Ú ­Ïÿÿ6ÍÿÿêwÇ\ ? ]? ˜B Aÿƒ y µbÐD‹lçtæ#p3,}ßœŒ-vAh
¿M"A‚UE ôû JQý
'wA´üát¸jžAÀ‚"Å
)DÂï–ŽtAöÙüñÅ›A|$Å ôû/ Ìcd§ÇørA†áãì˜AØY–Ä ôû1 /Áƒ´zsAc5+_’ô™AìéNÅ ôû( ¶y(,wvAFøÈV§ƒA˜ÝwE ôû$ _S R‰wAhÙ]‘ÑëžAÇ9Å vwAòܧsAŒöƒd§Ò™AÜOÄ ôû3 kœÕ}vA;D.ž‡žAÒûàÄ #ˆ" ϬŸ ntAfˆÞ3ךA~Y2E ôû3 :GVtAæ93l)ÆšAß yE ôû4 Uþy.TwA<âƒ' ¦žAhmëC ôû" ¯4Çï ›wAþ‰Ì½6ŸAŠû¶D ~~xI]tA<ÞÿrÁšAmHE ôû/ ÖÆ#ÈgŸsAXnþ‚†4šA'0tE ôû. ·ÈO:’
sA¢B†i™Aë%
E ôû/ >Þ,À8vA°‚9êœA>ÇD ôû, ø(¼+çŠuAÆOÁ לAÈΆD
ôû# ¨Ä-_c¯qAuÓ?]> —AÐкà ôû0 ÆUV¨ØZsA]ðÛñß™AÛ'Å ôû, ™mv7žqAYÐ:›Ä‘—AdWxD ôû1 ûö>%vA}„
ëV˜A.êbE
AÝ$GNRMC,124611.00,A,4951.69413,N,00839.03672,E,0.009,,030716,,,D*62
$GNVTG,,T,,M,0.009,N,0.016,K,D*36
$GNGNS,124611.00,4951.69413,N,00839.03672,E,RR,15,0.70,162.5,47.6,1.0,0000*42
$GNGGA,124611.00,4951.69413,N,00839.03672,E,4,12,0.70,162.5,M,47.6,M,1.0,0000*6A
$GNGSA,A,3,16,25,29,20,31,26,05,21,,,,,1.31,0.70,1.10*11
$GNGSA,A,3,88,79,78,81,82,80,72,,,,,,1.31,0.70,1.10*17
$GPGSV,4,1,13,02,08,040,18,04,,,47,05,18,071,44,09,02,348,21*43
$GPGSV,4,2,13,12,03,118,24,16,
I already searched for similar problems. However I was not able to find a solution which workes for me.
I ended up with code like that:
import csv
def unfussy_reader(csv_reader):
while True:
try:
yield next(csv_reader)
except csv.Error:
# log the problem or whatever
print("Problem with some row")
continue
if __name__ == '__main__':
#
# Generate malformed csv file for
# demonstration purposes
#
with open("temp.csv", "w") as fout:
fout.write("abc,def\nghi\x00,klm\n123,456")
#
# Open the malformed file for reading, fire up a
# conventional CSV reader over it, wrap that reader
# in our "unfussy" generator and enumerate over that
# generator.
#
with open("Rover.ubx") as fin:
reader = unfussy_reader(csv.reader(fin))
for n, row in enumerate(reader):
fout.write(row[0])
However I was not able to simply write a file containing just all the rows read in with the unfuss_reader wrapper using the above code.
Would be glad if you could help me.
Here is an Image of how the .ubx file looks in notepad++image
Thanks!
I am not quite sure but your file looks pretty binary. You should try to open it as such
with open(INPUT_FILENAME, 'rb') as input_file:
It seems like you did not open the file with correct coding format.
So the raw message cannot be read correctly.
If it is encoded as UTF8, you need to open the file with coding option:
with open(INPUT_FILENAME, 'r', newline='', encoding='utf8') as input_file
Hey if anyone else has this proglem to read in NMEA sentences of uBlox .ubx files
this pyhton code worked for me:
def read_in():
with open('GNGGA.txt', 'w') as GNGGA:
with open('GNRMC.txt','w') as GNRMC:
with open('rover.ubx', 'rb') as f:
for line in f:
#print line
if line.startswith('$GNGGA'):
#print line
GNGGA.write(line)
if line.startswith('$GNRMC'):
GNRMC.write(line)
read_in()
You could also use the gnssdump command line utility which is installed with the PyGPSClient and pygnssutils Python packages.
e.g.
gnssdump filename=Rover.ubx msgfilter=GNGGA
See gnssdump -h for help.
Alternatively if you want a simple Python script you could use the pyubx2 Python package, e.g.
from pyubx2 import UBXReader
with open("Rover.ubx", "rb") as stream:
ubr = UBXReader(stream)
for (_, parsed_data) in ubr.iterate():
if parsed_data.identity in ("GNGGA", "GNRMC"):
print(parsed_data)

extract values and construct new file

I have "vtu" format file (for paraview) as a text. The format is like below:
<?xml version="1.0"?>
<VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian" >
<UnstructuredGrid>
<Piece NumberOfPoints="21" NumberOfCells="20" >
<Points>
<DataArray type="Float64" Name="coordinates" NumberOfComponents="3" format="ascii" >
-3.3333333333e-01 1.1111111111e-01 0.0000000000e+00
-2.7777777778e-01 1.1111111111e-01 0.0000000000e+00
-1.1111111111e-01 4.4444444445e-01 0.0000000000e+00
</DataArray>
</Points>
<Cells>
<DataArray type="UInt64" Name="connectivity" NumberOfComponents="1" format="ascii" >
0 1
2 3
5 4
It is representing a mesh file.
I would like to extract the value for NumberOfPoints and also the first two coordinate and store them in another file as following:
21
-3.3333333333e-01
1.1111111111e-01
-2.7777777778e-01
1.1111111111e-01
-1.1111111111e-01
4.4444444445e-01
I am not familiat with python, I could only read the file line by line but I don't know to construct the above file.
What I have learnt so far is very simple. For the first file I am able to detect the line NumberOfPoints is included by
import xml.etree.ElementTree as ET
tree = ET.parse('read.vtu')
root = tree.getroot()
for Piece in root.iter('Piece'):
print Piece.attrib
nr = Piece.get('NumberOfPoints')
print nr
I can I have 21 :) the next step is to add Coordinate. But I dont know how to parse them, since I cannot find any node connected to them.
Try this:
import xml.etree.ElementTree as ET
try:
from cStringIO import StringIO
except:
from StringIO import StringIO
o = file('out.txt', 'w')
tree = ET.parse('read.vtu')
root = tree.getroot()
for Piece in root.iter('Piece'):
nr = Piece.get('NumberOfPoints')
o.write(nr+ '\n')
piece = root.iter('Piece')
piece = piece.next()
point = piece.getchildren()[0]
dataArr = point.getchildren()
data = dataArr[0]
# Writing to a buffer
output = StringIO()
output.write(data.text)
# Retrieve the value written
crds = output.seek(1)
for l in output:
ls = l.split( );
o.write(ls[0]+ '\n')
o.write(ls[1]+ '\n')
output.close()
o.close()
meshio (a project of mine) knows the VTU format, so you could simply
pip install meshio
and then
import meshio
points, cells, _, _, _ = meshio.read('file.vtu')

Categories

Resources