XML writing- new file - python

I am using Python to pick a specific set of values from my XML:
children = root[2].getchildren()
for child in children:
ET.dump(child)
Once I use this I get a print of exactly what I need from my XML. I can also change the root number to access different data. I want to export this value as a new separate xml, however when I use :
tree.write('new.xml')
It exports the entire XML, as before. It is not just the value I specified above, value I selected.

Related

How to get the text of drop down values in squish tool?

I have to verify the text of drop down list elements. How can I verify the same using python script in squish tool ?
Naive approach:
Record (then replay) selecting each of the entries. Use exception handling to log accessing individual entries and be able to proceed to test script execution.
More flexible approach:
Recording selecting one of the entries. This gives you script code to make the open the drop down and the object name of the drop down list. Then use object.children() to get all child elements of the drop down list object.
Pseudo example:
drop_down_list = waitForObject(...)
children = object.children(drop_down_list)
test.verify("Entry 1", children[0].text)
(You have to check the properties of the children to see which actual property contains the text or whatever else you want to verify.)

Modifying xml attributes through python

I have the following XML file in which the following information is present.
<PHYSICAL_TLINE>
<Traces general_diff="0" z_array="0" s_array="0" w_array="0" etch_factor="0.35" TS_track2track="0" TS_DQS="0" TW_DQS="0" TS_byte2dqs="0" TS_byte2byte="0" TS_DQ="0" TW_DQ="0" dsl_offset="0" D="20" TS="7" TW="5"/>
<PHYSICAL_TLINE>
Is there a way to set the values of these elements through python? For example, if I want to change the value of s_array to 5 instead of 0?.
I know that there is the xml.etree set command but I'm not too sure on how to set the values of these attributes in the child through python.
child.attrib["s_array"] = '0'
Assuming that child is the <Traces/> node.
Edit:
0 needs to be a string
This documentation may be helpful for you:
https://docs.python.org/2/library/xml.etree.elementtree.html
Note 19.7.1.4. Modifying an XML File
Modifying some code like this should acheive the desired result:
for rank in root.iter('rank')
rank.set('updated', 'yes')
tree.write('output.xml')

Python XML 'TypeError: must be xml.etree.ElementTree.Element, not str'

I currently am trying to build an XML file from a CSV file. Currently my code reads the CSV file to data and begins creating the XML from the data that is stored within the CSV.
CSV Example:
Element,XMLFile
SubElement,XMLName,XMLFile
SubElement,XMLDate,XMLName
SubElement,XMLInformation,XMLDate
SubElement,XMLTime,XMLName
Expected Output:
<XMLFile>
<XMLName>
<XMLDate>
<XMLInformation />
</XMLDate>
<XMLTime />
</XMLName>
</XMLFile>
Currently my code attempts to look at the CSV to see what the parent is for the new subelement:
# Defines main element
# xmlElement = xml.Element(XMLFile)
xmlElement = xml.Element(csvData[rowNumber][columnNumber])
# Should Define desired parent (FAIL) and SubElement name (PASS)
# xmlSubElement = xml.SubElement(XMLFile, XMLName)
xmlSubElement = xml.SubElement(csvData[rowNumber][columnNumber + 2], csvData[rowNumber][columnNumber + 1])
When the code attempts to use the CSV source string as the parent parameter, Python 3.5 generates the following error:
TypeError: must be xml.etree.ElementTree.Element, not str
Known cause of the error is that the parent paramenter is being returned as a string, when it is expected to be an Element or SubElement.
Is it possible to recall the stored value from the CSV and have it reference the Element or SubElement, instead of a string? The goal is to allow the code to read the CSV file and assign any SubElement to the parent listed in the CSV.
I cannot tell for sure, but it looks like you are doing:
ElementTree.SubElement(str, str)
when you should be doing:
ElementTree.SubElement(Element, str)
It also seems like you already know this. The real question, then, is how are you going to reference the parent object when you only know its tag string? You could search for Elements in the ElementTree with that particular tag string, but this is generally not a good idea as XML allows multiple instances of similar elements.
I would suggest you either:
Find a strategy to store references to parent elements
See if there is a way to uniquely identify the parent element using XPath

Conditional Selecting of child elements in pdfquery

I am using pdfquery to extract data from PDF.
My pdf tree xml looks like following:
<LTTextLineHorizontal>
<LTTextBoxHorizontal>Address</LTTextBoxHorizontal>
</LTTextLineHorizontal>
<LTTextBoxHorizontal>
<LTTextBoxHorizontal>
First-Name
</LTTextBoxHorizontal>
<LTTextBoxHorizontal>
Last-Name
</LTTextBoxHorizontal>
</LTTextBoxHorizontal>
The idea is to make a string that is Address First-Name Last-Name
The need then arises to select child elements depending on their existence, i am at a loss on how to do it.
You'll want to use .extract with the with_parents keyword in order to extract the children. The documentation gives a decent example
pdf.extract([
('with_parent','LTPage[page_index="1"]'),
('last_name', ':in_bbox("315,680,395,700")')
])
In this case, you're simply limiting the search to page 1 of the document. However, you can also pass in the result of previous selections with the with_parent keyword.
For example, if, in your example, Address had children (street, city, zipcode), you could first find the address section, store the element as a variable and then us .extract to pull out the children. How you store and structure the resulting data will depend on your ultimate needs.
address = pdf.pdfquery('LTTextBoxHorizontal:contains("Address")')
pdf.extract([
('with_parent', address),
(..., ...)])
In many cases, the children are not necessarily nested within the xml tree and you need to resort to bbox based approach. What I do in that case is construct a bbox using the "parent" as the top boundary and the next known non-child as the bottom boundary and then pass that in to .extract. Just remember that bboxes are constructed with the bottom-left, top-right coordinates.

structure data to use to get ordered tags in the initial order

I have an xml file(input) that contains tags in some order : what I want to do is open the file that contains the tags,get them, do some operations on those tags in the order of their appearance and generate a new xml file which contains the same informations of the input file and the results I obtained by the processing performed on each tag in the same order they appear in the input file
To create the xml output file I used a dictionary as { tag: information after treatment ; tag : information after treatement } I pass the result(the dictionary) to a function that allows me to generate the xml output file
so my problem is that the dictionary does not keep the tags in their original order(in the output file) I thought about creating a class for each tag that contains the tag and its information after treatment but I do not know if I'll get the same problem with a list of class(if at the end I'll get the tags in the bad order ) and if this is the case if you have a proposal on the way to get what i want (the data structure to use?)
In summary , will replacing the dictionary with a list of classes guarantee keeping the original order of appearance of tags in my output xml file ?
Thanks
Sounds like you want an OrderedDict (https://docs.python.org/2/library/collections.html#collections.OrderedDict).

Categories

Resources