I have an xml file that I need to parse without harcoding the attributes given:
<test_machines name="Mac1">
<ip_address>192.168.0.0</ip_address>
<operating_system type="OS X" version="10.10.5">OS X version=10.10.5</operating_system>
<path_to>path</path_to>
</test_maschines>
I was able to parse the file but hadrcoded the names of nodes like ip_machines and also hardcoding the attribute names like type="OS X".
I need to find a way to parse the file without doing any of that stuff.
How do I achieve this?
Thans to all in advance!
Python has both builtin functions and very powerful third party libraries like lxml with same API to parse xml files and much more. Check them out by clicking on the links.
Related
I am in a project in which I have the task to convert a text file to xml file.
But the restriction is, the file names should be same. I use element tree
module of python for this but while writing to the tree I use tree.write()
method in which I have to explicitly define / hardcode the xml filename
myself.
I want to know is there any procedure to automatically create the .xml file
with same name as the text file
the sh module provides great flexibility and may help you for what you want to do if I understand the question correctly. I have shown an example in the following:
import sh
name = "filename.xml"
new_name = name[-3] + "txt"
sh.touch(new_name)
from there you can open the created file and write directly to it.
Is there any standard for python configuration files? What I would like to have is a seperate document to my script which would have the options of my script in it. For example...
Test Options
Random_AOI = 1
Random_ReadMode = 1
These would then become a list within the python script, such as...
test_options(random_aoi, random_readmode)
Would I have to use regular expressions and scan the document or is there an easier way of performing this action?
Yes - there's ConfigParser, which parses .ini files.
There's a handy introduction to it here.
The "standard" config file formats for Python are INI (which you parse/write with the ConfigParser module -- http://docs.python.org/2/library/configparser.html) and JSON (http://docs.python.org/2/library/json.html).
I ran the Python REPL tool and imported a Python Module. Can I can dump the contents of that Module into a file? Is this possible? Thanks in advance.
In what format do you want to write the file? If you want exactly the same format that got imported, that's not hard -- but basically it's done with a file-to-file copy. For example, if the module in question is called blah, you can do:
>>> import shutil
>>> shutil.copy(blah.__file__, '/tmp/blahblah.pyc')
Do you mean something like this?
http://www.datamech.com/devan/trypython/trypython.py
I don't think it is possible, as this is a very restricted environment.
The __file__ attribute is faked, so doesn't map to a real file
You might get a start by getting a reference to your module object:
modobject = __import__("modulename")
Unfortunately those aren't pickleable. You might be able to iterate over dir(modobject) and get some good info out catching errors along the way... or is a string representation of dir(modobject) itself what you want?
I have this xml model.
link text
So I have to add some node (see the text commented) to this file.
How I can do it?
I have writed this partial code but it doesn't work:
xmldoc=minidom.parse(directory)
child = xmldoc.createElement("map")
for node in xmldoc.getElementsByTagName("Environment"):
node.appendChild(child)
Thanks in advance.
I downloaded your sample xml file and your code works fine. Your problem is most likely with the line: xmldoc=minidom.parse(directory), should this not be the path to the file you are trying to parse not to a directory? The parse() function parses an XML file it does not automatically parse all the XML files in a given directory.
If you change your code to something like below this should work fine:
xmldoc=minidom.parse("directory/model_template.xml")
child = xmldoc.createElement("map")
for node in xmldoc.getElementsByTagName("Environment"):
node.appendChild(child)
If you then execute the statement: print xmldoc.toxml() you will see that the map element has indeed been added to the Environment element: <Environment><map/></Environment>.
I have over a million text files compressed into 40 zip files. I also have a list of about 500 model names of phones. I want to find out the number of times a particular model was mentioned in the text files.
Is there any python module which can do a regex match on the files without unzipping it. Is there a simple way to solve this problem without unzipping?
There's nothing that will automatically do what you want.
However, there is a python zipfile module that will make this easy to do. Here's how to iterate over the lines in the file.
#!/usr/bin/python
import zipfile
f = zipfile.ZipFile('myfile.zip')
for subfile in f.namelist():
print subfile
data = f.read(subfile)
for line in data.split('\n'):
print line
You could loop through the zip files, reading individual files using the zipfile module and running your regex on those, eliminating to unzip all the files at once.
I'm fairly certain that you can't run a regex over the zipped data, at least not meaningfully.
To access the contents of a zip file you have to unzip it, although the zipfile package makes this fairly easy, as you can unzip each file within an archive individually.
Python zipfile module
Isn't it (at least theoretically) possible, to read in the ZIP's Huffman coding and then translate the regexp into the Huffman code? Might this be more efficient than first de-compressing the data, then running the regexp?
(Note: I know it wouldn't be quite that simple: you'd also have to deal with other aspects of the ZIP coding—file layout, block structures, back-references—but one imagines this could be fairly lightweight.)
EDIT: Also note that it's probably much more sensible to just use the zipfile solution.