I have a document which uses an XML namespace for which I want to increase /group/house/dogs by one: (the file is called houses.xml)
<?xml version="1.0"?>
<group xmlns="http://dogs.house.local">
<house>
<id>2821</id>
<dogs>2</dogs>
</house>
</group>
My current result using the code below is: (the created file is called houses2.xml)
<ns0:group xmlns:ns0="http://dogs.house.local">
<ns0:house>
<ns0:id>2821</ns0:id>
<ns0:dogs>3</ns0:dogs>
</ns0:house>
</ns0:group>
I would like to fix two things (if it is possible using ElementTree. If it isn´t, I´d be greatful for a suggestion as to what I should use instead):
I want to keep the <?xml version="1.0"?> line.
I do not want to prefix all tags, I´d like to keep it as is.
In conclusion, I don´t want to mess with the document more than I absolutely have to.
My current code (which works except for the above mentioned flaws) generating the above result follows.
I have made a utility function which loads an XML file using ElementTree and returns the elementTree and the namespace (as I do not want to hard code the namespace, and am willing to take the risk it implies):
def elementTreeRootAndNamespace(xml_file):
from xml.etree import ElementTree
import re
element_tree = ElementTree.parse(xml_file)
# Search for a namespace on the root tag
namespace_search = re.search('^({\S+})', element_tree.getroot().tag)
# Keep the namespace empty if none exists, if a namespace exists set
# namespace to {namespacename}
namespace = ''
if namespace_search:
namespace = namespace_search.group(1)
return element_tree, namespace
This is my code to update the number of dogs and save it to the new file houses2.xml:
elementTree, namespace = elementTreeRootAndNamespace('houses.xml')
# Insert the namespace before each tag when when finding current number of dogs,
# as ElementTree requires the namespace to be prefixed within {...} when a
# namespace is used in the document.
dogs = elementTree.find('{ns}house/{ns}dogs'.format(ns = namespace))
# Increase the number of dogs by one
dogs.text = str(int(dogs.text) + 1)
# Write the result to the new file houses2.xml.
elementTree.write('houses2.xml')
An XML based solution to this problem is to write a helper class for ElementTree which:
Grabs the XML-declaration line before parsing as ElementTree at the time of writing is unable to write an XML-declaration line without also writing an encoding attribute(I checked the source).
Parses the input file once, grabs the namespace of the root element. Registers that namespace with ElementTree as having the empty string as prefix. When that is done the source file is parsed using ElementTree again, with that new setting.
It has one major drawback:
XML-comments are lost. Which I have learned is not acceptable for this situation(I initially didn´t think the input data had any comments, but it turns out it has).
My helper class with example:
from xml.etree import ElementTree as ET
import re
class ElementTreeHelper():
def __init__(self, xml_file_name):
xml_file = open(xml_file_name, "rb")
self.__parse_xml_declaration(xml_file)
self.element_tree = ET.parse(xml_file)
xml_file.seek(0)
root_tag_namespace = self.__root_tag_namespace(self.element_tree)
self.namespace = None
if root_tag_namespace is not None:
self.namespace = '{' + root_tag_namespace + '}'
# Register the root tag namespace as having an empty prefix, as
# this has to be done before parsing xml_file we re-parse.
ET.register_namespace('', root_tag_namespace)
self.element_tree = ET.parse(xml_file)
def find(self, xpath_query):
return self.element_tree.find(xpath_query)
def write(self, xml_file_name):
xml_file = open(xml_file_name, "wb")
if self.xml_declaration_line is not None:
xml_file.write(self.xml_declaration_line + '\n')
return self.element_tree.write(xml_file)
def __parse_xml_declaration(self, xml_file):
first_line = xml_file.readline().strip()
if first_line.startswith('<?xml') and first_line.endswith('?>'):
self.xml_declaration_line = first_line
else:
self.xml_declaration_line = None
xml_file.seek(0)
def __root_tag_namespace(self, element_tree):
namespace_search = re.search('^{(\S+)}', element_tree.getroot().tag)
if namespace_search is not None:
return namespace_search.group(1)
else:
return None
def __main():
el_tree_hlp = ElementTreeHelper('houses.xml')
dogs_tag = el_tree_hlp.element_tree.getroot().find(
'{ns}house/{ns}dogs'.format(
ns=el_tree_hlp.namespace))
one_dog_added = int(dogs_tag.text.strip()) + 1
dogs_tag.text = str(one_dog_added)
el_tree_hlp.write('hejsan.xml')
if __name__ == '__main__':
__main()
The output:
<?xml version="1.0"?>
<group xmlns="http://dogs.house.local">
<house>
<id>2821</id>
<dogs>3</dogs>
</house>
</group>
If someone has an improvement to this solution please don´t hesitate to grab the code and improve it.
Round-tripping, unfortunately, isn't a trivial problem. With XML, it's generally not possible to preserve the original document unless you use a special parser (like DecentXML but that's for Java).
Depending on your needs, you have the following options:
If you control the source and you can secure your code with unit tests, you can write your own, simple parser. This parser doesn't accept XML but only a limited subset. You can, for example, read the whole document as a string and then use Python's string operations to locate <dogs> and replace anything up to the next <. Hack? Yes.
You can filter the output. XML allows the string <ns0: only in one place, so you can search&replace it with < and then the same with <group xmlns:ns0=" → <group xmlns=". This is pretty safe unless you can have CDATA in your XML.
You can write your own, simple XML parser. Read the input as a string and then create Elements for each pair of <> plus their positions in the input. That allows you to take the input apart quickly but only works for small inputs.
when Save xml add default_namespace argument is easy to avoid ns0, on my code
key code: xmltree.write(xmlfiile,"utf-8",default_namespace=xmlnamespace)
if os.path.isfile(xmlfiile):
xmltree = ET.parse(xmlfiile)
root = xmltree.getroot()
xmlnamespace = root.tag.split('{')[1].split('}')[0] //get namespace
initwin=xmltree.find("./{"+ xmlnamespace +"}test")
initwin.find("./{"+ xmlnamespace +"}content").text = "aaa"
xmltree.write(xmlfiile,"utf-8",default_namespace=xmlnamespace)
etree from lxml provides this feature.
elementTree.write('houses2.xml',encoding = "UTF-8",xml_declaration = True) helps you in not omitting the declaration
While writing into the file it does not change the namespaces.
http://lxml.de/parsing.html is the link for its tutorial.
P.S : lxml should be installed separately.
Given the following format (.properties or .ini):
propertyName1=propertyValue1
propertyName2=propertyValue2
...
propertyNameN=propertyValueN
For Java there is the Properties class that offers functionality to parse / interact with the above format.
Is there something similar in python's standard library (2.x) ?
If not, what other alternatives do I have ?
I was able to get this to work with ConfigParser, no one showed any examples on how to do this, so here is a simple python reader of a property file and example of the property file. Note that the extension is still .properties, but I had to add a section header similar to what you see in .ini files... a bit of a bastardization, but it works.
The python file: PythonPropertyReader.py
#!/usr/bin/python
import ConfigParser
config = ConfigParser.RawConfigParser()
config.read('ConfigFile.properties')
print config.get('DatabaseSection', 'database.dbname');
The property file: ConfigFile.properties
[DatabaseSection]
database.dbname=unitTest
database.user=root
database.password=
For more functionality, read: https://docs.python.org/2/library/configparser.html
For .ini files there is the configparser module that provides a format compatible with .ini files.
Anyway there's nothing available for parsing complete .properties files, when I have to do that I simply use jython (I'm talking about scripting).
I know that this is a very old question, but I need it just now and I decided to implement my own solution, a pure python solution, that covers most uses cases (not all):
def load_properties(filepath, sep='=', comment_char='#'):
"""
Read the file passed as parameter as a properties file.
"""
props = {}
with open(filepath, "rt") as f:
for line in f:
l = line.strip()
if l and not l.startswith(comment_char):
key_value = l.split(sep)
key = key_value[0].strip()
value = sep.join(key_value[1:]).strip().strip('"')
props[key] = value
return props
You can change the sep to ':' to parse files with format:
key : value
The code parses correctly lines like:
url = "http://my-host.com"
name = Paul = Pablo
# This comment line will be ignored
You'll get a dict with:
{"url": "http://my-host.com", "name": "Paul = Pablo" }
A java properties file is often valid python code as well. You could rename your myconfig.properties file to myconfig.py. Then just import your file, like this
import myconfig
and access the properties directly
print myconfig.propertyName1
if you don't have multi line properties and a very simple need, a few lines of code can solve it for you:
File t.properties:
a=b
c=d
e=f
Python code:
with open("t.properties") as f:
l = [line.split("=") for line in f.readlines()]
d = {key.strip(): value.strip() for key, value in l}
If you have an option of file formats I suggest using .ini and Python's ConfigParser as mentioned. If you need compatibility with Java .properties files I have written a library for it called jprops. We were using pyjavaproperties, but after encountering various limitations I ended up implementing my own. It has full support for the .properties format, including unicode support and better support for escape sequences. Jprops can also parse any file-like object while pyjavaproperties only works with real files on disk.
This is not exactly properties but Python does have a nice library for parsing configuration files. Also see this recipe: A python replacement for java.util.Properties.
i have used this, this library is very useful
from pyjavaproperties import Properties
p = Properties()
p.load(open('test.properties'))
p.list()
print(p)
print(p.items())
print(p['name3'])
p['name3'] = 'changed = value'
Here is link to my project: https://sourceforge.net/projects/pyproperties/. It is a library with methods for working with *.properties files for Python 3.x.
But it is not based on java.util.Properties
This is a one-to-one replacement of java.util.Propeties
From the doc:
def __parse(self, lines):
""" Parse a list of lines and create
an internal property dictionary """
# Every line in the file must consist of either a comment
# or a key-value pair. A key-value pair is a line consisting
# of a key which is a combination of non-white space characters
# The separator character between key-value pairs is a '=',
# ':' or a whitespace character not including the newline.
# If the '=' or ':' characters are found, in the line, even
# keys containing whitespace chars are allowed.
# A line with only a key according to the rules above is also
# fine. In such case, the value is considered as the empty string.
# In order to include characters '=' or ':' in a key or value,
# they have to be properly escaped using the backslash character.
# Some examples of valid key-value pairs:
#
# key value
# key=value
# key:value
# key value1,value2,value3
# key value1,value2,value3 \
# value4, value5
# key
# This key= this value
# key = value1 value2 value3
# Any line that starts with a '#' is considerered a comment
# and skipped. Also any trailing or preceding whitespaces
# are removed from the key/value.
# This is a line parser. It parses the
# contents like by line.
You can use a file-like object in ConfigParser.RawConfigParser.readfp defined here -> https://docs.python.org/2/library/configparser.html#ConfigParser.RawConfigParser.readfp
Define a class that overrides readline that adds a section name before the actual contents of your properties file.
I've packaged it into the class that returns a dict of all the properties defined.
import ConfigParser
class PropertiesReader(object):
def __init__(self, properties_file_name):
self.name = properties_file_name
self.main_section = 'main'
# Add dummy section on top
self.lines = [ '[%s]\n' % self.main_section ]
with open(properties_file_name) as f:
self.lines.extend(f.readlines())
# This makes sure that iterator in readfp stops
self.lines.append('')
def readline(self):
return self.lines.pop(0)
def read_properties(self):
config = ConfigParser.RawConfigParser()
# Without next line the property names will be lowercased
config.optionxform = str
config.readfp(self)
return dict(config.items(self.main_section))
if __name__ == '__main__':
print PropertiesReader('/path/to/file.properties').read_properties()
If you need to read all values from a section in properties file in a simple manner:
Your config.properties file layout :
[SECTION_NAME]
key1 = value1
key2 = value2
You code:
import configparser
config = configparser.RawConfigParser()
config.read('path_to_config.properties file')
details_dict = dict(config.items('SECTION_NAME'))
This will give you a dictionary where keys are same as in config file and their corresponding values.
details_dict is :
{'key1':'value1', 'key2':'value2'}
Now to get key1's value :
details_dict['key1']
Putting it all in a method which reads that section from config file only once(the first time the method is called during a program run).
def get_config_dict():
if not hasattr(get_config_dict, 'config_dict'):
get_config_dict.config_dict = dict(config.items('SECTION_NAME'))
return get_config_dict.config_dict
Now call the above function and get the required key's value :
config_details = get_config_dict()
key_1_value = config_details['key1']
-------------------------------------------------------------
Extending the approach mentioned above, reading section by section automatically and then accessing by section name followed by key name.
def get_config_section():
if not hasattr(get_config_section, 'section_dict'):
get_config_section.section_dict = dict()
for section in config.sections():
get_config_section.section_dict[section] =
dict(config.items(section))
return get_config_section.section_dict
To access:
config_dict = get_config_section()
port = config_dict['DB']['port']
(here 'DB' is a section name in config file
and 'port' is a key under section 'DB'.)
create a dictionary in your python module and store everything into it and access it, for example:
dict = {
'portalPath' : 'www.xyx.com',
'elementID': 'submit'}
Now to access it you can simply do:
submitButton = driver.find_element_by_id(dict['elementID'])
My Java ini files didn't have section headers and I wanted a dict as a result. So i simply injected an "[ini]" section and let the default config library do its job.
Take a version.ini fie of the eclipse IDE .metadata directory as an example:
#Mon Dec 20 07:35:29 CET 2021
org.eclipse.core.runtime=2
org.eclipse.platform=4.19.0.v20210303-1800
# 'injected' ini section
[ini]
#Mon Dec 20 07:35:29 CET 2021
org.eclipse.core.runtime=2
org.eclipse.platform=4.19.0.v20210303-1800
The result is converted to a dict:
from configparser import ConfigParser
#staticmethod
def readPropertyFile(path):
# https://stackoverflow.com/questions/3595363/properties-file-in-python-similar-to-java-properties
config = ConfigParser()
s_config= open(path, 'r').read()
s_config="[ini]\n%s" % s_config
# https://stackoverflow.com/a/36841741/1497139
config.read_string(s_config)
items=config.items('ini')
itemDict={}
for key,value in items:
itemDict[key]=value
return itemDict
This is what I'm doing in my project: I just create another .py file called properties.py which includes all common variables/properties I used in the project, and in any file need to refer to these variables, put
from properties import *(or anything you need)
Used this method to keep svn peace when I was changing dev locations frequently and some common variables were quite relative to local environment. Works fine for me but not sure this method would be suggested for formal dev environment etc.
import json
f=open('test.json')
x=json.load(f)
f.close()
print(x)
Contents of test.json:
{"host": "127.0.0.1", "user": "jms"}
I have created a python module that is almost similar to the Properties class of Java ( Actually it is like the PropertyPlaceholderConfigurer in spring which lets you use ${variable-reference} to refer to already defined property )
EDIT : You may install this package by running the command(currently tested for python 3).
pip install property
The project is hosted on GitHub
Example : ( Detailed documentation can be found here )
Let's say you have the following properties defined in my_file.properties file
foo = I am awesome
bar = ${chocolate}-bar
chocolate = fudge
Code to load the above properties
from properties.p import Property
prop = Property()
# Simply load it into a dictionary
dic_prop = prop.load_property_files('my_file.properties')
Below 2 lines of code shows how to use Python List Comprehension to load 'java style' property file.
split_properties=[line.split("=") for line in open('/<path_to_property_file>)]
properties={key: value for key,value in split_properties }
Please have a look at below post for details
https://ilearnonlinesite.wordpress.com/2017/07/24/reading-property-file-in-python-using-comprehension-and-generators/
you can use parameter "fromfile_prefix_chars" with argparse to read from config file as below---
temp.py
parser = argparse.ArgumentParser(fromfile_prefix_chars='#')
parser.add_argument('--a')
parser.add_argument('--b')
args = parser.parse_args()
print(args.a)
print(args.b)
config file
--a
hello
--b
hello dear
Run command
python temp.py "#config"
You could use - https://pypi.org/project/property/
eg - my_file.properties
foo = I am awesome
bar = ${chocolate}-bar
chocolate = fudge
long = a very long property that is described in the property file which takes up \
multiple lines can be defined by the escape character as it is done here
url=example.com/api?auth_token=xyz
user_dir=${HOME}/test
unresolved = ${HOME}/files/${id}/${bar}/
fname_template = /opt/myapp/{arch}/ext/{objid}.dat
Code
from properties.p import Property
## set use_env to evaluate properties from shell / os environment
prop = Property(use_env = True)
dic_prop = prop.load_property_files('my_file.properties')
## Read multiple files
## dic_prop = prop.load_property_files('file1', 'file2')
print(dic_prop)
# Output
# {'foo': 'I am awesome', 'bar': 'fudge-bar', 'chocolate': 'fudge',
# 'long': 'a very long property that is described in the property file which takes up multiple lines
# can be defined by the escape character as it is done here', 'url': 'example.com/api?auth_token=xyz',
# 'user_dir': '/home/user/test',
# 'unresolved': '/home/user/files/${id}/fudge-bar/',
# 'fname_template': '/opt/myapp/{arch}/ext/{objid}.dat'}
I did this using ConfigParser as follows. The code assumes that there is a file called config.prop in the same directory where BaseTest is placed:
config.prop
[CredentialSection]
app.name=MyAppName
BaseTest.py:
import unittest
import ConfigParser
class BaseTest(unittest.TestCase):
def setUp(self):
__SECTION = 'CredentialSection'
config = ConfigParser.ConfigParser()
config.readfp(open('config.prop'))
self.__app_name = config.get(__SECTION, 'app.name')
def test1(self):
print self.__app_name % This should print: MyAppName
This is what i had written to parse file and set it as env variables which skips comments and non key value lines added switches to specify
hg:d
-h or --help print usage summary
-c Specify char that identifies comment
-s Separator between key and value in prop file
and specify properties file that needs to be parsed eg : python
EnvParamSet.py -c # -s = env.properties
import pipes
import sys , getopt
import os.path
class Parsing :
def __init__(self , seprator , commentChar , propFile):
self.seprator = seprator
self.commentChar = commentChar
self.propFile = propFile
def parseProp(self):
prop = open(self.propFile,'rU')
for line in prop :
if line.startswith(self.commentChar)==False and line.find(self.seprator) != -1 :
keyValue = line.split(self.seprator)
key = keyValue[0].strip()
value = keyValue[1].strip()
print("export %s=%s" % (str (key),pipes.quote(str(value))))
class EnvParamSet:
def main (argv):
seprator = '='
comment = '#'
if len(argv) is 0:
print "Please Specify properties file to be parsed "
sys.exit()
propFile=argv[-1]
try :
opts, args = getopt.getopt(argv, "hs:c:f:", ["help", "seprator=","comment=", "file="])
except getopt.GetoptError,e:
print str(e)
print " possible arguments -s <key value sperator > -c < comment char > <file> \n Try -h or --help "
sys.exit(2)
if os.path.isfile(args[0])==False:
print "File doesnt exist "
sys.exit()
for opt , arg in opts :
if opt in ("-h" , "--help"):
print " hg:d \n -h or --help print usage summary \n -c Specify char that idetifes comment \n -s Sperator between key and value in prop file \n specify file "
sys.exit()
elif opt in ("-s" , "--seprator"):
seprator = arg
elif opt in ("-c" , "--comment"):
comment = arg
p = Parsing( seprator, comment , propFile)
p.parseProp()
if __name__ == "__main__":
main(sys.argv[1:])
Lightbend has released the Typesafe Config library, which parses properties files and also some JSON-based extensions. Lightbend's library is only for the JVM, but it seems to be widely adopted and there are now ports in many languages, including Python: https://github.com/chimpler/pyhocon
You can use the following function, which is the modified code of #mvallebr. It respects the properties file comments, ignores empty new lines, and allows retrieving a single key value.
def getProperties(propertiesFile ="/home/memin/.config/customMemin/conf.properties", key=''):
"""
Reads a .properties file and returns the key value pairs as dictionary.
if key value is specified, then it will return its value alone.
"""
with open(propertiesFile) as f:
l = [line.strip().split("=") for line in f.readlines() if not line.startswith('#') and line.strip()]
d = {key.strip(): value.strip() for key, value in l}
if key:
return d[key]
else:
return d
this works for me.
from pyjavaproperties import Properties
p = Properties()
p.load(open('test.properties'))
p.list()
print p
print p.items()
print p['name3']
I followed configparser approach and it worked quite well for me. Created one PropertyReader file and used config parser there to ready property to corresponding to each section.
**Used Python 2.7
Content of PropertyReader.py file:
#!/usr/bin/python
import ConfigParser
class PropertyReader:
def readProperty(self, strSection, strKey):
config = ConfigParser.RawConfigParser()
config.read('ConfigFile.properties')
strValue = config.get(strSection,strKey);
print "Value captured for "+strKey+" :"+strValue
return strValue
Content of read schema file:
from PropertyReader import *
class ReadSchema:
print PropertyReader().readProperty('source1_section','source_name1')
print PropertyReader().readProperty('source2_section','sn2_sc1_tb')
Content of .properties file:
[source1_section]
source_name1:module1
sn1_schema:schema1,schema2,schema3
sn1_sc1_tb:employee,department,location
sn1_sc2_tb:student,college,country
[source2_section]
source_name1:module2
sn2_schema:schema4,schema5,schema6
sn2_sc1_tb:employee,department,location
sn2_sc2_tb:student,college,country
You can try the python-dotenv library. This library reads key-value pairs from a .env (so not exactly a .properties file though) file and can set them as environment variables.
Here's a sample usage from the official documentation:
from dotenv import load_dotenv
load_dotenv() # take environment variables from .env.
# Code of your application, which uses environment variables (e.g. from `os.environ` or
# `os.getenv`) as if they came from the actual environment.