Adding subtitles to video with moviepy requiring encoding - python

I have a srt file named subtitles.srt that is non-English, and I followed the instructions of the documentation and source code of the moviepy package (https://moviepy.readthedocs.io/en/latest/_modules/moviepy/video/tools/subtitles.html):
from moviepy.video.tools.subtitles import SubtitlesClip
from moviepy.video.io.VideoFileClip import VideoFileClip
generator = lambda txt: TextClip(txt, font='Georgia-Regular', fontsize=24, color='white')
sub = SubtitlesClip("subtitles.srt", generator, encoding='utf-8')
And this gives the error TypeError: __init__() got an unexpected keyword argument 'encoding'.
In the source code, the class SubtitlesClip does have a keyword argument encoding. Does that mean the version of the source code is outdated or something? And what can I do about this? I even attempted to copy the source code for moviepy.video.tools.subtitles with the encoding keyword argument directly to my code, yet it led to more errors like at the line:
from moviepy.decorators import convert_path_to_string
it failed to import the decorator convert_path_to_string.
The source code does not seem to agree with what I have installed. Anyway to fix it? If not, are there any good alternatives of Python libraries for inserting subtitles or video editing in general?
Edit: My current solution is to create a child class of SubtitlesClip and override the constructor of the parent class:
from moviepy.video.tools.subtitles import SubtitlesClip
from moviepy.video.VideoClip import TextClip, VideoClip
from moviepy.video.compositing.CompositeVideoClip import CompositeVideoClip
import re
from moviepy.tools import cvsecs
def file_to_subtitles_with_encoding(filename):
""" Converts a srt file into subtitles.
The returned list is of the form ``[((ta,tb),'some text'),...]``
and can be fed to SubtitlesClip.
Only works for '.srt' format for the moment.
"""
times_texts = []
current_times = None
current_text = ""
with open(filename,'r',encoding='utf-8') as f:
for line in f:
times = re.findall("([0-9]*:[0-9]*:[0-9]*,[0-9]*)", line)
if times:
current_times = [cvsecs(t) for t in times]
elif line.strip() == '':
times_texts.append((current_times, current_text.strip('\n')))
current_times, current_text = None, ""
elif current_times:
current_text += line
return times_texts
class SubtitlesClipUTF8(SubtitlesClip):
def __init__(self, subtitles, make_textclip=None):
VideoClip.__init__(self, has_constant_size=False)
if isinstance(subtitles, str):
subtitles = file_to_subtitles_with_encoding(subtitles)
#subtitles = [(map(cvsecs, tt),txt) for tt, txt in subtitles]
self.subtitles = subtitles
self.textclips = dict()
if make_textclip is None:
make_textclip = lambda txt: TextClip(txt, font='Georgia-Bold',
fontsize=24, color='white',
stroke_color='black', stroke_width=0.5)
self.make_textclip = make_textclip
self.start=0
self.duration = max([tb for ((ta,tb), txt) in self.subtitles])
self.end=self.duration
def add_textclip_if_none(t):
""" Will generate a textclip if it hasn't been generated asked
to generate it yet. If there is no subtitle to show at t, return
false. """
sub =[((ta,tb),txt) for ((ta,tb),txt) in self.textclips.keys()
if (ta<=t<tb)]
if not sub:
sub = [((ta,tb),txt) for ((ta,tb),txt) in self.subtitles if
(ta<=t<tb)]
if not sub:
return False
sub = sub[0]
if sub not in self.textclips.keys():
self.textclips[sub] = self.make_textclip(sub[1])
return sub
def make_frame(t):
sub = add_textclip_if_none(t)
return (self.textclips[sub].get_frame(t) if sub
else np.array([[[0,0,0]]]))
def make_mask_frame(t):
sub = add_textclip_if_none(t)
return (self.textclips[sub].mask.get_frame(t) if sub
else np.array([[0]]))
self.make_frame = make_frame
hasmask = bool(self.make_textclip('T').mask)
self.mask = VideoClip(make_mask_frame, ismask=True) if hasmask else None
I actually only changed two lines, but I have to create a new class and redefine the whole thing, so I doubt whether it's really necessary. Any better solution than this?

The latest version in the documentation (the one you are looking at) corresponds to a dev version 2.x, which is not released to PyPI yet. The version you have installed through pip is most likely 1.0.3, which is the latest on PyPI, and it doesn't allow an encoding parameter.
From the PR where the feature was introduced, you can see that it's only been tagged for release in 2.x versions.
Copying only that file to your source code will most likely not work, because it will depend on changes that happened in between the two versions. However, if you feel adventurous, you can install the dev version of the package, by following the Method by hand section in moviepy's docs.

Related

Python unit testing with PyTest, stuck on more advanced testing

I have been using PyTest for some time now to write some simple tests (like the ones you find in tutorials and youtube video's) and I thought now it was time to start writing actual test for our python scripts. The scripts are way more advanced than any shown in tutorials so I am getting a bit stuck. I do not want the entire correct answer, but rather a nudge in the right direction if possible. Here is my issue:
We have a script that reads a .md text file and converts it to a pdf file based on an external template. Part of the script is here below (I removed most of it because I first just want to have 1 running test)
class DocumentationEngine:
def __init__(self, title, subtitle, series, style='TIIStyle_Digital_Aug_2020', templateFile='template.docet', tableOfContents=True, listOfFigures=False, listOfTables=False):
self.title = title
self.subtitle = subtitle
self.series = series
self.style = style
self.template = {}
self.hasTOC = tableOfContents
self.hasLOF = listOfFigures
self.hasLOT = listOfTables
self.loadTemplate(templateFile)
def loadTemplate(self, file='template.docet'):
with open(file, "r") as templatefile:
lines = templatefile.readlines()
key = "dummy"
value = ""
for line in lines:
line = line.strip()
if line.startswith('[') and line.endswith(']'):
self.template[key] = value
key = line[1:-1]
value = ""
else:
value += line + '\n'
def build(self, versions=[], content='', filename='Documenter\\_Autogenerated'):
document = self.template["doc"]
document = document.replace("%%style%%", self.style)
document = document.replace("%%body%%",
self.buildFirstPage() +
self.buildTableOfContents() +
self.buildListOfFigures() +
self.buildListOfTables() +
self.buildVersionTable(versions, filename) +
self.buildContentPages(content=content) +
self.buildLastPage()
)
return document
def buildLastPage(self):
return self.template["last_page"]
I am trying to write a simple unit test for the buildLastPage method and have been stuck for several days now.
I am not sure whether or not I need to mock the template file, use a fixture and/or if I can actually test only that method with all dependencies.
I started with the following:
from doceng import DocumentationEngine
import pytest
class Test:
def test_buildLastPage(self):
build_last_page = DocumentationEngine()
assert build_last_page.template(1) == 1
which gives me an error regarding 3 required arguments. When adding the arguments like this:
from doceng import DocumentationEngine
import pytest
class Test:
def test_buildLastPage(self, title, subtitle, series):
build_last_page = DocumentationEngine()
assert build_last_page.template(1) == 1
which gives me an error that the fixture is not found.
I added a fixture in conftest.py file like this:
import pytest
from doceng import DocumentationEngine
#pytest.fixture
def title(title):
return title("test")
which will get me another error, recursive dependency involving fixture 'title' detected
I'm quite stuck so any nudge in the right direction for a newbie would be highly appreciated
The error of the fixtures is regarding your test function test_buildLastPage. The way you are using it, it only needs the self argument.
A test function in pytest without any decorators always expects to find fixtures, that have the same name as the arguments. You did not define any fixtures and also do not use the arguments in your function. Therefore, you can remove them safely.
The actual error points DocumentationEngine(). The class expect 3 arguments when initializing the object. You set no arguments. Check your __init__ function again to find the proper arguments.

Using python class with spark DataFrame to parse URL's

I'm trying to process URL's in a pyspark dataframe using a class that I've written and a udf. I'm aware of urllib and other url parsing libraries but for this case I need to use my own code.
In order to get the tld of a url I cross check it against the iana public suffix list.
Here's a simplification of my code
class Parser:
# list of available public suffixes for extracting top level domains
file = open("public_suffix_list.txt", 'r')
data = []
for line in file:
if line.startswith("//") or line == '\n':
pass
else:
data.append(line.strip('\n'))
def __init__(self, url):
self.url = url
#the code here extracts port,protocol,query etc.
#I think this bit below is causing the error
matches = [r for r in self.data if r in self.hostname]
#extra functionality in my actual class
i = matches.index(self.string)
try:
self.tld = matches[i]
# logic to find tld if no match
The class works in pure python so for example I can run
import Parser
x = Parser("www.google.com")
x.tld #returns ".com"
However when I try to do
import Parser
from pyspark.sql.functions import udf
parse = udf(lambda x: Parser(x).url)
df = sqlContext.table("tablename").select(parse("column"))
When I call an action I get
File "<stdin>", line 3, in <lambda>
File "<stdin>", line 27, in __init__
TypeError: 'in <string>' requires string as left operand
So my guess is that it's failing to interpret the data as a list of strings?
I've also tried to use
file = sc.textFile("my_file.txt")\
.filter(lambda x: not x.startswith("//") or != "")\
.collect()
data = sc.broadcast(file)
to open my file instead, but that causes
Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
Any ideas?
Thanks in advance
EDIT: Apologies, I didn't have my code to hand so my test code didn't explain very well the problems I was having. The error I initially reported was a result of the test data I was using.
I've updated my question to be more reflective of the challenge I'm facing.
Why do you need a class in this case (the code for defining your class is incorrect, you never declared self.data before using it in the init method) the only relevant line that affects the output you want is self.string=string, so you are basically passing the identity function as udf.
The UnicodeDecodeError is due to an encoding issue in your file, it has nothing to do with your definition of the class.
The second error is in the line sc.broadcast(file) , details of which can be found here : Spark: Broadcast variables: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion
EDIT 1
I would redefine your class structure as follows. You basically need to create the instance self.data by calling self.data = data before you can use it. Also anything that you write before the init method is executed irrespective of whether you call that class or not. So moving out the file parsing part will not have any effect.
# list of available public suffixes for extracting top level domains
file = open("public_suffix_list.txt", 'r')
data = []
for line in file:
if line.startswith("//") or line == '\n':
pass
else:
data.append(line.strip('\n'))
class Parser:
def __init__(self, url):
self.url = url
self.data = data
#the code here extracts port,protocol,query etc.
#I think this bit below is causing the error
matches = [r for r in self.data if r in self.hostname]
#extra functionality in my actual class
i = matches.index(self.string)
try:
self.tld = matches[i]
# logic to find tld if no match

embedding resources in python scripts

I'd like to figure out how to embed binary content in a python script. For instance, I don't want to have any external files around (images, sound, ... ), I want all this content living inside of my python scripts.
Little example to clarify, let's say I got this small snippet:
from StringIO import StringIO
from PIL import Image, ImageFilter
embedded_resource = StringIO(open("Lenna.png", "rb").read())
im = Image.open(embedded_resource)
im.show()
im_sharp = im.filter(ImageFilter.SHARPEN)
im_sharp.show()
As you can see, the example is reading the external file 'Lenna.png'
Question
How to proceed to embed "Lenna.png" as a resource (variable) into my python script. What's the fastest way to achieve this simple task using python?
You might find the following class rather useful for embedding resources in your program. To use it, call the package method with paths to the files that you want to embed. The class will print out a DATA attribute that should be used to replace the one already found in the class. If you want to add files to your pre-built data, use the add method instead. To use the class in your program, make calls to the load method using context manager syntax. The returned value is a Path object that can be used as a filename argument to other functions or for the purpose of directly loading the reconstituted file. See this SMTP Client for example usage.
import base64
import contextlib
import pathlib
import pickle
import pickletools
import sys
import zlib
class Resource:
"""Manager for resources that would normally be held externally."""
WIDTH = 76
__CACHE = None
DATA = b''
#classmethod
def package(cls, *paths):
"""Creates a resource string to be copied into the class."""
cls.__generate_data(paths, {})
#classmethod
def add(cls, *paths):
"""Include paths in the pre-generated DATA block up above."""
cls.__preload()
cls.__generate_data(paths, cls.__CACHE.copy())
#classmethod
def __generate_data(cls, paths, buffer):
"""Load paths into buffer and output DATA code for the class."""
for path in map(pathlib.Path, paths):
if not path.is_file():
raise ValueError('{!r} is not a file'.format(path))
key = path.name
if key in buffer:
raise KeyError('{!r} has already been included'.format(key))
with path.open('rb') as file:
buffer[key] = file.read()
pickled = pickle.dumps(buffer, pickle.HIGHEST_PROTOCOL)
optimized = pickletools.optimize(pickled)
compressed = zlib.compress(optimized, zlib.Z_BEST_COMPRESSION)
encoded = base64.b85encode(compressed)
cls.__print(" DATA = b'''")
for offset in range(0, len(encoded), cls.WIDTH):
cls.__print("\\\n" + encoded[
slice(offset, offset + cls.WIDTH)].decode('ascii'))
cls.__print("'''")
#staticmethod
def __print(line):
"""Provides alternative printing interface for simplicity."""
sys.stdout.write(line)
sys.stdout.flush()
#classmethod
#contextlib.contextmanager
def load(cls, name, delete=True):
"""Dynamically loads resources and makes them usable while needed."""
cls.__preload()
if name not in cls.__CACHE:
raise KeyError('{!r} cannot be found'.format(name))
path = pathlib.Path(name)
with path.open('wb') as file:
file.write(cls.__CACHE[name])
yield path
if delete:
path.unlink()
#classmethod
def __preload(cls):
"""Warm up the cache if it does not exist in a ready state yet."""
if cls.__CACHE is None:
decoded = base64.b85decode(cls.DATA)
decompressed = zlib.decompress(decoded)
cls.__CACHE = pickle.loads(decompressed)
def __init__(self):
"""Creates an error explaining class was used improperly."""
raise NotImplementedError('class was not designed for instantiation')
The best way to go about this is converting your picture into a python string, and have it in a separate file called something like resources.py, then you simply parse it.
If you are looking to embed the whole thing inside a single binary, then you're looking at something like py2exe. Here is an example embedding external files
In the first scenario, you could even use base64 to (de)code the picture, something like this:
import base64
file = open('yourImage.png');
encoded = base64.b64encode(file.read())
data = base64.b64decode(encoded) # Don't forget to file.close() !

How Do I Suppress or Disable Warnings in reSTructuredText?

I'm working on a CMS in Python that uses reStructuredText (via docutils) to format content. Alot of my content is imported from other sources and usually comes in the form of unformatted text documents. reST works great for this because it makes everything look pretty sane by default.
One problem I am having, however, is that I get warnings dumped to stderr on my webserver and injected into my page content. For example, I get warnings like the following on my web page:
System Message: WARNING/2 (, line 296); backlink
My question is: How do I suppress, disable, or otherwise re-direct these warnings?
Ideally, I'd love to write these out to a log file, but if someone can just tell me how to turn off the warnings from being injected into my content then that would be perfect.
The code that's responsible for parsing the reST into HTML:
from docutils import core
import reSTpygments
def reST2HTML( str ):
parts = core.publish_parts(
source = str,
writer_name = 'html')
return parts['body_pre_docinfo'] + parts['fragment']
def reST2HTML( str ):
parts = core.publish_parts(
source = str,
writer_name = 'html',
settings_overrides={'report_level':'quiet'},
)
return parts['body_pre_docinfo'] + parts['fragment']
It seems the report_level accept string is an old version. Now, the below is work for me.
import docutils.core
import docutils.utils
from pathlib import Path
shut_up_level = docutils.utils.Reporter.SEVERE_LEVEL + 1
docutils.core.publish_file(
source_path=Path(...), destination_path=Path(...),
settings_overrides={'report_level': shut_up_level},
writer_name='html')
about level
# docutils.utils.__init__.py
class Reporter(object):
# system message level constants:
(DEBUG_LEVEL,
INFO_LEVEL,
WARNING_LEVEL,
ERROR_LEVEL,
SEVERE_LEVEL) = range(5)
...
def system_message(self, level, message, *children, **kwargs):
...
if self.stream and (level >= self.report_level # self.report_level was set by you. (for example, shut_up_level)
or self.debug_flag and level == self.DEBUG_LEVEL
or level >= self.halt_level):
self.stream.write(msg.astext() + '\n')
...
return msg
According to the above code, you know that you can assign the self.report_level (i.e. settings_overrides={'report_level': ...}) let the warning not show.
and I set it to SERVER_LEVEL+1, so it will not show any error. (you can set it according to your demand.)

Properties file in python (similar to Java Properties)

Given the following format (.properties or .ini):
propertyName1=propertyValue1
propertyName2=propertyValue2
...
propertyNameN=propertyValueN
For Java there is the Properties class that offers functionality to parse / interact with the above format.
Is there something similar in python's standard library (2.x) ?
If not, what other alternatives do I have ?
I was able to get this to work with ConfigParser, no one showed any examples on how to do this, so here is a simple python reader of a property file and example of the property file. Note that the extension is still .properties, but I had to add a section header similar to what you see in .ini files... a bit of a bastardization, but it works.
The python file: PythonPropertyReader.py
#!/usr/bin/python
import ConfigParser
config = ConfigParser.RawConfigParser()
config.read('ConfigFile.properties')
print config.get('DatabaseSection', 'database.dbname');
The property file: ConfigFile.properties
[DatabaseSection]
database.dbname=unitTest
database.user=root
database.password=
For more functionality, read: https://docs.python.org/2/library/configparser.html
For .ini files there is the configparser module that provides a format compatible with .ini files.
Anyway there's nothing available for parsing complete .properties files, when I have to do that I simply use jython (I'm talking about scripting).
I know that this is a very old question, but I need it just now and I decided to implement my own solution, a pure python solution, that covers most uses cases (not all):
def load_properties(filepath, sep='=', comment_char='#'):
"""
Read the file passed as parameter as a properties file.
"""
props = {}
with open(filepath, "rt") as f:
for line in f:
l = line.strip()
if l and not l.startswith(comment_char):
key_value = l.split(sep)
key = key_value[0].strip()
value = sep.join(key_value[1:]).strip().strip('"')
props[key] = value
return props
You can change the sep to ':' to parse files with format:
key : value
The code parses correctly lines like:
url = "http://my-host.com"
name = Paul = Pablo
# This comment line will be ignored
You'll get a dict with:
{"url": "http://my-host.com", "name": "Paul = Pablo" }
A java properties file is often valid python code as well. You could rename your myconfig.properties file to myconfig.py. Then just import your file, like this
import myconfig
and access the properties directly
print myconfig.propertyName1
if you don't have multi line properties and a very simple need, a few lines of code can solve it for you:
File t.properties:
a=b
c=d
e=f
Python code:
with open("t.properties") as f:
l = [line.split("=") for line in f.readlines()]
d = {key.strip(): value.strip() for key, value in l}
If you have an option of file formats I suggest using .ini and Python's ConfigParser as mentioned. If you need compatibility with Java .properties files I have written a library for it called jprops. We were using pyjavaproperties, but after encountering various limitations I ended up implementing my own. It has full support for the .properties format, including unicode support and better support for escape sequences. Jprops can also parse any file-like object while pyjavaproperties only works with real files on disk.
This is not exactly properties but Python does have a nice library for parsing configuration files. Also see this recipe: A python replacement for java.util.Properties.
i have used this, this library is very useful
from pyjavaproperties import Properties
p = Properties()
p.load(open('test.properties'))
p.list()
print(p)
print(p.items())
print(p['name3'])
p['name3'] = 'changed = value'
Here is link to my project: https://sourceforge.net/projects/pyproperties/. It is a library with methods for working with *.properties files for Python 3.x.
But it is not based on java.util.Properties
This is a one-to-one replacement of java.util.Propeties
From the doc:
def __parse(self, lines):
""" Parse a list of lines and create
an internal property dictionary """
# Every line in the file must consist of either a comment
# or a key-value pair. A key-value pair is a line consisting
# of a key which is a combination of non-white space characters
# The separator character between key-value pairs is a '=',
# ':' or a whitespace character not including the newline.
# If the '=' or ':' characters are found, in the line, even
# keys containing whitespace chars are allowed.
# A line with only a key according to the rules above is also
# fine. In such case, the value is considered as the empty string.
# In order to include characters '=' or ':' in a key or value,
# they have to be properly escaped using the backslash character.
# Some examples of valid key-value pairs:
#
# key value
# key=value
# key:value
# key value1,value2,value3
# key value1,value2,value3 \
# value4, value5
# key
# This key= this value
# key = value1 value2 value3
# Any line that starts with a '#' is considerered a comment
# and skipped. Also any trailing or preceding whitespaces
# are removed from the key/value.
# This is a line parser. It parses the
# contents like by line.
You can use a file-like object in ConfigParser.RawConfigParser.readfp defined here -> https://docs.python.org/2/library/configparser.html#ConfigParser.RawConfigParser.readfp
Define a class that overrides readline that adds a section name before the actual contents of your properties file.
I've packaged it into the class that returns a dict of all the properties defined.
import ConfigParser
class PropertiesReader(object):
def __init__(self, properties_file_name):
self.name = properties_file_name
self.main_section = 'main'
# Add dummy section on top
self.lines = [ '[%s]\n' % self.main_section ]
with open(properties_file_name) as f:
self.lines.extend(f.readlines())
# This makes sure that iterator in readfp stops
self.lines.append('')
def readline(self):
return self.lines.pop(0)
def read_properties(self):
config = ConfigParser.RawConfigParser()
# Without next line the property names will be lowercased
config.optionxform = str
config.readfp(self)
return dict(config.items(self.main_section))
if __name__ == '__main__':
print PropertiesReader('/path/to/file.properties').read_properties()
If you need to read all values from a section in properties file in a simple manner:
Your config.properties file layout :
[SECTION_NAME]
key1 = value1
key2 = value2
You code:
import configparser
config = configparser.RawConfigParser()
config.read('path_to_config.properties file')
details_dict = dict(config.items('SECTION_NAME'))
This will give you a dictionary where keys are same as in config file and their corresponding values.
details_dict is :
{'key1':'value1', 'key2':'value2'}
Now to get key1's value :
details_dict['key1']
Putting it all in a method which reads that section from config file only once(the first time the method is called during a program run).
def get_config_dict():
if not hasattr(get_config_dict, 'config_dict'):
get_config_dict.config_dict = dict(config.items('SECTION_NAME'))
return get_config_dict.config_dict
Now call the above function and get the required key's value :
config_details = get_config_dict()
key_1_value = config_details['key1']
-------------------------------------------------------------
Extending the approach mentioned above, reading section by section automatically and then accessing by section name followed by key name.
def get_config_section():
if not hasattr(get_config_section, 'section_dict'):
get_config_section.section_dict = dict()
for section in config.sections():
get_config_section.section_dict[section] =
dict(config.items(section))
return get_config_section.section_dict
To access:
config_dict = get_config_section()
port = config_dict['DB']['port']
(here 'DB' is a section name in config file
and 'port' is a key under section 'DB'.)
create a dictionary in your python module and store everything into it and access it, for example:
dict = {
'portalPath' : 'www.xyx.com',
'elementID': 'submit'}
Now to access it you can simply do:
submitButton = driver.find_element_by_id(dict['elementID'])
My Java ini files didn't have section headers and I wanted a dict as a result. So i simply injected an "[ini]" section and let the default config library do its job.
Take a version.ini fie of the eclipse IDE .metadata directory as an example:
#Mon Dec 20 07:35:29 CET 2021
org.eclipse.core.runtime=2
org.eclipse.platform=4.19.0.v20210303-1800
# 'injected' ini section
[ini]
#Mon Dec 20 07:35:29 CET 2021
org.eclipse.core.runtime=2
org.eclipse.platform=4.19.0.v20210303-1800
The result is converted to a dict:
from configparser import ConfigParser
#staticmethod
def readPropertyFile(path):
# https://stackoverflow.com/questions/3595363/properties-file-in-python-similar-to-java-properties
config = ConfigParser()
s_config= open(path, 'r').read()
s_config="[ini]\n%s" % s_config
# https://stackoverflow.com/a/36841741/1497139
config.read_string(s_config)
items=config.items('ini')
itemDict={}
for key,value in items:
itemDict[key]=value
return itemDict
This is what I'm doing in my project: I just create another .py file called properties.py which includes all common variables/properties I used in the project, and in any file need to refer to these variables, put
from properties import *(or anything you need)
Used this method to keep svn peace when I was changing dev locations frequently and some common variables were quite relative to local environment. Works fine for me but not sure this method would be suggested for formal dev environment etc.
import json
f=open('test.json')
x=json.load(f)
f.close()
print(x)
Contents of test.json:
{"host": "127.0.0.1", "user": "jms"}
I have created a python module that is almost similar to the Properties class of Java ( Actually it is like the PropertyPlaceholderConfigurer in spring which lets you use ${variable-reference} to refer to already defined property )
EDIT : You may install this package by running the command(currently tested for python 3).
pip install property
The project is hosted on GitHub
Example : ( Detailed documentation can be found here )
Let's say you have the following properties defined in my_file.properties file
foo = I am awesome
bar = ${chocolate}-bar
chocolate = fudge
Code to load the above properties
from properties.p import Property
prop = Property()
# Simply load it into a dictionary
dic_prop = prop.load_property_files('my_file.properties')
Below 2 lines of code shows how to use Python List Comprehension to load 'java style' property file.
split_properties=[line.split("=") for line in open('/<path_to_property_file>)]
properties={key: value for key,value in split_properties }
Please have a look at below post for details
https://ilearnonlinesite.wordpress.com/2017/07/24/reading-property-file-in-python-using-comprehension-and-generators/
you can use parameter "fromfile_prefix_chars" with argparse to read from config file as below---
temp.py
parser = argparse.ArgumentParser(fromfile_prefix_chars='#')
parser.add_argument('--a')
parser.add_argument('--b')
args = parser.parse_args()
print(args.a)
print(args.b)
config file
--a
hello
--b
hello dear
Run command
python temp.py "#config"
You could use - https://pypi.org/project/property/
eg - my_file.properties
foo = I am awesome
bar = ${chocolate}-bar
chocolate = fudge
long = a very long property that is described in the property file which takes up \
multiple lines can be defined by the escape character as it is done here
url=example.com/api?auth_token=xyz
user_dir=${HOME}/test
unresolved = ${HOME}/files/${id}/${bar}/
fname_template = /opt/myapp/{arch}/ext/{objid}.dat
Code
from properties.p import Property
## set use_env to evaluate properties from shell / os environment
prop = Property(use_env = True)
dic_prop = prop.load_property_files('my_file.properties')
## Read multiple files
## dic_prop = prop.load_property_files('file1', 'file2')
print(dic_prop)
# Output
# {'foo': 'I am awesome', 'bar': 'fudge-bar', 'chocolate': 'fudge',
# 'long': 'a very long property that is described in the property file which takes up multiple lines
# can be defined by the escape character as it is done here', 'url': 'example.com/api?auth_token=xyz',
# 'user_dir': '/home/user/test',
# 'unresolved': '/home/user/files/${id}/fudge-bar/',
# 'fname_template': '/opt/myapp/{arch}/ext/{objid}.dat'}
I did this using ConfigParser as follows. The code assumes that there is a file called config.prop in the same directory where BaseTest is placed:
config.prop
[CredentialSection]
app.name=MyAppName
BaseTest.py:
import unittest
import ConfigParser
class BaseTest(unittest.TestCase):
def setUp(self):
__SECTION = 'CredentialSection'
config = ConfigParser.ConfigParser()
config.readfp(open('config.prop'))
self.__app_name = config.get(__SECTION, 'app.name')
def test1(self):
print self.__app_name % This should print: MyAppName
This is what i had written to parse file and set it as env variables which skips comments and non key value lines added switches to specify
hg:d
-h or --help print usage summary
-c Specify char that identifies comment
-s Separator between key and value in prop file
and specify properties file that needs to be parsed eg : python
EnvParamSet.py -c # -s = env.properties
import pipes
import sys , getopt
import os.path
class Parsing :
def __init__(self , seprator , commentChar , propFile):
self.seprator = seprator
self.commentChar = commentChar
self.propFile = propFile
def parseProp(self):
prop = open(self.propFile,'rU')
for line in prop :
if line.startswith(self.commentChar)==False and line.find(self.seprator) != -1 :
keyValue = line.split(self.seprator)
key = keyValue[0].strip()
value = keyValue[1].strip()
print("export %s=%s" % (str (key),pipes.quote(str(value))))
class EnvParamSet:
def main (argv):
seprator = '='
comment = '#'
if len(argv) is 0:
print "Please Specify properties file to be parsed "
sys.exit()
propFile=argv[-1]
try :
opts, args = getopt.getopt(argv, "hs:c:f:", ["help", "seprator=","comment=", "file="])
except getopt.GetoptError,e:
print str(e)
print " possible arguments -s <key value sperator > -c < comment char > <file> \n Try -h or --help "
sys.exit(2)
if os.path.isfile(args[0])==False:
print "File doesnt exist "
sys.exit()
for opt , arg in opts :
if opt in ("-h" , "--help"):
print " hg:d \n -h or --help print usage summary \n -c Specify char that idetifes comment \n -s Sperator between key and value in prop file \n specify file "
sys.exit()
elif opt in ("-s" , "--seprator"):
seprator = arg
elif opt in ("-c" , "--comment"):
comment = arg
p = Parsing( seprator, comment , propFile)
p.parseProp()
if __name__ == "__main__":
main(sys.argv[1:])
Lightbend has released the Typesafe Config library, which parses properties files and also some JSON-based extensions. Lightbend's library is only for the JVM, but it seems to be widely adopted and there are now ports in many languages, including Python: https://github.com/chimpler/pyhocon
You can use the following function, which is the modified code of #mvallebr. It respects the properties file comments, ignores empty new lines, and allows retrieving a single key value.
def getProperties(propertiesFile ="/home/memin/.config/customMemin/conf.properties", key=''):
"""
Reads a .properties file and returns the key value pairs as dictionary.
if key value is specified, then it will return its value alone.
"""
with open(propertiesFile) as f:
l = [line.strip().split("=") for line in f.readlines() if not line.startswith('#') and line.strip()]
d = {key.strip(): value.strip() for key, value in l}
if key:
return d[key]
else:
return d
this works for me.
from pyjavaproperties import Properties
p = Properties()
p.load(open('test.properties'))
p.list()
print p
print p.items()
print p['name3']
I followed configparser approach and it worked quite well for me. Created one PropertyReader file and used config parser there to ready property to corresponding to each section.
**Used Python 2.7
Content of PropertyReader.py file:
#!/usr/bin/python
import ConfigParser
class PropertyReader:
def readProperty(self, strSection, strKey):
config = ConfigParser.RawConfigParser()
config.read('ConfigFile.properties')
strValue = config.get(strSection,strKey);
print "Value captured for "+strKey+" :"+strValue
return strValue
Content of read schema file:
from PropertyReader import *
class ReadSchema:
print PropertyReader().readProperty('source1_section','source_name1')
print PropertyReader().readProperty('source2_section','sn2_sc1_tb')
Content of .properties file:
[source1_section]
source_name1:module1
sn1_schema:schema1,schema2,schema3
sn1_sc1_tb:employee,department,location
sn1_sc2_tb:student,college,country
[source2_section]
source_name1:module2
sn2_schema:schema4,schema5,schema6
sn2_sc1_tb:employee,department,location
sn2_sc2_tb:student,college,country
You can try the python-dotenv library. This library reads key-value pairs from a .env (so not exactly a .properties file though) file and can set them as environment variables.
Here's a sample usage from the official documentation:
from dotenv import load_dotenv
load_dotenv() # take environment variables from .env.
# Code of your application, which uses environment variables (e.g. from `os.environ` or
# `os.getenv`) as if they came from the actual environment.

Categories

Resources