Q: Is is possible to create a format string using Python 3.5's string formatting syntax to left truncate?
Basically what I want to do is take a git SHA:
"c1e33f6717b9d0125b53688d315aff9cf8dd9977"
And using only a format string, get the display only the right 8 chars:
"f8dd9977"
Things Ive tried:
Invalid Syntax
>>> "{foo[-8:]}".format(foo="c1e33f6717b9d0125b53688d315aff9cf8dd9977")
>>> "{foo[-8]}".format(foo="c1e33f6717b9d0125b53688d315aff9cf8dd9977")
>>> "{:8.-8}".format("c1e33f6717b9d0125b53688d315aff9cf8dd9977")
Wrong Result
### Results in first 8 not last 8.
>>> "{:8.8}".format("c1e33f6717b9d0125b53688d315aff9cf8dd9977")
Works but inflexible and cumbersome
### solution requires that bar is always length of 40.
>>> bar="c1e33f6717b9d0125b53688d315aff9cf8dd9977"
>>> "{foo[32]}{foo[33]}{foo[34]}{foo[35]}{foo[36]}{foo[37]}{foo[38]}{foo[39]}".format(foo=bar)
A similar question was asked, but never answered. However mine differs in that I am limited to using only format string, I don't have the ability to change the range of the input param. This means that the following is an unacceptable solution:
>>> bar="c1e33f6717b9d0125b53688d315aff9cf8dd9977"
>>> "{0}".format(bar[-8:])
One more aspect I should clarify... the above explains the simplest form of the problem. In actual context, the problem is expressed more correctly as:
>>> import os
>>> "foo {git_sha}".format(**os.environ)
Where I want to left_truncate "git_sha" environment variable. Admittedly this is a tad more complex than simplest form, but if I can solve the simplest - I can find a way to solve the more complex.
So here is my solution, with thanks to #JacquesGaudin and folks on #Python for providing much guidance...
class MyStr(object):
"""Additional format string options."""
def __init__(self, obj):
super(MyStr, self).__init__()
self.obj = obj
def __format__(self, spec):
if spec.startswith("ltrunc."):
offset = int(spec[7:])
return self.obj[offset:]
else:
return self.obj.__format__(spec)
So this works when doing this:
>>> f = {k: MyStr(v) for k, v in os.environ.items()}
>>> "{PATH:ltrunc.-8}".format(**f)
Subclassing str and overriding the __format__ method is an option:
class CustomStr(str):
def __format__(self, spec):
if spec == 'trunc_left':
return self[-8:]
else:
return super().__format__(spec)
git_sha = 'c1e33f6717b9d0125b53688d315aff9cf8dd9977'
s = CustomStr(git_sha)
print('{:trunc_left}'.format(s))
Better though, you can create a custom Formatter which inherits from string.Formatter and will provide a format method. By doing this, you can override a number of methods used in the process of formatting strings. In your case, you want to override format_field:
from string import Formatter
class CustomFormatter(Formatter):
def format_field(self, value, format_spec):
if format_spec.startswith('trunc_left.'):
char_number = int(format_spec[len('trunc_left.'):])
return value[-char_number:]
return super().format_field(value, format_spec)
environ = {'git_sha': 'c1e33f6717b9d0125b53688d315aff9cf8dd9977'}
fmt = CustomFormatter()
print(fmt.format('{git_sha:trunc_left.8}', **environ))
Depending on the usage, you could put this in a context manager and temporarily shadow the builtin format function:
from string import Formatter
class CustomFormat:
class CustomFormatter(Formatter):
def format_field(self, value, format_spec):
if format_spec.startswith('trunc_left.'):
char_number = int(format_spec[len('trunc_left.'):])
return value[-char_number:]
return super().format_field(value, format_spec)
def __init__(self):
self.custom_formatter = self.CustomFormatter()
def __enter__(self):
self.builtin_format = format
return self.custom_formatter.format
def __exit__(self, exc_type, exc_value, traceback):
# make sure global format is set back to the original
global format
format = self.builtin_format
environ = {'git_sha': 'c1e33f6717b9d0125b53688d315aff9cf8dd9977'}
with CustomFormat() as format:
# Inside this context, format is our custom formatter's method
print(format('{git_sha:trunc_left.8}', **environ))
print(format) # checking that format is now the builtin function
Related
I was writing a test case for a function that accepts typing.BinaryIO that comes from fastapi.UploadFile.file.
def upload_binary(data: typing.BinaryIO):
...
I was confused what kind of object do I create that will pass type check. I tried io.StringIO and io.BytesIO, and the only way to check which one will be accepted as typing.BinaryIO was to use IDE's highlighting. It didn't accept StringIO but accepted BytesIO.
So my question - is there a way in Python to manually check if object will be validated with given typing hint.
For example some function like
file1 = StringIO("text")
file2 = BytesIO(b"text")
typing_check(file1, typing.BinaryIO) # >>> False
typing_check(file2, typing.BinaryIO) # >>> True
UPD
Looking at starlette/datastructures.py we have
class UploadFile:
...
file: typing.BinaryIO
def __init__(...):
if self.file is None:
self.file = tempfile.SpooledTemporaryFile(...)
And if you try to test it
s = tempfile.SpooledTemporaryFile()
isinstance(s, typing.BinaryIO) # >>> False
Somehow you can try building basic static type checking decorator using annotations and inspect module.
inspect.signature(fn) can read annotations of function parameters, and you can compare types with isinstance function.
import inspect
def static_type_checker(fn):
spec = inspect.signature(fn)
params = spec.parameters
def inner_fn(*args, **kwargs):
for arg, (name, param) in zip(args, params.items()):
assert isinstance(arg, param.annotation)
for key, value in kwargs.items():
assert isinstance(value, params[key].annotation)
return fn(*args, **kwargs)
return inner_fn
#static_type_checker
def sample(a: int):
print(a)
sample(1) # prints 1
sample("a") # AssertionError
Note that this is not perfect solution. It has many limitations, like when giving various-length arguments. I'm just suggesting basic idea.
Update 2
Alright, my answer to this question is not a complete solution to what I originally wanted but it's ok for simpler things like filename templating (what I originally intended to use this for). I have yet to come up with a solution for recursive templating. It might not matter to me though as I have reevaluated what I really need. Though it's possible I'll need bigger guns in the future, but then I'll probably just choose another more advanced templating engine instead of reinventing the tire.
Update
Ok I realize now string.Template probably is the better way to do this. I'll answer my own question when I have a working example.
I want to accomplish formatting strings by grouping keys and arbitrary text together in a nesting manner, like so
# conversions (!):
# u = upper case
# l = lower case
# c = capital case
# t = title case
fmt = RecursiveNamespaceFormatter(globals())
greeting = 'hello'
person = 'foreName surName'
world = 'WORLD'
sample = 'WELL {greeting!u} {super {person!t}, {tHiS iS tHe {world!t}!l}!c}!'
print(fmt.format(sample))
# output: WELL HELLO Super Forename Surname, this is the World!
I've subclassed string.Formatter to populate the nested fields which I retrieve with regex, and it works fine, except for the fields with a conversion type which doesn't get converted.
import re
from string import Formatter
class RecursiveNamespaceFormatter(Formatter):
def __init__(self, namespace={}):
Formatter.__init__(self)
self.namespace = namespace
def vformat(self, format_string, *args, **kwargs):
def func(i):
i = i.group().strip('{}')
return self.get_value(i,(),{})
format_string = re.sub('\{(?:[^}{]*)\}', func, format_string)
try:
return super().vformat(format_string, args, kwargs)
except ValueError:
return self.vformat(format_string)
def get_value(self, key, args, kwds):
if isinstance(key, str):
try:
# Check explicitly passed arguments first
return kwds[key]
except KeyError:
return self.namespace.get(key, key) # return key if not found (e.g. key == "this is the World")
else:
super().get_value(key, args, kwds)
def convert_field(self, value, conversion):
if conversion == "u":
return str(value).upper()
elif conversion == "l":
return str(value).lower()
elif conversion == "c":
return str(value).capitalize()
elif conversion == "t":
return str(value).title()
# Do the default conversion or raise error if no matching conversion found
return super().convert_field(value, conversion)
# output: WELL hello!u super foreName surName!t, tHiS iS tHe WORLD!t!l!c!
What am I missing? Is there a better way to do this?
Recursion is a complicated thing with this, especially with the limitations of python's re module. Before I tackled on with string.Template, I experimented with looping through the string and stacking all relevant indexes, to order each nested field in hierarchy. Maybe a combination of the two could work, I'm not sure.
Here's however a working, non-recursive example:
from string import Template, _sentinel_dict
class MyTemplate(Template):
delimiter = '$'
pattern = '\$(?:(?P<escaped>\$)|\{(?P<braced>[\w]+)(?:\.(?P<braced_func>\w+)\(\))*\}|(?P<named>(?:[\w]+))(?:\.(?P<named_func>\w+)\(\))*|(?P<invalid>))'
def substitute(self, mapping=_sentinel_dict, **kws):
if mapping is _sentinel_dict:
mapping = kws
elif kws:
mapping = _ChainMap(kws, mapping)
def convert(mo):
named = mapping.get(mo.group('named'), mapping.get(mo.group('braced')))
func = mo.group('named_func') or mo.group('braced_func') # i.e. $var.func() or ${var.func()}
if named is not None:
if func is not None:
# if named doesn't contain func, convert it to str and try again.
callable_named = getattr(named, func, getattr(str(named), func, None))
if callable_named:
return str(callable_named())
return str(named)
if mo.group('escaped') is not None:
return self.delimiter
if mo.group('invalid') is not None:
self._invalid(mo)
if named is not None:
raise ValueError('Unrecognized named group in pattern',
self.pattern)
return self.pattern.sub(convert, self.template)
sample1 = 'WELL $greeting.upper() super$person.title(), tHiS iS tHe $world.title().lower().capitalize()!'
S = MyTemplate(sample1)
print(S.substitute(**{'greeting': 'hello', 'person': 'foreName surName', 'world': 'world'}))
# output: WELL HELLO super Forename Surname, tHiS iS tHe World!
sample2 = 'testing${äää.capitalize()}.upper()ing $NOT_DECLARED.upper() $greeting '
sample2 += '$NOT_DECLARED_EITHER ASDF$world.upper().lower()ASDF'
S = MyTemplate(sample2)
print(S.substitute(**{
'some_var': 'some_value',
'äää': 'TEST',
'greeting': 'talofa',
'person': 'foreName surName',
'world': 'världen'
}))
# output: testingTest.upper()ing talofa ASDFvärldenASDF
sample3 = 'a=$a.upper() b=$b.bit_length() c=$c.bit_length() d=$d.upper()'
S = MyTemplate(sample3)
print(S.substitute(**{'a':1, 'b':'two', 'c': 3, 'd': 'four'}))
# output: a=1 b=two c=2 d=FOUR
As you can see, $var and ${var} works as expected, but the fields can also handle type methods. If the method is not found, it converts the value to str and checks again.
The methods can't take any arguments though. It also only catches the last method so chaining doesn't work either, which I believe is because re do not allow multiple groups to use the same name (the regex module does however).
With some tweaking of the regex pattern and some extra logic in convert both these things should be easily fixed.
MyTemplate.substitute works like MyTemplate.safe_substitute by not throwing exceptions on missing keys or fields.
I'm wondering how could one create a program to detect the following cases in the code, when comparing a variable to hardcoded values, instead of using enumeration, dynamically?
class AccountType:
BBAN = '000'
IBAN = '001'
UBAN = '002'
LBAN = '003'
I would like the code to report (drop a warning into the log) in the following case:
payee_account_type = self.get_payee_account_type(rc) # '001' for ex.
if payee_account_type in ('001', '002'): # Report on unsafe lookup
print 'okay, but not sure about the codes, man'
To encourage people to use the following approach:
payee_account_type = self.get_payee_account_type(rc)
if payee_account_type in (AccountType.IBAN, AccountType.UBAN):
print 'do this for sure'
Which is much safer.
It's not a problem to verify the == and != checks like below:
if payee_account_type == '001':
print 'codes again'
By wrapping payee_account_type into a class, with the following __eq__ implemented:
class Variant:
def __init__(self, value):
self._value = value
def get_value(self):
return self._value
class AccountType:
BBAN = Variant('000')
IBAN = Variant('001')
UBAN = Variant('002')
LBAN = Variant('003')
class AccountTypeWrapper(object):
def __init__(self, account_type):
self._account_type = account_type
def __eq__(self, other):
if isinstance(other, Variant):
# Safe usage
return self._account_type == other.get_value()
# The value is hardcoded
log.warning('Unsafe comparison. Use proper enumeration object')
return self._account_type == other
But what to do with tuple lookups?
I know, I could create a convention method wrapping the lookup, where the check can be done:
if IbanUtils.account_type_in(account_type, AccountType.IBAN, AccountType.UBAN):
pass
class IbanUtils(object):
def account_type_in(self, account_type, *types_to_check):
for type in types_to_check:
if not isinstance(type, Variant):
log.warning('Unsafe usage')
return account_type in types_to_check
But it's not an option for me, because I have a lot of legacy code I cannot touch, but still need to report on.
I would like to use Flask to handle URLs of the type:
http://localhost/aaa/bbb/ccc;x=1;y=10;z=11/ddd/
where x, y, z could have sensible defaults applied if they are absent (as would be possible with ddd for example).
One possible approach is to receive all the path, then split and handle manually:
#app.route('/')
#app.route('/<path:varargs>')
def hello(varargs = None):
if varargs:
print varargs
else:
print "Hello World"
Is there a more graceful approach to solving this problem?
No, out of the box doesn't support such URLs.
However, Flask uses the Werkzeug library for routing, and that library supports creating custom converters for path elements. It should not be too hard to provide a converter. This would need to be applied to specific path elements:
#app.route('/aaa/bbb/ccc<matrix(x=1, y=10, z=11):matrix_params>/ddd/')
Note the ability to pass in arguments; here x=1, y=10, z=11 are arguments to the converter, specifying default values for the matrix parameters.
The converter could be:
from werkzeug.routing import BaseConverter, ValidationError
class MatrixConverter(BaseConverter):
def __init__(self, url_map, **defaults):
super(MatrixConverter, self).__init__(url_map)
self.defaults = {k: str(v) for k, v in defaults.items()}
def to_python(self, value):
if not value.startswith(';'):
raise ValidationError()
value = value[1:]
parts = value.split(';')
result = self.defaults.copy()
for part in value.split(';'):
try:
key, value = part.split('=')
except ValueError:
raise ValidationError()
result[key.strip()] = value.strip()
return result
def to_url(self, value):
return ';' + ';'.join('{}={}'.format(*item) for item in value.items())
To add a custom converter to Flask, add it to the app.url_map.converters dictionary:
app.url_map.converters['matrix'] = MatrixConverter
before you add any routes that rely on this converter.
Let's say I have a class that has a member called data which is a list.
I want to be able to initialize the class with, for example, a filename (which contains data to initialize the list) or with an actual list.
What's your technique for doing this?
Do you just check the type by looking at __class__?
Is there some trick I might be missing?
I'm used to C++ where overloading by argument type is easy.
A much neater way to get 'alternate constructors' is to use classmethods. For instance:
>>> class MyData:
... def __init__(self, data):
... "Initialize MyData from a sequence"
... self.data = data
...
... #classmethod
... def fromfilename(cls, filename):
... "Initialize MyData from a file"
... data = open(filename).readlines()
... return cls(data)
...
... #classmethod
... def fromdict(cls, datadict):
... "Initialize MyData from a dict's items"
... return cls(datadict.items())
...
>>> MyData([1, 2, 3]).data
[1, 2, 3]
>>> MyData.fromfilename("/tmp/foobar").data
['foo\n', 'bar\n', 'baz\n']
>>> MyData.fromdict({"spam": "ham"}).data
[('spam', 'ham')]
The reason it's neater is that there is no doubt about what type is expected, and you aren't forced to guess at what the caller intended for you to do with the datatype it gave you. The problem with isinstance(x, basestring) is that there is no way for the caller to tell you, for instance, that even though the type is not a basestring, you should treat it as a string (and not another sequence.) And perhaps the caller would like to use the same type for different purposes, sometimes as a single item, and sometimes as a sequence of items. Being explicit takes all doubt away and leads to more robust and clearer code.
Excellent question. I've tackled this problem as well, and while I agree that "factories" (class-method constructors) are a good method, I would like to suggest another, which I've also found very useful:
Here's a sample (this is a read method and not a constructor, but the idea is the same):
def read(self, str=None, filename=None, addr=0):
""" Read binary data and return a store object. The data
store is also saved in the interal 'data' attribute.
The data can either be taken from a string (str
argument) or a file (provide a filename, which will
be read in binary mode). If both are provided, the str
will be used. If neither is provided, an ArgumentError
is raised.
"""
if str is None:
if filename is None:
raise ArgumentError('Please supply a string or a filename')
file = open(filename, 'rb')
str = file.read()
file.close()
...
... # rest of code
The key idea is here is using Python's excellent support for named arguments to implement this. Now, if I want to read the data from a file, I say:
obj.read(filename="blob.txt")
And to read it from a string, I say:
obj.read(str="\x34\x55")
This way the user has just a single method to call. Handling it inside, as you saw, is not overly complex
with python3, you can use Implementing Multiple Dispatch with Function Annotations as Python Cookbook wrote:
import time
class Date(metaclass=MultipleMeta):
def __init__(self, year:int, month:int, day:int):
self.year = year
self.month = month
self.day = day
def __init__(self):
t = time.localtime()
self.__init__(t.tm_year, t.tm_mon, t.tm_mday)
and it works like:
>>> d = Date(2012, 12, 21)
>>> d.year
2012
>>> e = Date()
>>> e.year
2018
Quick and dirty fix
class MyData:
def __init__(string=None,list=None):
if string is not None:
#do stuff
elif list is not None:
#do other stuff
else:
#make data empty
Then you can call it with
MyData(astring)
MyData(None, alist)
MyData()
A better way would be to use isinstance and type conversion. If I'm understanding you right, you want this:
def __init__ (self, filename):
if isinstance (filename, basestring):
# filename is a string
else:
# try to convert to a list
self.path = list (filename)
You should use isinstance
isinstance(...)
isinstance(object, class-or-type-or-tuple) -> bool
Return whether an object is an instance of a class or of a subclass thereof.
With a type as second argument, return whether that is the object's type.
The form using a tuple, isinstance(x, (A, B, ...)), is a shortcut for
isinstance(x, A) or isinstance(x, B) or ... (etc.).
You probably want the isinstance builtin function:
self.data = data if isinstance(data, list) else self.parse(data)
OK, great. I just tossed together this example with a tuple, not a filename, but that's easy. Thanks all.
class MyData:
def __init__(self, data):
self.myList = []
if isinstance(data, tuple):
for i in data:
self.myList.append(i)
else:
self.myList = data
def GetData(self):
print self.myList
a = [1,2]
b = (2,3)
c = MyData(a)
d = MyData(b)
c.GetData()
d.GetData()
[1, 2]
[2, 3]
My preferred solution is:
class MyClass:
_data = []
__init__(self,data=None):
# do init stuff
if not data: return
self._data = list(data) # list() copies the list, instead of pointing to it.
Then invoke it with either MyClass() or MyClass([1,2,3]).
Hope that helps. Happy Coding!
Why don't you go even more pythonic?
class AutoList:
def __init__(self, inp):
try: ## Assume an opened-file...
self.data = inp.read()
except AttributeError:
try: ## Assume an existent filename...
with open(inp, 'r') as fd:
self.data = fd.read()
except:
self.data = inp ## Who cares what that might be?