Related
Is there a way to create a dict object with a space in the key?
# This way works
>>> d = {'a b': 1}
>>> d
{'a b': 1}
# Is it possible to create the same using this method?
>>> d = dict('a b'=1)
File "<stdin>", line 1
SyntaxError: expression cannot contain assignment, perhaps you meant "=="?
No, this manner of constructing a dictionary cannot handle a key with a space.
https://docs.python.org/3/library/stdtypes.html#dict shows numerous methods for constructing dictionaries. The first one is what you're trying to do:
a = dict(one=1, two=2, three=3)
But the following note says:
Providing keyword arguments as in the first example only works for
keys that are valid Python identifiers.
A string value, as you're trying, is not a valid identifier. And an identifier cannot includes spaces, so a b without quotes will not work either.
When I try to create this
self.cmds = {
'help' : self.cmdsHelp,
'write-start' : self.startWriter,
'write' : self.writeTo,
'read-start' : self.startReader,
'read' : self.readFrom
}
with the built-in dict() function... i.e.
self.cmds = dict(
help = self.cmdsHelp,
write-start = self.startWriter,
write = self.writeTo,
read-start = self.startReader,
read = self.readFrom
)
... I get this error:
write-start = self.startWriter,
^
SyntaxError: keyword can't be an expression
The dictionary with the curly brackets ({}) -- whatever special name that is -- works, but I cannot fathom why the "newer version" (the dict() form) does not work. Is there something that I am missing, or do you just have to use the curly braces?
For clarity:
Each value in the dictionary is a function (and yes I did remove self., and I also tried to do both self.function() and function() so that when I called it I didn't have to do self.cmds[<input>]() but could rather do self.cmds[<input>])
Keyword arguments must be valid python identifiers. You cannot use - in valid identifiers (you are trying to subtract two identifiers instead). dict() is just a callable, and keyword arguments passed to it are no exception.
Use the {} literal dict syntax instead:
self.cmds = {
'help': self.cmdsHelp,
'write-start': self.startWriter,
'write': self.writeTo,
'read-start': self.startReader,
'read': self.readFrom,
}
because then you can use any valid immutable & hashable value as keys.
Alternatively, use valid identifiers; replace - with _, a character that is allowed in indentifiers:
self.cmds = dict(
help=self.cmdsHelp,
write_start=self.startWriter,
write=self.writeTo,
read_start=self.startReader,
read=self.readFrom,
)
Any other alternatives get ugly real fast; you could use the dict literal syntax to produce a **kwargs double-splat keyword argument mapping:
self.cmds = dict(
help=self.cmdsHelp,
write=self.writeTo,
read=self.readFrom,
**{
'read-start': self.startReader,
'write-start': self.startWriter,
}
)
but that's not any more readable, is it.
You can set those peskey non-identifier keys after the fact:
self.cmds = dict(
help=self.cmdsHelp,
write=self.writeTo,
read=self.readFrom,
)
self.cmds['read-start'] = self.startReader
self.cmds['write-start'] = self.startWriter
but that's more ugly still.
Note that dictionary displays (the official term for the syntax) are faster for the interpreter to process than are dict() calls, as fewer bytecode instructions are used to build one and no function call is involved.
Is it possible to capitalize a word using string formatting? For example,
"{user} did such and such.".format(user="foobar")
should return "Foobar did such and such."
Note that I'm well aware of .capitalize(); however, here's a (very simplified version of) code I'm using:
printme = random.choice(["On {date}, {user} did la-dee-dah. ",
"{user} did la-dee-dah on {date}. "
])
output = printme.format(user=x,date=y)
As you can see, just defining user as x.capitalize() in the .format() doesn't work, since then it would also be applied (incorrectly) to the first scenario. And since I can't predict fate, there's no way of knowing which random.choice would be selected in advance. What can I do?
Addt'l note: Just doing output = random.choice(['xyz'.format(),'lmn'.format()]) (in other words, formatting each string individually, and then using .capitalize() for the ones that need it) isn't a viable option, since printme is actually choosing from ~40+ strings.
As said #IgnacioVazquez-Abrams, create a subclass of string.Formatter allow you to extend/change the format string processing.
In your case, you have to overload the method convert_field
from string import Formatter
class ExtendedFormatter(Formatter):
"""An extended format string formatter
Formatter with extended conversion symbol
"""
def convert_field(self, value, conversion):
""" Extend conversion symbol
Following additional symbol has been added
* l: convert to string and low case
* u: convert to string and up case
default are:
* s: convert with str()
* r: convert with repr()
* a: convert with ascii()
"""
if conversion == "u":
return str(value).upper()
elif conversion == "l":
return str(value).lower()
# Do the default conversion or raise error if no matching conversion found
return super(ExtendedFormatter, self).convert_field(value, conversion)
# Test this code
myformatter = ExtendedFormatter()
template_str = "normal:{test}, upcase:{test!u}, lowcase:{test!l}"
output = myformatter.format(template_str, test="DiDaDoDu")
print(output)
You can pass extra values and just not use them, like this lightweight option
printme = random.choice(["On {date}, {user} did la-dee-dah. ",
"{User} did la-dee-dah on {date}. "
])
output = printme.format(user=x, date=y, User=x.capitalize())
The best choice probably depends whether you are doing this enough to need your own fullblown Formatter.
You can create your own subclass of string.Formatter which will allow you to recognize a custom conversion that you can use to recase your strings.
myformatter.format('{user!u} did la-dee-dah on {date}, and {pronoun!l} liked it. ',
user=x, date=y, pronoun=z)
In python 3.6+ you can use fstrings now. https://realpython.com/python-f-strings/
>>> txt = 'aBcD'
>>> f'{txt.upper()}'
'ABCD'
I want to achieve the following with str.format:
x,y = 1234,5678
print str(x)[2:] + str(y)[:2]
The only way I was able to do it was:
print '{0}{1}'.format(str(x)[2:],str(y)[:2])
Now, this an example and what I really have is a long and messy string, and so I want to put slicing inside the {}. I've studied the docs, but I can't figure out the correct syntax. My question is: is it possible to slice strings inside a replacement field?
No, you can't apply slicing to strings inside a the replacement field.
You'll need to refer to the Format Specification Mini-Language; it defines what is possible. This mini language defines how you format the referenced value (the part after the : in the replacement field syntax).
You could do something like this.
NOTE
This is a rough example and should not be considered complete and tested. But I think it shows you a way to start getting where you want to be.
import string
class SliceFormatter(string.Formatter):
def get_value(self, key, args, kwds):
if '|' in key:
try:
key, indexes = key.split('|')
indexes = map(int, indexes.split(','))
if key.isdigit():
return args[int(key)][slice(*indexes)]
return kwds[key][slice(*indexes)]
except KeyError:
return kwds.get(key, 'Missing')
return super(SliceFormatter, self).get_value(key, args, kwds)
phrase = "Hello {name|0,5}, nice to meet you. I am {name|6,9}. That is {0|0,4}."
fmt = SliceFormatter()
print fmt.format(phrase, "JeffJeffJeff", name="Larry Bob")
OUTPUT
Hello Larry, nice to meet you. I am Bob. That is Jeff.
NOTE 2
There is no support for slicing like [:5] or [6:], but I think that would be easy enough to implement as well. Also there is no error checking for slice indexes out of range, etc.
You can use a run-time evaluated "f" string. Python f-strings support slicing and don't use a "mini-language" like the formatter. The full power of a python expression is available within each curly-brace of an f-string. Unfortunately there is no string.feval() function ... imo there should be (languages should not have magic abilities that are not provided to the user).
You also can't add one to the string type, because the built-in python types cannot be modified/expanded.
See https://stackoverflow.com/a/49884004/627042 for an example of a run-time evaluates f-string.
Straight answering your question: No, slicing is not supported by builtin str formatting. Although, there is a workaround in case f-strings (runtime evaluated) don't fit your needs.
Workaround
The previous answers to extend string.Formatter are not completely right, since overloading get_value is not the correct way to add the slicing mechanism to string.Formatter.
import string
def transform_to_slice_index(val: str):
if val == "_":
return None
else:
return int(val)
class SliceFormatter(string.Formatter):
def get_field(self, field_name, args, kwargs):
slice_operator = None
if type(field_name) == str and '|' in field_name:
field_name, slice_indexes = field_name.split('|')
slice_indexes = map(transform_to_slice_index,
slice_indexes.split(','))
slice_operator = slice(*slice_indexes)
obj, first = super().get_field(field_name, args, kwargs)
if slice_operator is not None:
obj = obj[slice_operator]
return obj, first
Explanation
get_value is called inside get_field and it is used ONLY to access the args and kwargs from vformat(). attr and item accessing is done in get_field. Thus, the slice access should be done after super().get_field returned the desired obj.
With this said, overloading get_value gives you the problem that the formatter would not work for slicing after the object is traversed. You can see the error in this example:
WrongSliceFormatter().format("{foo.bar[0]|1,3}", foo=foo)
>> ValueError: "Only '.' or '[' may follow ']' in format field specifier"
This is a nice solution and solved my slicing problem quite nicely. However, I also wanted to do value eliding as well. For example 'AVeryLongStringValue' that I might want to stuff in a 10 character field, might be truncated to '...ngValue'. So I extended your example to support slicing, eliding, and normal formatting all in one. This is what I came up with.
class SliceElideFormatter(string.Formatter):
"""An extended string formatter that provides key specifiers that allow
string values to be sliced and elided if they exceed a length limit. The
additional formats are optional and can be combined with normal python
formatting. So the whole syntax looks like:
key[|slice-options][$elide-options[:normal-options]
Where slice options consist of '|' character to begin a slice request,
followed by slice indexes separated by commas. Thus {FOO|5,} requests
everything after the 5th element.
The elide consist of '$' character followed by an inter max field value,
followed by '<', '^', or '>' for pre, centered, or post eliding, followed
by the eliding string. Thus {FOO$10<-} would display the last 9 chanacters
of a string longer then 10 characters with '-' prefix.
Slicing and eliding can be combined. For example given a dict of
{'FOO': 'centeredtextvalue', and a format string of
'{FOO|1,-1$11^%2E%2E%2E}' would yield 'ente...valu'. The slice spec removes
the first and last characrers, and the elide spec center elides the
remaining value with '...'. The '...' value must be encoded in URL format
since . is an existing special format character.
"""
def get_value(self, key, args, kwds):
"""Called by string.Formatter for each format key found in the format
string. The key is checked for the presence of a slice or elide intro-
ducer character. If one or both a found the slice and/or elide spec
is extracted, parsed and processed on value of found with the remaining
key string.
Arguments:
key, A format key string possibly containing slice or elide specs
args, Format values list tuple
kwds, Format values key word dictrionary
"""
sspec = espec = None
if '|' in key:
key, sspec = key.split('|')
if '$' in sspec:
sspec, espec = sspec.split('$')
elif '$' in key:
key, espec = key.split('$')
value = args[int(key)] if key.isdigit() else kwds[key]
if sspec:
sindices = [int(sdx) if sdx else None
for sdx in sspec.split(',')]
value = value[slice(*sindices)]
if espec:
espec = urllib.unquote(espec)
if '<' in espec:
value = self._prefix_elide_value(espec, value)
elif '>' in espec:
value = self._postfix_elide_value(espec, value)
elif '^' in espec:
value = self._center_elide_value(espec, value)
else:
raise ValueError('invalid eliding option %r' % elidespec)
if sspec or espec:
return value
return super(SliceElideFormatter,self).get_value(key, args, kwds)
def _center_elide_value(self, elidespec, value):
"""Return center elide value if it exceeds the elide length.
Arguments:
elidespec, The elide spec field extracted from key
value, Value obtained from remaing key to maybe be elided
"""
elidelen, elidetxt = elidespec.split('^')
elen, vlen = int(elidelen), len(value)
if vlen > elen:
tlen = len(elidetxt)
return value[:(elen-tlen)//2] + elidetxt + value[-(elen-tlen)//2:]
return value
def _postfix_elide_value(self, elidespec, value):
"""Return postfix elided value if it exceeds the elide length.
Arguments:
elidespec, The elide spec field extracted from key
value, Value obtained from remaing key to maybe be elided
"""
elidelen, elidetxt = elidespec.split('>')
elen, vlen = int(elidelen), len(value)
if vlen > elen:
tlen = len(elidetxt)
return value[:(elen-tlen)] + elidetxt
return value
def _prefix_elide_value(self, elidespec, value):
"""Return prefix elided value if it exceeds the elide length.
Arguments:
elidespec, The elide spec field extracted from key
value, Value obtained from remaing key to maybe be elided
"""
elidelen, elidetxt = elidespec.split('<')
elen, vlen = int(elidelen), len(value)
if vlen > elen:
tlen = len(elidetxt)
return elidetxt + value[-(elen-tlen):]
return value
As an example all three format specs can be combined to clip the values first and last characters, center elide the slice to a 10 char value, and finally right justify it in a 12 char field as follows:
sefmtr = SliceElideFormatter()
data = { 'CNT':'centeredtextvalue' }
fmt = '{CNT|1,-1$10^**:>12}'
print '%r' % sefmtr.format(fmt, *(), **data)
Outputs: ' ente**valu'. For anyone else that may be interested. Thanks much.
I tried doing it in python 3.9 and it is working well
x="nowpossible"
print(" slicing is possible {}".format(x[0:2]))
output
slicing is possible now
Does Python have something like an empty string variable where you can do:
if myString == string.empty:
Regardless, what's the most elegant way to check for empty string values? I find hard coding "" every time for checking an empty string not as good.
Empty strings are "falsy" (python 2 or python 3 reference), which means they are considered false in a Boolean context, so you can just do this:
if not myString:
This is the preferred way if you know that your variable is a string. If your variable could also be some other type then you should use:
if myString == "":
See the documentation on Truth Value Testing for other values that are false in Boolean contexts.
From PEP 8, in the “Programming Recommendations” section:
For sequences, (strings, lists, tuples), use the fact that empty sequences are false.
So you should use:
if not some_string:
or:
if some_string:
Just to clarify, sequences are evaluated to False or True in a Boolean context if they are empty or not. They are not equal to False or True.
The most elegant way would probably be to simply check if its true or falsy, e.g.:
if not my_string:
However, you may want to strip white space because:
>>> bool("")
False
>>> bool(" ")
True
>>> bool(" ".strip())
False
You should probably be a bit more explicit in this however, unless you know for sure that this string has passed some kind of validation and is a string that can be tested this way.
I would test noneness before stripping. Also, I would use the fact that empty strings are False (or Falsy). This approach is similar to Apache's StringUtils.isBlank or Guava's Strings.isNullOrEmpty
This is what I would use to test if a string is either None OR Empty OR Blank:
def isBlank (myString):
return not (myString and myString.strip())
And, the exact opposite to test if a string is not None NOR Empty NOR Blank:
def isNotBlank (myString):
return bool(myString and myString.strip())
I once wrote something similar to Bartek's answer and javascript inspired:
def is_not_blank(s):
return bool(s and not s.isspace())
Test:
print is_not_blank("") # False
print is_not_blank(" ") # False
print is_not_blank("ok") # True
print is_not_blank(None) # False
The only really solid way of doing this is the following:
if "".__eq__(myString):
All other solutions have possible problems and edge cases where the check can fail.
len(myString) == 0 can fail if myString is an object of a class that inherits from str and overrides the __len__() method.
myString == "" and myString.__eq__("") can fail if myString overrides __eq__() and __ne__().
"" == myString also gets fooled if myString overrides __eq__().
myString is "" and "" is myString are equivalent. They will both fail if myString is not actually a string but a subclass of string (both will return False). Also, since they are identity checks, the only reason why they work is because Python uses String Pooling (also called String Internment) which uses the same instance of a string if it is interned (see here: Why does comparing strings using either '==' or 'is' sometimes produce a different result?). And "" is interned from the start in CPython
The big problem with the identity check is that String Internment is (as far as I could find) that it is not standardised which strings are interned. That means, theoretically "" is not necessary interned and that is implementation dependant.
Also, comparing strings using is in general is a pretty evil trap since it will work correctly sometimes, but not at other times, since string pooling follows pretty strange rules.
Relying on the falsyness of a string may not work if the object overrides __bool__().
The only way of doing this that really cannot be fooled is the one mentioned in the beginning: "".__eq__(myString). Since this explicitly calls the __eq__() method of the empty string it cannot be fooled by overriding any methods in myString and solidly works with subclasses of str.
This is not only theoretical work but might actually be relevant in real usage since I have seen frameworks and libraries subclassing str before and using myString is "" might return a wrong output there.
That said, in most cases all of the mentioned solutions will work correctly. This is post is mostly academic work.
Test empty or blank string (shorter way):
if myString.strip():
print("it's not an empty or blank string")
else:
print("it's an empty or blank string")
If you want to differentiate between empty and null strings, I would suggest using if len(string), otherwise, I'd suggest using simply if string as others have said. The caveat about strings full of whitespace still applies though, so don't forget to strip.
if stringname: gives a false when the string is empty. I guess it can't be simpler than this.
I find hardcoding(sic) "" every time for checking an empty string not as good.
Clean code approach
Doing this: foo == "" is very bad practice. "" is a magical value. You should never check against magical values (more commonly known as magical numbers)
What you should do is compare to a descriptive variable name.
Descriptive variable names
One may think that "empty_string" is a descriptive variable name. It isn't.
Before you go and do empty_string = "" and think you have a great variable name to compare to. This is not what "descriptive variable name" means.
A good descriptive variable name is based on its context.
You have to think about what the empty string is.
Where does it come from.
Why is it there.
Why do you need to check for it.
Simple form field example
You are building a form where a user can enter values. You want to check if the user wrote something or not.
A good variable name may be not_filled_in
This makes the code very readable
if formfields.name == not_filled_in:
raise ValueError("We need your name")
Thorough CSV parsing example
You are parsing CSV files and want the empty string to be parsed as None
(Since CSV is entirely text based, it cannot represent None without using predefined keywords)
A good variable name may be CSV_NONE
This makes the code easy to change and adapt if you have a new CSV file that represents None with another string than ""
if csvfield == CSV_NONE:
csvfield = None
There are no questions about if this piece of code is correct. It is pretty clear that it does what it should do.
Compare this to
if csvfield == EMPTY_STRING:
csvfield = None
The first question here is, Why does the empty string deserve special treatment?
This would tell future coders that an empty string should always be considered as None.
This is because it mixes business logic (What CSV value should be None) with code implementation (What are we actually comparing to)
There needs to be a separation of concern between the two.
How about this? Perhaps it's not "the most elegant", but it seems pretty complete and clear:
if (s is None) or (str(s).strip()==""): // STRING s IS "EMPTY"...
if you want to check if a string is completely empty
if not mystring
which works because empty strings are false
but if a string is only whitespace it will be true so you might want to
if not mystring.strip()
Responding to #1290. Sorry, no way to format blocks in comments. The None value is not an empty string in Python, and neither is (spaces). The answer from Andrew Clark is the correct one: if not myString. The answer from #rouble is application-specific and does not answer the OP's question. You will get in trouble if you adopt a peculiar definition of what is a "blank" string. In particular, the standard behavior is that str(None) produces 'None', a non-blank string.
However if you must treat None and (spaces) as "blank" strings, here is a better way:
class weirdstr(str):
def __new__(cls, content):
return str.__new__(cls, content if content is not None else '')
def __nonzero__(self):
return bool(self.strip())
Examples:
>>> normal = weirdstr('word')
>>> print normal, bool(normal)
word True
>>> spaces = weirdstr(' ')
>>> print spaces, bool(spaces)
False
>>> blank = weirdstr('')
>>> print blank, bool(blank)
False
>>> none = weirdstr(None)
>>> print none, bool(none)
False
>>> if not spaces:
... print 'This is a so-called blank string'
...
This is a so-called blank string
Meets the #rouble requirements while not breaking the expected bool behavior of strings.
I did some experimentation with strings like '', ' ', '\n', etc. I want isNotWhitespace to be True if and only if the variable foo is a string with at least one non-whitespace character. I'm using Python 3.6. Here's what I ended up with:
isWhitespace = str is type(foo) and not foo.strip()
isNotWhitespace = str is type(foo) and not not foo.strip()
Wrap this in a method definition if desired.
a = ''
b = ' '
a.isspace() -> False
b.isspace() -> True
The clearest approach is:
if s == "":
Benefits:
Additional indication to the programmer what the type of s should be.
"" is not "hard-coding" a magic value any more than x == 0 is.
Some values are fundamental and do not need a named constant; e.g. x % 2 to check for even numbers.
Cannot incorrectly indicate that any falsy value (e.g. []) is an empty string.
Consider how one checks if an integer is 0:
if x == 0:
One certainly should not do:
if not x:
Both integers and strings are primitive value types. Why treat them differently?
not str(myString)
This expression is True for strings that are empty. Non-empty strings, None and non-string objects will all produce False, with the caveat that objects may override __str__ to thwart this logic by returning a falsy value.
You may have a look at this Assigning empty value or string in Python
This is about comparing strings that are empty. So instead of testing for emptiness with not, you may test is your string is equal to empty string with "" the empty string...
for those who expect a behaviour like the apache StringUtils.isBlank or Guava Strings.isNullOrEmpty :
if mystring and mystring.strip():
print "not blank string"
else:
print "blank string"
When you are reading file by lines and want to determine, which line is empty, make sure you will use .strip(), because there is new line character in "empty" line:
lines = open("my_file.log", "r").readlines()
for line in lines:
if not line.strip():
continue
# your code for non-empty lines
If you are not totally sure, that your input is really a string, I would recommend to use isinstance(object, classinfo) link in addition, as shown in the example.
If not, lists or a True bool value could also be evaluated as True.
<script type="text/javascript" src="//cdn.datacamp.com/dcl-react.js.gz"></script>
<div data-datacamp-exercise data-lang="python">
<code data-type="sample-code">
def test_string(my_string):
if isinstance(my_string, str) and my_string:
print("It's a me, String! -> " + my_string)
else:
print("Nope. No, String")
def not_fully_test_string(my_string):
if my_string:
print("It's a me, String??? -> " + str(my_string))
else:
print("Nope. No, String")
print("Testing String:")
test_string("")
test_string(True)
test_string(["string1", "string2"])
test_string("My String")
test_string(" ")
print("\nTesting String or not?")
not_fully_test_string("")
not_fully_test_string(True)
not_fully_test_string(["string1", "string2"])
not_fully_test_string("My String")
not_fully_test_string(" ")
</code>
</div>
If you just use
not var1
it is not possible to difference a variable which is boolean False from an empty string '':
var1 = ''
not var1
> True
var1 = False
not var1
> True
However, if you add a simple condition to your script, the difference is made:
var1 = False
not var1 and var1 != ''
> True
var1 = ''
not var1 and var1 != ''
> False
In case this is useful to someone, here is a quick function i built out to replace blank strings with N/A's in lists of lists (python 2).
y = [["1","2",""],["1","4",""]]
def replace_blank_strings_in_lists_of_lists(list_of_lists):
new_list = []
for one_list in list_of_lists:
new_one_list = []
for element in one_list:
if element:
new_one_list.append(element)
else:
new_one_list.append("N/A")
new_list.append(new_one_list)
return new_list
x= replace_blank_strings_in_lists_of_lists(y)
print x
This is useful for posting lists of lists to a mysql database that does not accept blanks for certain fields (fields marked as NN in schema. in my case, this was due to a composite primary key).
Below is an elegant solution for any number of spaces.
def str_empty(s: str) -> bool:
"""Strip white space and count remaining characters."""
return len(s.strip()) < 1
>>> str_empty(' ')
True
As prmatta posted above, but with mistake.
def isNoneOrEmptyOrBlankString (myString):
if myString:
if not myString.strip():
return True
else:
return False
return False