I love using the expression
if 'MICHAEL89' in USERNAMES:
...
where USERNAMES is a list.
Is there any way to match items with case insensitivity or do I need to use a custom method? Just wondering if there is a need to write extra code for this.
username = 'MICHAEL89'
if username.upper() in (name.upper() for name in USERNAMES):
...
Alternatively:
if username.upper() in map(str.upper, USERNAMES):
...
Or, yes, you can make a custom method.
str.casefold is recommended for case-insensitive string matching. #nmichaels's solution can trivially be adapted.
Use either:
if 'MICHAEL89'.casefold() in (name.casefold() for name in USERNAMES):
Or:
if 'MICHAEL89'.casefold() in map(str.casefold, USERNAMES):
As per the docs:
Casefolding is similar to lowercasing but more aggressive because it
is intended to remove all case distinctions in a string. For example,
the German lowercase letter 'ß' is equivalent to "ss". Since it is
already lowercase, lower() would do nothing to 'ß'; casefold()
converts it to "ss".
I would make a wrapper so you can be non-invasive. Minimally, for example...:
class CaseInsensitively(object):
def __init__(self, s):
self.__s = s.lower()
def __hash__(self):
return hash(self.__s)
def __eq__(self, other):
# ensure proper comparison between instances of this class
try:
other = other.__s
except (TypeError, AttributeError):
try:
other = other.lower()
except:
pass
return self.__s == other
Now, if CaseInsensitively('MICHAEL89') in whatever: should behave as required (whether the right-hand side is a list, dict, or set). (It may require more effort to achieve similar results for string inclusion, avoid warnings in some cases involving unicode, etc).
Usually (in oop at least) you shape your object to behave the way you want. name in USERNAMES is not case insensitive, so USERNAMES needs to change:
class NameList(object):
def __init__(self, names):
self.names = names
def __contains__(self, name): # implements `in`
return name.lower() in (n.lower() for n in self.names)
def add(self, name):
self.names.append(name)
# now this works
usernames = NameList(USERNAMES)
print someone in usernames
The great thing about this is that it opens the path for many improvements, without having to change any code outside the class. For example, you could change the self.names to a set for faster lookups, or compute the (n.lower() for n in self.names) only once and store it on the class and so on ...
Here's one way:
if string1.lower() in string2.lower():
...
For this to work, both string1 and string2 objects must be of type string.
I think you have to write some extra code. For example:
if 'MICHAEL89' in map(lambda name: name.upper(), USERNAMES):
...
In this case we are forming a new list with all entries in USERNAMES converted to upper case and then comparing against this new list.
Update
As #viraptor says, it is even better to use a generator instead of map. See #Nathon's answer.
You could do
matcher = re.compile('MICHAEL89', re.IGNORECASE)
filter(matcher.match, USERNAMES)
Update: played around a bit and am thinking you could get a better short-circuit type approach using
matcher = re.compile('MICHAEL89', re.IGNORECASE)
if any( ifilter( matcher.match, USERNAMES ) ):
#your code here
The ifilter function is from itertools, one of my favorite modules within Python. It's faster than a generator but only creates the next item of the list when called upon.
To have it in one line, this is what I did:
if any(([True if 'MICHAEL89' in username.upper() else False for username in USERNAMES])):
print('username exists in list')
I didn't test it time-wise though. I am not sure how fast/efficient it is.
Example from this tutorial:
list1 = ["Apple", "Lenovo", "HP", "Samsung", "ASUS"]
s = "lenovo"
s_lower = s.lower()
res = s_lower in (string.lower() for string in list1)
print(res)
My 5 (wrong) cents
'a' in "".join(['A']).lower()
UPDATE
Ouch, totally agree #jpp, I'll keep as an example of bad practice :(
I needed this for a dictionary instead of list, Jochen solution was the most elegant for that case so I modded it a bit:
class CaseInsensitiveDict(dict):
''' requests special dicts are case insensitive when using the in operator,
this implements a similar behaviour'''
def __contains__(self, name): # implements `in`
return name.casefold() in (n.casefold() for n in self.keys())
now you can convert a dictionary like so USERNAMESDICT = CaseInsensitiveDict(USERNAMESDICT) and use if 'MICHAEL89' in USERNAMESDICT:
Related
Often I use string constants such as:
DICT_KEY1 = 'DICT_KEY1'
DICT_KEY2 = 'DICT_KEY2'
...
Many of those times I don't mind what the actual literals are, as long as they're unique and understandable to the human reader.
This way it is easier to refactor and change a literal across the project.
So my question is, is there a standard way to make these string constants declarations simpler? I don't want to repeat writing the literal 'DICT_KEYn'.
For example, something like this could work:
#string_consts
class DictKeys:
DICT_KEY1: str
DICT_KEY2: str
...
assert DictKeys.DICT_KEY1 == 'DICT_KEY1'
If it helps, the implementation for your string_consts decorator would be
def string_consts(cls):
for key, value in cls.__annotations__.items():
if value is str:
setattr(cls, key, key)
return cls
This makes
#string_consts
class DictKeys:
DICT_KEY1: str
DICT_KEY2: str
assert DictKeys.DICT_KEY1 == 'DICT_KEY1'
as in your example work OOTB.
When at global level, you could do something like this:
for i in range(100):
globals()['DICT_KEY'+str(i)] = 'DICT_KEY'+str(i)
At class level, setattr could be used in the same way:
for i in range(100):
setattr(self, 'DICT_KEY'+str(i), 'DICT_KEY'+str(i))
If your keys are not a numeric list as suggested by your question, you could still do this:
for key in ['KEY', 'OTHERKEY', 'SOMEOTHERKEY']:
globals()[key] = key
In my opinion, this is way easier to understand than using the decorator solution - and does the job.
I want to know if there is a more efficient way than making it ask: is it this? no? okay then is it this? no? okay then is it this? etc. I want it so that I can just say it is this so do that
if this = this:
do this
elif this = that:
do that
elif this = these
do these
elif this = those
do those
I want to be more efficient.
Use a dictionary instead, assuming that this, that, these and those are functions:
def this():
return "this"
def that():
return "that"
def these():
return "these"
def those():
return "those"
d = {"this": this,
"that": that,
"these": these,
"those": those
}
this = "that"
r = d.get(this, None)
print(r())
You can create functions, store their names as values in a dictionary, with key corresponding to possible values your variable can take. Keys can be integers as well, here I've used string keys.
def mango(quantity):
print("You selected "+str(quantity)+" mango(es).")
def banana(quantity):
print("You selected "+str(quantity)+" banana(s).")
def apple():
print("Here, have an apple")
fruits = {"m":mango, "b":banana} #key->function name
fruit = "m"
quantity = 1 #e.g. of parameters you might want to supply to a funciton
if fruit in fruit_rates: #with if-else you can mimic 'default' case
fruit_rates[fruit](quantity)
else:
apple()
The most efficient option really depends on what it is you're actually going for. Another option here would be ternary operators, which can be chained up
this() if this else that() if that else those() if those else these() if these
Depending on your code and use, you might be able to refactor it to use the shorthand ternary operator as well
this or that
...which will do the first thing that evaluates to true, but doesn't leave room for a separate condition. However, you can add a separate condition with
test and this or that
such that test and this both need to evaluate to true or else 'that' is evaluated. If 'this' and 'that' are both truthy expressions, 'test' behaves like your case.
If you'd like, you can also use truthiness to index into a tuple....
(do_if_false, do_if_true)[test]
this one, to me, is less readable and more voodoo, but 'test' effectively evaluates to 0 or 1, returning the expression at that index. However, this will also evaluate all of the expressions, unless you took an extra step with:
(lambda: do_if_false, lambda: do_if_true)[test]
I'd like to see if it's possible to run through a list of functions in a function. The closest thing I could find is looping through an entire module. I only want to use a pre-selected list of functions.
Here's my original problem:
Given a string, check each letter to see if any of the 5 tests fulfill.
If a minimum of 1 letter passes a check, return True.
If all letters in the string fails the check, return False.
For each letter in the string, we will check these functions: isalnum(), isalpha(), isdigit(), islower(), isupper()
The result of each test should print to different lines.
Sample Input
qA2
Sample Output (must print to separate lines, True if at least one letter passes, or false is all letters fail each test):
True
True
True
True
True
I wrote this for one test. Of course I could just write 5 different sets of code but that seems ugly. Then I started wondering if I could just loop through all the tests they're asking for.
Code for just one test:
raw = 'asdfaa3fa'
counter = 0
for i in xrange(len(raw)):
if raw[i].isdigit() == True: ## This line is where I'd loop in diff func's
counter = 1
print True
break
if counter == 0:
print False
My fail attempt to run a loop with all the tests:
raw = 'asdfaa3fa'
lst = [raw[i].isalnum(),raw[i].isalpha(),raw[i].isdigit(),raw[i].islower(),raw[i].isupper()]
counter = 0
for f in range(0,5):
for i in xrange(len(raw)):
if lst[f] == True: ## loop through f, which then loops through i
print lst[f]
counter = 1
print True
break
if counter == 0:
print False
So how do I fix this code to fulfill all the rules up there?
Using info from all the comments - this code fulfills the rules stated above, looping through each method dynamically as well.
raw = 'ABC'
functions = [str.isalnum, str.isalpha, str.isdigit, str.islower, str.isupper]
for func in functions:
print any(func(letter) for letter in raw)
getattr approach (I think this is called introspection method?)
raw = 'ABC'
meths = ['isalnum', 'isalpha', 'isdigit', 'islower', 'isupper']
for m in meths:
print any(getattr(c,m)() for c in raw)
List comprehension approach:
from __future__ import print_function ## Changing to Python 3 to use print in list comp
raw = 'ABC'
functions = [str.isalnum, str.isalpha, str.isdigit, str.islower, str.isupper]
solution = [print(func(raw)) for func in functions]
The way you are looping through a list of functions is slightly off. This would be a valid way to do it. The functions you need to store in the list are the generic string functions given by str.funcname. Once you have those list of functions, you can loop through them using a for loop, and just treat it like a normal function!
raw = 'asdfaa3fa'
functions = [str.isalnum, str.isalpha, str.isdigit, str.islower, str.isupper] # list of functions
for fn in functions: # iterate over list of functions, where the current function in the list is referred to as fn
for ch in raw: # for each character in the string raw
if fn(ch):
print(True)
break
Sample outputs:
Input Output
===================================
"qA2" -----> True True True True True
"asdfaa3fa" -----> True True True True
Also I notice you seem to use indexing for iteration which makes me feel like you might be coming from a language like C/C++. The for in loop construct is really powerful in python so I would read up on it (y).
Above is a more pythonic way to do this but just as a learning tool, I wrote a working version that matches how you tried to do it as much as possible to show you where you went wrong specifically. Here it is with comments:
raw = 'asdfaa3fa'
lst = [str.isalnum, str.isalpha, str.isdigit, str.islower, str.isupper] # notice youre treating the functions just like variables and aren't actually calling them. That is, you're writing str.isalpha instead of str.isalpha()
for f in range(0,5):
counter = 0
for i in xrange(len(raw)):
if lst[f](raw[i]) == True: # In your attempt, you were checking if lst[f]==True; lst[f] is a function so you are checking if a function == True. Instead, you need to pass an argument to lst[f](), in this case the ith character of raw, and check whether what that function evaluates to is true
print lst[f]
counter = 1
print True
break
if counter == 0:
print False
Okay, so the first question is easy enough. The simple way to do it is just do
def foo(raw):
for c in raw:
if c.isalpha(): return True
if c.isdigit(): return True
# the other cases
return False
Never neglect the simplest thing that could work.
Now, if you want to do it dynamically -- which is the magic keyword you probably needed, you want to apply something like this (cribbed from another question):
meths = [isalnum, isalpha, isdigit, islower, isupper]
for c in raw:
for m in meths:
getattr(c, m)()
Warning, this is untested code meant to give you the idea. The key notion here is that the methods of an object are attributes just like anything else, so, for example getattr("a", "isalpha")() does the following:
Uses getattr to search the attributes dictionary of "a" for a method named isalpha
Returns that method itself -- <function isalpha>
then invokes that method using the () which is the function application operator in Python.
See this example:
In [11]: getattr('a', 'isalpha')()
Out[11]: True
All the other answers are correct, but since you're a beginner, I want to point out the problem in your code:
lst = [raw[i].isalnum(),raw[i].isalpha(),raw[i].isdigit(),raw[i].islower(),raw[i].isupper()]
First: Not sure which value i currently has in your code snipped, but it seems to point somewhere in the string - which results in single characters being evaluated, not the whole string raw.
Second: When you build your list, you are already calling the methods you want to insert, which has the effect that not the functions themself get inserted, but their return values (that's why you're seeing all those True values in your print statement).
Try changing your code as follows:
lst = [raw.isalnum, raw.isalpha, raw.isdigit, raw.islower, raw.isupper]
I'm going to guess that you're validating password complexity, and I'm also going to say that software which takes an input and says "False" and there's no indication why is user-hostile, so the most important thing is not "how to loop over nested char function code wizardry (*)" but "give good feedback", and suggest something more like:
raw = 'asdfaa3fa'
import re
def validate_password(password):
""" This function takes a password string, and validates it
against the complexity requirements from {wherever}
and returns True if it's complex enough, otherwise False """
if not re.search('\d', password):
print("Error: password needs to include at least one number")
return False
elif not re.search('[a-z]', password):
print("Error: password must include at least one lowercase letter")
return False
elif not re.search('[A-Z]', password):
print("Error: password must include at least one uppercase letter")
return False
print("Password is OK")
return True
validate_password(raw)
Try online at repl.it
And the regex searching checks ranges of characters and digits in one call, which is neater than a loop over characters.
(PS. your functions overlap; a string which has characters matching 'isupper', 'islower' and 'isnumeric' already has 'isadigit' and 'isalnum' covered. More interesting would be to handle characters like ! which are not upper, lower, digits or alnum).
(*) function wizardry like the other answers is normally exactly what I would answer, but there's so much of that already answered that I may as well answer the other way instead :P
To answer the original question:
raw = 'asdfa3fa'
functions = [str.isalnum, str.isalpha, str.isdigit, str.islower, str.isupper]
isanything = [func(raw) for func in functions]
print repr(isanything)
Since you are looping through a list of simple items and trying to find if all of the functions has any valid results, you can simply define the list of functions you want to call on the input and return that. Here is a rather pythonic example of what you are trying to achieve:
def checker(checks, value):
return all(any(check(r) for r in value) for check in checks)
Test it out:
>>> def checker(checks, value):
... return all(any(check(r) for r in value) for check in checks)
...
>>> checks = [str.isalnum, str.isalpha, str.isdigit, str.islower, str.isupper]
>>> checker(checks, 'abcdef123ABC')
True
>>> checker(checks, 'abcdef123')
False
>>>
You can use introspection to loop through all of an object's attributes, whether they be functions or some other type.
However you probably don't want to do that here, because str has lots of function attributes, and you're only interested in five of them. It's probably better to do as you did and just make a list of the five you want.
Also, you don't need to loop over each character of the string if you don't want to; those functions already look at the whole string.
Check out this one-line solution for your problem. That problem is from HackerRank. I loop through a list of functions using the built-in getattr function.
s='qA2'
[print(bool(list(filter(lambda x : getattr(x, func)(),s)))) for func in ['isalnum','isalpha','isdigit','islower','isupper']]
I have a bunch of strings that are of the form:
'foo.bar.baz.spam.spam.spam...etc'
In all likelihood they have three or more multi-letter substrings separated by .'s. There might be ill formed strings with less than two .'s, and I want the original string in that case.
The first thing that comes to mind is the str.partition method, which I would use if I were after everything after the first .:
'foo.bar.baz.boink.a.b.c'.partition('.')[2]
returns
'bar.baz.boink.a.b.c'
This could be repeated:
def secondpartition(s):
return s.partition('.')[2].partition('.')[2] or s
But is this efficient? It doesn't seem efficient to call a method twice and use a subscript twice. It is certainly inelegant. Is there a better way?
The main question is:
How do you drop everything from the beginning up to the second instance of the . character, so that 'foo.bar.baz.spam.spam.spam' becomes 'baz.spam.spam.spam'? What would be the best/most efficient way to do that?
Using str.split with maxsplit argument:
>>> 'foo.bar.baz.spam.spam.spam'.split('.', 2)[-1]
'baz.spam.spam.spam'
UPDATE
To handle string with less than two .s:
def secondpartition(s):
parts = s.split('.', 2)
if len(parts) <= 2:
return s
return parts[-1]
Summary: This is the most performant approach (generalized to n characters):
def maxsplittwoexcept(s, n, c):
'''
given string s, return the string after the nth character c
if less than n c's, return the whole string s.
'''
try:
return s.split(c, 2)[2]
except IndexError:
return s
but I show other approaches for comparison.
There are various ways of doing this with string methods and regular expressions. I'll ensure you can follow along with an interpreter by being able to cut and paste everything in order.
First imports:
import re
import timeit
from itertools import islice
Different approaches: string methods
The way mentioned in the question is to partition twice, but I discounted it because it seems rather inelegant and unnecessarily repetitive:
def secondpartition(s):
return s.partition('.')[2].partition('.')[2] or s
The second way that came to mind to do this is to split on the .'s, slice from the second on, and join with .'s. This struck me as fairly elegant and I assumed it would be rather efficient.
def splitslicejoin(s):
return '.'.join(s.split('.')[2:]) or s
But slices create an unnecessary extra list. However, islice from the itertools module provides an iterable that doesn't! So I expected this to do even better:
def splitislicejoin(s):
return '.'.join(islice(s.split('.'), 2, None)) or s
Different approaches: regular expressions
Now regular expressions. First way that came to mind with regular expressions was to find and substitute with an empty string up to the second ..
dot2 = re.compile('.*?\..*?\.')
def redot2(s):
return dot2.sub('', s)
But it occurred to me that it might be better to use a non-capturing group, and return a match on the end:
dot2match = re.compile('(?:.*?\..*?\.)(.*)')
def redot2match(s):
match = dot2match.match(s)
if match is not None:
return match.group(1)
else:
return s
Finally, I could use a regular expression search to find the end of the second . and then use that index to slice the string, which would use a lot more code, but might still be fast and memory efficient.
dot = re.compile('\.')
def find2nddot(s):
for i, found_dot in enumerate(dot.finditer(s)):
if i == 1:
return s[found_dot.end():] or s
return s
update Falsetru suggests str.split's maxsplit argument, which had completely slipped my mind. My thoughts are that it may be the most straightforward approach, but the assignment and extra checking might hurt it.
def maxsplittwo(s):
parts = s.split('.', 2)
if len(parts) <= 2:
return s
return parts[-1]
And JonClements suggests using an except referencing https://stackoverflow.com/a/27989577/541136 which would look like this:
def maxsplittwoexcept(s):
try:
return s.split('.', 2)[2]
except IndexError:
return s
which would be totally appropriate since not having enough .s would be exceptional.
Testing
Now let's test our functions. First, let's assert that they actually work (not a best practice in production code, which should use unittests, but useful for fast validation on StackOverflow):
functions = ('secondpartition', 'redot2match', 'redot2',
'splitslicejoin', 'splitislicejoin', 'find2nddot',
'maxsplittwo', 'maxsplittwoexcept')
for function in functions:
assert globals()[function]('foo.baz') == 'foo.baz'
assert globals()[function]('foo.baz.bar') == 'bar'
assert globals()[function]('foo.baz.bar.boink') == 'bar.boink'
The asserts don't raise AssertionErrors, so now let's time them to see how they perform:
Performance
setup = 'from __main__ import ' + ', '.join(functions)
perfs = {}
for func in functions:
perfs[func] = min(timeit.repeat(func + '("foo.bar.baz.a.b.c")', setup))
for func in sorted(perfs, key=lambda x: perfs[x]):
print('{0}: {1}'.format(func, perfs[func]))
Results
Update Best performer is falsetru's maxsplittwo, which slightly edges out the secondpartition function. Congratulations to falsetru. It makes sense since it is a very direct approach. And JonClements's modification is even better...
maxsplittwoexcept: 1.01329493523
maxsplittwo: 1.08345508575
secondpartition: 1.1336209774
splitslicejoin: 1.49500417709
redot2match: 2.22423219681
splitislicejoin: 3.4605550766
find2nddot: 3.77172589302
redot2: 4.69134306908
Older run and analysis without falsetru's maxsplittwo and JonClements' maxsplittwoexcept:
secondpartition: 0.636116637553
splitslicejoin: 1.05499717616
redot2match: 1.10188927335
redot2: 1.6313087087
find2nddot: 1.65386564664
splitislicejoin: 3.13693511439
It turns out that the most performant approach is to partition twice, even though my intuition didn't like it.
Also, it turns out my intuition on using islice was wrong in this case, it is much less performant, and so the extra list from the regular slice is probably worth the tradeoff if faced with a similar bit of code.
Of the regular expressions, the match approach for my desired string is the best performer here, nearly tied with splitslicejoin.
I've got this block of code in a real Django function. If certain conditions are met, items are added to the list.
ret = []
if self.taken():
ret.append('taken')
if self.suggested():
ret.append('suggested')
#.... many more conditions and appends...
return ret
It's very functional. You know what it does, and that's great...
But I've learned to appreciate the beauty of list and dict comprehensions.
Is there a more Pythonic way of phrasing this construct, perhaps that initialises and populates the array in one blow?
Create a mapping dictionary:
self.map_dict = {'taken': self.taken,
'suggested': self.suggested,
'foo' : self.bar}
[x for x in ['taken', 'suggested', 'foo'] if self.map_dict.get(x, lambda:False)()]
Related: Most efficient way of making an if-elif-elif-else statement when the else is done the most?
Not a big improvement, but I'll mention it:
def populate():
if self.taken():
yield 'taken'
if self.suggested():
yield 'suggested'
ret = list(populate())
Can we do better? I'm skeptical. Clearly there's a need of using another syntax than a list literal, because we no longer have the "1 expression = 1 element in result" invariant.
Edit:
There's a pattern to our data, and it's a list of (condition, value) pairs. We might try to exploit it using:
[value
for condition, value
in [(self.taken(), 'taken'),
(self.suggested(), 'suggested')]
if condition]
but this still is a restriction for how you describe your logic, still has the nasty side effect of evaluating all values no matter the condition (unless you throw in a ton of lambdas), and I can't really see it as an improvement over what we've started with.
For this very specific example, I could do:
return [x for x in ['taken', 'suggested', ...] if getattr(self, x)()]
But again, this only works where the item and method it calls to check have the same name, ie for my exact code. It could be adapted but it's a bit crusty. I'm very open to other solutions!
I don't know why we are appending strings that match the function names, but if this is a general pattern, we can use that. Functions have a __name__ attribute and I think it always contains what you want in the list.
So how about:
return [fn.__name__ for fn in (self.taken, self.suggested, foo, bar, baz) if fn()]
If I understand the problem correctly, this works just as well for non-member functions as for member functions.
EDIT:
Okay, let's add a mapping dictionary. And split out the function names into a tuple or list.
fns_to_check = (self.taken, self.suggested, foo, bar, baz)
# This holds only the exceptions; if a function isn't in here,
# we will use the .__name__ attribute.
fn_name_map = {foo:'alternate', bar:'other'}
def fn_name(fn):
"""Return name from exceptions map, or .__name__ if not in map"""
return fn_name_map.get(fn, fn.__name__)
return [fn_name(fn) for fn in fns_to_check if fn()]
You could also just use #hcwhsa's mapping dictionary answer. The main difference here is I'm suggesting just mapping the exceptions.
In another instance (where a value will be defined but might be None - a Django model's fields in my case), I've found that just adding them and filtering works:
return filter(None, [self.user, self.partner])
If either of those is None, They'll be removed from the list. It's a little more intensive than just checking but still fairly easy way of cleaning the output without writing a book.
One option is to have a "sentinel"-style object to take the place of list entries that fail the corresponding condition. Then a function can be defined to filter out the missing items:
# "sentinel indicating a list element that should be skipped
Skip = object()
def drop_missing(itr):
"""returns an iterator yielding all but Skip objects from the given itr"""
return filter(lambda v: v is not Skip, itr)
With this simple machinery, we come reasonably close to list-comprehension style syntax:
return drop_skips([
'taken' if self.taken else Skip,
'suggested' if self.suggested else Skip,
100 if self.full else Skip,
// many other values and conditions
])
ret = [
*('taken' for _i in range(1) if self.taken()),
*('suggested' for _i in range(1) if self.suggested()),
]
The idea is to use the list comprehension syntax to construct either a single element list with item 'taken', if self.taken() is True, or an empty list, if self.taken() is False, and then unpack it.