What is the most pythonic and/or efficient way to count the number of characters in a string that are lowercase?
Here's the first thing that came to mind:
def n_lower_chars(string):
return sum([int(c.islower()) for c in string])
Clever trick of yours! However, I find it more readable to filter the lower chars, adding 1 for each one.
def n_lower_chars(string):
return sum(1 for c in string if c.islower())
Also, we do not need to create a new list for that, so removing the [] will make sum() work over an iterator, which consumes less memory.
def n_lower_chars(string):
return len(filter(str.islower, string))
def n_lower_chars(string):
return sum(map(str.islower, string))
If you want to divide things a little more finely:
from collections import Counter
text = "ABC abc 123"
print Counter("lower" if c.islower() else
"upper" if c.isupper() else
"neither" for c in text)
Related
Newbie here...Trying to write a function that takes a string and replaces all the characters with their respective dictionary values.
Here is what I have:
def alphabet_position(text):
dict = {'a':'1','b':'2','c':'3','d':'4','e':'5','f':'6','g':'7','h':'8':'i':'9','j':'10','k':'11','l':'12','m':'13','n':'14','o':'15','p':'16','q':'17','r':'18','s':'19','t':'20','u':'21','v':'22','w':'23','x':'24','y':'25','z':'26'}
text = text.lower()
for i in text:
if i in dict:
new_text = text.replace(i, dict[i])
print (new_text)
But when I run:
alphabet_position("The sunset sets at twelve o' clock.")
I get:
the sunset sets at twelve o' cloc11.
meaning it only changes the last character in the string. Any ideas? Any input is greatly appreciated.
Following your logic you need to create a new_text string and then iteratively replace its letters. With your code, you are only replacing one letter at a time, then start from scratch with your original string:
def alphabet_position(text):
dict = {'a':'1','b':'2','c':'3','d':'4','e':'5','f':'6','g':'7','h':'8','i':'9','j':'10','k':'11','l':'12','m':'13','n':'14','o':'15','p':'16','q':'17','r':'18','s':'19','t':'20','u':'21','v':'22','w':'23','x':'24','y':'25','z':'26'}
new_text = text.lower()
for i in new_text:
if i in dict:
new_text = new_text.replace(i, dict[i])
print (new_text)
And as suggested by Kevin, you can optimize a bit using set. (adding his comment here since he deleted it: for i in set(new_text):) Note that this might be beneficial only for large inputs though...
As your question is generally asking about "Alphabet position in python", I thought I could complement the already accepted answer with a different approach. You can take advantage of Python's string lib, char to int conversion and list comprehension to do the following:
import string
def alphabet_position(text):
alphabet = string.ascii_lowercase
return ''.join([str(ord(char)-96) if char in alphabet else char for char in text])
Your approach is not very efficient. You are recreating the string for every character.
There are 5 e characters in your string. This means replace is called 5 times, even though it only actually needs to do anything the first time.
There is another approach that might be more efficient. We cant use str.translate unfortunately, as it's remit is one to one replacements.
We just iterate the input and produce a new string character by character.
def alphabet_position2(text):
d = {L: str(i) for i, L in enumerate('abcdefghijklmnopqrstuvwxyz', 1)}
result = ''
for t in text.lower():
result += d.get(t, t)
return result
This is a pretty simple approach with list comprehension.
Generate k:v in this format from string module, 1:b instead of b:1
import string
def alphabet_position(text):
alphabeths = {v: k for k, v in enumerate(string.ascii_lowercase, start=1)}
return " ".join(str(alphabeths.get(char)) for char in text.lower() if char in alphabeths.keys())
I am trying to make the python program check if at least one letter is in a string?
import string
s = ('hello')
if string.ascii_lowercase in s:
print('yes')
else:
print('no')
It always just prints no
Well, string.ascii_lowercase is equal to 'abcdefghijklmnopqrstuvwxyz'. That doesn't look like it's contained in hello, right?
What you should do instead is to go over the letters in ascii_lowercase and check if any of them are in your string s.
import string
s = ('hello')
if any([letter in s for letter in string.ascii_lowercase]):
print('yes')
else:
print('no')
Wonderfully smart people in the comments have pointed out that you can drop the [ ] brackets that would usually create a list, turning our list comprehension into something called a generator. This would prevent the need to check every single letter in ascii_lowercase and make our code a little bit faster - as it stands, the whole list is generated and then checked. With the generator, the letters are checked only up to e, as that's in 'hello'.
I was able to shave off a whole nanosecond this way! Still, straight up going through the whole list should be fine as well for most cases and is certainly simpler.
An efficient way to check if some string s contains any character from some alphabet:
alphabet = frozenset(string.ascii_lowercase)
any(letter in alphabet for letter in s)
Key points:
Avoid linear search by storing the alphabet in a set instead of a more general iterable that doesn't allow fast (O(1)) check of elements
Loop over the input, not the target alphabet, because the alphabet is probably a finite set of constant size, and allow even very large inputs efficiently, without linear searching and excessive memory use (putting input in a set instead of the alphabet)
Avoid unnecessary list creation (and wasted memory) by using a generator expression
Here are some inferior alternatives.
Linear search over string.ascii_lowercase:
any(letter in string.ascii_lowercase for letter in s)
Linear search over string.ascii_lowercase, and a useless list creation:
any([letter in string.ascii_lowercase for letter in s])
Linear search over the input, very poor performance in the worst case when the input is very long and does not contain any character from the alphabet:
any(letter in s for letter in string.ascii_lowercase)
Currently you are checking whether the whole string string.ascii_lowercase is in s.
You have to check every single character of string.ascii_lowercase instead.
The naive solution would look like this:
>>> s = 'hello'
>>> for letter in string.ascii_lowercase:
... if letter in s:
... print('yes')
... break
... else:
... print('no')
...
yes
Here, the else block will only execute if the loop was not broken by the break statement.
A shorthand for the for loop would be to use the any builtin paired with a generator-expression:
>>> contained = any(letter in s for letter in string.ascii_lowercase)
>>> print('yes' if contained else 'no')
yes
Finally, you can improve the runtime of both implementations by using the set of characters from s, i.e. s = set(s). This will ensure that every in check is performed in constant time rather than iterating over s for every letter that is searched.
edit: Here's another short one:
>>> if set(s).intersection(string.ascii_lowercase):
... print('yes')
... else:
... print('no')
...
yes
This uses the fact that an empty set (the possible result of the intersection) will be treated as False in the if check.
(It has the slight drawback that the computation of the intersection does not stop once a single shared letter letter is found.)
Make a set of each string and check the size of their intersection
def share_letter(s1, s2):
return bool(set(s1).intersection(s2))
string.ascii_lowercase is a string that contains all the lower case alphabets, i.e abcdefghijklmnopqrstuvwxyz.
So, in the if condition, if string.ascii_lowercase in s you are checking if the string contains a substring abcdefghijklmnopqrstuvwxyz.
You can try this,
if any(e in string.ascii_lowercase for e in s):
...
The expression inside any is a generator, thus it stop checking at the first match.
Another way to do this is,
if any(e.islower() for e in s):
...
This is another option:
import string
s = ('hello')
alpha = string.ascii_lowercase
if any(i in alpha for i in s):
print('yes')
else:
print('no')
Or maybe quicker:
import string
s = ('hello')
alpha = string.ascii_lowercase
for l in s:
if l in alpha:
print("yes")
break
print("no")
Using Python 3.
I have a string such as 128kb/s, 5mb/s, or something as simple as 42!. There's no space between the numeric characters and its postfix, so I can't just invoke int(text) directly.
And I just want to capture the values of 128,5, and 42 into an integer.
At the moment, I just wrote a helper function that accumulates all the numbers into a string and breaks on the first non-numeric character.
def read_int_from_string(text):
s = ""
val = 0
for c in text:
if (c >= '0') and (c <= '9'):
s += c
else:
break
if s:
val = int(s)
return val
The above works fine, but is there a more pythonic way to do this?
This is one of those scenarios where a regex seems reasonable:
import re
leadingdigits = re.compile(r'^\d+')
def read_int_from_string(text):
return int(leadingdigits.match(text).group(0))
If you hate regex, you can do this to basically push your original loop's logic to the C layer, though it's likely to be slower:
from itertools import takewhile
def read_int_from_string(text):
return int(''.join(takewhile(str.isdigit, text)))
you can use str.isdigit, how about this one?
>> int(filter(str.isdigit, '128kb/s'))
128
for Python 3. since filter returns iterable in Python 3
int(''.join(filter(str.isdigit, '128kb/s')))
Is it possible without regex in python to print the first n integers from a string containing both integers and characters?
For instance:
string1 = 'test120202test34234e23424'
string2 = 'ex120202test34234e23424'
foo(string1,6) => 120202
foo(string2,6) => 120202
Anything's possible without a regex. Most things are preferable without a regex.
On easy way is.
>>> str = 'test120202test34234e23424'
>>> str2 = 'ex120202test34234e23424'
>>> ''.join(c for c in str if c.isdigit())[:6]
'120202'
>>> ''.join(c for c in str2 if c.isdigit())[:6]
'120202'
You might want to handle your corner cases some specific way -- it all depends on what you know your code should do.
>>> str3 = "hello 4 world"
>>> ''.join(c for c in str3 if c.isdigit())[:6]
'4'
And don't name your strings str!
You can remove all the alphabets from you string with str.translate and the slice till the number of digits you want, like this
import string
def foo(input_string, num):
return input_string.translate(None, string.letters)[:num]
print foo('test120202test34234e23424', 6) # 120202
print foo('ex120202test34234e23424', 6) # 120202
Note: This simple technique works only in Python 2.x
But the most efficient way is to go with the itertools.islice
from itertools import islice
def foo(input_string, num):
return "".join(islice((char for char in input_string if char.isdigit()),num))
This is is the most efficient way because, it doesn't have to process the entire string before returning the result.
If you didn't want to process the whole string - not a problem with the length of strings you give as an example - you could try:
import itertools
"".join(itertools.islice((c for c in str2 if c.isdigit()),0,5))
Hello I am fairly new at programming,
I would like to know is there a function or a method that allows us to find out how many letters have been changed in a string..
example:
input:
"Cold"
output:
"Hold"
Hence only 1 letter was changed
or the example:
input:
"Deer"
output:
"Dial"
Hence 3 letters were changed
I spoke too soon. First result googling:
https://pypi.python.org/pypi/python-Levenshtein/
This should be able to measure the minimum number of changes needed to get from one string to another.
If you don't need to consider character insertions or deletions, the problem is reduced to simply counting the number of characters that are different between the strings.
Since you're new to programming, a imperative-style program would be:
def differences(string1,string2):
i=0
different=0
for i in range(len(string1)):
if string1[i]!=string2[i]:
different= different+1
return different
something slightly more pythonic would be:
def differences(string1,string2):
different=0
for a,b in zip(string1,string2):
if a!=b:
different+= 1
return different
or, if you want to go fully functional:
def differences(string1,string2):
return sum(map(lambda (x,y):x!=y, zip(string1,string2)))
which, as #DSM suggested, is equivalent to the more readable generator expression:
def differences(string1,string2):
return sum(x != y for x,y in zip(string1, string2))
Use the itertools library as follows (Python 3.x)
from itertools import zip_longest
def change_count(string1, string2):
count = 0
for i, (char1, char2) in enumerate(zip_longest(string1, string2)):
if char1 != char2:
count = count + 1
return count
string1 = input("Enter one string: ")
string2 = input("Enter another string: ")
changed = change_count(string1, string2)
print("Times changed: ", changed)
Check out the difflib library, particularly then ndiff method. Note: this is kind of overkill for the required job, but it is really great for seeing the differences between two files (you can see which are new, which are changed, etc etc)
word1 = "Cold"
word2 = "Waldo"
i = 0
differences = difflib.ndiff(word1, word2)
for line in differences:
if line[0] is not " ":
i += 1
print(i)