What is the lexicographically smallest string in Python? - python

What is the lexicographically smallest string in Python? In other words, what is the string x such that x < y is always True (where y is a string)?
For example, in C++ the empty string is the smallest one (see Smallest lexicographical value of a string). Can anyone confirm that the same answer holds for Python?
This is what I've tried so far:
import string
for x in list(string.printable):
assert("" < x)

With x as empty string, x < y for any string y holds True in Python.
We can confirm this:
>>> all('' < x for x in string.printable)
True
all() returns a True if all elements of the iterable are true (or if the iterable is empty). The less than (<) operation of empty string with all string printables is thus a True.
This is true for non-printable characters as well.
The total range vary from 0 to 1,1141,111(0x10FFFF in base 16) (thanks to #AlexHall in comments).
>>> all('' < chr(i) for i in range(0x110000))
True

"" is the smallest string you can get, since its length is 0 (that's also the string returned by str())
However, I did not find anything in the documentation to explicitly confirm that...
https://docs.python.org/3.7/library/stdtypes.html#comparisons
https://docs.python.org/3.7/reference/datamodel.html#object.__lt__

Related

How to differentiate between a number and a letter or sign?

If I have a digit within a string I can just do:
x = "2"
x.isdigit()
and I get True. But when I do this:
isinstance(x, str)
By my understanding this also results in True.
My question is now how can I tell if it is a character or a number?
Use isalpha() for this:
x = "2"
x.isalpha()
Returns False
The isdigit number checks every character in string and checks if its a digit or not, in other words, can be an number or not, it returns true if every digit is integer.
while the isinstance primarily checks the datatype of the value you pass.
x='2'
isinstance(x,integer)
Since x itself is a string, isinstance(x,str) returns true.
So, to find whether a string contains a number or character, just use x.isdigit(), it will always return true if its a digit otherwise false.

What's the easiest way to validate a string is however many digits it needs to be and is all integers?

I have to validate that the string is either 4 or 6 digits. The string cannot contain any characters, only integers. Return true if it meets the condition else false.
I tried to create a list with acceptable digits and loop through the string and compare. If any part of the string is not in the acceptable list I will exit the loop and return false. If the running total is equal to 4 or 6 then it should be true.
python code:
def validate(n):
count = 0
valid_list = list(range(10))
for digit in pin:
if digit not in valid_list:
return False
count += 1
I'm not sure why something like 1234 is being returned as False.
How about with regex?
import re
str="03506"
pattern="[0-9]{4,6}"
prog=re.compile(pattern)
result=prog.match(str)
if result:
return True
else:
return False
This matches digits that are between 4 and 6 characters long. If you mean you want to match those string that are 4 or 6 long, you can try
import re
str="03506"
pattern1="[0-9]{4}"
pattern2="[0-9]{6}"
if re.match(pattern1,str) or re.match(pattern2, str):
return True
else:
return False
I'm not sure why something like 1234 is being returned as False.
Python never implicitly converts between integers and strings and comparisons between integers and strings are always false.
"valid_list" is a list of integers, but "digit" is a string, so you will never find anything in your list.

questions about if,what does 'if +some variable:'(without any condition) mean?

def string_to_int_list(s):
L1=[]
for i in s.split(','):
if i:#what does this line mean?
L1.append(int(i))
return L1
I want to convert string to list,and if I delete 'if i',it will remind me that ValueError: invalid literal for int() with base 10: ''
if i has a value, the condition will return true, if the value of i is None (empty), it will return false. It's the same as i != None.
Also I test that if the split function returns an empty string "" it will not pass the if condition.
For check if a string is numeric (0 - 9) you can use str.isdigit()
str.isdigit()
Return true if all characters in the string are digits and there is at least one character, false otherwise.
For 8-bit strings, this method is locale-dependent.
This code works:
def string_to_int_list(s):
L1=[]
for i in s.split(','):
if i and i.isdigit():#what does this line mean?
L1.append(int(i))
return L1
a = "1,2,3,q,43,hello"
b = string_to_int_list(a)
print b
It will return [1, 2, 3, 43]
Note that I remove indentation to return because it has no sense inside the loop.
What you are doing here is splitting your string by , and then converting to integer if you see a valid value.
Let's say your 1,2,3,4,,,5
What the script is returning you a list of [1,2,3,4,5]
You should try what if condition returns for a empty string, None, empty list i.e. [] or {}.
This script will fail if you have "abc,2,3,4,5"
You can also functional loops like
filter(lambda x : x , map(lambda x : int(x) if x else None, a.split(",")))

How to check if the string has digits without a try/except or str.isdigit?

NUMBERS = "123456789"
def digit_checker(x):
for t in x:
if t in NUMBERS:
y = True
else:
y = False
return y
sentence = input("Enter a string to check if its all digits: ")
checker = digit_checker(sentence)
print(checker)
As the title states, how would I find if the string has all digits without using the str.isdigit or a try/except. The code keeps checking the first character and not all. How do I fix that?
NUMBERS = "123456789"
def digit_checker(x):
y = True
for t in x:
if t not in NUMBERS:
y = False
return y
You can use all and a generator expression:
NUMBERS = "1234567890"
def digit_checker(x):
return all(t in NUMBERS for t in x)
This will go through each character in x and see if it is in NUMBERS. If not, all will immediately stop checking and return False. Otherwise, it will return True after it has confirmed that every character is a digit. Below is a demonstration:
>>> NUMBERS = "1234567890"
>>> def digit_checker(x):
... return all(t in NUMBERS for t in x)
...
>>> digit_checker('12345')
True
>>> digit_checker('12345a')
False
>>>
Note too that it would be more efficient if you made NUMBERS a set:
NUMBERS = set("1234567890")
That way, t in NUMBERS will perform an O(1) (constant) hash lookup rather than an O(n) (linear) search through the string. Granted, on strings this small, the performance impact of the linear search is not too worrisome. However, that will quickly change whenever you are working with larger strings.
Actually, it looks like it is checking every character, but because it sets y for every character, it is the "numberless" of the last character that determines the value returned, regardless of what the other characters are.
Instead, you should initialize y to True, and only set it to False if you ever find a non-number. In fact, when that happens, you can immediately return.
If you are chefcking for the fact that all the letters are numbers, you can try the following expression:
def digit_checker(x):
return all( t in NUMBERS for t in x )
It is exactly the same as your code except it will also check whether all the characters within x are numbers. This is what has been missing in your code. The return value is always overwritten by the last check. Of course, using a loop and breaking out might be more efficient unless all does that internally for the generator expression, in which case the two are exactly equivalent.
NUMBERS = "1234567890" # Did you miss the 0?
def digit_checker(x):
all_digits = True
for t in x:
if t not in NUMBERS:
all_digits = False
break
return all_digits
A different approach would be to check them as sets:
def digit_checker(x):
return True if set(x).difference(set(NUMBERS)) == set() else False
Perhaps if you clear up the variable names the problem will be more clear:
def digit_checker(sentence):
is_number = True
for character in sentence:
if character in NUMBERS:
is_number = True
else:
is_number = False
return is_number
As you can see, you are evaluating if it is a number for each character, and changing the is_number variable each time. So, only the last character will result in proper evaluation.
You probably want to just return False when a non-digit is first detected. Try if character not in NUMBERS set is_number to False and break the loop.
There are some good answers to do that. Here's another way to do it. You can count the number of characters that are not digits, by storing them in a list, the length of this list should be zero.
NUMBERS = "1234567890"
def digit_checker(x):
return len([t for t in x if t not in NUMBERS]) == 0
print digit_checker('123') #True

how does string comparison work on python? [duplicate]

This question already has answers here:
How are strings compared?
(7 answers)
Closed 5 months ago.
>>> "spam" < "bacon"
False
>>> "spam" < "SPAM"
False
>>> "spam" < "spamalot"
True
>>> "Spam" < "eggs"
True
How are equal length strings compared? Why is "Spam" less than "eggs"? What if the strings are not the same length?
Lexigraphically.
The first bytes are compared and if the ordinal value of the first is less than of the second, it's lesser. If it's more, it's greater. If they are the same, the next byte is tried. If it's all ties and one is longer, the shorter one is lesser.
>>> "a" < "zzz"
True
>>> "aaa" < "z"
True
>>> "b" < "a"
False
>>> "abc" < "abcd"
True
>>> "abcd" < "abce"
True
>>> "A" < "a"
True
>>> ord("A")
65
>>> ord("a")
97
Since A comes before a in ASCII table, so S in Spam is considered smaller than e in eggs.
>>> "A" < "a"
True
>>> "S" < "e"
True
>>> "S" < "eggs"
True
Note that, String length in not considered in comparison. Rather ordinal values for each byte are compared starting from the first byte, as rightly pointed out by #MikeGraham in comments below.
And as soon as mismatch is found, the comparison stops, and comparison value is returned, as in the last example.
From the docs - Comparing Sequences and Other Types: -
The comparison uses lexicographical ordering: first the first two
items are compared, and if they differ this determines the outcome of
the comparison; if they are equal, the next two items are compared,
and so on, until either sequence is exhausted.
Also further in the same paragraph: -
Lexicographical ordering for strings uses the ASCII ordering for
individual characters
Strings in Python are lexicographically ordered so that they can be logically sorted:
>>> print sorted(['spam','bacon','SPAM','spamalot','Spam','eggs'])
['SPAM', 'Spam', 'bacon', 'eggs', 'spam', 'spamalot']
There are compromises with this, primarily with unicode. The letter é will be sorted after the letter z for example:
>>> 'e' < 'z'
True
>>> 'é' < 'z'
False
Luckily, you can use a sort function, use locale or a subclass of string to have strings sorted anyway you wish.
It is a lexicographical comparison.

Categories

Resources