Python Regex validation - python

I am brand new to Python.
I'm trying to ensure a username contains ONLY alpha characters (only a-z). I have the below code. If I type digits only (e.g. 7777) it correctly throws the error. If I type numbers and letters mix, but I START with a number, it also rejects. But if I start with a letter (a-z) and then have numbers in the string as well, it accepts it as correct. Why?
def register():
uf = open("user.txt","r")
un = re.compile(r'[a-z]')
up = re.compile(r'[a-zA-Z0-9()$%_/.]*$')
print("Register new user:\n")
new_user = input("Please enter a username:\n-->")
if len(new_user) > 10:
print("That username is too long. Max 10 characters please.\n")
register()
#elif not un.match(new_user):
elif not re.match('[a-z]',new_user):
print("That username is invalid. Only letters allowed, no numbers or special characters.\n")
register()
else:
print(f"Thanks {new_user}")

Why don't you use isalpha()?
string = '333'
print(string.isalpha()) # False
string = 'a33'
print(string.isalpha()) # False
string = 'aWWff'
print(string.isalpha()) # True

in your code, uf, un and up are unused variables.
the only point where you validate something is the line elif not re.match('[a-z]',new_user):, and you just check if there is at least one lowercase char.
To ensure that a variable contains only letters, use: elif not re.match('^[a-zA-Z]{1,10}$',new_user):
in the regex ^[a-zA-Z]{1,10}$ you find:
^ : looks for the start of the line
[a-zA-Z] : looks for chars between a and z and between A and Z
{1,10} : ensure that the char specified before (letter) is repeated between 1 and 10 times. As LhasaDad is suggesting in the comments, you may want to increase the minimum number of characters, e.g. to 4: {4,10}. We don't know what this username is for, but 1 char seems in any case too low.
$ : search for the end of the line
Since you were looking for a RegEx, I've produced and explained one, but Guy's answer is more pythonic.
IMPORTANT:
You're not asking for this, but you may encounter an error you're not expecting: since you're calling a function inside itself, you have a recursive function. If the user provides too many times (more than 1000) the wrong username, you'll receive a RecursionError

As the re.match docs say:
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object.
That's exactly what's happening in your case: a letter in the beginning of the string will satisfy the match. Try the expression [a-z]+$ which will make sure that the match expands till the end of the string.
You can check the length on the same go: [a-z]{1,10}$.

Related

Regex Sequence Strong Password [duplicate]

This question already has answers here:
Validation of a Password - Python
(14 answers)
Closed 1 year ago.
I would like to test if the input value meets the criterias:
at least one lower case letter
at least one upper case letter
at least one digit
at least one character that is non \w
It seems the regex I programmed only follows this specific order like:
abCD99$%
But if I shuffled the sequence, the regex doesn't work anymore:
CD99ab$%
Anyone knows what the problem is please? Cheers in advance.
import re
# Asks user for an input
print('Please enter a password for checking its strength:')
pwd = input('> ')
#Test the input to see if it is more than 8 characters
if not len(pwd) < 8:
pwdRegex = re.compile(r'([a-z]+)([A-Z]+)([0-9]+)(\W+)') #order problem
if not pwdRegex.search(pwd) == None:
print('Password OK.')
else:
print('Please make sure password fulfills requirements!')
else:
print('Characters must not be less than 8 characters!')
You need to make use of lookaheads to verify your requirements:
(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*\W)^.+$
(?=.*[a-z]) - make sure we have a lowercase char somewhere
(?=.*[A-Z]) - make sure we have an uppercase char somewhere
(?=.*[0-9]) - make sure we have a digit somewhere
(?=.*\W) - make sure we have a non-\w somewhere
^.+$ - all the aforementioned requirements were met so lets capture the entire line
This piece can be omitted if you're just doing a pass/fail test and don't need to capture anything
https://regex101.com/r/HdfVXp/1/
In your case, I think it could be a good solution to make your validation without regex:
def isGoodPassword(password):
# contains upper and lower, digit and special char
return (not password.islower()
and not password.isupper()
and any(c.isdigit() for c in password)
and any(not c.isalnum() for c in password))
isGoodPassword('hello') # False
isGoodPassword('Hello!') # False
isGoodPassword('Hello!1') # True
The https://pypi.org/project/password-strength/ package contains code to do this. If you are interested in how this is done rather than actually doing it you could read the source code.

Input Name with space with validation

I am using the .isalpha function to take an input of a name. It is working but whenever i put on space between name for example a full name John Doe It gives me error.
What ive Tried so far
while not name.isalpha():
print('Entered Name is invalid')
name = input('Please Enter Your Name Sir: ')
if name.isalpha() or name.isspace():
print('Hello Mr.' + name)
select_mmenu('main-menu.txt')
I've tried combining .isalpha and .isspace but it seems not to be working. Need the most simple way to solve this trick
isalpha tests that each member of the string is a letter. isspace tests that each member of the string is a whitespace character. Neither of those is what you want.
Instead you could do:
if all(lett.isalpha() or lett.isspace() for lett in name):
which will pass if every letter is EITHER a letter or a space. Alternatively you can match a regular expression:
import re # at the top of your module
if re.match(r"[\s\w]+$", name):
which is arguably cleaner, and certainly more powerful. The square brackets denote a character class, \s is all whitespaces and \w is all word character, the + means "matches 1 or more times," and the $ is the end of string. [\s\w]+$ then means "one or more characters that are either whitespace or word characters, and nothing afterwards.
It will certainly give you an error because the method isalpha() checks whether the string consists of alphabetic characters only. So if you put a space, the result will return false instead of true, and you will get an error.
Thankyou for the answers. I got it solved without using all() function. I just solved it with simplest basic Python loops
Thankyou Adam Smith because of your answer i got this idea to solve it through that method
con = False
while con!=True:
l=0
strs = input('Enter your Name: ')
for i in strs:
if i.isalpha() or i.isspace():
l += 1
if l == len(strs):
con = True
break
else:
print('Wrong Input')
if con==True:
print(strs)
In this code its basically counting the input lenght and alphabets and space lenght if it match it works. else the while loop continue.

where have i gone wrong in the following python code :

To do:
Convert python to ythonpay (i.e. pick up a first letter, put it in the end and add 'ay' and make sure the user has not entered in the word numbers or alphanumeric word )
def check(word):
if word.isalnum() or word.isdigit():
print ("Enter valid word!")
else:
print ("Thank You for Entering a valid word!")
first_letter = word[0]
new_word = word.strip(word[0])
PygLatin = new_word+first_letter+"ay"
print (PygLatin)
word= input("enter a word:").lower()
result = check(word)
result I got:
1>> enter a word -> python
2>> Enter valid word!
There are two fundamental issues with your code (and one stylistic issue).
Usually you want functions to return something. For example your intention is to take a word, move the first letter to the end of the word, and add "ay" ... in other words to render it in "Pig Latin."
But you're print-ing the results rather than return-ing them. You might think that using print returns a value (in the sense that it "returned" something to your screen or terminal). But that's not what "return" means in computer programming. The Python return statement is how your function returns a result to the rest of the program following any particular invocation of (or "call into") your function.
Here's the simplest function that would work:
def pigify(word):
return word[1:]+word[0].lower()+'ay'
... that will take a "slice" of the word from a one character offset into the string all the way to the end of the string. That's what [1:] means ... it describes a range of characters, how far to the start of the range and then how far to go to get up to (but not including) the end. Then it adds the first character (which is "zero characters" from the beginning of the string), converts that to lower case (which is harmless for all characters, and only affects capital letters) and then it adds the literal string "ay" ... and it takes all of that and returns it.
pig_latin = pigify("Python")
print(pig_latin)
# ---> prints "ythonpay"
The other issue with your code is that you're calling string methods in a confused way. word.alnum() will return True only if all the characters are alphanumeric and word.isdigit() will return True only if all of the characters are numeric. That's the same as just calling word.isdigit() since digits are a proper subset of the alphanumeric character set. In other words the only strings that will pass your code will be those which contain no letters (or other characters); clearly not what you intended.
You probably would prefer to check that the string consists entirely of alphabetic characters:
def pigify(word):
if word.isalpha():
return word[1:]+word[0].lower()+'ay'
# or else? ....
This leaves you with the question of what you should do with an invalid argument (value passed to your function's "word" parameter by the caller).
You could print an error message. But that's considered poor programming practice. You could return some special value such as Python's None value; but then code that calls your function must either check the results every time, or results can cause some other error far removed from where your function was called (where this non-string value was returned).
You can also raise an exception:
def pigify(word):
if word.isalpha():
return word[1:]+word[0].lower()+'ay'
raise ValueError("Invalid input for pigify()")
... note that I don't need to explicitly spell out else in this case; the control flow only reaches that statement if I didn't return a value, only when it's an error. Any other time the control flow returns to the calling code (the part of the program that called my pigify() function).
Another thing I could do is decide that pigify() simply returns anything that doesn't look like a "word" exactly as it was passed:
def pigify(word):
if word.isalpha():
return word[1:]+word[0].lower()+'ay'
else:
return word
... here I could just return word without the else: as it did before with the raise statement. But I personally think that looks wrong; so I've explicitly spelled out the else: clause purely for stylistic reasons.
Mostly you want your program to be composed of functions (or objects with methods) that work with (manipulate) the data, and then a smaller body of code (possibly functions or object and their methods) which then "render" the results of those manipulations. Any time you're writing a function which manipulations or transforms data and writes those results to the screen or into a web page, or out to a file or database, you should pause and reconsider your design. The transformative/manipulations and computations might be re-useable while the code which writes results is typically quite specific. So interleaving one with the other is usually a bad decision.
The str.isdigit() and str.isalnum() methods only return true if all of the characters match the criteria. Your test is written so that you want to detect whether any of the characters match the criteria:
>>> word = 'abc123'
>>> word.isdigit()
False
>>> any(map(str.isdigit, word))
True
So you can amend the code to start with something like this:
def check(word):
if not word.isalnum() or any(map(str.isdigit, word)):
print ("Enter valid word!")
else:
print ("Thank You for Entering a valid word!")
...
FWIW, str.isalpha() would be easier to use because digits are already excluded.
In your code, you have problem with isalnum() which returns true if string contains only alphabets, only numbers or contains both alphabets and numbers so you can try to match if string only contains alphabets and execute your code as follow:
def check(word):
if word.isalpha(): # process only if word contains only alphabets
print("Thank You for Entering a valid word : {}!".format(word))
print(word[1:] + word[0] + "ay") # slicing is better choice in your case
else:
print("Enter valid word!!!")
word = input("enter a word:")
result = check(word.lower())

String Format Checking in Python

While preparing for my AS-Level Computer Science exam I came across a question in the pre-release material:
Prompt the user to input a User ID and check if the format of the ID corresponds with pre-defined formatting rules and output accordingly.
The Format (In order):
One upper case letter
Two Lower case letters
Three numeric characters (digits)
Example: "Abc123"
I came up with a solution using my language of choice(Python), however, I was wondering if there is a more elegant or better way to solve this. Especially the third check.
Here is my code:
#Task 2.2
u_id = input("Input User ID: ") #DECLARE u_id : string
numbers = [str(num) for num in range(10)]
#Checking if final 3 characters of User ID (u_id) are digits
for i in list(u_id[3::]):
if i not in numbers:
digit_check = False #DECLARE digit_check : bool
break
else:
digit_check = True
#User ID format check
if (u_id[0:1].isupper() == True) and (u_id[1:3] == u_id[1:3].lower()) and (digit_check == True):
print ("Correct Format")
else:
print ("Wrong Format")
Ignore the DECLARATION comments. They are an exam requirement.
Thanks
If you are allowed to import re:
import re
u_id = input("Input User ID: ") #DECLARE u_id : string
rex = re.compile("^[A-Z][a-z]{2}[0-9]{3}$")
if rex.match(u_id):
print("Correct format")
else:
print("Incorrect")
Explanation of expression:
^ represents the beginning of a string.
[A-Z] is a range, containing all uppercase letters (in the English alphabet).
[a-z] is a range, containing all lowercase letters.
[0-9] is a range, containing all numbers.
{n} specifies that n items (items are whatever is before the curly brackets) will be matched.
$ represents the end of the string.
Also, you can see more detailed explanations and test arbitrary strings against this regular expression here.
If you want to solve it without regular expressions (mind you, in this case they are the right tool!), you could do something like this:
id_format = [
"ABCDEFGHIJKLMNOPQRSTUVWXYZ", # or string.ascii_uppercase etc.
"abcdefghijklmnopqrstuvwxyz",
"abcdefghijklmnopqrstuvwxyz",
"0123456789",
"0123456789",
"0123456789",
]
def check(input):
# check for same length
if len(input) != len(id_format):
return False
for test, valid in zip(input, id_format): # itertools.zip_longest can make
if test not in valid: # the length check unnecessary
return False
return True
check("Abc123") # True
check("abc123") # False

I want to check if password and username contain at least one symbol?

I want this function to work in my is_password_good function.
def is_ascii(some_string) :
for each_letter in some_string:
if ord(each_letter) < 128:
return False
return True
The is_good_password function makes certain the user's password is at least 10 characters long and that at least one uppercase and lowercase exists.
How can I pass my ASCII function to check if the user creates a passwords using at least one symbol by ASCII standards?
def is_good_password(password):
count_upper, count_lower = 0, 0
for characters in password:
if characters.isupper():
count_upper += 1
if characters.islower():
count_lower += 1
is_password_good = True
if len(password) <= 10:
print "Password is too weak, must be more than 10 characters long!"
is_password_good = False
if count_upper < 1 or count_lower < 1:
print "Password must contain at least one uppercase and one lowercase character!"
is_password_good = False
create_user(database)
print "Welcome! Username & Password successfully created!"
return is_password_good
You can check string.punctuation exist in string or not.
>>>string.punctuation
'!"#$%&\'()*+,-./:;<=>?#[\\]^_`{|}~'
import re
def getmix(password):
Upper=len(set(re.findall(r'[A-Z]',password)))
Lower=len(set(re.findall(r'[a-z]',password)))
Nums=len(set(re.findall(r'[0-9]',password)))
Symb=len(set(re.findall(r'[~!##$%^&\*()_+=-`]')))
return (Upper, Lower, Nums, Symb)
Should give you a good starting point.
The function somestring.isalnum() will return False if not all characters in the string are alphabetics or numbers.
The precise definition of these categories are locale-dependent; make sure you know which locale you are using.
By the by, ASCII is only defined up to character code 127. If you go above 127, you need to know which character set and encoding you are dealing with. However, characters like # and ! are indeed defined in ASCII, and have character codes in the 30-something range. You are better off using library functions which abstract away the precise character codes, anyway.
has_symbol = False
for c in '~!##$%^&*()_+=-`':
if c in password:
has_symbol = True
break
if not has_symbol:
print "Password must contain at least one uppercase and one lowercase character!"
is_password_good = False
Always use the builtins, don't roll your own, so in that spirit, use the string module for a canonical list of symbols:
import string
symbols = string.punctuation
and printing symbols shows us these characters:
!"#$%&'()*+,-./:;<=>?#[\]^_`{|}~
You can pass to any the "one iterable in another" construction:
if any(char in symbols for char in some_string):
print 'password good'
However, I actually prefer the set methods instead of the above construction:
if set(symbols).intersection(some_string):
print 'password good'
but Triplee's advice on isalnum is just as potent, doesn't require importing the string module, and quite shorter.
if not some_string.isalnum():
print 'password good'

Categories

Resources