How to validate a string? - python

I was just wondering how one would write a code that validates a string? For example a user inputs a postal code (string). i have to make sure it follows a L#L#L# format L-> represents a letter only and #-> represents only a number not decimal ...if not ask user to enter again

String Methods more info
For your example you could slice the string with a step of 2 checking every other if it is a digit/letter:
.isdecimal checks for characters that make up base-10 numbers systems (0-9).
.isalpha checks for letters (A-Z)
test_good = 'L5L5L5'
test_bad = 'LLLLLL'
def check_string(test):
if test[0::2].isalpha() and test[1::2].isdecimal():
return True
else:
return False
Test it out:
check_string(test_good)
>>>True
Negative test:
check_string(test_bad)
>>>False
Regex more info regexr
Regex does pattern matching operations and really a lot more. In the example below I compiled the pattern ahead of time so that it looks clean and can be reused if needed.
I also use re.fullmatch() which requires the entire provided string match, not just one part of it. On its own it will return None or the match object, so I check to see if it exists (meaning it matched) and return True or if not (None) return False.
import re
def match(test):
re_pattern = re.compile('[A-Z][0-9][A-Z][0-9][A-Z][0-9]')
if re.fullmatch(re_pattern, test):
return True
else:
return False

Related

Python Regex validation

I am brand new to Python.
I'm trying to ensure a username contains ONLY alpha characters (only a-z). I have the below code. If I type digits only (e.g. 7777) it correctly throws the error. If I type numbers and letters mix, but I START with a number, it also rejects. But if I start with a letter (a-z) and then have numbers in the string as well, it accepts it as correct. Why?
def register():
uf = open("user.txt","r")
un = re.compile(r'[a-z]')
up = re.compile(r'[a-zA-Z0-9()$%_/.]*$')
print("Register new user:\n")
new_user = input("Please enter a username:\n-->")
if len(new_user) > 10:
print("That username is too long. Max 10 characters please.\n")
register()
#elif not un.match(new_user):
elif not re.match('[a-z]',new_user):
print("That username is invalid. Only letters allowed, no numbers or special characters.\n")
register()
else:
print(f"Thanks {new_user}")
Why don't you use isalpha()?
string = '333'
print(string.isalpha()) # False
string = 'a33'
print(string.isalpha()) # False
string = 'aWWff'
print(string.isalpha()) # True
in your code, uf, un and up are unused variables.
the only point where you validate something is the line elif not re.match('[a-z]',new_user):, and you just check if there is at least one lowercase char.
To ensure that a variable contains only letters, use: elif not re.match('^[a-zA-Z]{1,10}$',new_user):
in the regex ^[a-zA-Z]{1,10}$ you find:
^ : looks for the start of the line
[a-zA-Z] : looks for chars between a and z and between A and Z
{1,10} : ensure that the char specified before (letter) is repeated between 1 and 10 times. As LhasaDad is suggesting in the comments, you may want to increase the minimum number of characters, e.g. to 4: {4,10}. We don't know what this username is for, but 1 char seems in any case too low.
$ : search for the end of the line
Since you were looking for a RegEx, I've produced and explained one, but Guy's answer is more pythonic.
IMPORTANT:
You're not asking for this, but you may encounter an error you're not expecting: since you're calling a function inside itself, you have a recursive function. If the user provides too many times (more than 1000) the wrong username, you'll receive a RecursionError
As the re.match docs say:
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object.
That's exactly what's happening in your case: a letter in the beginning of the string will satisfy the match. Try the expression [a-z]+$ which will make sure that the match expands till the end of the string.
You can check the length on the same go: [a-z]{1,10}$.

A check text numbers, alphabets, letters upper case and lower case except symbols

I need guys your help.
I can't understand what to use either list or set. List is more efficient. dictionary also need index. but my problem is text should be string so variable must equal to text as string. I can't D=['a','b','c'].
text gives me error because it can't compare them all except individual and i must create such as abc or word example as _success and confirm its in the list to be true.
This is my code so far but i have problem which is now it accepts numbers and letters and symbols. Symbols such as !##$% should be returning False.
Having it as own function works but i need it in the if statement.
return text.isalnum() doesn't work in the if statement. Thats my problem symbols should be false.
def check(text):
if text== '':
return False
if text.isalpha() == text.isdigit():
return True
else:
return text.isalnum()
def main():
text = str(raw_input("Enter text: "))
print(check(text))
main()
output problem.
Enter text: _
False
_ is suppose to be one of the symbols True. Example _success123 is True
!##$% is suppose to be false but its showing as True as output Another example is !##A123. This output is False.
The code up there does accept the underscore and letter and number
output:
_success123
but problem is also accepts !##$ as True.
return text.isalnum() Does deny the symbols but its not working in the if statement.
It's an overkill, but you can use Regex. It's easy to add new chars (e.g. symbols):
import re
def check(text):
return re.match('^[a-zA-Z0-9_!]*$', text)
text = str(raw_input("Enter text: "))
print(check(text))
If you want to avoid a regular expression, you could use Python sets:
allowed = set('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_0123456789')
def check(text):
return not len(set(text) - allowed)
for text in ['_success123', '!##$%']:
print(text, check(text))
This converts your text into a set of characters and removes all the characters that are allowed. If any characters remain then you know it is invalid. For two examples, this gives:
_success123 True
!##$% False

how to check if each letter in a password is part of a list of allowed characters [duplicate]

Alright so for this problem I am meant to be writing a function that returns True if a given string contains only characters from another given string. So if I input "bird" as the first string and "irbd" as the second, it would return True, but if I used "birds" as the first string and "irdb" as the second it would return False. My code so far looks like this:
def only_uses_letters_from(string1,string2):
"""Takes two strings and returns true if the first string only contains characters also in the second string.
string,string -> string"""
if string1 in string2:
return True
else:
return False
When I try to run the script it only returns True if the strings are in the exact same order or if I input only one letter ("bird" or "b" and "bird" versus "bird" and "irdb").
This is a perfect use case of sets. The following code will solve your problem:
def only_uses_letters_from(string1, string2):
"""Check if the first string only contains characters also in the second string."""
return set(string1) <= set(string2)
sets are fine, but aren't required (and may be less efficient depending on your string lengths). You could also do simply:
s1 = "bird"
s2 = "irbd"
print all(l in s1 for l in s2) # True
Note that this will stop immediately as soon as a letter in s2 isn't found in s1 and return False.

where have i gone wrong in the following python code :

To do:
Convert python to ythonpay (i.e. pick up a first letter, put it in the end and add 'ay' and make sure the user has not entered in the word numbers or alphanumeric word )
def check(word):
if word.isalnum() or word.isdigit():
print ("Enter valid word!")
else:
print ("Thank You for Entering a valid word!")
first_letter = word[0]
new_word = word.strip(word[0])
PygLatin = new_word+first_letter+"ay"
print (PygLatin)
word= input("enter a word:").lower()
result = check(word)
result I got:
1>> enter a word -> python
2>> Enter valid word!
There are two fundamental issues with your code (and one stylistic issue).
Usually you want functions to return something. For example your intention is to take a word, move the first letter to the end of the word, and add "ay" ... in other words to render it in "Pig Latin."
But you're print-ing the results rather than return-ing them. You might think that using print returns a value (in the sense that it "returned" something to your screen or terminal). But that's not what "return" means in computer programming. The Python return statement is how your function returns a result to the rest of the program following any particular invocation of (or "call into") your function.
Here's the simplest function that would work:
def pigify(word):
return word[1:]+word[0].lower()+'ay'
... that will take a "slice" of the word from a one character offset into the string all the way to the end of the string. That's what [1:] means ... it describes a range of characters, how far to the start of the range and then how far to go to get up to (but not including) the end. Then it adds the first character (which is "zero characters" from the beginning of the string), converts that to lower case (which is harmless for all characters, and only affects capital letters) and then it adds the literal string "ay" ... and it takes all of that and returns it.
pig_latin = pigify("Python")
print(pig_latin)
# ---> prints "ythonpay"
The other issue with your code is that you're calling string methods in a confused way. word.alnum() will return True only if all the characters are alphanumeric and word.isdigit() will return True only if all of the characters are numeric. That's the same as just calling word.isdigit() since digits are a proper subset of the alphanumeric character set. In other words the only strings that will pass your code will be those which contain no letters (or other characters); clearly not what you intended.
You probably would prefer to check that the string consists entirely of alphabetic characters:
def pigify(word):
if word.isalpha():
return word[1:]+word[0].lower()+'ay'
# or else? ....
This leaves you with the question of what you should do with an invalid argument (value passed to your function's "word" parameter by the caller).
You could print an error message. But that's considered poor programming practice. You could return some special value such as Python's None value; but then code that calls your function must either check the results every time, or results can cause some other error far removed from where your function was called (where this non-string value was returned).
You can also raise an exception:
def pigify(word):
if word.isalpha():
return word[1:]+word[0].lower()+'ay'
raise ValueError("Invalid input for pigify()")
... note that I don't need to explicitly spell out else in this case; the control flow only reaches that statement if I didn't return a value, only when it's an error. Any other time the control flow returns to the calling code (the part of the program that called my pigify() function).
Another thing I could do is decide that pigify() simply returns anything that doesn't look like a "word" exactly as it was passed:
def pigify(word):
if word.isalpha():
return word[1:]+word[0].lower()+'ay'
else:
return word
... here I could just return word without the else: as it did before with the raise statement. But I personally think that looks wrong; so I've explicitly spelled out the else: clause purely for stylistic reasons.
Mostly you want your program to be composed of functions (or objects with methods) that work with (manipulate) the data, and then a smaller body of code (possibly functions or object and their methods) which then "render" the results of those manipulations. Any time you're writing a function which manipulations or transforms data and writes those results to the screen or into a web page, or out to a file or database, you should pause and reconsider your design. The transformative/manipulations and computations might be re-useable while the code which writes results is typically quite specific. So interleaving one with the other is usually a bad decision.
The str.isdigit() and str.isalnum() methods only return true if all of the characters match the criteria. Your test is written so that you want to detect whether any of the characters match the criteria:
>>> word = 'abc123'
>>> word.isdigit()
False
>>> any(map(str.isdigit, word))
True
So you can amend the code to start with something like this:
def check(word):
if not word.isalnum() or any(map(str.isdigit, word)):
print ("Enter valid word!")
else:
print ("Thank You for Entering a valid word!")
...
FWIW, str.isalpha() would be easier to use because digits are already excluded.
In your code, you have problem with isalnum() which returns true if string contains only alphabets, only numbers or contains both alphabets and numbers so you can try to match if string only contains alphabets and execute your code as follow:
def check(word):
if word.isalpha(): # process only if word contains only alphabets
print("Thank You for Entering a valid word : {}!".format(word))
print(word[1:] + word[0] + "ay") # slicing is better choice in your case
else:
print("Enter valid word!!!")
word = input("enter a word:")
result = check(word.lower())

Python Regex for string matching

There are 2 rules that I am trying to match with regex. I've tried testing on various cases, giving me unwanted results.
Rules are as follows:
Find all strings that are numbers (integer, decimal, and negative values included)
Find all strings that have no numeric value. This is referring to special characters like !##$%^&*()
So in my attempt to match these rules, I got this:
def rule(word):
if re.match("\W", word):
return True
elif re.match("[-.\d]", word):
return True
else:
return False
Input: output tests are as follows
word = '972.2' : True
word = '-88.2' : True
word = '8fdsf' : True
word = '86fdsf' : True I want this to be False
word = '&^(' : True
There were some more tests, but I just wanted to show that one I want to return False. It seems like it's matching just the first character, so I tried changing the regex epressions, but that made things worse.
As the documentation says, re.match will return a MatchObject which always evaluates to True whenever the start of the string is matched to the regex.
Thus, you need to use anchors in regex to make sure only whole string match counts, e.g. [-.\d]$ (note the dollar sign).
EDIT: plus what Max said - use + so your regex won't just match a single letter.
Your regexes (both of them) only look at the first character of your string. Change them to add +$ at the end in order to make sure your string is only made of the target characters.
As i understand, you need to exclude all except 1 and 2.
Try this:
import re
def rule(word):
return True if re.search("[^\W\d\-\.]+", word) is None else False
Results on provided samples:
972.2: True
-88.2: True
8fdsf: False
86fdsf: False
&^(: True

Categories

Resources