Checking if name of a person is valid or not - python

I am working on a project in which I have to figure out if the name of a person is valid or not. One case of invalidity is a single character name.
In English, it is straight forward to figure out by checking the length.
if len(name) < 2:
return 0
I am not sure if checking length will work out for other languages to, like 玺. I am not sure if this is one character or something else.
Can someone help me to solve this issue?
Dataset info:
countries: 125
total names: 11 Million

While I can't vouch for all other languages, checking a python script with the character you provided confirms that according to Python it is still a single character.
print(len("玺"))
if len("玺") < 2:
print("Single char name")
Another potential solution would be to check ord(char) (29626 for the given char) to check if it is outside the standard latin alphabet and perform additional conditional checks.

Maybe use a dictionary:
language_dict = {
'english':2,
'chinese':1
}
if len(name) < language_dict['english']:
return 0

Related

Determining validity of key

new to python and trying to think up how to approach having a list of valid characters allowed for a key within my dictionary.This key can be any combination of the characters all the way down to a single character or empty.
For example:
allowedWalkingDirection['N','n,'S','s','E','e','W','w']
def isRotateValid(path):
if allowedWalkingDirection in path['walk']:
return path
return False
And so if I try say: {'rotate':'WeNsE'} my input says it isn't valid.
I'm sorry if this isn't very clear and concise, in short, my goal is to allow the valid walking directions to be input however many times within my key, but it's currently only allowing one character in the string.
ok, upon further brain melting and relentless internet perusing I've found some help from Valid characters in a String , I then thought of implementing something like
def isRotateValid(path):
for i in range(0,(len(path['walk']))):
if path['walk'][i] not in allowedWalkingDirection:
return False

How to include character in new string if x is a multiple of ord(ch)?

So the problem is I need to take a string and if it's ord value (ord(ch)) is a multiple of x in the function then that character needs to be added to the new string. I was just wondering if I've done this correctly?
def test(string, x):
r = []
for ch in string:
if ord(ch) % x == 0:
r.append(ch)
return ''.join(r)
Have I done this correctly? If not, any pointers? This was a question in my test I've failed and I don't know if I did it right or not.
Perhaps they were looking for a list comprehension:
return "".join( ch for ch in string if ord(ch)%x == 0 )
That looks correct. You might want to post the exact instructions for the problem, as it's possible you slightly misread the question. It's very hard to write (and read) unambiguous instructions for coding challenges. Nearly everybody who learned programming has made such a mistake at least once in their life.
I was gonna post a list comprehension version, but Alain T. beat me to it while I formulated this response.

Using difflib to detect missing characters

I've created a program to take in a password and tell the user by how many characters the password is wrong. To do this I've used difflib.Differ().
However I'm not sure how to create another loop to make it also be able to tell my how many characters the password is wrong if the input is missing characters.
This is the checking function itself.
import difflib
def check_text(correctPass, inputPass):
count = 0
difference = 0
d = difflib.Differ()
count_difference = list(d.compare(correctPass, inputPass))
while count < len(count_difference):
if '+' in count_difference[count]:
difference +=1
count +=1
return difference
At the moment the function can only pick up mistakes if the mistakes are extra characters (correct password in this case is 'rusty').
My understanding of difflib is quite poor. Do I create a new loop, swap the < for a > and the +s for -s? Or do I just change the conditions already in the while loop?
EDIT:
example input/output: rusty33 –> wrong by 2 characters
rsty –> wrong by 0 characters
I'm mainly trying to make the function detect if there are characters missing or extra, not too fussed about order of characters for the moment.

Strict validation for latin character set (ISO 8859)

I'm looking to validate user input to ensure that all characters in the input string fall within the western-latin character set.
Background
I'm specifically working in Python, but I'm looking more to understand the ISO-8859 character set than receive actual Python code.
For a simpler example of what I'm looking for, if I was looking to ensure that user input was entirely ASCII, I could easily do so by checking that each character's numeric value falls in the range [0-126]:
def is_ascii(s):
for c in s:
if not (0 <= ord(c) <= 126):
return False
return True
Simple enough! But now I want to validate for ISO-8859 (western latin character set).
Question
Is this a simple case of changing the upper bound for the value of ord(c)?
If so, what value should I replace 126 with?
If not, how do I perform this validation?
Note
I'm expecting to receive characters that are certainly outside of ISO-8859, for example emotes entered from a mobile device's keyboard.
Edit
After some further research it looks like replacing 126 with 255 would potentially be a valid solution, but I would appreciate if anyone could confirm this?

How do I compare two strings using a for loop?

I'm building a simple email verifier. I need to compare the local-parts current letter to a list of valid characters. So essentially I'm asking how do I check to see if the current letter I'm on in local-part is equivalent to a letter in the ENTIRE list of valid chars. If it is a valid character, local-part will go to the next letter in its string and go through the list of valid characters to see if this too is and so on until it reaches the # symbol unless there isn't a valid character.
I'm fairly new to python so I don't know how nested for loops work.
for ch in local:
for ch in valChar:
if(ch ==ch) <----problem
This is what I currently have written for the loops. Is "ch" a variable or some type of syntax to represent char?
You don't need nested loop in this case, thanks to the in operator:
for c in local:
if c in valChar:
performvalidaction(c)
else:
denoteasinvalid(c)
What identifier to use (c, ch, or anything else) is pretty indifferent, I tend to use single-character identifiers for loop variables, but there's no rule saying that you must.
If you did have to use two nested loops, you'd just use different loop variables for the two loops.
In fact you don't even need one loop here (you could instead work e.g with Python's sets, for example) -- much less two -- but I guess using one loop is OK if it's clearer for you.
Let me explain the for loop for you:
for eachitem in file:
do something
eachitem is a variable of one specific value of a file/dictionairy etc..
ch is a variable, you can replace it with any valid identifier:
for local_ch in local:
for valChar_ch in valChar:
if(local_ch == valChar_ch): <----No problem
You need to validate an email address, i would use a regular expression:
\b[A-Z0-9._%+-]+#[A-Z0-9.-]+.[A-Z]{2,6}\b

Categories

Resources