python UUID based on email

python UUID based on email - python

How do I generate UUID based on email ids ?
I have read the docs.
I prefer to use the UUID module.

Without knowing exactly what the namespace thing is about, I'd try this:
>> import uuid
>> mail = "foo#bar.example"
>> uuid.uuid5(uuid.NAMESPACE_URL, mail)
UUID('45348e31-1ca5-57f3-ad95-cb80bf6ad145')
If all you need is a unique hash you can also use the hashlib module.
>> import hashlib
>> m = hashlib.sha1()
>> m.update(mail)
>> m.hexdigest()
'edb13b9a276142c6dcb93534a21f497fec4b93f8'

You need to generate "version 3 UUID / UUID3" OR "version 5 UUID / UUID5" to solve your problem.
A version 3 UUID is created using the DNS namespace.
>> import uuid
>> uuid.NAMESPACE_DNS
>> UUID('6ba7b810-9dad-11d1-80b4-00c04fd430c8')
>> uuid.uuid3(uuid.NAMESPACE_DNS, 'YOU EMAIL ID')
>> UUID('3d813cbb-47fb-32ba-91df-831e1593ac29')
UUID5 can be generated similarly..
And you can also use "NAMESPACE_URL" to generate UUID3 or UUID5.
(uuid.NAMESPACE_URL)

As others have told you, you have to use uuid3 or uuid5. (Which one, it doesn't really matter if you don't care about cryptography. I'll use uuid3 in this example.) Now you have to decide on a namespace.
DNS doesn't make sense, since it only accepts FQDNs, which email address surely is not. X.500 can theoretically be used if you are in LDAP, but it's still more complicated than necessary. OID tree, as far as I know, doesn't have an arc for emails - and rightly so, since they are trying to build a permanent registry, and email address are not really permanent.
So, that leaves URIs. Are email addresses URIs? Fortunately, yes. [Formally, it's for URLs only, but fortunately, email addresses are URLs, too.:] URIs have a syntax described in this Wikipedia article. So you have to find a scheme, and then fit your data into it. IANA gives you a list of schemes, where you can find "mailto" as "Electronic mail address" "Permanent" scheme. Seems like exactly what we want.
You also get the link to the RFC, in this case RFC 6068, which tells you how exactly you should format your email address. The possible problem is that you speak about "email id", that could possibly mean just the "local-part" of it (the "username" as it's usually called). Of course, that won't do, since it isn't unique globally.
[The only way you could make it work is to somehow restrict the namespace to your mail server. You can do it with MX records and DNS, but much simpler is to just code the domain into the whole email address.]
def email_uuid(email_id, domain='your.domain.example.com'):
from uuid import uuid3, NAMESPACE_URL
if '#' not in email_id:
email_id += '#' + domain
return uuid3(NAMESPACE_URL, 'mailto:' + email_id)

Related

Is there a builtin library in Python that can parse out the domain part (if any) of an email address?

I know that I can use email.utils.parseaddr to parse out an email address properly, even a tricksy one:
>>> parseaddr('Bad Horse <bad.horse#example(no one expects the #-ish inquisition!).com')
('Bad Horse (no one expects the #-ish inquisition!)', 'bad.horse#example.com')
(So good, Python! So good!)
However, that's just a string:
>>> type(parseaddr('Bad Horse <bad.horse#example(no one expects the #-ish inquisition!).com')[-1])
<class 'str'>
In the typical case I can just do .rsplit('#', maxsplit=1)[-1] to get the domain. But what if I'm just sending local mail without a domain?
>>> parseaddr('Wayne <wayne>')[-1].rsplit('#', maxsplit=1)[-1]
'wayne'
That's not quite what I want - I'd prefer maybe None or 'localhost'.
Does anything like that come in Python's included batteries?

I haven't been able to find anything yet, so my current approach is to make a slight adjustment:
try:
domain = parseaddr('Wayne <wayne>')[-1].rsplit('#', maxsplit=1)[1]
except IndexError:
# There was no '#' in the email address
domain = None # or 'localhost'
In the absence of a better way, this works and gets me what I need.

IMAP search for address is equal not contains

=)
I need get all messages from email inbox with specific address.
For that i use command:
self.server.search(None, '(HEADER FROM "test#gmail.com")')
and it's work but when I try find message form st#gmail.com I got the same results. And I know with this criteria I searching all messages CONTAINS specific string. But for me test#gmail.com and st#gmail.com is diffrents addresses. How can I search for EQUAL not CONTAINS addresses?
import imaplib
self.server = imaplib.IMAP4(self.imap_ssl_host, self.imap_ssl_port)

You can try searching for <test#gmail.com> instead of test#gmail.com.
A message from test#gmail.com usually says From: Firstname Lastname <test#gmail.com>, which contains the substring <test#, and most IMAP searches are substring searches, including FROM. If this hack is enough for you and whatever server you're using, good for you, otherwise you need to do clientside filtering to remove the false positives.

Django or python manipulate email addresses and reason about domains

I want to be able to parse email addresses to isolate the domain part, and test if an email address is part of a given domain.
The email module doesn't, as far as I can tell, do that. Is there anything worth using to do this other than the usual string handling and regex routines?
Note: I know how to deal with python strings. I don't need basic recipes, although awesome recipes are welcome.
The problem here is essentially that email addresses have the format (schematically) userpart#sub\.domain\.[sld]+\.tld.
Stripping the part before the # is easy; the hard part is parsing the domain to work out which parts are subdomains on a larger organisation's domain, rather than generic second-level (or, I guess even higher order) public domains.
Imagine parsing user#mail.organisation.co.uk to find that the organisation's domain name is organisation.co.uk and so be able to match both mail.organisation.co.uk and finance.organisation.co.uk as subdomains of organisation.co.uk.
There are basically two possible (non-dns-based) approaches: build a finite automaton that knows about all generic slds and their relation to the tld (including popular 'fake' slds like uk.com), or try to guess, based on the knowledge that there must be a tld, and assuming that if there are three (or more) elements, the second-level domain is generic if it has fewer than three/four characters. The relative drawbacks of each approach should be obvious.
The alternative is to look through DNS entries to work out what is a registered domain, which has its own drawbacks.
In any case, I would rather piggyback on the work of others.

As per #dm03514's comment, there is a python library that does exactly this: tldextract:
>>> import tldextract
>>> tldextract.extract('foo#bar.baz.org.uk')
ExtractResult(subdomain='bar', domain='baz', tld='org.uk')

With this simple script, we replace # with #. so that our domain is terminated and the endswith won't match a domain ending with the same text.
def address_in_domain(address, domain):
return address.replace('#', '#.').endswith('.' + domain)
if __name__ == '__main__':
addresses = [
'user1#domain.com',
'user1#anotherdomain.com',
'user2#org.domain.com',
]
print filter(lambda address: address_in_domain(address, 'domain.com'), addresses)
# Prints: ['user1#domain.com', 'user2#org.domain.com']

MX Record lookup and check

I need to create a tool that will check a domains live mx records against what should be expected (we have had issues with some of our staff fiddling with them and causing all incoming mail to redirected into the void)
Now I won't lie, I'm not a competent programmer in the slightest! I'm about 40 pages into "dive into python" and can read and understand the most basic code. But I'm willing to learn rather than just being given an answer.
So would anyone be able to suggest which language I should be using?
I was thinking of using python and starting with something along the lines of using 0s.system() to do a (dig +nocmd domain.com mx +noall +answer) to pull up the records, I then get a bit confused about how to compare this to a existing set of records.
Sorry if that all sounds like nonsense!
Thanks
Chris

With dnspython module (not built-in, you must pip install it):
>>> import dns.resolver
>>> domain = 'hotmail.com'
>>> for x in dns.resolver.resolve(domain, 'MX'):
... print(x.to_text())
...
5 mx3.hotmail.com.
5 mx4.hotmail.com.
5 mx1.hotmail.com.
5 mx2.hotmail.com.

Take a look at dnspython, a module that should do the lookups for you just fine without needing to resort to system calls.

the above solutions are correct. some things I would like to add and update.
the dnspython has been updated to be used with python3 and it has superseeded the dnspython3 library so use of dnspython is recommended
the domain will strictly take in the domain and nothing else.
for example: dnspython.org is valid domain, not www.dnspython.org
here's a function if you want to get the mail servers for a domain.
def get_mx_server(domain: str = "dnspython.org") -> str:
mail_servers = resolver.resolve(domain, 'MX')
mail_servers = list(set([data.exchange.to_text()
for data in mail_servers]))
return ",".join(mail_servers)

django, python and link encryption

I need to arrange some kind of encrpytion for generating user specific links. Users will be clicking this link and at some other view, related link with the crypted string will be decrypted and result will be returned.
For this, I need some kind of encryption function that consumes a number(or a string) that is the primary key of my selected item that is bound to the user account, also consuming some kind of seed and generating encryption code that will be decrypted at some other page.
so something like this
my_items_pk = 36 #primary key of an item
seed = "rsdjk324j23423j4j2" #some string for crypting
encrypted_string = encrypt(my_items_pk,seed)
#generates some crypted string such as "dsaj2j213jasas452k41k"
and at another page:
decrypt_input = encrypt(decypt,seed)
print decrypt_input
#gives 36
I want my "seed" to be some kind of primary variable (not some class) for this purpose (ie some number or string).
How can I achieve this under python and django ?

There are no encryption algorithms, per se, built in to Python. However, you might want to look at the Python Cryptography Toolkit (PyCrypt). I've only tinkered with it, but it's referenced in Python's documentation on cryptographic services. Here's an example of how you could encrypt a string with AES using PyCrypt:
from Crypto.Cipher import AES
from urllib import quote
# Note that for AES the key length must be either 16, 24, or 32 bytes
encryption_obj = AES.new('abcdefghijklmnop')
plain = "Testing"
# The plaintext must be a multiple of 16 bytes (for AES), so here we pad it
# with spaces if necessary.
mismatch = len(plain) % 16
if mismatch != 0:
padding = (16 - mismatch) * ' '
plain += padding
ciph = encryption_obj.encrypt(plain)
# Finally, to make the encrypted string safe to use in a URL we quote it
quoted_ciph = quote(ciph)
You would then make this part of your URL, perhaps as part of a GET request.
To decrypt, just reverse the process; assuming that encryption_obj is created as above, and that you've retrieved the relevant part of the URL, this would do it:
from urllib import unquote
# We've already created encryption_object as shown above
ciph = unquote(quoted_ciph)
plain = encryption_obj.decrypt(ciph)
You also might consider a different approach: one simple method would be to hash the primary key (with a salt, if you wish) and store the hash and pk in your database. Give the user the hash as part of their link, and when they return and present the hash, look up the corresponding pk and return the appropriate object. (If you want to go this route, check out the built-in library hashlib.)
As an example, you'd have something like this defined in models.py:
class Pk_lookup(models.Model):
# since we're using sha256, set the max_length of this field to 32
hashed_pk = models.CharField(primary_key=True, max_length=32)
key = models.IntegerField()
And you'd generate the hash in a view using something like the following:
import hashlib
import Pk_lookup
hash = hashlib.sha256()
hash.update(str(pk)) # pk has been defined previously
pk_digest = hash.digest()
lookup = Pk_lookup(hashed_pk=pk_digest,key=pk)
lookup.save()
Note that you'd have to quote this version as well; if you prefer, you can use hexdigest() instead of digest (you wouldn't have to quote the resulting string), but you'll have to adjust the length of the field to 64.

Django has features for this now. See https://docs.djangoproject.com/en/dev/topics/signing/
Quoting that page:
"Django provides both a low-level API for signing values and a high-level API for setting and reading signed cookies, one of the most common uses of signing in Web applications.
You may also find signing useful for the following:
Generating “recover my account” URLs for sending to users who have lost their password.
Ensuring data stored in hidden form fields has not been tampered with.
Generating one-time secret URLs for allowing temporary access to a protected resource, for - example a downloadable file that a user has paid for."

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.