I have been given this hypothetical problem:
"Osama returns from the dead and wants revenge. He now wants to communicate with his sleeper cells around the world and plan an attack. But he has to make sure know one else gets it and hence, will like to send it in an encrypted form. He's recruited you for the job. Design a system with encryption and decryption modules for the text message."
I am currently considering following scheme for encryption/decryption:
Now I want to know about the best PKC and SKC and Hash function for implementing above scheme.I did a bit of research over net on best algorithm and narrowed my algorithm choices to following:
Hash:MD5
PKC:RSA or Diffie-Hellman
SKC:DSA
Can you please suggest if there is something that I am missing or any better/new algorithem available.
I am planning to implement this in python.
EDIT:
After reading replies I think I should go with followings:
Hash:SHA-2
PKC:ECC
SKC:AES
Any advice on python library that will provide these algorithm.
The short answer is: it cannot be secure if you do it yourself.
Cryptography provides some basic tools, such as symmetric encryption or digital signatures. Assembling those tools into a communication protocol is devilishly difficult, and when I invoke the name of the Devil I mean it: it looks easy, but there are many details, and it is known that the Devil hides in the details.
Your problem here is akin to "secure emailing", and there are two main protocols for that: OpenPGP and CMS (as used in S/MIME). Look them up: you will see that the solution to your problem is not easy. For the implementation, just use an existing library, e.g. M2Crypto.
MD5 has been known to have weaknesses since 1996, and is considered to be utterly broken since 2004. You should try to get some more up-to-date sources.
If you really want "better" algorithms:
Hash: any of SHA-2 family (sha-224/256/384/512)
PKC: ECC (Elliptic Curve Cryptography)
Other related info: Elliptic curve Diffie–Hellman, Elliptic Curve DSA
Also read Applied Cryptography and other books by Bruce Schneier
For your SKC, AES is a more modern standard than DSA, though DSA isn't broken in the same way that MD5 is.
Strictly speaking, the question does not require sender authentication or nonrepudiation, so the digital signature is not necessary either.
Alot of asymmetric cryptography is vulnerable to quantum cryptography. If you're Osama your're going up against the NSA and it's a safe bet they've got a quantum computer sitting around somewhere.
The only system that is able to provide perfect secrecy is the One Time Pad and it is this reason that satalites use use it.
It is essentially a chunk of random data that you XOR a message with to produce a ciphertext with and XOR with the key key again to produce the plaintext.
**Pros:**Eliptic
Perfect secrecy
Cons:
Key is the length of the message.
Unable to re-use a key without revealing the plaintext of both messages
Use a fallback of AES in feedback mode to encrypt your messageselliptic curve crypto for document signing and HMAC to ensure message integrity.
Pycrypto is a python library with everything you would need to implement this yourself.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I've been learning, working, and playing with Python for a year and a half now. As a biologist slowly making the turn to bio-informatics, this language has been at the very core of all the major contributions I have made in the lab. I more or less fell in love with the way Python permits me to express beautiful solutions and also with the semantics of the language that allows such a natural flow from thoughts to workable code.
What I would like to know is your answer to a kind of question I have seldom seen in this or other forums. This question seems central to me for anyone on the path to Python improvement but who wonders what his next steps should be.
Let me sum up what I do NOT want to ask first ;)
I don't want to know how to QUICKLY learn Python
Nor do I want to find out the best way to get acquainted with the language
Finally, I don't want to know a 'one trick that does it all' approach.
What I do want to know your opinion about, is:
What are the steps YOU would recommend to a Python journeyman, from apprenticeship to guru status (feel free to stop wherever your expertise dictates it), in order that one IMPROVES CONSTANTLY, becoming a better and better Python coder, one step at a time. Some of the people on SO almost seem worthy of worship for their Python prowess, please enlighten us :)
The kind of answers I would enjoy (but feel free to surprise the readership :P ), is formatted more or less like this:
Read this (eg: python tutorial), pay attention to that kind of details
Code for so manytime/problems/lines of code
Then, read this (eg: this or that book), but this time, pay attention to this
Tackle a few real-life problems
Then, proceed to reading Y.
Be sure to grasp these concepts
Code for X time
Come back to such and such basics or move further to...
(you get the point :)
I really care about knowing your opinion on what exactly one should pay attention to, at various stages, in order to progress CONSTANTLY (with due efforts, of course). If you come from a specific field of expertise, discuss the path you see as appropriate in this field.
EDIT: Thanks to your great input, I'm back on the Python improvement track! I really appreciate!
I thought the process of Python mastery went something like:
Discover list comprehensions
Discover generators
Incorporate map, reduce, filter, iter, range, xrange often into your code
Discover Decorators
Write recursive functions, a lot
Discover itertools and functools
Read Real World Haskell (read free online)
Rewrite all your old Python code with tons of higher order functions, recursion, and whatnot.
Annoy your cubicle mates every time they present you with a Python class. Claim it could be "better" implemented as a dictionary plus some functions. Embrace functional programming.
Rediscover the Strategy pattern and then all those things from imperative code you tried so hard to forget after Haskell.
Find a balance.
One good way to further your Python knowledge is to dig into the source code of the libraries, platforms, and frameworks you use already.
For example if you're building a site on Django, many questions that might stump you can be answered by looking at how Django implements the feature in question.
This way you'll continue to pick up new idioms, coding styles, and Python tricks. (Some will be good and some will be bad.)
And when you see something Pythony that you don't understand in the source, hop over to the #python IRC channel and you'll find plenty of "language lawyers" happy to explain.
An accumulation of these little clarifications over years leads to a much deeper understanding of the language and all of its ins and outs.
Understand (more deeply) Python's data types and their roles with regards to memory mgmt
As some of you in the community are aware, I teach Python courses, the most popular ones being the comprehensive Intro+Intermediate course as well as an "advanced" course which introduces a variety of areas of application development.
Quite often, I get asked a question quite similar to, "Should I take your intro or advanced course? I've already been programming Python for 1-2 years, and I think the intro one is too simple for me so I'd like to jump straight to the advanced... which course would you recommend?"
To answer their question, I probe to see how strong they are in this area -- not that it's really the best way to measure whether they're ready for any advanced course, but to see how well their basic knowledge is of Python's objects and memory model, which is a cause of many Python bugs written by those who are not only beginners but those who have gone beyond that.
To do this, I point them at this simple 2-part quiz question:
Many times, they are able to get the output, but the why is more difficult and much more important of an response... I would weigh the output as 20% of the answer while the "why" gets 80% credit. If they can't get the why, regardless how Python experience they have, I will always steer people to the comprehensive intro+intermediate course because I spend one lecture on objects and memory management to the point where you should be able to answer with the output and the why with sufficient confidence. (Just because you know Python's syntax after 1-2 years doesn't make you ready to move beyond a "beginner" label until you have a much better understanding as far as how Python works under the covers.)
A succeeding inquiry requiring a similar answer is even tougher, e.g.,
Example 3
x = ['foo', [1,2,3], 10.4]
y = list(x) # or x[:]
y[0] = 'fooooooo'
y[1][0] = 4
print x
print y
The next topics I recommend are to understanding reference counting well, learning what "interning" means (but not necessarily using it), learning about shallow and deep copies (as in Example 3 above), and finally, the interrelationships between the various types and constructs in the language, i.e. lists vs. tuples, dicts vs. sets, list comprehensions vs. generator expressions, iterators vs. generators, etc.; however all those other suggestions are another post for another time. Hope this helps in the meantime! :-)
ps. I agree with the other responses for getting more intimate with introspection as well as studying other projects' source code and add a strong "+1" to both suggestions!
pps. Great question BTW. I wish I was smart enough in the beginning to have asked something like this, but that was a long time ago, and now I'm trying to help others with my many years of full-time Python programming!!
Check out Peter Norvig's essay on becoming a master programmer in 10 years: http://norvig.com/21-days.html. I'd wager it holds true for any language.
Understand Introspection
write a dir() equivalent
write a type() equivalent
figure out how to "monkey-patch"
use the dis module to see how various language constructs work
Doing these things will
give you some good theoretical knowledge about how python is implemented
give you some good practical experience in lower-level programming
give you a good intuitive feel for python data structures
def apprentice():
read(diveintopython)
experiment(interpreter)
read(python_tutorial)
experiment(interpreter, modules/files)
watch(pycon)
def master():
refer(python-essential-reference)
refer(PEPs/language reference)
experiment()
read(good_python_code) # Eg. twisted, other libraries
write(basic_library) # reinvent wheel and compare to existing wheels
if have_interesting_ideas:
give_talk(pycon)
def guru():
pass # Not qualified to comment. Fix the GIL perhaps?
I'll give you the simplest and most effective piece of advice I think anybody could give you: code.
You can only be better at using a language (which implies understanding it) by coding. You have to actively enjoy coding, be inspired, ask questions, and find answers by yourself.
Got a an hour to spare? Write code that will reverse a string, and find out the most optimum solution. A free evening? Why not try some web-scraping. Read other peoples code. See how they do things. Ask yourself what you would do.
When I'm bored at my computer, I open my IDE and code-storm. I jot down ideas that sound interesting, and challenging. An URL shortener? Sure, I can do that. Oh, I learnt how to convert numbers from one base to another as a side effect!
This is valid whatever your skill level. You never stop learning. By actively coding in your spare time you will, with little additional effort, come to understand the language, and ultimately, become a guru. You will build up knowledge and reusable code and memorise idioms.
If you're in and using python for science (which it seems you are) part of that will be learning and understanding scientific libraries, for me these would be
numpy
scipy
matplotlib
mayavi/mlab
chaco
Cython
knowing how to use the right libraries and vectorize your code is essential for scientific computing.
I wanted to add that, handling large numeric datasets in common pythonic ways(object oriented approaches, lists, iterators) can be extremely inefficient. In scientific computing, it can be necessary to structure your code in ways that differ drastically from how most conventional python coders approach data.
Google just recently released an online Python class ("class" as in "a course of study").
http://code.google.com/edu/languages/google-python-class/
I know this doesn't answer your full question, but I think it's a great place to start!
Download Twisted and look at the source code. They employ some pretty advanced techniques.
Thoroughly Understand All Data Types and Structures
For every type and structure, write a series of demo programs that exercise every aspect of the type or data structure. If you do this, it might be worthwhile to blog notes on each one... it might be useful to lots of people!
I learned python first by myself over a summer just by doing the tutorial on the python site (sadly, I don't seem to be able to find that anymore, so I can't post a link).
Later, python was taught to me in one of my first year courses at university. In the summer that followed, I practiced with PythonChallenge and with problems from Google Code Jam.
Solving these problems help from an algorithmic perspective as well as from the perspective of learning what Python can do as well as how to manipulate it to get the fullest out of python.
For similar reasons, I have heard that code golf works as well, but i have never tried it for myself.
Learning algorithms/maths/file IO/Pythonic optimisation
This won't get you guru-hood but to start out, try working through the Project Euler problems
The first 50 or so shouldn't tax you if you have decent high-school mathematics and know how to Google. When you solve one you get into the forum where you can look through other people's solutions which will teach you even more. Be decent though and don't post up your solutions as the idea is to encourage people to work it out for themselves.
Forcing yourself to work in Python will be unforgiving if you use brute-force algorithms.
This will teach you how to lay out large datasets in memory and access them efficiently with the fast language features such as dictionaries.
From doing this myself I learnt:
File IO
Algorithms and techniques such as Dynamic Programming
Python data layout
Dictionaries/hashmaps
Lists
Tuples
Various combinations thereof, e.g. dictionaries to lists of tuples
Generators
Recursive functions
Developing Python libraries
Filesystem layout
Reloading them during an interpreter session
And also very importantly
When to give up and use C or C++!
All of this should be relevant to Bioinformatics
Admittedly I didn't learn about the OOP features of Python from that experience.
Have you seen the book "Bioinformatics Programming using Python"? Looks like you're an exact member of its focus group.
You already have a lot of reading material, but if you can handle more, I recommend you
learn about the evolution of python by reading the Python Enhancement Proposals, especially the "Finished" PEPs and the "Deferred, Abandoned, Withdrawn, and Rejected" PEPs.
By seeing how the language has changed, the decisions that were made and their rationales, you will absorb the philosophy of Python and understand how "idiomatic Python" comes about.
http://www.python.org/dev/peps/
Attempt http://challenge.greplin.com/ using Python
Teaching to someone else who is starting to learn Python is always a great way to get your ideas clear and sometimes, I usually get a lot of neat questions from students that have me to re-think conceptual things about Python.
Not precisely what you're asking for, but I think it's good advice.
Learn another language, doesn't matter too much which. Each language has it's own ideas and conventions that you can learn from. Learn about the differences in the languages and more importantly why they're different. Try a purely functional language like Haskell and see some of the benefits (and challenges) of functions free of side-effects. See how you can apply some of the things you learn from other languages to Python.
I recommend starting with something that forces you to explore the expressive power of the syntax. Python allows many different ways of writing the same functionality, but there is often a single most elegant and fastest approach. If you're used to the idioms of other languages, you might never otherwise find or accept these better ways. I spent a weekend trudging through the first 20 or so Project Euler problems and made a simple webapp with Django on Google App Engine. This will only take you from apprentice to novice, maybe, but you can then continue to making somewhat more advanced webapps and solve more advanced Project Euler problems. After a few months I went back and solved the first 20 PE problems from scratch in an hour instead of a weekend.
The authentication system for an application we're using right now uses a two-way hash that's basically little more than a glorified caesar cypher. Without going into too much detail about what's going on with it, I'd like to replace it with a more secure encryption algorithm (and it needs to be done server-side). Unfortunately, it needs to be two-way and the algorithms in hashlib are all one-way.
What are some good encryption libraries that will include algorithms for this kind of thing?
I assume you want an encryption algorithm, not a hash. The PyCrypto library offers a pretty wide range of options. It's in the middle of moving over to a new maintainer, so the docs are a little disorganized, but this is roughly where you want to start looking. I usually use AES for stuff like this.
If it's two-way, it's not really a "hash". It's encryption (and from the sounds of things this is really more of a 'salt' or 'cypher', not real encryption.) A hash is one-way by definition. So rather than something like MD5 or SHA1 you need to look for something more like PGP.
Secondly, can you explain the reasoning behind the 2-way requirement? That's not generally considered good practice for authentication systems any more.
PyCrypto supports AES, DES, IDEA, RSA, ElGamal, etc.
I've found the documentation here.