C++ shared_ptr vs. Python object

C++ shared_ptr vs. Python object - python

AFAIK, the use of shared_ptr is often discouraged because of potential bugs caused by careless usage of them (unless you have a really good explanation for significant benefit and carefully checked design).
On the other hand, Python objects seem to be essentially shared_ptrs (ref_count and garbage collection).
I am wondering what makes them work nicely in Python but potentially dangerous in C++. In other words, what are the differences between Python and C++ in dealing with shared_ptr that makes their usage discouraged in C++ but not causing similar problems in Python?
I know e.g. Python automatically detects cycles between objects which prevents memory leaks that dangling cyclic shared_ptrs can cause in C++.

"I know e.g. Python automatically detects cycles" -- that's what makes them work nicely, at least so far as the "potential bugs" relate to memory leaks.
Besides which, C++ programs are more commonly written under tight performance constraints than Python programs (which IMO is a combination of different genuine requirements with some fairly bogus differences in rules-of-thumb, but that's another story). A fairly high proportion of the Python objects I use don't strictly need reference counting, they have exactly one owner and a unique_ptr would be fine (or for that matter a data member of class type). In C++ it's considered (by the people writing the advice you're reading) worth taking the performance advantage and the explicitly simplified design. In Python it's usually not considered a problem, you pay the performance and you keep the flexibility to decide later that it's shared after all without any code change required (other than to take additional references that outlive the original, I mean).
Btw in any language, shared mutable objects have "potential bugs" associated with them, if you lose track of what objects will or won't change when you're not looking at them. I don't just mean race conditions: even in a single-threaded program you need to be aware that C++ Predicates shouldn't change anything and that you (often) can't mutate a container while iterating over it. I don't see this as a difference between C++ and Python, though. Rather, to some extent you should be slightly wary of shared objects in Python too, and when you proliferate references to an object at least understand why you're doing it.
So, on to the list of issues in the question you link to:
cyclic references -- as mentioned, Python rolls its sleeves up, finds them and frees them. For reasons to do with the design and specific uses of the languages, cycle-breaking garbage collection is rather difficult to implement in C++, although not impossible.
creating multiple unrelated shared_ptrs to the same object -- no analog is possible in Python, since the reference-counter isn't open to the user to mess up.
Constructing an anonymous temporary shared pointer -- doesn't arise in Python, there's no risk of a memory leak that way in Python since there's no "gap" in which the object exists but is not yet subject to collection if it becomes unreferenced.
Calling the get() function to get the raw pointer and use it after the pointed-to object goes out of scope -- well, you can mess this up if you're writing Python/C, but not in pure Python.
Passing a reference of or a raw pointer to a shared_ptr should be dangerous too, since it won't increment the internal count -- there's no means in Python to add a reference without the language taking care of the refcount.
we passed 'this' to some thread workers instead of 'shared_from_this' -- in other words, forgot to create a shared_ptr when needed. Can't do this in Python.
most of the predicates you know and love from <functional> don't play nicely with shared_ptr -- Python refcounting is so built in to the runtime (or I suppose to be precise I should say: garbage collection is so built in to the language design) that there are no libraries that fail to cope with it.
Using shared_ptr for really small objects (like char short) could be an overhead -- issue exists in Python, and Python programmers generally don't sweat it. If you need an array of "primitive type" then you can use numpy to reduce overhead. Sometimes Python programs run out of memory and you need to do something about it, that's life ;-)
Giving out a shared_ptr< T > to this inside a class definition is also dangerous. Use enabled_shared_from_this instead -- it may not be obvious, but this is "don't create multiple unrelated shared_ptr to the same object" again.
You need to be careful when you use shared_ptr in multithread code -- it's possible to create race conditions in Python too, this is part of "shared mutable objects are tricksy".
Most of this is to do with the fact that in C++ you have to explicitly do something to get refcounting, and you don't get it if you don't ask for it. This provides several opportunities for error that Python doesn't make available to the programmer because it just does it for you. If you use shared_ptr correctly then apart from the existence of libraries that don't co-operate with it, none of these problems comes up in C++ either. Those who are cautious of using it for these reasons are basically saying they're afraid they'll use it incorrectly, or at any rate more afraid than that they'll misuse some alternative. Much of C++ programming is trading different potential bugs off against each other until you come up with a design that you consider yourself competent to execute. Furthermore it has "don't pay for what you don't need" as a design philosophy. Between these two factors, you don't do anything without a really good explanation, a significant benefit, and a carefully checked design. shared_ptr is no different ;-)

AFAIK, the use of shared_ptr is often discouraged because of potential bugs caused by careless usage of them (unless you have a really good explanation for significant benefit and carefully checked design).
I wouldn't agree. The tendency goes towards generally using these smart pointers unless you have a very good reasons not to do so.
shared_ptr that makes their usage discouraged in C++ but not causing similar problems in Python?
Well, I don't know about your favourite largish signal processing framework ecosystem, but GNU Radio uses shared_ptrs for all their blocks, which are the core elements of the GNU Radio architecture. In fact, blocks are classes, with private constructors, which are only accessible by a friend make function, which returns a shared_ptr. We haven't had problems with this -- and GNU Radio had good reason to adopt such a model. Now, we don't have a single place where users try to use deallocated block objects, not a single block is leaked. Nice!
Also, we use SWIG and a gateway class for a few C++ types that can't just be represented well as Python types. All this works very well on both sides, C++ and Python. In fact, it works so very well, that we can use Python classes as blocks in the C++ runtime, wrapped in shared_ptr.
Also, we never had performance problems. GNU Radio is a high rate, highly optimized, heavily multithreaded framework.

Related

Magic functions to C functions in CPython

I am looking into Cpython implementation and got to learn about how python tackles operator overloading (for example comparison operators) using something like richcmpfunc tp_richcompare; field in _typeobject struct. Where the type is defined as typedef PyObject *(*richcmpfunc) (PyObject *, PyObject *, int);. And so whenever there is need for PyObject being operated by these operators it tries to call tp_richcompare function.
My doubt is that in python we use magic functions like __gt__ etc. to override these operators. So how does python code gets converted into C code as a tp_richcompare and is being used everywhere where we interpret any comparison operator for PyObject.
My second doubt is kind of general version of this: How code in a particular language (here Python) to override things (operators, hash etc.) which are interpreted in another language (C in case of CPython) calls the function defined in first language (Python). As far as I know, when bytecode is generated it's a low-level instruction based representation (which is essentially array of uint8_t).
Another example of this is __hash__ which would be defined in python but is needed in the C-based implementation of the dictionary while lookdict. Again they use C function typedef Py_hash_t (*hashfunc)(PyObject *); everywhere hash is needed for a PyObject but translation of __hash__ to this C function is mysterious.

Python code is not transformed into C code. It is interpreted by C code (in CPython), but that's a completely different concept.
There are many ways to interpret a Python program, and the language reference does not specify any particular mechanism. CPython does it by transforming the each Python function into a list of virtual machine instructions, which can then be interpreted with a virtual machine emulator. That's one approach. Another one would be to just build the AST and then define a (recursive) evaluate method on each AST node.
Of course, it would also be possible to transform the program into C code and compile the C code for future execution. (Here, "C" is not important. It could be any compiled language which seems convenient.) However, there's not much benefit to doing that, and lots of disadvantages. One problem, which I guess is the one behind your question, is that Python types don't correspond to any C primitive type. The only way to represent a Python object in C is to use a structure, such as CPython PyObject, which is effectively a low-level mechanism for defining classes (a concept foreign to C) by including a pointer to a type object which contains a virtual method table, which contains pointers to the functions used to implement the various operations on objects of that type. In effect, that will end up calling the same functions as the interpreter would call to implement each operation; the only purpose of the compiled C code is to sequence the calls without having to walk through an interpretable structure (VM list or AST or whatever). That might be slightly faster, since it avoids a switch statement on each AST node or VM operation, but it's also a lot bulkier, because a function call occupies a lot more space in memory than a single opcode byte.
An intermediate possibility, in common use these days, is to dynamically compile descriptions of programs (ASTs or VM lists or whatever) into actual machine code at runtime, taking into account what can be discovered about the actual dynamic types and values of the referenced variables and functions. That's called "just-in-time (JIT) compilation", and it can produce huge speedups at runtime, if it's implemented well. On the other hand, it's very hard to get it right, and discussing how to do it is well beyond the scope of a SO answer.
As a postscript, I understand from a different question that you are reading Robert Nystrom's book, Crafting Interpreters. That's probably a good way of learning these concepts, although I'm personally partial to a much older but still very current textbook, also freely available on the internet, The Structure and Interpretation of Computer Programs, by Gerald Sussman, Hal Abelson, and Julie Sussman. The books are not really comparable, but both attempt to explain what it means to "interpret a program", and that's an extremely important concept, which probably cannot be communicated in four paragraphs (the size of this answer).
Whichever textbook you use, it's important to not just read the words. You must do the exercises, which is the only way to actually understand the underlying concepts. That's a lot more time-consuming, but it's also a lot more rewarding. One of the weaknesses of Nystrom's book (although I would still recommend it) is that it lays out a complete implementation for you. That's great if you understand the concepts and are looking for something which you can tweak into a rapid prototype, but it leaves open the temptation of skipping over the didactic material, which the is most important part for someone interested in learning how computer languages work.

using data from PyObjects in Python C Extension without holding the GIL

In my Python C Extension I am performing actions on an iterable of strings. So in a first step I call PySequence_Fast to convert it to a list and then iterate over the elements. For each string I use PyUnicode_DATA and then compare the strings using some criteria. So I only read from PyObjects, but never modify them.
Now I would like to process the list in parallel, which would require me to release the GIL. However I do not know which effects this has on my use case. Here are my current thoughts:
I can still use those APIs, since they are only macros, that directly read from the PyObjects without modifying them.
I have to use the APIs beforehand and store a array of structs that hold kind, length and data pointer of the strings
I have to use the APis beforehand and have to store a copy of the strings in a array
Case 1 would be the most performant and memory efficient. However it is stated, that without acquiring the GIL it is not allowed to perform on Python objects (does this include reading access) or use Python/C API functions.
Case 2 would be the next most efficient, since at least I do not have to copy all strings. However when I am not allowed to read from Python objects while the GIL is released, I wonder whether I would even be allowed to use a pointer to the data inside the PyObject.
Case 3 would require me to copy all strings. In my case this might make the multithreaded solution slower than a sequential solutions.
I hope someone can help me understand what I am allowed to do while the GIL is released.

I think the official answer is that you should not do method 1 and should use methods 2 and 3. And that while it might work now it could change in the future and break. This is especially important if you want to support things like PyPy's C-API wrapper (which might well use a different representation that Python internally). There are increasing moves to try to hide implementation details that you slightly risk getting caught out by.
Practically I think method 1 would work fine provided you only use the macro forms with no error checking - the GIL is mainly about stopping simultaneous writes putting Python objects in an undefined state, and you aren't doing this. Where I'd be slightly careful is if you ever have (deprecated) "non-canonical" unicode objects - things that look "macro-y" like PyUnicode_READY can cause them to be modified to the canonical state. Again, be especially wary of alternative (non-CPython) implementations of the C-API.
One alternative to consider would be to use the buffer protocol instead. Although I can't find it explicitly stated in the docs, the idea is that PyObject_GetBuffer and PyBuffer_Release require the GIL but reading/writing to the buffer doesn't. Here I have two sub-suggestions:
can you have a single object like a Numpy array that exposes all your strings as a buffer?
you can also get a buffer from a unicode object (as a utf-8 C-string) - the thing to do would be to create all the buffers with the GIL, do your parallel processing without, and them free them with the GIL. It's possible that the overhead for this might be inefficient. This is basically an "official" version of method 2.
I short, you'd probably get away with it, but if it ever breaks I doubt that a bug report to Python would be well-received (since it's technically wrong)

python threading: memory model and visibility

Does python threading expose issues of memory visibility and statement reordering as Java does? Since I can't find any reference to a "Python Memory Model" or anything like that, despite the fact that lots of people are writing multithreaded Python code, I'm guessing that these gotchas don't exist here. No volatile keyword, for instance. But it doesn't seem to be stated explicitly anywhere that, for instance, a change in a variable in one thread is immediately visible to all other threads.
Maybe this stuff is all very obvious to Python programmers, but as a fearful Java programmer, I require a little extra reassurance :)

There is no formal model for Python's threading (hey, after all, there wasn't one for Java's for years... hopefully, one will also eventually be written for Python).
In practice, no Python implementation performs any advanced optimization such as statement reordering or temporarily treating shared variables as thread-local ones -- and you can count on these semantics constraints even though they are not formally assured.
CPython in particular, as #Rawheiser's mention, uses a global interpreter lock; other implementations (PyPy, IronPython, Jython, ...) do not (so they can use multiple cores effectively with a threading model, while CPython requires multi-processing for the same purpose), so you should not count on that if you want to write code that's portable throughout Python implementations. (So, you shouldn't count on the "atomicity" of operations that only happen to be atomic in CPython because of the GIL, such as dictionary accesses -- in other Python implementations, multiple threads might be modifying a dict at once and cause errors unless you protect the dict with a lock or the like).

Why is (python|ruby) interpreted?

What are the technical reasons why languages like Python and Ruby are interpreted (out of the box) instead of compiled? It seems to me like it should not be too hard for people knowledgeable in this domain to make these languages not be interpreted like they are today, and we would see significant performance gains. So certainly I am missing something.

Several reasons:
faster development loop, write-test vs write-compile-link-test
easier to arrange for dynamic behavior (reflection, metaprogramming)
makes the whole system portable (just recompile the underlying C code and you are good to go on a new platform)
Think of what would happen if the system was not interpreted. Say you used translation-to-C as the mechanism. The compiled code would periodically have to check if it had been superseded by metaprogramming. A similar situation arises with eval()-type functions. In those cases, it would have to run the compiler again, an outrageously slow process, or it would have to also have the interpreter around at run-time anyway.
The only alternative here is a JIT compiler. These systems are highly complex and sophisticated and have even bigger run-time footprints than all the other alternatives. They start up very slowly, making them impractical for scripting. Ever seen a Java script? I haven't.
So, you have two choices:
all the disadvantages of both a compiler and an interpreter
just the disadvantages of an interpreter
It's not surprising that generally the primary implementation just goes with the second choice. It's quite possible that some day we may see secondary implementations like compilers appearing. Ruby 1.9 and Python have bytecode VM's; those are ½-way there. A compiler might target just non-dynamic code, or it might have various levels of language support declarable as options. But since such a thing can't be the primary implementation, it represents a lot of work for a very marginal benefit. Ruby already has 200,000 lines of C in it...
I suppose I should add that one can always add a compiled C (or, with some effort, any other language) extension. So, say you have a slow numerical operation. If you add, say Array#newOp with a C implementation then you get the speedup, the program stays in Ruby (or whatever) and your environment gets a new instance method. Everybody wins! So this reduces the need for a problematic secondary implementation.

Exactly like (in the typical implementation of) Java or C#, Python gets first compiled into some form of bytecode, depending on the implementation (CPython uses a specialized form of its own, Jython uses JVM just like a typical Java, IronPython uses CLR just like a typical C#, and so forth) -- that bytecode then gets further processed for execution by a virtual machine (AKA interpreter), which may also generate machine code "just in time" -- known as JIT -- if and when warranted (CLR and JVM implementations often do, CPython's own virtual machine typically doesn't but can be made to do so e.g. with psyco or Unladen Swallow).
JIT may pay for itself for sufficiently long-running programs (if memory's way cheaper than CPU cycles), but it may not (due to slower startup times and larger memory footprint), especially when the types also have to be inferred or specialized as part of the code generation. Generating machine code without type inference or specialization is easy if that's what you want, e.g. freeze does it for you, but it really doesn't present the advantages that "machine code fetishists" attribute to it. E.g., you get an executable binary of 1.5 to 2 MB in lieu of a tiny "hello world" .pyc -- not much point!-). That executable is stand-alone and distributable as such, but it will only work on a very specific narrow range of operating systems and CPU architectures, so the tradeoffs are quite iffy in most cases. And, the time it takes to prepare the executable is quite long indeed, so it would be a crazy choice to make that mode of operation the default one.

Merely replacing an interpreter with a compiler won't give you as big a performance boost as you might think for a language like Python. When most time is actually spend doing symbolic lookups of object members in dictionaries, it doesn't really matter if the call to the function performing such lookup is interpreted, or is native machine code - the difference, while not quite negligible, will be dwarfed by lookup overhead.
To really improve performance, you need optimizing compilers. And optimization techniques here are very different from what you have with C++, or even Java JIT - an optimizing compiler for a dynamically typed / duck typed language such as Python needs to do some very creative type inference (including probabilistic - i.e. "90% chance of it being T" and then generating efficient machine code for that case with a check/branch before it) and escape analysis. This is hard.

I think the biggest reason for the languages being interpreted is portability. As a programmer you can write code that will run in an interpreter not a specific OS. So your programs behave more uniformly across platforms (more so than compiled languages). Another advantage I can think of is it's easier to have a dynamic type system in an interpreted language. I think the creators of the language were thinking having a language where programmers can be more productive due to automatic memory management, dynamic type system and meta programming wins over any performance loss due to the language being interpreted. If you are concerned about performance you can always compile the language to native machine code employing a technique like JIT compilation.

Today, there is no longer a strong distinction between "compiled" and "interpreted" languages. Python is in fact compiled just as much as Java is, the only differences are:
The Python compiler is much faster than the Java compiler
Python automatically compiles source code as it is executed, there is no separate "compile" step required
Python bytecode is different from JVM bytecode
Python even has a function called compile() which is an interface to the compiler.
It sounds like the distinction you are making is between "dynamically typed" and "statically typed" languages. In dynamic languages such as Python, you can write code like:
def fn(x, y):
return x.foo(y)
Notice that the types of x and y are not specified. At runtime, this function will look at x to see whether it has a member function named foo, and if so will call it with y. If not, it will throw a runtime error that indicates no such function was found. This sort of runtime lookup is much easier to represent using an intermediate representation like bytecode, where a runtime VM does the lookup instead of having to generate machine code to do the lookup itself (or, call a function to do the lookup which is what the bytecode will do anyway).
Python has projects such as Psyco, PyPy, and Unladen Swallow that take various approaches to compiling Python object code into something closer to native code. There is active research in this area but there is not (as yet) a simple answer.

The effort required to create a good compiler to generate native code for a new language is staggering. Small research groups typically take 5 to 10 years (examples: SML/NJ, Haskell, Clean, Cecil, lcc, Objective Caml, MLton, and many others). And when the language in question requires type checking and other decisions to be made at run time, a compiler writer has to work much harder to get good native-code performance (for an excellent example, see work by Craig Chambers and later Urs Hoelzle on Self). The performance gains you might hope for are harder to realize than you might think. This phenomenon partly explains why so many dynamically typed languages are interpreted.
As noted, a decent interpreter is also instantly portable, while porting compilers to new machine architectures takes substantial effort (and is a problem I personally have been working on for over 20 years, with some time off for good behavior). So an interpreter is a way to reach a wide audience quickly.
Finally, although fast compilers and slow interpreters exist, it's usually easer to make the edit-translate-go cycle faster by using an interpreter. (For some nice examples of fast compilers see the aforementioned lcc as well as Ken Thompson's go compiler. For an example of a relatively slow interpreter see GHCi.

Well, isn't one of the strengths of these languages that they are so easily scriptable? They wouldn't be if they were compiled. And on the other hand, dynamic languages are easier to intereprete than to compile.

In a compiled language, the loop you get into when making software is
Make a change
Compile changes
Test changes
goto 1
Interpreted languages tend to be faster to make stuff in because you get to cut out step two of that process (and when you're dealing with a large system where compile times can be upwards of two minutes, step two can add a significant amount of time).
This isn't necessarily the reason python|ruby designers thought of, but keep in mind that "How efficiently does the machine run this?" is only half the software development problem.
It also seems like it would be easier to compile code in a language that's interpreted naturally than it would be to add an interpreter to a language that's compiled by default.

REPL. Don't knock it 'till you've tried it. :)

By design.
The authors wanted something where they can write scripts into.
Python gets compiled the first time it is executed though

Compiling Ruby at least is notoriously hard. I'm working on one, and as part of that I wrote a blog post enumerating some of the issues here.
Specifically, Ruby is suffering from a very unclear (i.e. non-existent) boundary between the "read" and "execute" phase of the program that makes it hard to compile efficiently. You could just emulate what the interpreter does, but then you're not going to see much speed up, so it wouldn't be worth the effort. If you want to compile it efficiently you then face a lot of additional complications to handle the extreme level of dynamism in Ruby.
The good news is that there are techniques for overcoming this. Self, Smalltalk and Lisp/Scheme's have dealt quite successfully with most of the same issues. But it takes time to sift through it and figure out how to make it work with Ruby. It also doesn't help that Ruby has a very convoluted grammar.

Raw compute performance is probably not a goal of most interpreted languages. Interpreted languages are typically more concerned about programmer productivity than raw speed. In most cases these languages are plenty fast enough for the tasks the languages were designed to tackle.
Given that, and that just about the only advantages of a compiler are type checking (difficult to do in a dynamic language) and speed, there's not much incentive to write compilers for most interpreted languages.

How can I use Python for large scale development?

I would be interested to learn about large scale development in Python and especially in how do you maintain a large code base?
When you make incompatibility changes to the signature of a method, how do you find all the places where that method is being called. In C++/Java the compiler will find it for you, how do you do it in Python?
When you make changes deep inside the code, how do you find out what operations an instance provides, since you don't have a static type to lookup?
How do you handle/prevent typing errors (typos)?
Are UnitTest's used as a substitute for static type checking?
As you can guess I almost only worked with statically typed languages (C++/Java), but I would like to try my hands on Python for larger programs. But I had a very bad experience, a long time ago, with the clipper (dBase) language, which was also dynamically typed.

Don't use a screw driver as a hammer
Python is not a statically typed language, so don't try to use it that way.
When you use a specific tool, you use it for what it has been built. For Python, it means:
Duck typing : no type checking. Only behavior matters. Therefore your code must be designed to use this feature. A good design means generic signatures, no dependences between components, high abstraction levels.. So if you change anything, you won't have to change the rest of the code. Python will not complain either, that what it has been built for. Types are not an issue.
Huge standard library. You do not need to change all your calls in the program if you use standard features you haven't coded yourself. And Python come with batteries included. I keep discovering them everyday. I had no idea of the number of modules I could use when I started and tried to rewrite existing stuff like everybody. It's OK, you can't get it all right from the beginning.
You don't write Java, C++, Python, PHP, Erlang, whatever, the same way. They are good reasons why there is room for each of so many different languages, they do not do the same things.
Unit tests are not a substitute
Unit tests must be performed with any language. The most famous unit test library (JUnit) is from the Java world!
This has nothing to do with types. You check behaviors, again. You avoid trouble with regression. You ensure your customer you are on tracks.
Python for large scale projects
Languages, libraries and frameworks
don't scale. Architectures do.
If you design a solid architecture, if you are able to make it evolves quickly, then it will scale. Unit tests help, automatic code check as well. But they are just safety nets. And small ones.
Python is especially suitable for large projects because it enforces some good practices and has a lot of usual design patterns built-in. But again, do not use it for what it is not designed. E.g : Python is not a technology for CPU intensive tasks.
In a huge project, you will most likely use several different technologies anyway. As a SGBD (French for DBMS) and a templating language, or else. Python is no exception.
You will probably want to use C/C++ for the part of your code you need to be fast. Or Java to fit in a Tomcat environment. Don't know, don't care. Python can play well with these.
As a conclusion
My answer may feel a bit rude, but don't get me wrong: this is a very good question.
A lot of people come to Python with old habits. I screwed myself trying to code Java like Python. You can, but will never get the best of it.
If you have played / want to play with Python, it's great! It's a wonderful tool. But just a tool, really.

I had some experience with modifying "Frets On Fire", an open source python "Guitar Hero" clone.
as I see it, python is not really suitable for a really large scale project.
I found myself spending a large part of the development time debugging issues related to assignment of incompatible types, things that static typed laguages will reveal effortlessly at compile-time.
also, since types are determined on run-time, trying to understand existing code becomes harder, because you have no idea what's the type of that parameter you are currently looking at.
in addition to that, calling functions using their name string with the __getattr__ built in function is generally more common in Python than in other programming languages, thus getting the call graph to a certain function somewhat hard (although you can call functions with their name in some statically typed languages as well).
I think that Python really shines in small scale software, rapid prototype development, and gluing existing programs together, but I would not use it for large scale software projects, since in those types of programs maintainability becomes the real issue, and in my opinion python is relatively weak there.

Since nobody pointed out pychecker, pylint and similar tools, I will: pychecker and pylint are tools that can help you find incorrect assumptions (about function signatures, object attributes, etc.) They won't find everything that a compiler might find in a statically typed language -- but they can find problems that such compilers for such languages can't find, too.
Python (and any dynamically typed language) is fundamentally different in terms of the errors you're likely to cause and how you would detect and fix them. It has definite downsides as well as upsides, but many (including me) would argue that in Python's case, the ease of writing code (and the ease of making it structurally sound) and of modifying code without breaking API compatibility (adding new optional arguments, providing different objects that have the same set of methods and attributes) make it suitable just fine for large codebases.

my 0.10 EUR:
i have several python application in 'production'-state. our company use java, c++ and python. we develop with the eclipse ide (pydev for python)
unittests are the key-solution for the problem. (also for c++ and java)
the less secure world of "dynamic-typing" will make you less careless about your code quality
BY THE WAY:
large scale development doesn't mean, that you use one single language!
large scale development often uses a handful of languages specific to the problem.
so i agree to the-hammer-problem :-)
PS: static-typing & python

Here are some items that have helped me maintain a fairly large system in python.
Structure your code in layers. i.e separate biz logic, presentation logic and your persistence layers. Invest a bit of time in defining these layers and make sure everyone on the project is brought in. For large systems creating a framework that forces you into a certain way of development can be key as well.
Tests are key, without unit tests you will likely end up with an unmanagable code base several times quicker than with other languages. Keep in mind that unit tests are often not sufficient, make sure to have several integration/acceptance tests you can run quickly after any major change.
Use Fail Fast principle. Add assertions for cases you feel your code maybe vulnerable.
Have standard logging/error handling that will help you quickly navigate to the issue
Use an IDE( pyDev works for me) that provides type ahead, pyLint/Checker integration to help you detect common typos right away and promote some coding standards
Carefull about your imports, never do from x import * or do relative imports without use of .
Do refactor, a search/replace tool with regular expressions is often all you need to do move methods/class type refactoring.

Incompatible changes to the signature of a method. This doesn't happen as much in Python as it does in Java and C++.
Python has optional arguments, default values, and far more flexibility in defining method signatures. Also, duck typing means that -- for example -- you don't have to switch from some class to an interface as part of a significant software change. Things just aren't as complex.
How do you find all the places where that method is being called? grep works for dynamic languages. If you need to know every place a method is used, grep (or equivalent IDE-supported search) works great.
How do you find out what operations an instance provides, since you don't have a static type to lookup?
a. Look at the source. You don't have the Java/C++ problem of object libraries and jar files to contend with. You don't need all the elaborate aids and tools that those languages require.
b. An IDE can provide signature information under many common circumstances. You can, easily, defeat your IDE's reasoning powers. When that happens, you should probably review what you're doing to be sure it makes sense. If your IDE can't reason out your type information, perhaps it's too dynamic.
c. In Python, you often work through the interactive interpreter. Unlike Java and C++, you can explore your instances directly and interactively. You don't need a sophisticated IDE.
Example:
>>> x= SomeClass()
>>> dir(x)
How do you handle/prevent typing errors? Same as static languages: you don't prevent them. You find and correct them. Java can only find a certain class of typos. If you have two similar class or variable names, you can wind up in deep trouble, even with static type checking.
Example:
class MyClass { }
class MyClassx extends MyClass { }
A typo with these two class names can cause havoc. ["But I wouldn't put myself in that position with Java," folks say. Agreed. I wouldn't put myself in that position with Python, either; you make classes that are profoundly different, and will fail early if they're misused.]
Are UnitTest's used as a substitute for static type checking? Here's the other Point of view: static type checking is a substitute for clear, simple design.
I've worked with programmers who weren't sure why an application worked. They couldn't figure out why things didn't compile; the didn't know the difference between abstract superclass and interface, and the couldn't figure out why a change in place makes a bunch of other modules in a separate JAR file crash. The static type checking gave them false confidence in a flawed design.
Dynamic languages allow programs to be simple. Simplicity is a substitute for static type checking. Clarity is a substitute for static type checking.

My general rule of thumb is to use dynamic languages for small non-mission-critical projects and statically-typed languages for big projects. I find that code written in a dynamic language such as python gets "tangled" more quickly. Partly that is because it is much quicker to write code in a dynamic language and that leads to shortcuts and worse design, at least in my case. Partly it's because I have IntelliJ for quick and easy refactoring when I use Java, which I don't have for python.

The usual answer to that is testing testing testing. You're supposed to have an extensive unit test suite and run it often, particularly before a new version goes online.
Proponents of dynamically typed languages make the case that you have to test anyway because even in a statically typed language conformance to the crude rules of the type system covers only a small part of what can potentially go wrong.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.