How is string.find implemented in CPython? - python

I was wondering if the 'find' method on strings was implemented with a linear search, or if python did something more sophisticated. The Python documentation doesn't discuss implementation details, so http://docs.python.org/library/stdtypes.html is of no help. Could someone please point me to the relevant source code?

The comment on the implementation has the following to say:
fast search/count implementation,
based on a mix between boyer-moore
and horspool, with a few more bells
and whistles on the top.
for some more background, see: http://effbot.org/zone/stringlib.htm
—https://github.com/python/cpython/blob/master/Objects/stringlib/fastsearch.h#L5

You should be able to find it in Objects/stringlib/find.h, although the real code is in fastsearch.h.

Looks like the algorithm used originates from Boyer-Moore-Horspool algorithm

Related

Suggestion about Minhash implementation with n permutation

I'm trying to understand LSH implementation. I found this on stackoverflow
Can you suggest a good minhash implementation?
and I try to follow the Duhaime's implementation.
In my case, i wish apply a permutation on the minhash(like in datasketch tool), and i think this implementation isn't good for me.
I already start from sparse matrix.
Someone can give some suggestion about this tecnique? isn't very diffuse so i don't find more material about implementation with Python.
I hope in you help.
Don't just look for example code. Try to understand the math behind it.
Obviously, maxhash should work similar. Or you could omit 0 values. But then you should double check the math.

How is scipy.special.expi implemented?

I was using scipy.special.expn when I realized I could be using expi instead and it should be much faster, to judge from the Cephes code that I expected it would be based on. But switching from expn to expi made almost no difference in runtime.
This made me suspect that expi is implemented by an equivalent call to expn which does not take advantage of the simpler conditions in force for expi. But looking through the source code for scipy I am baffled as to how expi is implemented. I can find the C source for expn but not expi.
Can someone clarify how expi is implemented and/or where I can find the source for it?

A-star search in numpy or python

i tried searching stackoverflow for the tags [a-star] [and] [python] and [a-star] [and] [numpy], but nothing. i also googled it but whether due to the tokenizing or its existence, i got nothing.
it's not much harder than your coding-interview tree traversals to implement. but, it would be nice to have a correct efficient implementation for everyone.
does numpy have A*?
Numpy doesn't have A*, but NetworkX has. See https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.algorithms.shortest_paths.astar.astar_path.html .
Because your question specifies numpy OR python: There is at least one astar solver in python available on PyPi.
Also there seems to be a few options on GitHub, one of which leverages numpy and c++ (hopefully efficiently).
No, there is no A* search in Numpy.
Gamedev libraries are providing their implementations as well. For example, libtcod (a roguelike engine) has it here, but it is only useful for libtcod's own grid.
General-purpose A* "for everyone" is impossible because there is just too many things it can be applied to. All sorts of graphs, grids, planes, and all of them can be implemented in a dozen ways with dozen APIs each.

AdaBoost ML algorithm python implementation

Is there anyone that has some ideas on how to implement the AdaBoost (Boostexter) algorithm in python?
Cheers!
It looks as if the sdpy project has an AdaBoost implementation. Specifically look at the sdpy/cs/ml/cla/boosting.py file.
Perhaps you can get some motivation from there.
Thanks a million Steve! In fact, your suggestion had some compatibility issues with MacOSX (a particular library was incompatible with the system) BUT it helped me find out a more interesting package : icsi.boost.macosx. I am just denoting that in case any Mac-eter finds it interesting!
Thank you again!
Tim

Is there a way to plan and diagram an architecture for dynamic scripting languages like groovy or python?

Say I want to write a large application in groovy, and take advantage of closures, categories and other concepts (that I regularly use to separate concerns). Is there a way to diagram or otherwise communicate in a simple way the architecture of some of this stuff? How do you detail (without verbose documentation) the things that a map of closures might do, for example? I understand that dynamic language features aren't usually recommended on a larger scale because they are seen as complex but does that have to be the case?
UML isn't too well equipped to handle such things, but you can still use it to communicate your design if you are willing to do some mental mapping. You can find an isomorphism between most dynamic concepts and UMLs static object-model.
For example you can think of a closure as an object implementing a one method interface. It's probably useful to model such interfaces as something a bit more specific than interface Callable { call(args[0..*]: Object) : Object }.
Duck typing can similarly though of as an interface. If you have a method that takes something that can quack, model it as taking an object that is a specialization of the interface _interface Quackable { quack() }.
You can use your imagination for other concepts. Keep in mind that the purpose of design diagrams is to communicate ideas. So don't get overly pedantic about modeling everything 100%, think what do you want your diagrams to say, make sure that they say that and eliminate any extraneous detail that would dilute the message. And if you use some concepts that aren't obvious to your target audience, explain them.
Also, if UML really can't handle what you want to say, try other ways to visualize your message. UML is only a good choice because it gives you a common vocabulary so you don't have to explain every concept on your diagram.
If you don't want to generate verbose documentation, a picture is worth a thousand words. I've found tools like FreeMind useful, both for clarifying my ideas and for communicating them to others. And if you are willing to invest in a medium (or at least higher) level of documentation, I would recommend Sphinx. It is pretty easy to use, and although it's oriented towards documentation of Python modules, it can generate completely generic documentation which looks professional and easy on the eye. Your documentation can contain diagrams such as are created using Graphviz.

Categories

Resources