Suggestion about Minhash implementation with n permutation

Suggestion about Minhash implementation with n permutation - python

I'm trying to understand LSH implementation. I found this on stackoverflow
Can you suggest a good minhash implementation?
and I try to follow the Duhaime's implementation.
In my case, i wish apply a permutation on the minhash(like in datasketch tool), and i think this implementation isn't good for me.
I already start from sparse matrix.
Someone can give some suggestion about this tecnique? isn't very diffuse so i don't find more material about implementation with Python.
I hope in you help.

Don't just look for example code. Try to understand the math behind it.
Obviously, maxhash should work similar. Or you could omit 0 values. But then you should double check the math.

Related

Python how to do ludecomposition and inverse

I'm trying to do a matrix operation. the original logic we have in java is like this:
inverseMatrix = LUDecomposition(matrix).getSolver().getInverse()
The LUDecomposition used here is: https://commons.apache.org/proper/commons-math/javadocs/api-3.6/org/apache/commons/math3/linear/LUDecomposition.html#getSolver()
I'm finding a way to implement this in python, does anyone know what would be a good way to do so?
Sorry I'm not quite familiar with matrix operations...Thanks a lot!

howto find e^matrix in c

How can I calculate exp([[1,2,3]]) like in python numpy.
Is there any in tensorflow or any other library?
print(np.exp(np.array([[1,2,3,4],[2,3,4,66]])) )
I need a c++ solution.
actual result(math)
https://www.symbolab.com/solver/matrix-exponential-calculator/e%5E%7B%5Cbegin%7Bpmatrix%7D1%260%260%5C%5C0%261%260%5C%5C0%260%261%5Cend%7Bpmatrix%7D%7D?or=ex

Possibly duplicate. see Complex matrix exponential in C++
Careful here, the matrix exponential is actually very difficult and a current topic of research for the general case! See https://en.wikipedia.org/wiki/Matrix_exponential. What Tortellini is describing is an element wise exponentiation, which is not the same thing.
The numpy library does the element wise exponent, the math you linked seems to be the matrix exponent. For a diagonal matrix they are the same.
But the two don't correspond in general (See for example: https://www.wolframalpha.com/input/?i=exp%28%7B%7B1%2C0%2C1%7D%2C%7B0%2C1%2C0%7D%2C%7B0%2C0%2C1%7D%7D%29), so please make sure which one you need.

Computing log of integral in terms of log of integrand

This question may be half computational math, half programming.
I'm trying to estimate log[\int_0^\infty\int_0^\infty f(x,y)dxdy] [actually thousands of such integrals] in Python. The function f(x,y) involves some very large/very small numbers that are bound to cause overflow/underflow errors; so I'd really prefer to work with log[f(x,y)] instead of f(x,y).
Thus my question is two parts:
1) Is there a way to estimate log[\int_0^\infty\int_0^\infty f(x,y)dxdy] using the log of the function instead of the function itself?
2) Is there an implementation of this in Python?
Thanks

I would be surprised if the math and/or numpy libraries or perhaps some more specific third party libraries would not be able to solve a problem like this. Here are some of their log functions:
math.log(x[, base]), math.log1p(x), math.log2(x), math.log10(x) (https://docs.python.org/3.3/library/math.html)
numpy.log, numpy.log10, numpy.log2, numpy.log1p, numpy.logaddexp, numpy.logaddexp2 (https://numpy.org/doc/stable/reference/routines.math.html#exponents-and-logarithms)
Generally, Just google: "logarithm python library" and try to identify similar stackoverflow problems, which will allow you to find the right libraries and functions to try out. Once you do that, then you can follow this guide, so that someone can try to help you get from input to expected output: How to make good reproducible pandas examples

How can I compute the inverse cross product of a vector in numpy?

How can I perform the inverse cross product in numpy?
That is, given two numpy arrays b and c, how can I find a such that
a.cross(b) == c
EDIT: Could whoever downvoted please let me know what they didn't like it, so that I can learn from their opinion? I asked the question because I didn't easily find an answer anywhere. Turns out the question is mathematically ill-defined (as people pointed out), but from now on if people look it up here this answer will show up and they'll know that quickly and easily.

There only exists a solution if a and c are orthogonal, and the solution is not unique.
Then, a = np.cross(b,c)/np.dot(b,b)+t*b is a solution for all t.
See this question on Math SE:
https://math.stackexchange.com/questions/32600/whats-the-opposite-of-a-cross-product

There is no function native to numpy that will arrive at the solution you're looking for. You may have better luck asking the question here.
There seems to be a problem with the question as well. From what I know of linear algebra, solving for 'a' wouldn't yield a unique solution unless certain conditions are met.
See this answer over at the math stack exchange for more information.

How is string.find implemented in CPython?

I was wondering if the 'find' method on strings was implemented with a linear search, or if python did something more sophisticated. The Python documentation doesn't discuss implementation details, so http://docs.python.org/library/stdtypes.html is of no help. Could someone please point me to the relevant source code?

The comment on the implementation has the following to say:
fast search/count implementation,
based on a mix between boyer-moore
and horspool, with a few more bells
and whistles on the top.
for some more background, see: http://effbot.org/zone/stringlib.htm
—https://github.com/python/cpython/blob/master/Objects/stringlib/fastsearch.h#L5

You should be able to find it in Objects/stringlib/find.h, although the real code is in fastsearch.h.

Looks like the algorithm used originates from Boyer-Moore-Horspool algorithm

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Suggestion about Minhash implementation with n permutation - python

Don't just look for example code. Try to understand the math behind it. Obviously, maxhash should work similar. Or you could omit 0 values. But then you should double check the math.

Related

Python how to do ludecomposition and inverse

howto find e^matrix in c

Computing log of integral in terms of log of integrand

How can I compute the inverse cross product of a vector in numpy?

How is string.find implemented in CPython?

Categories

Resources