Erroneous SciPy 1.7 source build - python

When installing SciPy 1.7.1 from source on Linux using
python setup.py build
python setup.py install
(along with environment and site.cfg hacking as needed) I end up with a broken build. My particular build recipe works for SciPy <= 1.6
Once SciPy 1.7.1 is built, importing e.g. scipy.optimize or scipy.special results in errors
AttributeError: module 'scipy.special._ufuncs_cxx' has no attribute 'pyx_capi'
ImportError: cannot import name 'levinson' from 'scipy.linalg._solve_toeplitz'
ImportError: cannot import name 'csgraph_to_dense' from 'scipy.sparse.csgraph._tools'
What has changed, and how do I solve this?

Looking at the site-packages directory I see that SciPy 1.7 installs itself as a zipped Python egg, whereas previous versions used to install as directories (though still Python eggs). This behaviour can be chosen by specifying the zip_safe argument to setuptools.setup(), called from within setup.py. In SciPy 1.7 this is called as
setup(**metadata)
with metadata not including 'zip_safe', meaning that whether zipped eggs are safe to use are automatically determined. This might also be the case for older SciPy versions, but for whatever reason the process ends up declaring zipped eggs to be safe for 1.7 on my system, which does not seem to be the case.
Manually adding
metadata['zip_safe'] = False
above setup(**metadata) prior to executing setup.py results in the egg being a directory (as opposed to a zipped archive), and the build works.
To do the patching of setup.py programatically, use e.g. (GNU sed)
sed -i "s/\(^ *\)\(setup *(.*\)$/\1metadata['zip_safe'] = False; \2/" setup.py

Related

cython setuptools change output filename

I am using cython to cross-compile external python module. I am using python3.6 on the host and python3.5 on the target. Also I am compiling on x86_64 for target aarch64.
My setup.py looks like:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
from Cython.Distutils import build_ext
import builder_config
import os
os.environ["PATH"] = builder_config.PATH
os.environ["CC"] = builder_config.COMPILER
os.environ["LDSHARED"] = builder_config.COMPILER + " -lpython3.5m -shared"
os.environ["CFLAGS"] = builder_config.CFLAGS
os.environ["LDFLAGS"] = builder_config.LDFLAGS
os.environ["ARCH"] = "aarch64"
setup(
ext_modules = cythonize((Extension("my_ext", ["file1.pyx", "file2.pyx", "file3.pyx", "file4.pyx", "file5.pyx"]))),
)
When I run python3.6 setup.py build_ext -i I get a file named: my_ext.cpython-36m-x86_64-linux-gnu.so
My problem is that on the target the library will not be loaded unless the name is changed to:
my_ext.cpython-35m-aarch64-linux-gnu.so
How can I change the generated filename?
As stated in the comments, what you are trying to achieve is unsafe.
You can work around the architecture tag with the environment variable _PYTHON_HOST_PLATFORM (e.g. you can change it in your sitecustomize.py). But, if the modules are actually incompatible (and they most likely are), you will only get core dumps later on.
I don't think you can work around the major Python version.
In order to come back to safer grounds, I would try to rely on portable solutions. For example, it doesn't look official, but we can find some articles on the web about Conda and aarch64 (e.g. you can look for 'Archiconda'). One more time, you wouldn't be able to simply copy the conda environments from one machine to another, but, you can freeze these environments (via a 'conda export') and build similar ones on the target machine.
An option is to upgrade the target interpreter to v3.6 if that's possible for you.
Another option is to install v3.5 on the machine you're using to build with that interpreter. It's pretty uncomplicated to get several different versions of the python interpreter installed on the same machine. I don't know your specifics so I can't provide any links but I'm sure a quick search will get you what you need.

In a setup.py involving Cython, if install_requires, then how can from library import something?

This doesn't make sense to me. How can I use the setup.py to install Cython and then also use the setup.py to compile a library proxy?
import sys, imp, os, glob
from setuptools import setup
from Cython.Build import cythonize # this isn't installed yet
setup(
name='mylib',
version='1.0',
package_dir={'mylib': 'mylib', 'mylib.tests': 'tests'},
packages=['mylib', 'mylib.tests'],
ext_modules = cythonize("mylib_proxy.pyx"), #how can we call cythonize here?
install_requires=['cython'],
test_suite='tests',
)
Later:
python setup.py build
Traceback (most recent call last):
File "setup.py", line 3, in <module>
from Cython.Build import cythonize
ImportError: No module named Cython.Build
It's because cython isn't installed yet.
What's odd is that a great many projects are written this way. A quick github search reveals as much: https://github.com/search?utf8=%E2%9C%93&q=install_requires+cython&type=Code
As I understand it, this is where PEP 518 comes in - also see some clarifications by one of its authors.
The idea is that you add yet another file to your Python project / package: pyproject.toml. It is supposed to contain information on build environment dependencies (among other stuff, long term). pip (or just any other package manager) could look into this file and before running setup.py (or any other build script) install the required build environment. A pyproject.toml could therefore look like this:
[build-system]
requires = ["setuptools", "wheel", "Cython"]
It is a fairly recent development and, as of yet (January 2019), it is not finalized / approved by the Python community, though (limited) support was added to pip in May 2017 / the 10.0 release.
One solution to this is to not make Cython a build requirement, and instead distribute the Cython generated C files with your package. I'm sure there is a simpler example somewhere, but this is what pandas does - it conditionally imports Cython, and if not present can be built from the c files.
https://github.com/pandas-dev/pandas/blob/3ff845b4e81d4dde403c29908f5a9bbfe4a87788/setup.py#L433
Edit: The doc link from #danny has an easier to follow example.
http://docs.cython.org/en/latest/src/reference/compilation.html#distributing-cython-modules
When you use setuptool, you should add cython to setup_requires (and also to install_requires if cython is used by installation), i.e.
# don't import cython, it isn't yet there
from setuptools import setup, Extension
# use Extension, rather than cythonize (it is not yet available)
cy_extension = Extension(name="mylib_proxy", sources=["mylib_proxy.pyx"])
setup(
name='mylib',
...
ext_modules = [cy_extension],
setup_requires=["cython"],
...
)
Cython isn't imported (it is not yet available when setup.pystarts), but setuptools.Extension is used instead of cythonize to add cython-extension to the setup.
It should work now. The reason: setuptools will try to import cython, after setup_requires are fulfilled:
...
try:
# Attempt to use Cython for building extensions, if available
from Cython.Distutils.build_ext import build_ext as _build_ext
# Additionally, assert that the compiler module will load
# also. Ref #1229.
__import__('Cython.Compiler.Main')
except ImportError:
_build_ext = _du_build_ext
...
It becomes more complicated, if your Cython-extension uses numpy, but also this is possible - see this SO post.
It doesn't make sense in general. It is, as you suspect, an attempt to use something that (possibly) has yet to be installed. If tested on a system that already has the dependency installed, you might not notice this defect. But run it on a system where your dependency is absent, and you will certainly notice.
There is another setup() keyword argument, setup_requires, that can appear to be parallel in form and use to install_requires, but this is an illusion. Whereas install_requires triggers a lovely ballet of automatic installation in environments that lack the dependencies it names, setup_requires is more documentation than automation. It won't auto-install, and certainly not magically jump back in time to auto-install modules that have already been called for in import statements.
There's more on this at the setuptools docs, but the quick answer is that you're right to be confused by a module that is trying to auto-install its own setup pre-requisites.
For a practical workaround, try installing cython separately, and then run this setup. While it won't fix the metaphysical illusions of this setup script, it will resolve the requirements and let you move on.

Can't install datasets package via pip

I'm trying to run a script that requires the datasets python package. I've tried installing this unsuccessfully using pip by calling:
pip install datasets
I know this hasn't worked because when I run the script I get the message:
Traceback (most recent call last):
File "lda.py", line 2, in <module>
import lda
File "/Users/deepthought/lda.py", line 3, in <module>
import datasets
ImportError: No module named datasets
I've installed python via homebrew.
When I run pip install datasets I get the error:
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/ch/84cpkwc52zx0rsh4k5v4_7h40000gn/T/pip-build-gZWyT3/datasets/
I'm fairly new to scripting python or going under the hood of an OS X, so there's a risk I've missed something elementary.
I've been researching & trying to overcome this for about a week now including looking at similar questions on stackoverflow.com and haven't gotten past this stage for the duration. One of the tutorials I was working through told me to edit ~/.profile
This has been left like so:
# The orginal version is saved in .profile.pysave
#PATH="/Library/Frameworks/Python.framework/Versions/3.5/bin:${PATH}"
#export PATH
export PATH=/usr/local/bin:/usr/local/sbin:$PATH
/etc/paths contains:
/usr/local/bin
/usr/bin
/bin
/usr/sbin
/sbin
I'm running OS X El Capitan - 10.11.5 (15F34)
Python 2.7.11
Brew doctor flagged multiple items, but I've no idea whether it is worth fixing none/all of them:
Warning: Your XQuartz (2.7.7) is outdated
Please install XQuartz 2.7.9:
https://xquartz.macosforge.org
Warning: Python is installed at /Library/Frameworks/Python.framework
Homebrew only supports building against the System-provided Python or a
brewed Python. In particular, Pythons installed to /Library can interfere
with other software installs.
Warning: Unbrewed dylibs were found in /usr/local/lib.
If you didn't put them there on purpose they could cause problems when
building Homebrew formulae, and may need to be deleted.
Unexpected dylibs:
/usr/local/lib/libtcl8.6.dylib
/usr/local/lib/libtk8.6.dylib
Warning: Unbrewed header files were found in /usr/local/include.
If you didn't put them there on purpose they could cause problems when
building Homebrew formulae, and may need to be deleted.
Unexpected header files:
/usr/local/include/fakemysql.h
/usr/local/include/fakepq.h
/usr/local/include/fakesql.h
/usr/local/include/itcl.h
/usr/local/include/itcl2TclOO.h
/usr/local/include/itclDecls.h
/usr/local/include/itclInt.h
/usr/local/include/itclIntDecls.h
/usr/local/include/itclMigrate2TclCore.h
/usr/local/include/itclTclIntStubsFcn.h
/usr/local/include/mysqlStubs.h
/usr/local/include/node/ares.h
/usr/local/include/node/ares_version.h
/usr/local/include/node/nameser.h
/usr/local/include/node/node.h
/usr/local/include/node/node_buffer.h
/usr/local/include/node/node_internals.h
/usr/local/include/node/node_object_wrap.h
/usr/local/include/node/node_version.h
/usr/local/include/node/openssl/opensslconf.h
/usr/local/include/node/uv-private/ngx-queue.h
/usr/local/include/node/uv-private/stdint-msvc2008.h
/usr/local/include/node/uv-private/tree.h
/usr/local/include/node/uv-private/uv-bsd.h
/usr/local/include/node/uv-private/uv-darwin.h
/usr/local/include/node/uv-private/uv-linux.h
/usr/local/include/node/uv-private/uv-sunos.h
/usr/local/include/node/uv-private/uv-unix.h
/usr/local/include/node/uv-private/uv-win.h
/usr/local/include/node/uv.h
/usr/local/include/node/v8-debug.h
/usr/local/include/node/v8-preparser.h
/usr/local/include/node/v8-profiler.h
/usr/local/include/node/v8-testing.h
/usr/local/include/node/v8.h
/usr/local/include/node/v8stdint.h
/usr/local/include/node/zconf.h
/usr/local/include/node/zlib.h
/usr/local/include/odbcStubs.h
/usr/local/include/pqStubs.h
/usr/local/include/tcl.h
/usr/local/include/tclDecls.h
/usr/local/include/tclOO.h
/usr/local/include/tclOODecls.h
/usr/local/include/tclPlatDecls.h
/usr/local/include/tclThread.h
/usr/local/include/tclTomMath.h
/usr/local/include/tclTomMathDecls.h
/usr/local/include/tdbc.h
/usr/local/include/tdbcDecls.h
/usr/local/include/tdbcInt.h
/usr/local/include/tk.h
/usr/local/include/tkDecls.h
/usr/local/include/tkPlatDecls.h
Warning: Unbrewed .pc files were found in /usr/local/lib/pkgconfig.
If you didn't put them there on purpose they could cause problems when
building Homebrew formulae, and may need to be deleted.
Unexpected .pc files:
/usr/local/lib/pkgconfig/tcl.pc
/usr/local/lib/pkgconfig/tk.pc
Warning: Unbrewed static libraries were found in /usr/local/lib.
If you didn't put them there on purpose they could cause problems when
building Homebrew formulae, and may need to be deleted.
Unexpected static libraries:
/usr/local/lib/libtclstub8.6.a
/usr/local/lib/libtkstub8.6.a
Warning: You have unlinked kegs in your Cellar
Leaving kegs unlinked can lead to build-trouble and cause brews that depend on
those kegs to fail to run properly once built. Run `brew link` on these:
git
python3
Warning: Broken symlinks were found. Remove them with `brew prune`:
/usr/local/bin/github
/usr/local/lib/perl5/site_perl/Git/I18N.pm
/usr/local/lib/perl5/site_perl/Git/IndexInfo.pm
/usr/local/lib/perl5/site_perl/Git/SVN/Editor.pm
/usr/local/lib/perl5/site_perl/Git/SVN/Fetcher.pm
/usr/local/lib/perl5/site_perl/Git/SVN/GlobSpec.pm
/usr/local/lib/perl5/site_perl/Git/SVN/Log.pm
/usr/local/lib/perl5/site_perl/Git/SVN/Memoize/YAML.pm
/usr/local/lib/perl5/site_perl/Git/SVN/Migration.pm
/usr/local/lib/perl5/site_perl/Git/SVN/Prompt.pm
/usr/local/lib/perl5/site_perl/Git/SVN/Ra.pm
/usr/local/lib/perl5/site_perl/Git/SVN/Utils.pm
/usr/local/lib/perl5/site_perl/Git/SVN.pm
/usr/local/lib/perl5/site_perl/Git.pm
/usr/local/share/git-core/templates/description
/usr/local/share/git-core/templates/hooks/applypatch-msg.sample
/usr/local/share/git-core/templates/hooks/commit-msg.sample
/usr/local/share/git-core/templates/hooks/post-update.sample
/usr/local/share/git-core/templates/hooks/pre-applypatch.sample
/usr/local/share/git-core/templates/hooks/pre-commit.sample
/usr/local/share/git-core/templates/hooks/pre-push.sample
/usr/local/share/git-core/templates/hooks/pre-rebase.sample
/usr/local/share/git-core/templates/hooks/prepare-commit-msg.sample
/usr/local/share/git-core/templates/hooks/update.sample
/usr/local/share/git-core/templates/info/exclude
/usr/local/share/man/man1/git-add.1
/usr/local/share/man/man1/git-am.1
/usr/local/share/man/man1/git-annotate.1
/usr/local/share/man/man1/git-apply.1
/usr/local/share/man/man1/git-archimport.1
/usr/local/share/man/man1/git-archive.1
/usr/local/share/man/man1/git-bisect.1
/usr/local/share/man/man1/git-blame.1
/usr/local/share/man/man1/git-branch.1
/usr/local/share/man/man1/git-bundle.1
/usr/local/share/man/man1/git-cat-file.1
/usr/local/share/man/man1/git-check-attr.1
/usr/local/share/man/man1/git-check-ignore.1
/usr/local/share/man/man1/git-check-mailmap.1
/usr/local/share/man/man1/git-check-ref-format.1
/usr/local/share/man/man1/git-checkout-index.1
/usr/local/share/man/man1/git-checkout.1
/usr/local/share/man/man1/git-cherry-pick.1
/usr/local/share/man/man1/git-cherry.1
/usr/local/share/man/man1/git-citool.1
/usr/local/share/man/man1/git-clean.1
/usr/local/share/man/man1/git-clone.1
/usr/local/share/man/man1/git-column.1
/usr/local/share/man/man1/git-commit-tree.1
/usr/local/share/man/man1/git-commit.1
/usr/local/share/man/man1/git-config.1
/usr/local/share/man/man1/git-count-objects.1
/usr/local/share/man/man1/git-credential-cache--daemon.1
/usr/local/share/man/man1/git-credential-cache.1
/usr/local/share/man/man1/git-credential-store.1
/usr/local/share/man/man1/git-credential.1
/usr/local/share/man/man1/git-cvsexportcommit.1
/usr/local/share/man/man1/git-cvsimport.1
/usr/local/share/man/man1/git-cvsserver.1
/usr/local/share/man/man1/git-daemon.1
/usr/local/share/man/man1/git-describe.1
/usr/local/share/man/man1/git-diff-files.1
/usr/local/share/man/man1/git-diff-index.1
/usr/local/share/man/man1/git-diff-tree.1
/usr/local/share/man/man1/git-diff.1
/usr/local/share/man/man1/git-difftool.1
/usr/local/share/man/man1/git-fast-export.1
/usr/local/share/man/man1/git-fast-import.1
/usr/local/share/man/man1/git-fetch-pack.1
/usr/local/share/man/man1/git-fetch.1
/usr/local/share/man/man1/git-filter-branch.1
/usr/local/share/man/man1/git-fmt-merge-msg.1
/usr/local/share/man/man1/git-for-each-ref.1
/usr/local/share/man/man1/git-format-patch.1
/usr/local/share/man/man1/git-fsck-objects.1
/usr/local/share/man/man1/git-fsck.1
/usr/local/share/man/man1/git-gc.1
/usr/local/share/man/man1/git-get-tar-commit-id.1
/usr/local/share/man/man1/git-grep.1
/usr/local/share/man/man1/git-gui.1
/usr/local/share/man/man1/git-hash-object.1
/usr/local/share/man/man1/git-help.1
/usr/local/share/man/man1/git-http-backend.1
/usr/local/share/man/man1/git-http-fetch.1
/usr/local/share/man/man1/git-http-push.1
/usr/local/share/man/man1/git-imap-send.1
/usr/local/share/man/man1/git-index-pack.1
/usr/local/share/man/man1/git-init-db.1
/usr/local/share/man/man1/git-init.1
/usr/local/share/man/man1/git-instaweb.1
/usr/local/share/man/man1/git-log.1
/usr/local/share/man/man1/git-lost-found.1
/usr/local/share/man/man1/git-ls-files.1
/usr/local/share/man/man1/git-ls-remote.1
/usr/local/share/man/man1/git-ls-tree.1
/usr/local/share/man/man1/git-mailinfo.1
/usr/local/share/man/man1/git-mailsplit.1
/usr/local/share/man/man1/git-merge-base.1
/usr/local/share/man/man1/git-merge-file.1
/usr/local/share/man/man1/git-merge-index.1
/usr/local/share/man/man1/git-merge-one-file.1
/usr/local/share/man/man1/git-merge-tree.1
/usr/local/share/man/man1/git-merge.1
/usr/local/share/man/man1/git-mergetool--lib.1
/usr/local/share/man/man1/git-mergetool.1
/usr/local/share/man/man1/git-mktag.1
/usr/local/share/man/man1/git-mktree.1
/usr/local/share/man/man1/git-mv.1
/usr/local/share/man/man1/git-name-rev.1
/usr/local/share/man/man1/git-notes.1
/usr/local/share/man/man1/git-p4.1
/usr/local/share/man/man1/git-pack-objects.1
/usr/local/share/man/man1/git-pack-redundant.1
/usr/local/share/man/man1/git-pack-refs.1
/usr/local/share/man/man1/git-parse-remote.1
/usr/local/share/man/man1/git-patch-id.1
/usr/local/share/man/man1/git-peek-remote.1
/usr/local/share/man/man1/git-prune-packed.1
/usr/local/share/man/man1/git-prune.1
/usr/local/share/man/man1/git-pull.1
/usr/local/share/man/man1/git-push.1
/usr/local/share/man/man1/git-quiltimport.1
/usr/local/share/man/man1/git-read-tree.1
/usr/local/share/man/man1/git-rebase.1
/usr/local/share/man/man1/git-receive-pack.1
/usr/local/share/man/man1/git-reflog.1
/usr/local/share/man/man1/git-relink.1
/usr/local/share/man/man1/git-remote-ext.1
/usr/local/share/man/man1/git-remote-fd.1
/usr/local/share/man/man1/git-remote-testgit.1
/usr/local/share/man/man1/git-remote.1
/usr/local/share/man/man1/git-repack.1
/usr/local/share/man/man1/git-replace.1
/usr/local/share/man/man1/git-repo-config.1
/usr/local/share/man/man1/git-request-pull.1
/usr/local/share/man/man1/git-rerere.1
/usr/local/share/man/man1/git-reset.1
/usr/local/share/man/man1/git-rev-list.1
/usr/local/share/man/man1/git-rev-parse.1
/usr/local/share/man/man1/git-revert.1
/usr/local/share/man/man1/git-rm.1
/usr/local/share/man/man1/git-send-email.1
/usr/local/share/man/man1/git-send-pack.1
/usr/local/share/man/man1/git-sh-i18n--envsubst.1
/usr/local/share/man/man1/git-sh-i18n.1
/usr/local/share/man/man1/git-sh-setup.1
/usr/local/share/man/man1/git-shell.1
/usr/local/share/man/man1/git-shortlog.1
/usr/local/share/man/man1/git-show-branch.1
/usr/local/share/man/man1/git-show-index.1
/usr/local/share/man/man1/git-show-ref.1
/usr/local/share/man/man1/git-show.1
/usr/local/share/man/man1/git-stage.1
/usr/local/share/man/man1/git-stash.1
/usr/local/share/man/man1/git-status.1
/usr/local/share/man/man1/git-stripspace.1
/usr/local/share/man/man1/git-submodule.1
/usr/local/share/man/man1/git-svn.1
/usr/local/share/man/man1/git-symbolic-ref.1
/usr/local/share/man/man1/git-tag.1
/usr/local/share/man/man1/git-tar-tree.1
/usr/local/share/man/man1/git-unpack-file.1
/usr/local/share/man/man1/git-unpack-objects.1
/usr/local/share/man/man1/git-update-index.1
/usr/local/share/man/man1/git-update-ref.1
/usr/local/share/man/man1/git-update-server-info.1
/usr/local/share/man/man1/git-upload-archive.1
/usr/local/share/man/man1/git-upload-pack.1
/usr/local/share/man/man1/git-var.1
/usr/local/share/man/man1/git-verify-pack.1
/usr/local/share/man/man1/git-verify-tag.1
/usr/local/share/man/man1/git-web--browse.1
/usr/local/share/man/man1/git-whatchanged.1
/usr/local/share/man/man1/git-write-tree.1
/usr/local/share/man/man1/git.1
/usr/local/share/man/man1/gitk.1
/usr/local/share/man/man1/gitremote-helpers.1
/usr/local/share/man/man1/gitweb.1
/usr/local/share/man/man3/Git.3pm
/usr/local/share/man/man3/Git::I18N.3pm
/usr/local/share/man/man3/Git::SVN::Editor.3pm
/usr/local/share/man/man3/Git::SVN::Fetcher.3pm
/usr/local/share/man/man3/Git::SVN::Memoize::YAML.3pm
/usr/local/share/man/man3/Git::SVN::Prompt.3pm
/usr/local/share/man/man3/Git::SVN::Ra.3pm
/usr/local/share/man/man3/Git::SVN::Utils.3pm
/usr/local/share/man/man5/gitattributes.5
/usr/local/share/man/man5/githooks.5
/usr/local/share/man/man5/gitignore.5
/usr/local/share/man/man5/gitmodules.5
/usr/local/share/man/man5/gitrepository-layout.5
/usr/local/share/man/man5/gitweb.conf.5
/usr/local/share/man/man7/gitcli.7
/usr/local/share/man/man7/gitcore-tutorial.7
/usr/local/share/man/man7/gitcredentials.7
/usr/local/share/man/man7/gitcvs-migration.7
/usr/local/share/man/man7/gitdiffcore.7
/usr/local/share/man/man7/gitglossary.7
/usr/local/share/man/man7/gitnamespaces.7
/usr/local/share/man/man7/gitrevisions.7
/usr/local/share/man/man7/gittutorial-2.7
/usr/local/share/man/man7/gittutorial.7
/usr/local/share/man/man7/gitworkflows.7
Warning: Your Homebrew is outdated.
You haven't updated for at least 24 hours. This is a long time in brewland!
To update Homebrew, run `brew update`.
How do I make progress in diagnosing the issue with the installation of the datasets package?
Update
Here is the script I'm trying to run:
import sys
egg_path = '/usr/local/lib/python2.7/site-packages/datasets-0.0.9-py2.7.egg'
sys.path.append(egg_path)
import numpy as np
import lda
import datasets
X = lda.datasets.load_reuters()
vocab = lda.datasets.load_reuters_vocab()
titles = lda.datasets.load_reuters_titles()
X.shape
(395, 4258)
X.sum()
84010
model = lda.LDA(n_topics=20, n_iter=1500, random_state=1)
model.fit(X) # model.fit_transform(X) is also available
topic_word = model.topic_word_ # model.components_ also works
n_top_words = 8
for i, topic_dist in enumerate(topic_word):
topic_words = np.array(vocab)[np.argsort(topic_dist)][:-(n_top_words+1):-1]
print('Topic {}: {}'.format(i, ' '.join(topic_words)))
Using pip install datasets I was also not able to properly install this package. It seems like there is a bug in this particular package.
The DESCRIBE.rst file is simply missing. To fix this just download the plain package from PyPi. https://pypi.python.org/pypi/datasets/0.0.9
Then adjust the setup.py file (remove the description).
Afterwards you need to install using python setup.py install. Don't forget to add the installed package to your Python path!
To do so, I would recommend that you add the following to your script.
import sys
egg_path = '__MODULE_PATH__/datasets-0.0.9-py3.5.egg'
sys.path.append(egg_path)
import datasets
Otherwise, you can also add your module using:
export PATH=__MODULE_PATH__:$PATH
Alternatively, you could also simply pull the source code from the Github repository and just include it in your project. https://github.com/realtimeweb/datasets
Hope this was kind of helpful to your problem. If you got any further questions just let me know.
I just hit the same issue on a rapsberry pi, just found out this had been fixed but the error comes from the lack of ram to extract properly the package.
You can fix this by disabling the creation of a cache dir in ram adding the parameter
--no-cache-dir
for example
pip2 install --user --no-cache-dir datasets

Python module development workflow - setup and build [duplicate]

I'm developing my own module in python 2.7. It resides in ~/Development/.../myModule instead of /usr/lib/python2.7/dist-packages or /usr/lib/python2.7/site-packages. The internal structure is:
/project-root-dir
/server
__init__.py
service.py
http.py
/client
__init__.py
client.py
client/client.py includes PyCachedClient class. I'm having import problems:
project-root-dir$ python
Python 2.7.2+ (default, Jul 20 2012, 22:12:53)
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from server import http
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "server/http.py", line 9, in <module>
from client import PyCachedClient
ImportError: cannot import name PyCachedClient
I didn't set PythonPath to include my project-root-dir, therefore when server.http tries to include client.PyCachedClient, it tries to load it from relative path and fails. My question is - how should I set all paths/settings in a good, pythonic way? I know I can run export PYTHONPATH=... in shell each time I open a console and try to run my server, but I guess it's not the best way. If my module was installed via PyPi (or something similar), I'd have it installed in /usr/lib/python... path and it'd be loaded automatically.
I'd appreciate tips on best practices in python module development.
My Python development workflow
This is a basic process to develop Python packages that incorporates what I believe to be the best practices in the community. It's basic - if you're really serious about developing Python packages, there still a bit more to it, and everyone has their own preferences, but it should serve as a template to get started and then learn more about the pieces involved. The basic steps are:
Use virtualenv for isolation
setuptools for creating a installable package and manage dependencies
python setup.py develop to install that package in development mode
virtualenv
First, I would recommend using virtualenv to get an isolated environment to develop your package(s) in. During development, you will need to install, upgrade, downgrade and uninstall dependencies of your package, and you don't want
your development dependencies to pollute your system-wide site-packages
your system-wide site-packages to influence your development environment
version conflicts
Polluting your system-wide site-packages is bad, because any package you install there will be available to all Python applications you installed that use the system Python, even though you just needed that dependency for your small project. And it was just installed in a new version that overrode the one in the system wide site-packages, and is incompatible with ${important_app} that depends on it. You get the idea.
Having your system wide site-packages influence your development environment is bad, because maybe your project depends on a module you already got in the system Python's site-packages. So you forget to properly declare that your project depends on that module, but everything works because it's always there on your local development box. Until you release your package and people try to install it, or push it to production, etc... Developing in a clean environment forces you to properly declare your dependencies.
So, a virtualenv is an isolated environment with its own Python interpreter and module search path. It's based on a Python installation you previously installed, but isolated from it.
To create a virtualenv, install the virtualenv package by installing it to your system wide Python using easy_install or pip:
sudo pip install virtualenv
Notice this will be the only time you install something as root (using sudo), into your global site-packages. Everything after this will happen inside the virtualenv you're about to create.
Now create a virtualenv for developing your package:
cd ~/pyprojects
virtualenv --no-site-packages foobar-env
This will create a directory tree ~/pyprojects/foobar-env, which is your virtualenv.
To activate the virtualenv, cd into it and source the bin/activate script:
~/pyprojects $ cd foobar-env/
~/pyprojects/foobar-env $ . bin/activate
(foobar-env) ~/pyprojects/foobar-env $
Note the leading dot ., that's shorthand for the source shell command. Also note how the prompt changes: (foobar-env) means your inside the activated virtualenv (and always will need to be for the isolation to work). So activate your env every time you open a new terminal tab or SSH session etc..
If you now run python in that activated env, it will actually use ~/pyprojects/foobar-env/bin/python as the interpreter, with its own site-packages and isolated module search path.
A setuptools package
Now for creating your package. Basically you'll want a setuptools package with a setup.py to properly declare your package's metadata and dependencies. You can do this on your own by by following the setuptools documentation, or create a package skeletion using Paster templates. To use Paster templates, install PasteScript into your virtualenv:
pip install PasteScript
Let's create a source directory for our new package to keep things organized (maybe you'll want to split up your project into several packages, or later use dependencies from source):
mkdir src
cd src/
Now for creating your package, do
paster create -t basic_package foobar
and answer all the questions in the interactive interface. Most are optional and can simply be left at the default by pressing ENTER.
This will create a package (or more precisely, a setuptools distribution) called foobar. This is the name that
people will use to install your package using easy_install or pip install foobar
the name other packages will use to depend on yours in setup.py
what it will be called on PyPi
Inside, you almost always create a Python package (as in "a directory with an __init__.py) that's called the same. That's not required, the name of the top level Python package can be any valid package name, but it's a common convention to name it the same as the distribution. And that's why it's important, but not always easy, to keep the two apart. Because the top level python package name is what
people (or you) will use to import your package using import foobar or from foobar import baz
So if you used the paster template, it will already have created that directory for you:
cd foobar/foobar/
Now create your code:
vim models.py
models.py
class Page(object):
"""A dumb object wrapping a webpage.
"""
def __init__(self, content, url):
self.content = content
self.original_url = url
def __repr__(self):
return "<Page retrieved from '%s' (%s bytes)>" % (self.original_url, len(self.content))
And a client.py in the same directory that uses models.py:
client.py
import requests
from foobar.models import Page
url = 'http://www.stackoverflow.com'
response = requests.get(url)
page = Page(response.content, url)
print page
Declare the dependency on the requests module in setup.py:
install_requires=[
# -*- Extra requirements: -*-
'setuptools',
'requests',
],
Version control
src/foobar/ is the directory you'll now want to put under version control:
cd src/foobar/
git init
vim .gitignore
.gitignore
*.egg-info
*.py[co]
git add .
git commit -m 'Create initial package structure.
Installing your package as a development egg
Now it's time to install your package in development mode:
python setup.py develop
This will install the requests dependency and your package as a development egg. So it's linked into your virtualenv's site-packages, but still lives at src/foobar where you can make changes and have them be immediately active in the virtualenv without re-installing your package.
Now for your original question, importing using relative paths: My advice is, don't do it. Now that you've got a proper setuptools package, that's installed and importable, your current working directory shouldn't matter any more. Just do from foobar.models import Page or similar, declaring the fully qualified name where that object lives. That makes your source code much more readable and discoverable, for yourself and other people that read your code.
You can now run your code by doing python client.py from anywhere inside your activated virtualenv. python src/foobar/foobar/client.py works just as fine, your package is properly installed and your working directory doesn't matter any more.
If you want to go one step further, you can even create a setuptools entry point for your CLI scripts. This will create a bin/something script in your virtualenv that you can run from the shell.
setuptools console_scripts entry point
setup.py
entry_points='''
# -*- Entry points: -*-
[console_scripts]
run-fooobar = foobar.main:run_foobar
''',
client.py
def run_client():
# ...
main.py
from foobar.client import run_client
def run_foobar():
run_client()
Re-install your package to activate the entry point:
python setup.py develop
And there you go, bin/run-foo.
Once you (or someone else) installs your package for real, outside the virtualenv, the entry point will be in /usr/local/bin/run-foo or somewhere simiar, where it will automatically be in $PATH.
Further steps
Creating a release of your package and uploading it PyPi, for example using zest.releaser
Keeping a changelog and versioning your package
Learn about declaring dependencies
Learn about Differences between distribute, distutils, setuptools and distutils2
Suggested reading:
The Hitchhiker’s Guide to Packaging
The pip cookbook
So, you have two packages, the first with modules named:
server # server/__init__.py
server.service # server/service.py
server.http # server/http.py
The second with modules names:
client # client/__init__.py
client.client # client/client.py
If you want to assume both packages are in you import path (sys.path), and the class you want is in client/client.py, then in you server you have to do:
from client.client import PyCachedClient
You asked for a symbol out of client, not client.client, and from your description, that isn't where that symbol is defined.
I personally would consider making this one package (ie, putting an __init__.py in the folder one level up, and giving it a suitable python package name), and having client and server be sub-packages of that package. Then (a) you could do relative imports if you wanted to (from ...client.client import something), and (b) your project would be more suitable for redistribution, not putting two very generic package names at the top level of the python module hierarchy.

Usage of "provides" keyword-argument in python's setup.py

I am working on a fork of a python projet (tryton) which uses setuptools for packaging. I am trying to extend the server part of the project, and would like to be able to use the existing modules with my fork.
Those modules are distributed with setuptools packaging, and are requiring the base project for installation.
I need a way to make it so that my fork is considered an acceptable requirement for those modules.
EDIT : Here is what I used in my setup.py :
from setuptools import setup
setup(
...
provides=["trytond (2.8.2)"],
...
)
The modules I want to be able to install have those requirements :
from setuptools import setup
setup(
...
install_requires=["trytond>=2.8"]
...
)
As it is, with my package installed, trying to install a module triggers the installation of the trytond package.
Don’t use provides, it comes from a packaging specification (a metadata PEP) that is not implemented by any tool. The requiremens in the install_requires argument map to the name in your other setup.py. IOW, replace your provides with setup(name='trytond', version='2.8.2').
If you are building rpms, it is possible to use the setup.cfg as follows:
[bdist_rpm]
provides = your-package = 0.8
obsoletes = your-package

Categories

Resources