How to update source files for pytest? - python

pytest appears to be using old source code and failing tests because of it. I'm not sure how to update it.
Test code:
from nba_stats import league
class TestLeaders():
def test_default():
leaders = league.Leaders()
print(leaders)
Source code (league.py):
from nba_stats.nba_api import NbaAPI
from nba_stats import constants
class Leaders:
...
When I run pytest on my parent directory, I get an error that refers to an old import statement.
_____________________________ ERROR collecting test/test_league.py ______________________________
ImportError while importing test module '/home/mfb/src/nba_stats/test/test_league.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
test_league.py:1: in <module>
from nba_stats import league
../../../.virtualenvs/nba_stats_dev/lib/python3.6/site-packages/nba_stats/league.py:1: in <module>
from nba_stats import _api_scrape, _get_json
E ImportError: cannot import name '_api_scrape'
I tried resetting my virtualenvironment and also reinstalling my package via pip. What do I need to do to tell it to see the new import statement and why is this happening?
Edit: Deleting my virtual environment completely and then creating a new one seemed to fix it, but it seems to be a recurring issue with any further source code changes. Surely there must be a way to not have to reset my virtualenvironment each time?

Looks like you installed that package (possibly as a dependency through something else if not directly) and also have it cloned locally for development. You can look into local editable installs (https://pip.pypa.io/en/stable/reference/pip_install/#editable-installs), but personally, I prefer to make the test refer directly to the package under which it is being run, since then it can be used "as-is" after cloning it. Do this by modifying sys.path in your test_league.py. Ie., assuming it has a structure with the python code under python/nba_stats, in the parent directory of `test
sys.path = [os.path.join(os.pardir, 'python')] + sys.path
at the top of test_league.py. This puts your local package up front and import will consider it first.
EDIT:
Since you tried and it still did not work (please do make sure that the snippet above does point to the local python package as in the actual structure; the above is just a common one but you may have a different structure), here is how you can see which directories are considered in order, and which are eventually selected:
python -vv -m pytest -svx
You will be able to see if there are spurious directories in sys.path, whether the one tried (as in the snippet above) matches as expected or not, any left-over .pyc files that get picked up, etc.
EDITv2: Since you stated that python -m pytest works, but pytest not, have a look where that pytest executable is coming from with which pytest. Likely it's a system one that refers to a different python then the one in your virtualenv. To see which python it picks up, do:
cat `which python`
and look at the top line.
If that is not the same as what which python gives you (with your desired virtualenv activated), you may have to install pytest for that current virtualenv (python -m pip install pytest).

Related

Can I zip PySpark dependencies containing some setuptools.Extension?

I am attempting to include the dateparser package for a PySpark (v2.4.3) shell session by a short little zip build process pip install -r requirements.txt -t some_target && cd some_target && zip -r ../deps.zip . && cd .., after which I would, for example, pyspark --py-files deps.zip. When importing dateparser, though, I get an indirect ModuleNotFoundError from the regex library, whining that "No module named 'regex._regex'" (stack trace says this is referenced in /mnt/tmp/spark-some/long/path/deps.zip/regex/_regex_core.py line 21, which is of course referenced much farther up the stack by dateparser).
I attempted adding a flag to the dateparser line in requirements.txt like dateparser --no-binary=regex, but the error persisted. A normal python shell is able to import without issue, and other packages in this zip seem to be importable in PySpark shell without issue. This has led me down a number of rabbit holes, but I think/hope I have finally found the culprit: namely, that regex._regex is not a normal .py file, but rather a .so. My knowledge of python build process is limited, but it seems that regex library's setup.py uses the setuptools.Extension class to compile some C files into this shared object. I have seen suggestions to modify LD_LIBRARY_PATH environment variable in order to make those shared objects discoverable to python, but a number of comments also suggested this was dangerous and not a viable long-term solution. The fact that a normal python interactive session has no issue with the import also has me skeptical, since the LD_LIBRARY_PATH variable doesn't even exist in os.environ within that interactive shell. I'm thence left wondering if --py-files is insufficient for including packages that compile these Extension objects (seems unlikely, since there are a lot of people doing crazier things than my simple use case), or if this actually stems from neglecting some other setting.
Merci mille fois for any and all help :)
The error appears to stem from the import statements not being able to recognize binary (.so) files within a zip archive, i.e., the dependencies.zip that I pass with the --py-files parameter. I first tried pulling out regex dependency and building a .whl to include in --py-files, to discover that my version of PySpark (v2.4.3) predates wheel support. I was, however, able to build an .egg based on the source code, then set PYTHON_EGG_CACHE and PYTHON_EGG_DIR env variables for spark.executorEnv and spark.driverEnv... Not sure if the last step would be necessary for others; it seems to have stemmed from weird permissions issues that may just apply to my user/group/use case.

Importing self-made package

I am testing my own package, but I am struggling to import it. My package file structure is as follows:
(You can also alternatively view my Github repository here)
In PyAdventures is my init.py with the content of
name="pyadventures"
So my question is, when I import it in another python file, how come it doesn't work?
I run:
import pyadventures
But I get the following error:
No module named pyadventures
It'd be great if you could answer my question!
It's important to note that the package is in my Python package directory, not the test one I showed
New discovery! In my PyAdventures folder (the one Python actually uses) the folder only has a few files, not the ones in the screenshot above.
You can install this with pip install pyadventures
Ah, like others remark in comments and answer: The camelcase might be the problem. (Why not name it just all in lower case: pyadventures? - Else users will be as confused as its developer now :D .)
Before, I thought, it might be a problem that you want to use your module without having installed it (locally). And my following answer is for this case (using a not installed package from a local folder):
In the docs you can read:
The variable sys.path is a list of strings that determines the
interpreter’s search path for modules. It is initialized to a default
path taken from the environment variable PYTHONPATH, or from a
built-in default if PYTHONPATH is not set. You can modify it using
standard list operations:
import sys
sys.path.append('/ufs/guido/lib/python')
thus, before import do:
import sys
sys.path.append('/absolute/or/relative/path/to/your/module/pyadventures')
# Now, your module is "visible" for the module loader
import pyadventures
Alternatively, you could just place your pyadventures module folder locally in your working directory where you start python and then just do
import pyadventures
However, it is much easier to manage to keep only one version in one place and refer to this from other places/scripts. (Else you have multiple copies, and have difficulties to track changes on the multiple copies).
As far as I know, your __init__.py doesn't need to contain anything. It works if it is empty. It is there just to mark your directory as a module.
Simple. Even though the package listed on pip is namd pyadventures, in your code the directory is called PyAdventures. So that's what python knows it as. I ran import PyAdventures and it worked fine.

Python Pylearn2 package "ImportError: No module named pylearn2.utils"

I have recently tried to use pylearn2, a deep machin learning package for Python developed at University of Montreal.
I've just installed it and tried to run a simple example, but it did not work.
I have been using a pc with an Ubuntu 13.10 system, on which I found ipython installed.
I have installed Theano and later pylearn2, by following this webpage instructions:
http://deeplearning.net/software/pylearn2/
I have also modified the .bashrc file, as suggested
I thought that everything went well, and then I tried this Quick start example:
http://deeplearning.net/software/pylearn2/tutorial/index.html
I stopped at the first command:
python make_dataset.py
My terminal states:
Traceback (most recent call last): File "make_dataset.py", line 14,
in
Do you have any ideas on why it is not working?
Do you why these errors occur?
Thanks a lot
EDIT: the 14 line is the first non-commented line of the file. It states
from pylearn2.utils import serial
Without more information, I can only guess, but my first guess is…
You haven't actually installed pylearn2, because if you follow the linked docs to grab the git repo and add a PYLEARN2_DATA_PATH variable, nothing gets installed into site-packages (or dist-packages or anywhere else on sys.path).
This means that pylearn2 will only work when you start Python from within the top-level directory of the pylearn2 repo.
So, if you run a script like this:
$ cd /path/to/pylearn2
$ cd scripts/tutorials/grbm_smd/
$ python make_dataset.py
… it won't actually work.
It looks like there is a setup.py file in the repository. Does it work? I have no idea. Even though the docs don't mention using it, you might want to try. Either this:
$ pip install .
… or, if you don't have pip or it doesn't work on this package:
$ python setup.py install
Either way, of course, you may need sudo or a flag to install to your user site-packages instead of system, etc., as with any other Python package.
If that doesn't work, you might be able to just add /path/to/pylearn2 to your sys.path in some way. The most obvious way is by doing an export PYTHONPATH=/path/to/pylearn2:$PYTHONPATH in your ~/.bashrc.
Also, you will need to either source ~/.bashrc or create a new shell to get any effects of modifying the file.
If you're wondering why the instructions and the tutorial together don't give you enough information to make this work without a lot of hassle, I think that's covered in the very top of the documentation:
Pylearn2 is still undergoing rapid development. Don’t expect a clean road without bumps!
And the very fact that there is no PyPI download yet implies that this really is not ready for novices to use. If you don't know enough about using Python packages (and bash basics) to muddle through on your own, there's a good chance you won't be able to use this package.

Python installation, failing to find bz2 module

OK, I have an old Debian VM. Package managers are useless. No, I'm not going to update the OS.
I have the bzip2 lib and development headers installed correctly on my system (those actually came from a package).
I start with absolutely NO Python on the system. I removed everything manually. I downloaded the Python 2.7.5 source, and configured with ./configure --prefix=/usr. It configures fine. I run make, and it compiles fine. I try ./python -c "import bz2; print bz2.__doc__", it works, and says:
The python bz2 module provides a comprehensive interface for
the bz2 compression library. It implements a complete file
interface, one shot (de)compression functions, and types for
sequential (de)compression.
I then run make test and the whole test suite progresses fine, and notably the "test_bz2" test passes.
I then run make install, which installs my new Python binary into /usr/bin/ like I wanted.
I try /usr/bin/python -c "import bz2; print bz2.__doc__", and it fails with:
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named bz2
I've tried a bunch of different things, including building Python as --enable-shared and not, no luck. I've tried at least 10 times (each time totally cleaning out everything, running make distclean, etc.). No luck.
I tried: PYTHONPATH="/usr/lib/python2.7"; export PYTHONPATH. Still no luck.
HOWEVER, if I delete the symlink that make install creates for /usr/bin/python, and instead do: ln -s /path/to/my/python/compile/python python, NOW it magically works.
So, what the heck? Why is this Python binary I'm getting created only able to find stuff when the binary exists in the compile directory, and not when it's put into normal production install location? What am I missing?
I am root during the entire process, from configure to make to make install to trying to test the Python import call.
I have started from scratch again (this time compiling with --enable-shared btw), and verified that not only in the compile directory is there build/lib.linux-x86_64-2.7/bz2.so, but once I run make install, that file is put into /usr/lib/python2.7/lib-dynload/bz2.so.
I've tried to do some reading on lib-dynload, but haven't been able to determine if there's something else a Python program (like default configuration for the CLI or whatever) would need to be able to tell it to pull module imports from lib-dynload, or if there's some other place or option to tell the make install where it should be putting it instead of dynload.
Still I have no explanation why the /path/to/compilation/python binary can find and load bz2.so fine, but the /usr/bin/python binary can't find (or load) /usr/lib/python2.7/lib-dynload/bz2.so.
I thought maybe it was something to do with the fact that the installation doesn't create like a /usr/lib/python symlink to point at /usr/lib/python2.7 directory. But I created the symlink and still no go.
I am still lost here.
It would appear that a sort of non-answer answer was arrived at accidentally via a long string of Twitter conversation(s).
I've filed another Stack Overflow question here to ask WHY what we found was the solution to this problem: https://stackoverflow.com/questions/17662091/python-installation-prefix-not-being-persisted-in-config
For posterity sake, right now the solution is that I have to set the PYTHONHOME environment variable to /usr, and everything starts working. The puzzling part is that the documentation says PYTHONHOME should default to {prefix}, which I was clearly setting as default during configure to /usr. So why should I have to manually set it?
Running python-config --prefix reveals that the {prefix} default is in fact /usr/bin, NOT /usr like I specified, which leads to me needing to override the default back to the default, bizarrely.

easy_install 'develop' command fails to work in virtualenv

Update:
It turns out that the virtualenv was not properly initialized before running easy_install. Once this has been rectified, things started to work as intended. There's no solution to post, since the stated problem did not exist in the first place. The 'when I activate the virtualenv' step was not properly taken (don't ask), so the following malfunction was an illusion.
Case closed.
Original question:
I have a virtualenv. Inside it, sys.path looks like this:
[...,
'/<inside_virtualenv>/lib/python2.6/site-packages/foo-1.2.egg',
...
'/usr/local/lib/python2.6/dist-packages/foo-2.0.egg'
]
If I import foo from inside the virtualenv, I get foo-1.2 imported, as expected.
I have an egg; its setup file lists another egg as a dependency that has foo=1.2 in its dependencies.
When I activate the virtualenv and try to run python <my_egg>/setup.py develop, I get an error:
Processing dependencies for <my egg>
Installed distribution foo 2.0 conflicts with requirement foo==1.2
I even patched setuptools/command/easy_install.py to print sys.path right inside the try statement that raises this exception. The path is all right, listing foo-1.2 first and foo-2.0 distant second.
What am I doing wrong? Is there any way to make easy_install ignore the non-virtualenv foo-2.0 installation and accept foo-1.2 inside the virtualenv?
Removing the offending entry from sys.path inside my egg's setup.py does not help. While sys.path only contains the correct version of foo, the process fails with the same error.
There's another possible case where this can happen, beyond the one you were experiencing directly, but which is easily avoided:
When setting up a new virtualenv, use --no-site-packages to avoid including libraries from your system Python installation unless you're certain they don't (and will never) conflict.

Categories

Resources