Although this source provides a lot of information on caching within Azure pipelines, it is not clear how to cache Python pip packages for a Python project.
How to proceed if one is willing to cache Pip packages on an Azure pipelines build?
According to this, it may be so that pip cache will be enabled by default in the future. As far as I know it is not yet the case.
I used the pre-commit documentation as inspiration:
https://pre-commit.com/#azure-pipelines-example
https://github.com/asottile/azure-pipeline-templates/blob/master/job--pre-commit.yml
and configured the following Python pipeline with Anaconda:
pool:
vmImage: 'ubuntu-latest'
variables:
CONDA_ENV: foobar-env
CONDA_HOME: /usr/share/miniconda/envs/$(CONDA_ENV)/
steps:
- script: echo "##vso[task.prependpath]$CONDA/bin"
displayName: Add conda to PATH
- task: Cache#2
displayName: Use cached Anaconda environment
inputs:
key: conda | environment.yml
path: $(CONDA_HOME)
cacheHitVar: CONDA_CACHE_RESTORED
- script: conda env create --file environment.yml
displayName: Create Anaconda environment (if not restored from cache)
condition: eq(variables.CONDA_CACHE_RESTORED, 'false')
- script: |
source activate $(CONDA_ENV)
pytest
displayName: Run unit tests
To cache a standard pip install use this:
variables:
# variables are automatically exported as environment variables
# so this will override pip's default cache dir
- name: pip_cache_dir
value: $(Pipeline.Workspace)/.pip
steps:
- task: Cache#2
inputs:
key: 'pip | "$(Agent.OS)" | requirements.txt'
restoreKeys: |
pip | "$(Agent.OS)"
path: $(pip_cache_dir)
displayName: Cache pip
- script: |
pip install -r requirements.txt
displayName: "pip install"
I wasn't very happy with the standard pip cache implementation that is mentioned in the official documentation. You basically always install your dependencies normally, which means that pip will perform loads of checks that take up time. Pip will find the cached builds (*.whl, *.tar.gz) eventually, but it all takes up time. You can opt to use venv or conda instead, but for me it lead to buggy situations with unexpected behaviour. What I ended up doing instead was using pip download and pip install separately:
variables:
pipDownloadDir: $(Pipeline.Workspace)/.pip
steps:
- task: Cache#2
displayName: Load cache
inputs:
key: 'pip | "$(Agent.OS)" | requirements.txt'
path: $(pipDownloadDir)
cacheHitVar: cacheRestored
- script: pip download -r requirements.txt --dest=$(pipDownloadDir)
displayName: "Download requirements"
condition: eq(variables.cacheRestored, 'false')
- script: pip install -r requirements.txt --no-index --find-links=$(pipDownloadDir)
displayName: "Install requirements"
Related
I'm trying to create a Gitlab CI pipeline that will install python packages via pip, which I can then cache and use in later stages without the need to re install them each time.
I have followed the CI docs on how to do this, but I'm facing issue. They say to create a virtual environment and install via pip there, however after creating and activating the venv the packages aren't installed in the venv. Instead they're installed in
/builds//ca-cert-checker/venv/lib/python3.10/site-packages
Also, in a separate stage when the cache has been downloaded the stage is looking for a dir that doesn't exist
venv/bin/python
gitlab-ci.yml
stages:
- setup
- before_test
variables:
PYTHON_IMG: "python:3.10"
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache:
paths:
- .cache/pip
- venv/
python_install:
image: $PYTHON_IMG
stage: setup
script:
- python3 --version
- pip install virtualenv
- virtualenv venv
- source venv/bin/activate
- pip install -r requirements.txt
- pytest --version #debug message
- pip show pytest #debug message
pytest_ver:
stage: before_test
image: $PYTHON_IMG
script:
- ls venv #debug message
- ls .cache/pip #debug message
- pytest --version #debug message
In the pipeline this causes the setup stage to run and cache successfully
Successful cache creation:
Creating cache default... .cache/pip: found 185 matching files and
directories venv/: found 3292 matching files and directories
Output from pip show pytest:
Location: /builds//ca-cert-checker/venv/lib/python3.10/site-packages
So pip isn't installing within venv, which I think is the issue.
When I run the before_test stage I get the following output:
// Ommited for brevity
Checking cache for default...
Downloading cache.zip from https://storage.googleapis.com/gitlab-com-runners-cache/project/<project-number>/default
WARNING: venv/bin/python: chmod venv/bin/python: no such file or directory (suppressing repeats)
Successfully extracted cache
Executing "step_script" stage of the job script 00:00
Using docker image sha256:33ceb4320f06dbd22ca43809042a31851df207827b4fc45cd6c9323013dff7c7 for python:3.10 with digest python#sha256:b58c3f2846e201f5fc6b654e43f131f5a8702f8d568130302d77fbdfd9230362 ...
$ ls venv
bin
lib
lib64
pyvenv.cfg
share
$ ls .cache/pip
http
selfcheck
$ pytest --version
/bin/bash: line 128: pytest: command not found
Cleaning up project directory and file based variables 00:01
ERROR: Job failed: exit code 1
Any help / advice on how to get the pip dependencies to install in the venv and cache properly would be appreciated!
I see that your installation is done in different jobs (and even two different steps), but each job / step is a different thread / instance so the installation step does not retain anything for the next step.
You should do the installation in a before_script, pip will find the cache
image: python:3.10
stages:
- setup
- before_test
variables:
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache:
paths:
- .cache/pip
- venv/
before_script:
- python3 --version
- pip install virtualenv
- virtualenv venv
- source venv/bin/activate
- pip install -r requirements.txt
- pytest --version #debug message
- pip show pytest #debug message
pytest_ver:
stage: before_test
script:
- ls venv #debug message
- ls .cache/pip #debug message
- pytest --version #debug message
I'm building stage pipeline (env prp, tests, code). Currently have faced the blocker. Seems like each stage is kindda individual process. My requirements.txt is being installed correctly but then test stage raises the ModuleNotFoundError. Appreciate any hints how to make it working :)
yaml:
trigger: none
parameters:
- name: "input_files"
type: object
default: ['a-rg', 't-rg', 'd-rg', 'p-rg']
stages:
- stage: 'Env_prep'
jobs:
- job: "install_requirements"
steps:
- script: |
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
- stage: 'Tests'
jobs:
- job: 'Run_tests'
steps:
- script: |
python -m pytest -v tests/variableGroups_tests.py
Different jobs and stages are capable of being executed on different agents in Azure Pipelines. In your case, the installation requirements are a direct prerequisite of the tests being run, so everything should be in one job:
trigger: none
parameters:
- name: "input_files"
type: object
default: ['a-rg', 't-rg', 'd-rg', 'p-rg']
stages:
- stage: Test
jobs:
- job:
steps:
- script: |
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
displayName: Install Required Components
- script: |
python -m pytest -v tests/variableGroups_tests.py
displayName: Run Tests
Breaking those into separate script steps isn't even necessary unless you want the log output to be separate in the console.
I am building a python project -- potion. I want to use Github actions to automate some linting & testing before merging a new branch to master.
To do that, I am using a slight modification of a Github recommended python actions starter workflow -- Python Application.
During the step of "Install dependencies" within the job, I am getting an error. This is because pip is trying to install my local package potion and failing.
The code that is failing if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
The corresponding error is:
ERROR: git+https#github.com:<github_username>/potion.git#82210990ac6190306ab1183d5e5b9962545f7714#egg=potion is not a valid editable requirement. It should either be a path to a local project or a VCS URL (beginning with bzr+http, bzr+https, bzr+ssh, bzr+sftp, bzr+ftp, bzr+lp, bzr+file, git+http, git+https, git+ssh, git+git, git+file, hg+file, hg+http, hg+https, hg+ssh, hg+static-http, svn+ssh, svn+http, svn+https, svn+svn, svn+file).
Error: Process completed with exit code 1.
Most likely, the job is not able install the package potion because it is not able to find it. I installed it on my own computer using pip install -e . and later used pip freeze > requirements.txt to create the requirements file.
Since I use this package for testing therefore I need to install this package so that pytest can run its tests properly.
How can I install a local package (which is under active development) on Github Actions?
Here is part of the Github workflow file python-app.yml
...
steps:
- uses: actions/checkout#v2
- name: Set up Python 3.8
uses: actions/setup-python#v2
with:
python-version: 3.8
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8 pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Lint with flake8
...
Note 1: I have already tried changing from git+git#github.com:<github_username>... to git_git#github.com/<github_username>.... Pay attention to / instead of :.
Note 2: I have also tried using other protocols such as git+https, git+ssh, etc.
Note 3: I have also tried to remove the alphanumeric #8221... after git url ...potion.git
The "package under test", potion in your case, should not be part of the requirements.txt. Instead, simply add your line
pip install -e .
after the line with pip install -r requirements.txt in it. That installs the already checked out package in development mode and makes it available locally for an import.
Alternatively, you could put that line at the latest needed point, i.e. right before you run pytest.
What I tried:
placing 'pip install --user -r requirements.txt' in the second run command
placing 'pip install pytest' in the second run command along with 'pip install pytest-html'
both followed by,
pytest --html=pytest_report.html
I am new to CircleCI and using pytest as well
Here is the steps portion of the config.yml file
version: 2.1
jobs:
run_tests:
docker:
- image: circleci/python:3.9.1
steps:
- checkout
- run:
name: Install Python Dependencies
command:
echo 'export PATH=~$PATH:~/.local/bin' >> $BASH_ENV && source $BASH_ENV
pip install --user -r requirements.txt
- run:
name: Run Unit Tests
command:
pip install --user -r requirements.txt
pytest --html=pytest_report.html
- store_test_results:
path: test-reports
- store_artifacts:
path: test-reports
workflows:
build_test:
jobs:
- run_tests
--html is not one of the builtin options for pytest -- it likely comes from a plugin
I believe you're looking for pytest-html -- make sure that's listed in your requirements
it's also possible / likely that the pip install --user is installing another copy of pytest into the image which'll only be available at ~/.local/bin/pytest instead of whatever pytest comes with the circle ci image
disclaimer, I'm one of the pytest core devs
I am trying to cache dependencies for a Github Action workflow. I use Pipenv.
this is my config:
- uses: actions/cache#v1
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/Pipfile') }}
restore-keys: |
${{ runner.os }}-pip-
I got this config from Github's own examples for using pip. I have only changed requirements.txt to Pipfile since we don't use a requirements.txt. But even with the requirements.txt I get the same issue anyway.
the Cache Dependencies step always give this issue:
and then after running the tests:
There's no error on the workflow and it finishes as normal, however, it never seems to be able to find or update the dependency cache.
pipenv needed to be installed before the cache step...
- name: Install pipenv, libpq, and pandoc
run: |
sudo apt-get install libpq-dev -y
pip install pipenv