How to integrate any Python lint with GitHub commit status API?

How to integrate any Python lint with GitHub commit status API? - python

Is there already a way to integrate one of Python lint programs (PyLint, PyChecker, PyFlakes, etc.) with GitHub commit status API? In this way Python lint could be automatically called on pull requests to check the code and provide feedback and code (and style).

You could use something like Travis-CI, and run pylint as part of your tests, along the lines of:
language: python
install: "pip install nose pylint"
script: "nosetests && pylint"
Of course that fails commits for minor stylistic violations - you'd probably want to disable certain messages, or use pylint --errors-only to make it less stringent

I had the same question, and just found this blog post describing a project called pylint-server for doing something similar (though triggered on Travis CI build events, not pulls).
From the README:
A small Flask app to keep keep track of pylint reports and ratings on a per-repository basis.
I haven't tried it yet, so I can't comment on its quality. If anyone tries it, please comment and let us know how you like it.

Related

Python - Difference between Sonarcube Vs pylint

I'm evaluating test framework, lint and code coverage options for a new Python project I'll be working on.
I've chosen pytest for the testing needs. After reading a bunch of resources, I'm confused when to use Sonarcube, Sonarlint , pylint and coverage.py.
Is SonarLint and Pylint comparable? When would I use Sonarcube?
I need to be able to use this in a Jenkins build. Thanks for helping!

Sonarlint and pylint are comparable, in a way.
Sonarlint is a code linter and pylint is too. I haven't used sonarlint, but it seems that analyzes the code a bit deeper that pylint does. From my experience, pylint only follows a set of rules (that you can modify, by the way), while sonarlint goes a bit further analyzing the inner workings of your code. They are both static analyze tools, however.
Sonarcube, on the other hand, does a bit more. Sonarcube is a CI/CD tool that runs static linters, but also shows you code smells, and does a security analysis. All of what I'm saying is based purely on their website.
If you would like to run CI/CD workflows or scripts, you would use Sonarcube, but for local coding, sonarlint is enough. Pylint is the traditional way, though.

Nicholas has a great summary of Pylint vs Sonarlint.
(Personally I use the Sonarlint)
Although the question is older, I thought I'd answer the other part of your question in case anyone else has the same question; internet being eternal and all.
Coverage.py as it sounds, runs code coverage for your package. SonarQube then uses the report that coverage.py makes and does things with it and formats it in a way that the Sonar team decided was necessary. Coverage.py is needed if you want to use SonarQube for code coverage. However, if you just want the code smells from SonarQube, it is not needed.
You were also asking about when to use SonarQube, coverage.py, and Jenkins.
In Jenkins, you would create a pipeline with several stages. Something along the following lines:
Check out code (automatically done as the first step by Jenkins
Build code as it is intended to be used by user/developer
Run Unit Tests
run coverage.py
run SonarQube

PyCharm code style check via command line

Is it possible to run the PyCharm linter / code style checks from command line and get the warnings/errors?
Extension to that: Is it possible to integrate that in my Travis tests?

To put some of the comments and some further research into an answer:
PyCharm comes with a small command-line utility bin/inspect.sh, which is documented here. This tool is quite limited though, and has some problems, e.g. it cannot run while the PyCharm IDE is running, and it reports somewhat incorrectly / different than the IDE. Related code can be seen e.g. here.
PyCharm does not do PEP8 code style checks by itself but uses the (bundled) pycodestyle tool.
Maybe these shortcomings can be fixed upstream. See e.g. this report, or this, or this.
I'm using this in Travis and GitHub Actions now, via this script.
This also does the necessary setup of a project.
This also compensates for the inspect.sh missing warnings by also using the pycodestyle tool and thus exactly matching the PyCharm IDE warnings.
Alternative, I'm thinking about writing an extended simple utility which basically does this. All the relevant PyCharm code is open source. I created a project page pychar-inspect for this. But this is just in the planing phase currently, and maybe obsolete when this will be addressed upstream.

Why are there no Makefiles for automation in Python projects?

As a long time Python programmer, I wonder, if a central aspect of Python culture eluded me a long time: What do we do instead of Makefiles?
Most ruby-projects I've seen (not just rails) use Rake, shortly after node.js became popular, there was cake. In many other (compiled and non-compiled) languages there are classic Make files.
But in Python, no one seems to need such infrastructure. I randomly picked Python projects on GitHub, and they had no automation, besides the installation, provided by setup.py.
What's the reason behind this?
Is there nothing to automate? Do most programmers prefer to run style checks, tests, etc. manually?
Some examples:
dependencies sets up a virtualenv and installs the dependencies
check calls the pep8 and pylint commandlinetools.
the test task depends on dependencies enables the virtualenv, starts selenium-server for the integration tests, and calls nosetest
the coffeescript task compiles all coffeescripts to minified javascript
the runserver task depends on dependencies and coffeescript
the deploy task depends on check and test and deploys the project.
the docs task calls sphinx with the appropiate arguments
Some of them are just one or two-liners, but IMHO, they add up. Due to the Makefile, I don't have to remember them.
To clarify: I'm not looking for a Python equivalent for Rake. I'm glad with paver. I'm looking for the reasons.

Actually, automation is useful to Python developers too!
Invoke is probably the closest tool to what you have in mind, for automation of common repetitive Python tasks: https://github.com/pyinvoke/invoke
With invoke, you can create a tasks.py like this one (borrowed from the invoke docs)
from invoke import run, task
#task
def clean(docs=False, bytecode=False, extra=''):
patterns = ['build']
if docs:
patterns.append('docs/_build')
if bytecode:
patterns.append('**/*.pyc')
if extra:
patterns.append(extra)
for pattern in patterns:
run("rm -rf %s" % pattern)
#task
def build(docs=False):
run("python setup.py build")
if docs:
run("sphinx-build docs docs/_build")
You can then run the tasks at the command line, for example:
$ invoke clean
$ invoke build --docs
Another option is to simply use a Makefile. For example, a Python project's Makefile could look like this:
docs:
$(MAKE) -C docs clean
$(MAKE) -C docs html
open docs/_build/html/index.html
release: clean
python setup.py sdist upload
sdist: clean
python setup.py sdist
ls -l dist

Setuptools can automate a lot of things, and for things that aren't built-in, it's easily extensible.
To run unittests, you can use the setup.py test command after having added a test_suite argument to the setup() call. (documentation)
Dependencies (even if not available on PyPI) can be handled by adding a install_requires/extras_require/dependency_links argument to the setup() call. (documentation)
To create a .deb package, you can use the stdeb module.
For everything else, you can add custom setup.py commands.
But I agree with S.Lott, most of the tasks you'd wish to automate (except dependencies handling maybe, it's the only one I find really useful) are tasks you don't run everyday, so there wouldn't be any real productivity improvement by automating them.

There is a number of options for automation in Python. I don't think there is a culture against automation, there is just not one dominant way of doing it. The common denominator is distutils.
The one which is closed to your description is buildout. This is mostly used in the Zope/Plone world.
I myself use a combination of the following: Distribute, pip and Fabric. I am mostly developing using Django that has manage.py for automation commands.
It is also being actively worked on in Python 3.3

Any decent test tool has a way of running the entire suite in a single command, and nothing is stopping you from using rake, make, or anything else, really.
There is little reason to invent a new way of doing things when existing methods work perfectly well - why re-invent something just because YOU didn't invent it? (NIH).

The make utility is an optimization tool which reduces the time spent building a software image. The reduction in time is obtained when all of the intermediate materials from a previous build are still available, and only a small change has been made to the inputs (such as source code). In this situation, make is able to perform an "incremental build": rebuild only a subset of the intermediate pieces that are impacted by the change to the inputs.
When a complete build takes place, all that make effectively does is to execute a set of scripting steps. These same steps could just be deposited into a flat script. The -n option of make will in fact print these steps, which makes this possible.
A Makefile isn't "automation"; it's "automation with a view toward optimized incremental rebuilds." Anything scripted with any scripting tool is automation.
So, why would Python project eschew tools like make? Probably because Python projects don't struggle with long build times that they are eager to optimize. And, also, the compilation of a .py to a .pyc file does not have the same web of dependencies like a .c to a .o.
A C source file can #include hundreds of dependent files; a one-character change in any one of these files can mean that the source file must be recompiled. A properly written Makefile will detect when that is or is not the case.
A big C or C++ project without an incremental build system would mean that a developer has to wait hours for an executable image to pop out for testing. Fast, incremental builds are essential.
In the case of Python, probably all you have to worry about is when a .py file is newer than its corresponding .pyc, which can be handled by simple scripting: loop over all the files, and recompile anything newer than its byte code. Moreover, compilation is optional in the first place!
So the reason Python projects tend not to use make is that their need to perform incremental rebuild optimization is low, and they use other tools for automation; tools that are more familiar to Python programmers, like Python itself.

The original PEP where this was raised can be found here. Distutils has become the standard method for distributing and installing Python modules.
Why? It just happens that python is a wonderful language to perform the installation of Python modules with.

Here are few examples of makefile usage with python:
https://blog.horejsek.com/makefile-with-python/
https://krzysztofzuraw.com/blog/2016/makefiles-in-python-projects.html
I think that a most of people is not aware "makefile for python" case. It could be useful, but "sexiness ratio" is too small to propagate rapidly (just my PPOV).

Is there nothing to automate?
Not really. All but two of the examples are one-line commands.
tl;dr Very little of this is really interesting or complex. Very little of this seems to benefit from "automation".
Due to documentation, I don't have to remember the commands to do this.
Do most programmers prefer to run stylechecks, tests, etc. manually?
Yes.
generation documentation,
the docs task calls sphinx with the appropiate arguments
It's one line of code. Automation doesn't help much.
sphinx-build -b html source build/html. That's a script. Written in Python.
We do this rarely. A few times a week. After "significant" changes.
running stylechecks (Pylint, Pyflakes and the pep8-cmdtool).
check calls the pep8 and pylint commandlinetools
We don't do this. We use unit testing instead of pylint.
You could automate that three-step process.
But I can see how SCons or make might help someone here.
tests
There might be space for "automation" here. It's two lines: the non-Django unit tests (python test/main.py) and the Django tests. (manage.py test). Automation could be applied to run both lines.
We do this dozens of times each day. We never knew we needed "automation".
dependecies sets up a virtualenv and installs the dependencies
Done so rarely that a simple list of steps is all that we've ever needed. We track our dependencies very, very carefully, so there are never any surprises.
We don't do this.
the test task depends on dependencies enables the virtualenv, starts selenium-server for the integration tests, and calls nosetest
The start server & run nosetest as a two-step "automation" makes some sense. It saves you from entering the two shell commands to run both steps.
the coffeescript task compiles all coffeescripts to minified javascript
This is something that's very rare for us. I suppose it's a good example of something to be automated. Automating the one-line script could be helpful.
I can see how SCons or make might help someone here.
the runserver task depends on dependencies and coffeescript
Except. The dependencies change so rarely, that this seems like overkill. I supposed it can be a good idea of you're not tracking dependencies well in the first place.
the deploy task depends on check and test and deploys the project.
It's an svn co and python setup.py install on the server, followed by a bunch of customer-specific copies from the subversion area to the customer /www area. That's a script. Written in Python.
It's not a general make or SCons kind of thing. It has only one actor (a sysadmin) and one use case. We wouldn't ever mingle deployment with other development, QA or test tasks.

Pylint - distinguish new errors from old ones

Does anybody know how to distinguish new errors (those that were found during latest Pylint execution) and old errors (those that were alredy found during previous executions) in the Pylint report?
I'm using Pylint in one of my projects, and the project is pretty big. Pylint reports pretty many errors (even though I disabled lots of them in the rcfile). While I fix these errors with time, it is also important to not introduce new ones. But Pylint HTML and "parseable" reports don't distinguish new errors from those that were identified previously, even though I run Pylint with persistent=yes option.
As for now - I compare old and new reports manually. What would be really nice though, is if Pylint could highlight somehow those error messages which were found on a latest run, but were not found on a previous one. Is it possible to do so using Pylint or existing tools or something? Becuase if not - it seems I will end up writing my own comparison and report generation.

Two basic approaches. Fix errors as they appear so that there will be no old ones. Or, if you have no intention of fixing certain types of lint errors, tell lint to stop reporting them.
If you have a lot of files it would be a good idea to get a lint report for each file separately, commit the lint reports to revision control like svn, and then use the revision control systems diff utility to separate new lint errors from older pre-existing ones. The reason for seperate reports for each .py file is to make it easier to read the diff output.
If you are on Linux, vim -d oldfile newfile is a nice way to read diff. If you are on Windows then just use the diff capability built into Tortoise SVN.

Run pylint on dev branch, get x errors
Run pylint on master branch, get y errors
If y > x, which means you have new errors
You can do above things in CI process, before code is merged to master

Using Pylint with Django

I would very much like to integrate pylint into the build process for
my python projects, but I have run into one show-stopper: One of the
error types that I find extremely useful--:E1101: *%s %r has no %r
member*--constantly reports errors when using common django fields,
for example:
E1101:125:get_user_tags: Class 'Tag' has no 'objects' member
which is caused by this code:
def get_user_tags(username):
"""
Gets all the tags that username has used.
Returns a query set.
"""
return Tag.objects.filter( ## This line triggers the error.
tagownership__users__username__exact=username).distinct()
# Here is the Tag class, models.Model is provided by Django:
class Tag(models.Model):
"""
Model for user-defined strings that help categorize Events on
on a per-user basis.
"""
name = models.CharField(max_length=500, null=False, unique=True)
def __unicode__(self):
return self.name
How can I tune Pylint to properly take fields such as objects into account? (I've also looked into the Django source, and I have been unable to find the implementation of objects, so I suspect it is not "just" a class field. On the other hand, I'm fairly new to python, so I may very well have overlooked something.)
Edit: The only way I've found to tell pylint to not warn about these warnings is by blocking all errors of the type (E1101) which is not an acceptable solution, since that is (in my opinion) an extremely useful error. If there is another way, without augmenting the pylint source, please point me to specifics :)
See here for a summary of the problems I've had with pychecker and pyflakes -- they've proven to be far to unstable for general use. (In pychecker's case, the crashes originated in the pychecker code -- not source it was loading/invoking.)

Do not disable or weaken Pylint functionality by adding ignores or generated-members.
Use an actively developed Pylint plugin that understands Django.
This Pylint plugin for Django works quite well:
pip install pylint-django
and when running pylint add the following flag to the command:
--load-plugins pylint_django
Detailed blog post here.

I use the following: pylint --generated-members=objects

If you use Visual Studio Code do this:
pip install pylint-django
And add to VSC config:
"python.linting.pylintArgs": [
"--load-plugins=pylint_django"
],

My ~/.pylintrc contains
[TYPECHECK]
generated-members=REQUEST,acl_users,aq_parent,objects,_meta,id
the last two are specifically for Django.
Note that there is a bug in PyLint 0.21.1 which needs patching to make this work.
Edit: After messing around with this a little more, I decided to hack PyLint just a tiny bit to allow me to expand the above into:
[TYPECHECK]
generated-members=REQUEST,acl_users,aq_parent,objects,_meta,id,[a-zA-Z]+_set
I simply added:
import re
for pattern in self.config.generated_members:
if re.match(pattern, node.attrname):
return
after the fix mentioned in the bug report (i.e., at line 129).
Happy days!

django-lint is a nice tool which wraps pylint with django specific settings : http://chris-lamb.co.uk/projects/django-lint/
github project: https://github.com/lamby/django-lint

Because of how pylint works (it examines the source itself, without letting Python actually execute it) it's very hard for pylint to figure out how metaclasses and complex baseclasses actually affect a class and its instances. The 'pychecker' tool is a bit better in this regard, because it does actually let Python execute the code; it imports the modules and examines the resulting objects. However, that approach has other problems, because it does actually let Python execute the code :-)
You could extend pylint to teach it about the magic Django uses, or to make it understand metaclasses or complex baseclasses better, or to just ignore such cases after detecting one or more features it doesn't quite understand. I don't think it would be particularly easy. You can also just tell pylint to not warn about these things, through special comments in the source, command-line options or a .pylintrc file.

I resigned from using pylint/pychecker in favor of using pyflakes with Django code - it just tries to import module and reports any problem it finds, like unused imports or uninitialized local names.

This is not a solution, but you can add objects = models.Manager() to your Django models without changing any behavior.
I myself only use pyflakes, primarily due to some dumb defaults in pylint and laziness on my part (not wanting to look up how to change the defaults).

Try running pylint with
pylint --ignored-classes=Tags
If that works, add all the other Django classes - possibly using a script, in say, python :P
The documentation for --ignore-classes is:
--ignored-classes=<members names>
List of classes names for which member
attributes should not be checked
(useful for classes with attributes
dynamicaly set). [current: %default]
I should add this is not a particular elegant solution in my view, but it should work.

For neovim & vim8 use w0rp's ale plugin. If you have installed everything correctly including w0rp's ale, pylint & pylint-django. In your vimrc add the following line & have fun developing web apps using django.
Thanks.
let g:ale_python_pylint_options = '--load-plugins pylint_django'

The solution proposed in this other question it to simply add get_attr to your Tag class. Ugly, but works.

So far I have found no real solution to that but work around:
In our company we require a pylint
score > 8. This allows coding
practices pylint doesn't understand
while ensuring that the code isn't
too "unusual". So far we havn't seen
any instance where E1101 kept us
from reaching a score of 8 or
higher.
Our 'make check' targets
filter out "for has no 'objects'
member" messages to remove most of
the distraction caused by pylint not
understanding Django.

For heroku users, you can also use Tal Weiss's answer to this question using the following syntax to run pylint with the pylint-django plugin (replace timekeeping with your app/package):
# run on the entire timekeeping app/package
heroku local:run pylint --load-plugins pylint_django timekeeping
# run on the module timekeeping/report.py
heroku local:run pylint --load-plugins pylint_django timekeeping/report.py
# With temporary command line disables
heroku local:run pylint --disable=invalid-name,missing-function-docstring --load-plugins pylint_django timekeeping/report.py
Note: I was unable to run without specifying project/package directories.
If you have issues with E5110: Django was not configured., you can also invoke as follows to try to work around that (again, change timekeeping to your app/package):
heroku local:run python manage.py shell -c 'from pylint import lint; lint.Run(args=["--load-plugins", "pylint_django", "timekeeping"])'
# With temporary command line disables, specific module
heroku local:run python manage.py shell -c 'from pylint import lint; lint.Run(args=["--load-plugins", "pylint_django", "--disable=invalid-name,missing-function-docstring", "timekeeping/report.py"])'

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.