How to know what version of a GitHub Action to use - python

I've noticed in various GitHub Action workflow examples, often when calling a pre-defined action (with the uses: syntax) then a particular version of that action is specified. For example:
steps:
- uses: actions/checkout#v2
- name: Set up Python
uses: actions/setup-python#v2
with:
python-version: '3.x'
The above workflow specifies #v2 for both actions/checkout and actions/setup-python.
The question is, how does one know that #v2 is the best version to use?
And how will I know when #v3 becomes available?
Even more confusing is the case of the action used to publish to pypi, pypa/gh-action-pypi-publish. In examples I have looked at, I have seen at least four different versions specified:
pypa/gh-action-pypi-publish#27b31702a0e7fc50959f5ad993c78deac1bdfc29
pypa/gh-action-pypi-publish#master
pypa/gh-action-pypi-publish#v1
pypa/gh-action-pypi-publish#release/v1
How do I know which one to use?
And in general, how do you know which one's are available, and what the differences are?

How to know which version to use?
When writing a workflow and including an action, I recommend looking at the Release tab on the GitHub repository. For actions/setup-python, that would be https://github.com/actions/setup-python/releases
On that page, you should see what versions there are and what the latest one is. You want to use the latest version, because that way you can be sure you're not falling behind and upgrading doesn't become too painful in the future.
How to reference a version?
By convention, actions are published with specific tags (e.g. v1.0.1) as well as a major tag (e.g. v1). This allows you to reference an action like so actions/setup-python#v1. As soon as version v1.0.2 is published, you will automatically use that one. This means you profit from bug fixes and new features, but you're prevented from pulling in breaking changes.
However, note that this is only by convention. Not every author of an action publishes a major tag and moves that along as new tags are published. Furthermore, an author might introduce a breaking change without bumping the major version.
When to use other formats
As you said there are other ways you can reference an action such as a specific commit (e.g. actions/setup-python#27b31702a0e7fc50959f5ad993c78deac1bdfc29) and others.
In general, you want to stick to tags as described above. In particular, referencing #main or #master is dangerous, because you'll always get the latest changes, which might break your workflow. If there is an action that advises you to reference their default branch and they don't publish tags, I recommend creating an issue in their GitHub repository asking to publish tags.
Using a git hash can be useful if you need to use a specific version. A use-case could be that you want to test if a specific version would fix a problem or if you see that the author of the action has pushed some new commits with changes that are not tagged yet. You could then test that version of the action.
Security
From a security perspective, using a tag (e.g. #v1 or #v1.1.0) or a branch (e.g. #main) is problematic, because the author of that repository could change where it refers to. A malicious author of an action could add malicious code to that branch or even simply not be careful enough when reviewing a PR and thereby introduce a vulnerability (e.g. via a transitive dependency).
By using hashes (e.g. #27b31702a0e7fc50959f5ad993c78deac1bdfc29) you know exactly what you get and it doesn't change unless you choose change the version by updating the hash (at which point you can carefully review the changes).
As of early 2022, using hashes instead of tags is not widely adopted, but for example GitHub does this for their docs repository. As supply chain security becomes more important, tools are created to help with "pinning" (point to a specific version by hash rather than tag), such as sethvargo/ratchet. But even depedanbot (see below) should be able to update hashes to the latest hash.
How to know when there is a new version?
You can use Dependabot for that: Keeping your actions up to date with Dependabot. Dependabot is a tool that creates a pull request in your repository as soon as a new version of any of your actions is available such that you can review what the changes are and keep your workflow up to date.
Here's a sample Dependabot configuration that keeps your actions up to date by creating PRs:
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"

People should get used to such tags based release management (other examples like Docker), as documented in articles like this.
How do a user know which tag to use? Usually the action documentation file contains the recommended version to use, so 99% of the users should follow that. You only need to use other tags if you want to live on the bleeding edge.

Related

Can I use Facebook translate service in my python NLP project?

I hope you are feeling good and safe.
I'm working on Natural language processing project for my master degree, and I do need to translate
my local dialect to ENG, and I noticed that Facebook translate machine did very well with my local dialect.
So my question is there any way to use Facebook translate service in my project, like is there any api or python module that use it.
Which language is your local language?
Facebook has many machine translation models, so it depends on how good it has to be and how much computing power you have. I am not sure if they offer their latest state-of-the-art ones that they use in their products as an independent translation tool as well.
First Option: Run full models locally
One way would be using one of their models on huggingface (see the "Generation" part):
https://huggingface.co/docs/transformers/model_doc/m2m_100#training-and-generation
They also have some easy-to-use pretrained models in their torch.hub module (but that probably doesnt cover your local language):
https://github.com/pytorch/fairseq/blob/main/examples/translation/README.md
Second Option: APIs
As I said it depends on what quality you need, you could try out some easy-to-use (non-facebook) APIs and see how far that gets you, as this is much easier and you can use them online:
e.g. https://libretranslate.com/
Or check out this comparison of APIs: https://rapidapi.com/collection/google-translate-api-alternatives
APIs are usually limited to a maximum number of characters/words/requests per month/day/minute so you'll have to see if that is enough for your case and if the quality is acceptable.
Third Option: pip packages which use APIs
For example check out: https://pypi.org/project/deep-translator/
Fourth Option: pip wrapper packages which run translation locally
A great package which actually has some pretty strong facebook MT models is: https://pypi.org/project/EasyNMT/ (it also has their strong m2m_100 models)
More lightweight but probably not as strong: https://github.com/argosopentech/argos-translate
Conclusion:
Since I assume your local language is not supported by that many models I would first try the fourth option (start with biggest models and if it doesnt work try smaller ones).
If that doesnt work out you can try if you can get APIs work for your case.
If you have a lot of computing power and want to go a bit deeper you can run the full model inference locally using huggingface or fairseq.

How to get all merged pull requests since last tag was added using github3.py?

I am trying to identify all merged pull-requests that happened since last release was made. A release always has a tag, so the logic is to find any pull-requests that happened after that tag was created.
Apparently the pull-request API does not allow to filter by tags, not even the commits ones.
I guess that if I find a way to query all commits that happened after a particular tag, I may detect which pull-requests produced them (i do not care about direct pushes).
Details:
commits(...)
pull_requests(...) -- gets them in inverse order of creation which is perfect but it seems to not ever stop, apparently mentioning head=mytag does not have the desired effect of making it stop in time
I want this in order to be able to produce some draft release nodes, and all the data I need is the list of PRs that were merged.
I ended up using gitpython to perform a local query that returned me the commits. Example at https://github.com/pycontribs/tender/blob/master/tender/__main__.py#L133-L145 but the main code looks like
rev = f"{tag}..HEAD"
for commit in self.git.iter_commits(rev=rev):
result[commit.hexsha] = commit

What is the difference between Kinto and Cliquet?

Why is Kinto using Cliquet and what is the difference between the two ?
Disclaimer: I am one of the authors of both tools. Since this is frequently asked question, I thought it would be relevant to share a proper answer here :)
At Mozilla Services we regularly implement and deploy micro-services.
Since most services share the same production requirements (in terms of monitoring, REST protocols etc.), we decided to develop and package a reusable toolkit using Cornice.
Kinto is one of those services. It uses Cliquet as one of its core libraries.
The Kinto HTTP API is made of several REST endpoints, that all share a set of common properties (filtrable, sortable etc.). The common code base for those REST resources is implemented as a reusable class in Cliquet.
We really like the name Cliquet. However, given the confusion of its scope, we will probably (some day) split it into two packages, called like cornice-mozprod and cornice-crud.
Kinto and Cliquet have now be merged together, and cliquet is no longer a thing.
See all the nifty details at https://mail.mozilla.org/pipermail/kinto/2016-May/000119.html

Zenoss - Device Access Control Lists for Customers

We're evaluating Zenoss and are interested in Device Access Control. We would like to set up the system so that our customers could access Zenoss and only see their devices and status. This feature apparently only exists in the enterprise version as can be seen here.
In the user configuration page there is a "Administered Objects" section but in the community version it has no practical effect, apparently. There is also a roles and permissions configuration page available at http://.../zport/manage_access but I haven't really figured out how to use it for this use case.
Can anyone give me some tips on how we could limit a certain user to certain devices or device groups? Would it require changing a lot of code in the Zenoss core? Can we do that with a ZenPack? Are there any examples on how to do this?
Thanks in advance!
I am working on this right now. Part of the issue is that there are a number of bugs around the Zenoss Administered Objects concept. I have posted some findings at the Zenoss forum at http://community.zenoss.org/message/59100#59100 . I have also opened a number of tickets with Zenoss (referenced in the previous url). If you can add extra supporting information to the tickets then it may get their priority raised. Meanwhile, I am working on my own code fixes / ZenPack workaround and almost have something ready for alpha testing if you are interested.
Cheers,
Jane

Building a wiki application?

I'm building this app in Python with Django.
I would like to give parts of the site wiki like functionality,
but I don't know how to go on about reliability and security.
Make sure that good content is not ruined
Check for quality
Prevent spam from invading the site
The items requiring wiki like functionality are just a few: a couple of text fields.
Can anyone help on this one?
Would be very much appreciated. :)
You could try using Django Wikiapp, which gives you most of the features you want in a wiki, including history and the ability to revert to older versions of an article. I have personally used this app and it's pretty self-explanatory; they also have a bit of documentation at http://code.google.com/p/django-wikiapp/source/browse/trunk/docs.
In terms of spam protection you can to one of two things or both: password protect the pages that have to do with editing the wiki, and use Akismet to filter for spam. I'm working on something similar and this is probably what we'll end up doing.
Assuming that there will be a community of users you can provide good tools for them to spot problems and easily undo damage. The most important of these is to provide a Recent Changes page that summarizes recent edits. Then each page that can be edited should retain prior versions of the page that can be used to replace any damaging edit. This makes it easier to undo damage than it is to damage things.
Then think about how you are going to handle either locking resources or handling simultaneous edits.
If you can tie edits to users you can provide some administrative functions for undoing all edits by a particular user, and banning that user.
Checking for quality would be tied to the particular data that your application is using.
Make sure that good content is not ruined = version each edit and allow roll-backs.
Check for quality = get people to help with that
Prevent spam from invading the site = get people to help with that, require login, add a captcha if need be, use nofollow for all links

Categories

Resources