I'am making an API that returns all the versions currently available. The versions are structured like this: 22.12a
22 is the year,12 is the month and a goes up by a letter everytime we launch another version and resets every month.
My problem is that I need to sort the version so that they can be in the release order like this:
["22.12b","22.12a","22.11a","22.9a"]
But I have no idea how to do it.
You can use the natsort.natsorted():
from natsort import natsorted
versions = ['22.12b', '22.12a', '22.11a', '22.9a']
natsorted(versions)
#['22.9a', '22.11a', '22.12a', '22.12b']
It can also be done via packging.version.parse():
from packaging.version import parse
versions = ['22.12b', '22.12a', '22.11a', '22.9a']
versions.sort(key=parse)
#['22.9a', '22.11a', '22.12a', '22.12b']
Related
When / Why does setuptools-scm append .devXXX to its generated version?
In a couple repos I maintain setuptools-scm starts producing versions with .devXXX appended to the version number. This causes issues because this tag is invalid for upload to PyPi.
I created a workaround the first time this happened, and I assumed that it was because I had done something improper in git. This just happened in a really simple project though, and it's really frustrating.
The workaround that I used before is to hijack the versioning via use_scm_version. This is less than ideal, and I'd like to understand the root cause.
Thanks in advance for any help you might be able to offer!
Documentation is here:
https://github.com/pypa/setuptools_scm/#importing-in-setuppy
# setup.py
def _clean_version():
"""
This function was required because scm was generating developer versions on
GitHub Action.
"""
def get_version(version):
return str(version.tag)
def empty(version):
return ''
return {'local_scheme': get_version, 'version_scheme': empty}
setuptools.setup(
...
use_scm_version=_clean_version,
...
)
It's doing so because commits not tagged have their version value computed like this:
X.Y.(Z+1)-devN-gSHA
where:
X.Y.Z is the most recent previous tagged commit on top of/above which you are actually.
N is the number of commits you are after that previous X.Y.Z
and SHA is the SHA of your current commit.
-dev* version are considered beta/pre version of what they follow.
so X.Y.(Z+1)-devN-gSHA is considered beta/pre version of X.Y.(Z+1).
I'm having trouble understanding an issue with calling packages after they have been imported using the __import_ function in Python. I'll note that using the usual import x as y works fine, but this is an learning exercise for me. I am importing and then checking the version for multiple packages and to learn a little more about Python, I wanted to automate this by using a dictionary.
My dictionary looks something like this:
pack = {"numpy": ["np", "1.7.1"]}
and I then use this to load and check the modules:
for keys in pack.keys():
pack[keys][0] = __import__(keys)
print("%s version: %6.6s (need at least %s)" %(keys, pack[keys][0].__version__, pack[keys][1]))
This works fine, but when I later try to call the package, it does not recognize it: x = np.linspace(0,10,30)
produces an error saying np isn't recognized, but this works: x = pack[keys][0].linspace(0,10,30)
Since this is just a way for me to learn, I'd also be interested in any solutions that change how I've approached the problem. I went with dictionaries because I've at least heard of them, and I used the _import__ function since I was forced to either use quoted characters or numeric values in my dictionary values. The quoted characters created problems for the import x as y technique.
Althought it isn't a good practice, you can create variables dynamically using the builtin locals function.
So,
for module, data in pack.items():
locals()[data[0]] = __import__(module)
nickname = locals()[data[0]]
print("%s version: %6.6s (need at least %s)" %(module, nickname.__version__, pack[module][1]))
Output:
numpy version: 1.12.1 (need at least 1.7.1)
I am learning Spark now. When I tried to load a json file, as follows:
people=sqlContext.jsonFile("C:\wdchentxt\CustomerData.json")
I got the following error:
AttributeError: 'SQLContext' object has no attribute 'jsonFile'
I am running this on Windows 7 PC, with spark-2.1.0-bin-hadoop2.7, and Python 2.7.13 (Dec 17, 2016).
Thank you for any suggestions that you may have.
You probably forgot to import the implicits. This is what my solution looks like in Scala:
def loadJson(filename: String, sqlContext: SqlContext): Dataset[Row] = {
import sqlContext._
import sqlContext.implicits._
val df = sqlContext.read.json(filename)
df
}
First, the more recent versions of Spark (like the one you are using) involve .read.json(..) instead of the deprecated .readJson(..).
Second, you need to be sure that your SqlContext is setup right, as mentioned here: pyspark : NameError: name 'spark' is not defined. In my case, it's setup like this:
from pyspark.sql import SQLContext, Row
sqlContext = SQLContext(sc)
myObjects = sqlContext.read.json('file:///home/cloudera/Downloads/json_files/firehose-1-2018-08-24-17-27-47-7066324b')
Note that they have version-specific quick-start tutorials that can help with getting some of the basic operations right, as mentioned here: name spark is not defined
So, my point is to always check to ensure that with whatever library or language you are using (and this applies in general across all technologies) that you are following the documentation that matches the version you are running because it is very common for breaking changes to create a lot of confusion if there is a version mismatch. In cases where the technology you are trying to use is not well documented in the version you are running, that's when you need to evaluate if you should upgrade to a more recent version or create a support ticket with those who maintain the project so that you can help them to better support their users.
You can find a guide on all of the version-specific changes of Spark here: https://spark.apache.org/docs/latest/sql-programming-guide.html#upgrading-from-spark-sql-16-to-20
You can also find version-specific documentation on Spark and PySpark here (e.g. for version 1.6.1): https://spark.apache.org/docs/1.6.1/sql-programming-guide.html
As mentioned before, .jsonFile (...) has been deprecated1, use this instead:
people = sqlContext.read.json("C:\wdchentxt\CustomerData.json").rdd
Source:
[1]: https://docs.databricks.com/spark/latest/data-sources/read-json.html
Simple code like this won't work anymore on my python shell:
import pandas as pd
df=pd.read_csv("K:/01. Personal/04. Models/10. Location/output.csv",index_col=None)
df.sample(3000)
The error I get is:
AttributeError: 'DataFrame' object has no attribute 'sample'
DataFrames definitely have a sample function, and this used to work.
I recently had some trouble installing and then uninstalling another distribution of python. I don't know if this could be related.
I've previously had a similar problem when trying to execute a script which had the same name as a module I was importing, this is not the case here, and pandas.read_csv is actually working.
What could cause this?
As given in the documentation of DataFrame.sample -
DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)
Returns a random sample of items from an axis of object.
New in version 0.16.1.
(Emphasis mine).
DataFrame.sample is added in 0.16.1 , you can either -
Upgrade your pandas version to latest, you can use pip for that, Example -
pip install pandas --upgrade
Or if you don't want to upgrade, and want to sample few rows from the dataframe, you can also use random.sample(), Example -
import random
num = 100 #number of samples
sampleddata = df.loc[random.sample(list(df.index),num)]
I am using pyusb and according to the docs it runs on any one of three backends. libusb01 libusb10 and openusb. I have all three backends installed. How can I tell which backend it is using and how can I switch to a different one?
I found the answer by looking inside the usb.core source file.
You do it by importing the backend and then setting a parameter inside the find method of usb.core. Like so:
import usb.backend.libusb1 as libusb1
import usb.backend.libusb0 as libusb0
import usb.backend.openusb as openusb
and then any one of:
devices = usb.core.find(find_all=1, backend=libusb1.get_backend() )
devices = usb.core.find(find_all=1, backend=libusb0.get_backend() )
devices = usb.core.find(find_all=1, backend=openusb.get_backend() )
This assumes you are using pyusb-1.0.0a3. For 1.0.0a2 the libs are called libusb10, libusb01 and openusb. Of course, you'd only need to import the one you want.