How to install and use PyPy for my Python script? - python

Let me start by clarifying that I am relatively new to coding and a noob when it comes to everything that goes beyond python coding. For a project of mine which involves simulation, I really need to decrease the running time. After some research I got the impression that using PyPy interpreter would be a possible solution for my problem.
I use Spyder & Anaconda and I have been trying some stuff to implement PyPy, but I have found that my understanding is not sufficient and it has been rather time consuming without success. I have also installed VScode, for which I did succeed in loading and using the PyPy interpreter. However, I need to use several packages that I use in my original Python script; pandas, numpy and scipy. If it is even possible to use these packages, I have no clue how to install these for PyPy.
I have read on this website that it is recommended to use conda forge (right?) for these situations, but my Anaconda is often buggy for some reason. It would be amazing if someone can give a step-by-step guide or some advise on how to tackle this problem, either in VScode or Spyder&Anaconda.
I have downloaded PyPy 7.3.9.
Thanks in advance.

The recommended way to get binary packages like NumPy and Pandas is to use the conda-forge packages via the conda. There is a blog post about it, the short version is
# create an environment
conda create -c conda-forge -n my-pypy-env pypy python=3.8
# activate it
conda activate my-pypy-env
#install some things
conda install numpy pandas
# run your script
python my_script.py
With that, PyPy will be no faster than CPython when running scripts that make heavy use of NumPy and Pandas data structures since they are written in C. In fact, the hoops PyPy must jump through to use these data structures in Python means that PyPy could be significantly slower. We have a plan for that called HPy, but it will take a while to happen.

Related

Installing Python 3.11

I want to try out Python 3.11 to find out how much faster this version is than what I'm currently using (3.7.3). I am using Anaconda and Spyder, but Anaconda does not yet support Python 3.11 and additionally I regularly have problems with updating in Anaconda.
Importantly, I want to maintain my Anaconda and Spyder environments as it is and use Python 3.11 independently from this. Therefore, I was wondering if simply downloading Python 3.11 from their website will mess up my environment, as then there will be two versions of Python insalled on my PC. Also I would like to know if I have to use a different IDE for this (or even without IDE).
Even though my question might be a bit vague, thanks in advance.
Try to create new env 3.10 using Anaconda, if Anaconda still doesn't have 3.11. The difference with 3.11 would be (I'm not guaranty, just a "rumors") ~+15%, depends...
You can build and install your version from source :
build-python-from-source
This way you won't break anything and can to delete Python3.11 after experiments.
You can google the benchmark tests for overage performance comparison between <your.version> and <any.over.version> for very common understanding.

Is it possible to use custom build cpython in Anaconda environment?

I build a cpython locally (Windows) with a fix to a problem with multiprocessing that I have, but I also need data science stack of packages like numpy, pandas, scipy, matplotlib, statsmodels and few others. When I try to install them the process appears quite cubersome for many packages and for scipy I wasn't able to resolve it after 3 days of trying.
I was thinking that it would be amazing if I can just replace python in my anaconda env, and use conda to install the packages I need. Is it possible to easily replace python with binaries I have or I need to wait while my fix is released with new Python version?
I was able to resolve my issue by replacing only python DLL that I had in, and the conda environment just worked with it

Confusion between Python and Anaconda

Recently I have started programming in Python (Python 3.5) on my Linux OS. But I am confused about Anaconda. What is it actually? Is it a version of Python or something else? If I do not install Anaconda will there be any limitations?
Anaconda is a free and open-source Python distribution and collection of hundreds of packages related to data science, scientific programming, development and more. Python is included in the Anaconda distribution. It is not an IDE (like PyCharm that mentioned in the comments) though it can be configured with most IDEs. I will note that the distribution includes an IDE called Spyder. It also comes with a platform-agnostic package manager called conda.
You can read more here: https://docs.continuum.io/anaconda/
Anaconda is a popular Python data science platform.
Anaconda is a commercial open source distribution of:
Python and R programming languages for large-scale data processing, predictive analytics, and scientific computing, that aims to simplify package management and deployment.
Also, you can very well install Anaconda for any operating system i.e linux or windows. They have navigator which will be of great use to launch modules available.
Anaconda while installing asks Python version :
Find more about anaconda at :
Official Website
Anaconda Docs
Anaconda distribution has been on my computer for last 2 years, on & off, so I feel that I have some experience using it.
Anaconda tries to be a Swiss army knife, and the fact remains, everything that is available with anaconda, can be manually installed using PIP.
If you're a beginner, and don't intend to do some comprehensive stuff in data science/ML field, I don't see any reason that you will need to install Anaconda. If you still want to have conda on your machine, go for it, but if you have python pre-installed, remove it first, and then use conda. (Otherwise you'll have to be specific and observant of where is it that the new python packages being installed on your computer.)
Conda dist. usually occupies 2-4 GB of space very easily.(There is a light installer known as miniconda, but it too goes on to consume memory considerably)
When you use conda command to install a python package, it usually pulls additional (maybe unnecessary for a beginner) packages along with it, thus consuming more & more space on your device. So, if your machine is slow and you have less space, Anaconda is a big NO-NO for you.
Anaconda (IMHO) is a finely tuned hype in the internet space of beginner python users.
And even if you have sufficient memory and a capable device, I don't find why should you spend that for things that you may never use. Unless you have a significant benefit when doing so, which could be more pronounced for those in a professional environment.
There are ways to bulk install everything you need using PIP, And PIP only installs what we demand/command from the terminal, nothing additional stuff, unless we ask for it.
Also, keep in mind, if you want to do data science, ML, Deep learning things, go for 64-bit version of python, so that every module you need can be installed without countering errors.
Anaconda is nothing but a python and R distribution. If you are working on Machine learning or data science field, tou will find anaconda very useful. So installing anaconda will also install python, conda(which is a package manager in anaconda), a lot of third party python packages, an IDE(like spyder), jupyter notebook(which is very helpful to write codes and visualise results and run codes cell by cell) . However, if tou are just a beginner, installing only python would be enough. Python will have certain standard libraries that will be installed along with it. And when u need new packages, you can use pip to install them.
P.s. if you have low memory space and u are just beginning, anaconda is a no no as it will have many packages installed by default, which u might not use. But installing python requires less memory and when u need a third party library, u can use pip to install libraries.

Remove all previous versions of python

I have some experience with C++ and Fortran, and I want to start using python for my post-processing as I am starting to realise how inefficient MATLAB is for what I need to do (mostly involves plots with millions of points).
I already had a few versions of python installed, from every time I wanted to start using. It has now become a mess. In /usr/local/bin/, here is what the command ls python* returns:
python python2.7 python3 python3.5 python3.5m pythonw-32
python-32 python2.7-32 python3-32 python3.5-32 python3.5m-config pythonw2.7
python-config python2.7-config python3-config python3.5-config pythonw pythonw2.7-32
I now want a clean slate. I want a safe way to remove all the previous versions of python, including all of their packages, so I can just install the latest version and import all the libraries I want like numpy and matplotlib smoothly (I had some issues with that).
EDIT:
I am running on OSX Yosemite 10.10.
Do not uninstall your system's Python interpreter (Python 2.7 most probably). You might consider uninstalling the other version (Python 3.5 most probably), but I do not think you really need to do that (it may not be a bad idea to keep a system-wide Python 3 interpreter... who knows!).
If you want a clean state I would recommend you to use virtual environments for now on. You have two options:
Use virtualenv and pip to setup your virtual environments and packages. However, using pip means you will have to compile the packages that need compilation (numpy, matplotlib and many other scientific Python packages that you may use for your "post-processing").
Use Conda (or Miniconda). This way you will be able to handle virtual
environments but without having to compile Python packages yourself.
Conda also allows you to handle different Python interpreters without
the need of having them installed in your system (it will download
them for you).
Also, you say you are feeling MATLAB is inefficient for plotting millions of points. I do not know your actual needs/constraints, but I find Matplotlib to be very inefficient for plotting large data and/or real-time data.
Just as a suggestion, consider using PyQtGraph. If you still feel that is not fast enough, consider using VisPy (probably less functional/convenient at the moment, but more efficient).

Anaconda and VirtualEnv

I have a virtualenv running python 2.7.7. It has a pretty extensive set of libraries which support a pretty complicated set of proprietary modules. In other words, the virtualenv needs to maintain its integrity. That is of course the whole point of virtualenv.
Recently, I encountered a number of problems that are very easily solved by using Anaconda. I tried it out in a test environment and it worked quite well. Now I'm tasked with incorporating this new configuration into production. It isn't clear to me how to incorporate Anaconda into a virtualenv, or whether this is even a good idea. In fact, it almost seems to me like I should use the anaconda install as the new source and desconstruct the old virtualenv... merging the libraries it held into the conda.
Does anyone have a recommendation as to the best approach? If merging the environments is called for, can anyone point to an explanation of how to go about it?
It doesn't really make sense to merge Anaconda and a virtualenv, as Anaconda is a completely independent installation of Python. You can do it, typically by setting your PYTHONPATH, but things have a good chance of breaking when you do this sort of thing, and I would recommend against it.
If there are libraries in your virtualenv, you can use them with Anaconda by making conda packages for them. They may already have conda packages (search with conda search and search https://binstar.org/). Otherwise, you can build a package using a conda recipe. See http://conda.pydata.org/docs/build.html and https://github.com/conda/conda-recipes for some example recipes.

Categories

Resources