I am trying to run a script which involves numpy through the terminal on WinSCP, but whenever I do, I get the following error:
import gensim
File "/data/work/worker/gensim/init.py", line 5, in
from gensim import parsing, corpora, matutils, interfaces, models, similarities, summarization, utils # noqa:F401
File "/data/work/worker/gensim/parsing/init.py", line 4, in
from .preprocessing import (remove_stopwords, strip_punctuation, strip_punctuation2, # noqa:F401
File "/data/work/worker/gensim/parsing/preprocessing.py", line 42, in
from gensim import utils
File "/data/work/worker/gensim/utils.py", line 38, in
import numpy as np
File "/data/work/worker/numpy/init.py", line 142, in
from . import core
File "/data/work/worker/numpy/core/init.py", line 50, in
raise ImportError(msg)
ImportError:
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.
We have compiled some common reasons and troubleshooting tips at:
https://numpy.org/devdocs/user/troubleshooting-importerror.html
Please note and check the following:
The Python version is: Python2.7 from "/usr/bin/python"
The NumPy version is: "1.18.5"
and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.
Original error was: No module named _multiarray_umath>
I suspect that the issue is that the version of Python being quoted here is outdated; however, after having done some searching, I cannot find any literature on how to update Python on WinSCP. I have Python 3.8 installed on my machine, and I have tried moving the installer and the .exe file into the WinSCP directory to no avail. Is there any way to update python directly in the terminal? Alternately, is this issue actually nothing to do with a stale version of Python at all?
Related
I'm attempting to figure out the mechanics of running Python scripts in Power BI for Reasons and I've hit a snag. I was running through the steps in this somewhat basic tutorial and I came to the section in which I am supposed to paste the script into they Python Script screen, which is Step 3 of the 'Run the Script and Import Data' section.
When I followed the steps, which are essentially to paste the example script into the window and hit Okay, I got this 'helpful' error:
`
Details: "ADO.NET: Python script error.
<pi>Traceback (most recent call last):
File "C:\Users\my.username\PythonScriptWrapper_aca634c7-30c3-4bf4-881b-d1e47bb0a919\PythonScriptWrapper.PY", line 2, in <module>
import os, pandas, matplotlib
File "C:\Users\my.username\Anaconda3\lib\site-packages\matplotlib\__init__.py", line 109, in <module>
from . import _api, _version, cbook, docstring, rcsetup
File "C:\Users\my.username\Anaconda3\lib\site-packages\matplotlib\rcsetup.py", line 27, in <module>
from matplotlib.colors import Colormap, is_color_like
File "C:\Users\my.username\Anaconda3\lib\site-packages\matplotlib\colors.py", line 51, in <module>
from PIL import Image
File "C:\Users\my.username\Anaconda3\lib\site-packages\PIL\Image.py", line 89, in <module>
from . import _imaging as core
ImportError: DLL load failed while importing _imaging: The specified module could not be found.
</pi>"
`
I have verified that the matplotlib and pandas packages were installed via pip list, os doesn't show up, which surprised me but I know it's part of the standard library so I'm not stressing about it unless someone thinks I should. Is anyone an expert in this? Is there a better way? Am I doomed to scream into the Power BI void for all eternity?
The somewhat basic tutorial is based on using standard python from python.org. However, you are using the Anaconda distribution, which basically requires the environment to be activated before any modules - especially pandas' C-library - can be accessed.
You can achieve that in the cmd shell by running
conda activate
C:\Users\my.username\AppData\Local\Microsoft\WindowsApps\PBIDesktopStore.exe
which assumes that you are using the Power BI Desktop version from the Microsoft store.
I use Pydev and try to load numpy in my script. I installed Anaconda to be able to use it, following the Anaconda user guide concerning how to set it up as interpreter in Pydev.
I have tried installing Anaconda both with and without adding it to PATH (the recommended version is without). In both cases numpy shows up as an installed package in Pydev:
However, when I run my script I receive the following error:
File "Z:\Path\To\My\file.py", line 2, in <module>
import numpy as np
File "C:\Users\(username)\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\__init__.py", line 140, in <module>
from . import _distributor_init
File "C:\Users\(username)\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\_distributor_init.py", line 34, in <module>
from . import _mklinit
ImportError: DLL load failed: The specified module could not be found.
I also noticed that in the manual from anaconda, in the shown picture a lot more library folders are added than what I am presented with. I couldn't find a list of those, though. Running numpy from the Anaconda prompt works fine.
Any ideas would be appreciated!
I've managed to construct a simple app utilizing the Pico framework (https://github.com/fergalwalsh/pico). My frontend is connecting to my backend without any difficulties. Below is my Python file, which at the moment simply returns/renders a string, using a client-side input value, "name".
from __future__ import absolute_import
import sys
import pico
import numpy as np
# import sklearn
# import pandas as pd
from api2 import aloha
from pico import PicoApp
#pico.expose()
def hello(name):
a = np.arange(15).reshape(3, 5)
# a = np.arrange('data', 'field').reshape(3,5)
return "hello %s, %s" %(name, a)
app = PicoApp()
app.register_module(__name__)
(It also returns a NumPy array, simply because I'm testing what I can import into the file.)
All my packages are installed just fine, via Anaconda in /site-packages, which is in the python3.6 directory.
Oddly, the app runs fine; it can import NumPy. It breaks, however, when I try to import Pandas or SKLearn. I've tried manually copying and pasting NumPy into /Library/Python/2.7/site-packages, which actually breaks the app. But NumPy works in the app when it is only located in Anaconda's /site-packages.
I've tried altering app.register(__name__) to app.register('api'), which is the name of the Python file (api.py), based on another Question/Answer here. I've also tried reinstalling Pandas with sudo -H pip install pandas, but all the requirements are already satisfied.
This is the error that is thrown when I try to include Pandas in api.py:
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Library/Python/2.7/site-packages/pico/server.py", line 31, in <module>
app = import_string(module_name)
File "/Library/Python/2.7/site-packages/werkzeug/utils.py", line 443, in import_string
sys.exc_info()[2])
File "/Library/Python/2.7/site-packages/werkzeug/utils.py", line 431, in import_string
module = import_string(module_name)
File "/Library/Python/2.7/site-packages/werkzeug/utils.py", line 443, in import_string
sys.exc_info()[2])
File "/Library/Python/2.7/site-packages/werkzeug/utils.py", line 418, in import_string
__import__(import_name)
File "./api.py", line 6, in <module>
import pandas as pd
File "/Library/Python/2.7/site-packages/pandas/__init__.py", line 23, in <module>
from pandas.compat.numpy import *
File "/Library/Python/2.7/site-packages/pandas/compat/numpy/__init__.py", line 24, in <module>
'this pandas version'.format(_np_version))
werkzeug.utils.ImportStringError: import_string() failed for 'api.app'. Possible reasons are:
- missing __init__.py in a package;
- package or module path not included in sys.path;
- duplicated package or module name taking precedence in sys.path;
- missing module, class, function or variable;
Debugged import:
- 'api' not found.
Original exception:
ImportStringError: import_string() failed for 'api'. Possible reasons are:
- missing __init__.py in a package;
- package or module path not included in sys.path;
- duplicated package or module name taking precedence in sys.path;
- missing module, class, function or variable;
Debugged import:
- 'api' not found.
Original exception:
ImportError: this version of pandas is incompatible with numpy < 1.9.0
your numpy version is 1.8.0rc1.
Please upgrade numpy to >= 1.9.0 to use this pandas version
When I run which python, it points to /Users/richardscheiwe/anaconda3/bin/python. Also, I have NumPy v.1.15 installed, and I can't find any other NumPy folder(s). When I try moving a version of NumPy to Library/Python/2.7/site-packages, I get this error:
ImportError:
Importing the multiarray numpy extension module failed. Most
likely you are trying to import a failed build of numpy.
If you're working with a numpy git repo, try `git clean -xdf` (removes all
files not under version control). Otherwise reinstall numpy.
Original error was: cannot import name multiarray
I guess I need to somehow point the app's Python to Anaconda's Python 3.6 version, but I don't know how to do that. Pico is also available in Anaconda's /site-packages directory, but it isn't pointing there.
Any help is greatly appreciated. I've scoured StackOverflow and GitHub.
You don't mention how you are starting the pico app but I assume you are doing like this:
python -m pico.server api
In this case it will simply use whatever python is in your path. If it is python3 in /Users/richardscheiwe/anaconda3/bin/python but you are getting errors referring to /Library/Python/2.7/ then there is some problem with your anaconda installation/paths in your environment.
There is nothing different with pico to running a plain python script but I suggest you create a simplified script without pico (literally just import pandas) to work out your environment issues with simpler error messages.
If I'm reading this correctly the error seems to come from trying to use a version of NumPy built for running on python 2.6 while your app is running using Python3.
Try removing NumPy using; "sudo pip uninstall numpy" and then use "pip -H install Numpy" to try reinstalling it and seeing if it correctly finds the Python3 version of Numpy
I am running Spark programs on a large cluster (for which, I do not have administrative privileges). numpy is not installed on the worker nodes. Hence, I bundled numpy with my program, but I get the following error:
Traceback (most recent call last):
File "/home/user/spark-script.py", line 12, in <module>
import numpy
File "/usr/local/lib/python2.7/dist-packages/numpy/__init__.py", line 170, in <module>
File "/usr/local/lib/python2.7/dist-packages/numpy/add_newdocs.py", line 13, in <module>
File "/usr/local/lib/python2.7/dist-packages/numpy/lib/__init__.py", line 8, in <module>
File "/usr/local/lib/python2.7/dist-packages/numpy/lib/type_check.py", line 11, in <module>
File "/usr/local/lib/python2.7/dist-packages/numpy/core/__init__.py", line 6, in <module>
ImportError: cannot import name multiarray
The script is actually quite simple:
from pyspark import SparkConf, SparkContext
sc = SparkContext()
sc.addPyFile('numpy.zip')
import numpy
a = sc.parallelize(numpy.array([12, 23, 34, 45, 56, 67, 78, 89, 90]))
print a.collect()
I understand that the error occurs because numpy dynamically loads multiarray.so dependency and even if my numpy.zip file includes multiarray.so file, somehow the dynamic loading doesn't work with Apache Spark. Why so? And how do you othewise create a standalone numpy module with static linking?
Thanks.
There are at least two problems with your approach and both can be reduced to a simple fact that NumPy is a heavyweight dependency.
First of all Debian packages come with multiple dependencies including libgfortran, libblas, liblapack and libquadmath. So you cannot simply copy NumPy installation and expect that things will work (to be honest you shouldn't do anything like this if it wasn't the case). Theoretically you could try to build it using static linking and this way ship it with all the dependencies but it hits the second issue.
NumPy is pretty large by itself. While 20MB doesn't look particularly impressive and with all the dependencies it shouldn't be more 40MB it has to be shipped to the workers each time you start your job. The more workers you have the worse it gets. If you decide you need SciPy or SciKit it can get much worse.
Arguably this makes NumPy a really bad candidate for being shipped with pyFile method.
If you hadn't have direct access to the workers but all the dependencies, including header files and a static library were present, you could simply try to install NumPy in the user space from the task itself (it assumes that pip is installed as well) with something like this:
try:
import numpy as np
expect ImportError:
import pip
pip.main(["install", "--user", "numpy"])
import numpy as np
You'll find other variants of this method in How to install and import Python modules at runtime?
Since you have access to the workers a much better solution is to create a separate Python environment. Probably the simplest approach is to use Anaconda which can be used to package non-Python dependencies as well and doesn't depend on the system-wide libraries. You can easily automate this task using tools like Ansible or Fabric, it doesn't require administrative privileges and all you really need is bash and some way to fetch basic installers (wget, curl, rsync, scp).
See also: shipping python modules in pyspark to other nodes?
I am trying to learn sklearn and I encounter the below error when I run import sklearn . However, when I run the exact same code using python 2.7, I do not encounter any errors.
import sklearn
File "/usr/local/lib/python3.2/dist-packages/sklearn/__init__.py", line 38, in <module>
from .base import clone
File "/usr/local/lib/python3.2/dist-packages/sklearn/base.py", line 10, in <module>
from scipy import sparse
File "/usr/lib/python3/dist-packages/scipy/__init__.py", line 124, in <module>
pkgload(verbose=SCIPY_IMPORT_VERBOSE,postpone=True)
File "/usr/local/lib/python3.2/dist-packages/numpy/_import_tools.py", line 177, in __call__
for package_name in self._get_sorted_names():
File "/usr/local/lib/python3.2/dist-packages/numpy/_import_tools.py", line 114, in _get_sorted_names
for name in depend_dict.keys():
RuntimeError: dictionary changed size during iteration
I did some googl'ing and followed the instructions from the following link and ran sudo pip3 install git+https://github.com/scikit-learn/scikit-learn.git. The installation went fine, however, I continue to get the error.
https://askubuntu.com/questions/449326/installation-error-in-sklearn-for-python3
How does one go about fixing this issue. (other than working with Python2.7)
It's a bug that's that will be fixed in the next NumPy (v 1.9.0) release:
https://github.com/numpy/numpy/commit/5025c40965fa5fb2b591f07c152b966dc7b730f0
There is already a patch available on github, but it hasn't been bundled into a patch release yet. Your options:
Wait for the 1.9.0 release to fix for Python 3, and use Python 2 in the meantime.
Simply apply the same changes to the two lines in the link I provided to your current version of Numpy
Install Numpy 1.9.0 beta.