I'd seen previous errors importing form JAX from several years ago (https://github.com/google/jax/issues/372), but the post implied an update would fix it. I just installed JAX and am trying to get set up on a jupyter notebook. Could you let me know what might be going wrong?
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Input In [1], in <cell line: 4>()
1 ########## JAX ON MNIST #####################
2 # Import some additional JAX and dataloader helpers
3 from jax.scipy.special import logsumexp
----> 4 from jax.experimental import optimizers
6 import torch
7 from torchvision import datasets, transforms
ImportError: cannot import name 'optimizers' from 'jax.experimental' (/Users/XXX/opt/anaconda3/lib/python3.9/site-packages/jax/experimental/__init__.py)
I saw that the similar previous error was in 2019 and implied a version difference would fix it. I did not know where to go from there.
According to the CHANGELOG
jax 0.3.16
Deprecations:
Removed jax.experimental.optimizers; it has long been a deprecated alias of jax.example_libraries.optimizers.
So it sounds like if you're using JAX version 0.3.16 or newer, you should do
from jax.example_libraries import optimizers
But as noted in the jax.example_libraries.optimizers documentation, this is not well-supported code and you'll probably have a better experience with something like Optax or JAXopt.
Related
Good morning,
i want to test the hdbscan (Hierarchical Density-Based Spatial Clustering of Applications w/ Noise)using GPU so i should use the framework rapids.
When i tried to follow the steps described here https://colab.research.google.com/drive/1rY7Ln6rEE1pOlfSHCYOVaqt8OvDO35J0#forceEdit=true&sandboxMode=true&scrollTo=EwaJSKuswsNi
taken from Rapids website: https://rapids.ai/start.html
i get the following error when i run the code of the function CUDF:
import cudf
import io, requests
# download CSV file from GitHub
url="https://github.com/plotly/datasets/raw/master/tips.csv"
content = requests.get(url).content.decode('utf-8')
# read CSV from memory
tips_df = cudf.read_csv(io.StringIO(content))
tips_df['tip_percentage'] = tips_df['tip']/tips_df['total_bill']*100
# display average tip by dining party size
print(tips_df.groupby('size').tip_percentage.mean())
ValueError Traceback (most recent call last)
<ipython-input-1-a95ca25217db> in <module>()
----> 1 import cudf
2 import io, requests
3
4 # download CSV file from GitHub
5 url="https://github.com/plotly/datasets/raw/master/tips.csv"
2 frames
/usr/local/lib/python3.7/site-packages/cudf/_lib/__init__.py in <module>()
2 import numpy as np
3
----> 4 from . import (
5 avro,
6 binaryop,
cudf/_lib/avro.pyx in init cudf._lib.avro()
cudf/_lib/column.pyx in init cudf._lib.column()
cudf/_lib/scalar.pyx in init cudf._lib.scalar()
cudf/_lib/interop.pyx in init cudf._lib.interop()
ValueError: pyarrow.lib.Codec size changed, may indicate binary incompatibility. Expected 48 from C header, got 40 from PyObject
could you please help me.
thanks an advance
Colab made some enhancements this week that affected the RAPIDS installation process. Work toward a resolution is active, and progress is being tracked in this issue (which includes a potential workaround)
Please help, I get the error below running jupyter notebook.
import numpy as np
import pandas as pd
from helper import boston_dataframe
np.set_printoptions(precision=3, suppress=True)
Error:
ImportError Traceback (most recent call last)
<ipython-input-3-a6117bd64450> in <module>
1 import numpy as np
2 import pandas as pd
----> 3 from helper import boston_dataframe
4
5
ImportError: cannot import name 'boston_dataframe' from 'helper' (/Users/irina/opt/anaconda3/lib/python3.8/site-packages/helper/__init__.py)
Since you are not giving the where you get the notebook, I have to guess that you get it from this course Supervised Learning: Regression provided IBM.
In the zip folder in week 1, it provides helper.py.
What you need to do it is to change the directory to where this file is. Change IPython/Jupyter notebook working directory
Alternatively, you can load boston data from sklearn then load it to Pandas Dataframe
Advices for you:
Learn how to use Jupyter notebook
Learn how Python import work
Learn how to provide information in a question so that no one need to guess
This is my first attempt to use xgboost in pyspark so my experience with Java and Pyspark is still in learning phase.
I saw an awesome article in towards datascience with title PySpark ML and XGBoost full integration tested on the Kaggle Titanic dataset where the author goes through use case of xgboost in pyspark.
I tried to follow the steps but was hit with ImportError.
Installation
I have downloaded two jar files from maven and put them in the same directory where my notebook is.
xgboost4j version 0.72
xgboost4j-spark version 0.72
I have also downloaded the xgboost wrapper file sparkxgb.zip to the path ~/Softwares/sparkxgb.zip.
My jupyter notebook first cell
import xgboost
print(xgboost.__version__) # 1.2.0
import os
os.environ['PYSPARK_SUBMIT_ARGS'] = '--jars xgboost4j-spark-0.72.jar,xgboost4j-0.72.jar pyspark-shell'
HOME = os.path.expanduser('~')
import findspark
findspark.init(HOME + "/Softwares/spark-3.0.0-bin-hadoop2.7")
import pyspark
from pyspark.sql.session import SparkSession
from pyspark.sql.types import *
from pyspark.ml.feature import StringIndexer, VectorAssembler
from pyspark.ml import Pipeline
from pyspark.sql.functions import col
spark = SparkSession\
.builder\
.appName("PySpark XGBOOST Titanic")\
.getOrCreate()
spark.sparkContext.addPyFile(HOME + "/Softwares/sparkxgb.zip")
print(pyspark.__version__) # 3.0.0
# this does not give any error
# Computer: MacOS
This cell gives errror
from sparkxgb import XGBoostEstimator
Error
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-7-cf2ff39c26f4> in <module>
----> 1 from sparkxgb import XGBoostEstimator
/private/var/folders/tb/7xdk9scs79j9hxzcl3l_s6k00000gn/T/spark-1cf282a4-f3f2-42b3-a064-6bbd8751489e/userFiles-abca5e59-5af3-4b3d-a3bc-edc2973e9995/sparkxgb.zip/sparkxgb/__init__.py in <module>
18
19 from sparkxgb.pipeline import XGBoostPipeline, XGBoostPipelineModel
---> 20 from sparkxgb.xgboost import XGBoostEstimator, XGBoostClassificationModel, XGBoostRegressionModel
21
22 __all__ = ["XGBoostEstimator", "XGBoostClassificationModel", "XGBoostRegressionModel",
/private/var/folders/tb/7xdk9scs79j9hxzcl3l_s6k00000gn/T/spark-1cf282a4-f3f2-42b3-a064-6bbd8751489e/userFiles-abca5e59-5af3-4b3d-a3bc-edc2973e9995/sparkxgb.zip/sparkxgb/xgboost.py in <module>
19 from pyspark.ml.param import Param
20 from pyspark.ml.param.shared import HasFeaturesCol, HasLabelCol, HasPredictionCol, HasWeightCol, HasCheckpointInterval
---> 21 from pyspark.ml.util import JavaMLWritable, JavaPredictionModel
22 from pyspark.ml.wrapper import JavaEstimator, JavaModel
23 from sparkxgb.util import XGBoostReadable
ImportError: cannot import name 'JavaPredictionModel' from 'pyspark.ml.util' (/Users/poudel/Softwares/spark-3.0.0-bin-hadoop2.7/python/pyspark/ml/util.py)
Questions
How to fix the error and run xgboost in pyspark?
Maybe I have not placed downloaded jar files to correct path. (I have them placed in my working directory where I have jupyter notebook file). Do I need to place these files somewhere else? I assume jupyter automatically loads the path . and sees these jar files but I may be wrong.
If any good samaritan has already ran xgboost in pyspark, their help is much appreciated.
This question was asked almost 6 months ago but still no solution were provided.
Even I was facing the same issue for past few days, and finally I got solution so would like to share with all my folks.
By now you might have also got solution but I thought it would be better if I share so that you or in future any one can get benefits from this solution.
You can get rid of this error in two ways,
JavaPredictionModel has been removed from latest version of pyspark, so your can downgrad pyspark to let say version 2.4.0, then error will resolve.
But by doing this you might have to follow all the structure of old pyspark version only like OneHotEncoder can not be used for multiple features at same time you have to do that one-by-one.
!pip install pyspark==2.4.0
The second and best solution is to modify sparkxgb codes, like you can import JavaPredictionModel from pyspark.ml.wrapper, so you don't need to downgrad your pyspark.
from pyspark.ml.wrapper import JavaPredictionModel
P.S. Pardon me for not following the answer standards.
There are some problem with your versions. I know decision of similar problem fot catboost_spark. So i had a problem with versions (catboost_spark_version)
You need to go to https://catboost.ai/en/docs/installation/spark-installation-pyspark
Get the appropriate catboost_spark_version (see available versions at Maven central).
Choose the appropriate spark_compat_version (2.3, 2.4 or 3.0) and scala_compat_version (2.11 or 2.12).
Just add the catboost-spark Maven artifact with the appropriate spark_compat_version, scala_compat_version and catboost_spark_version to spark.jar.packages Spark config parameter and import the catboost_spark package:
So you go to https://search.maven.org/search?q=catboost-spark and
choose version (for example catboost-spark_3.3_2.12)
Then copy "Artifact ID". In this case is "catboost-spark_3.3_2.12:1.1.1"
Then paste it to your config parameter
And you will get something like this:
sparkSession = (SparkSession.builder
.master('local[*]')
.config("spark.jars.packages", "ai.catboost:catboost-spark_3.3_2.12:1.1.1")
.getOrCreate())
import catboost_spark
and it will works :)
I am working on some NLP task, trying to import normalize_corpus from the normalization module.
I am getting the below error.
from normalization import normalize_corpus
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-13-5919bba55473> in <module>()
----> 1 from normalization import normalize_corpus
ImportError: cannot import name 'normalize_corpus'
The normalization module you have installed (probably https://pypi.org/project/normalization/) does not correspond to the code you are trying to run (possibly from "Text Analytics with Python").
Uninstall the normalization module and track down the code from the book. (A place to start: https://github.com/dipanjanS/text-analytics-with-python)
For whoever lands here because you are reading "Text analytics with Python". The author of the book is referring to a custom script they built and is hosted in the first edition - chapter 4 or 5 -
https://github.com/dipanjanS/text-analytics-with-python/blob/master/Chapter-4/normalization.py
https://github.com/dipanjanS/text-analytics-with-python/blob/b4f6cefc9dd96e2a3e74e01bda391019bd7fb053/Old-First-Edition/Ch05_Text_Summarization/normalization.py
I'm certainly missing something very obvious here, but why does this work:
a = [0.2635,0.654654,0.365,0.4545,1.5465,3.545]
import statsmodels.robust as rb
print rb.scale.mad(a)
0.356309343367
but this doesn't:
import statsmodels as sm
print sm.robust.scale.mad(a)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-5-1ce0c872b0be> in <module>()
----> 1 print statsmodels.robust.scale.mad(a)
AttributeError: 'module' object has no attribute 'robust'
Long answer see http://www.statsmodels.org/stable/importpaths.html
Statsmodels has intentionally mostly empty __init__.py but has a parallel import collection through the api.py.
The recommended import for interactive work import statsmodels.api as sm imports almost all of statsmodels, numpy, pandas and patsy, and large parts of scipy. This is slooow on cold start.
If we want to import just a specific part of statsmodels, then we don't need to import all these extras. Having empty __init__.py means that we can import just a single module (which of course imports the dependencies of that module).
e.g. from statsmodels.robust.scale import mad or
import statmodels.robust scale as smscale
smscale.mad(...)
(Small caveat: Some of the very low level imports might not remain always backwards compatible if the internal structure changes. However, the general policy is to deprecate functions over one or two releases while maintaining the old access structure.)
You can, you just have to import robust as well:
import statsmodels as sm
import statsmodels.robust
Then:
>>> sm.robust.scale.mad(a)
0.35630934336679576
robust is a subpackage of statsmodels, and importing a package does not in general automatically import subpackages (unless the package is written to do so explicitly).