GAE, sqlite3.OperationalError: unable to open database file - python

Ok, I read a lot. There is many people with the same issue, but all answers was not helpful for me.
I'm trying to do this - http://googlecloudplatform.github.io/appengine-php-wordpress-starter-project/ , but each time I running the app, I have the same message:
> 2014-09-22 10:12:10 Running command: "['C:\\Python27\\pythonw.exe', 'C:\\Program Files (x86)\\Google\\google_appengine\\dev_appserver.py', '--skip_sdk_update_check=yes', '--port=8080', '--admin_port=8090', 'C:\\gae\\wp39']"
INFO 2014-09-22 10:12:12,089 devappserver2.py:725] Skipping SDK update check.
Traceback (most recent call last):
File "C:\Program Files (x86)\Google\google_appengine\dev_appserver.py", line 82, in <module>
_run_file(__file__, globals())
File "C:\Program Files (x86)\Google\google_appengine\dev_appserver.py", line 78, in _run_file
execfile(_PATHS.script_file(script_name), globals_)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\devappserver2\devappserver2.py", line 970, in <module>
main()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\devappserver2\devappserver2.py", line 963, in main
dev_server.start(options)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\devappserver2\devappserver2.py", line 768, in start
request_data, storage_path, options, configuration)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\devappserver2\devappserver2.py", line 867, in _create_api_server
default_gcs_bucket_name=options.default_gcs_bucket_name)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\devappserver2\api_server.py", line 364, in setup_stubs
auto_id_policy=datastore_auto_id_policy)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\datastore\datastore_sqlite_stub.py", line 604, in __init__
factory=sql_conn)
sqlite3.OperationalError: unable to open database file
2014-09-22 10:12:12 (Process exited with code 1)
Windows 8, Python27, GAE 1.9.11
I checked all permisions and started GAE as administrator.
I tried Compatibility Mode (XP&Me, Win7) - nothing.
I tried set TMP variable in app.yaml
I tried to find "datastore.db" on all C: drive - found nothing.
I tried start App from CMD (as Administrator), like this:
C:\gae\wp39>dev_appserver.py C:\gae\wp39
C:\gae\wp39>dev_appserver.py --datastore_path C:\temp\data.db C:\gae\wp39
C:\gae\wp39>dev_appserver.py --clear_datastore=yes --datastore_path C:\temp\data.db C:\gae\wp39
The same result.
When I try run app from console with attribute "--datastore_path C:\temp\data.db' , the system creates that file (about 9KB) , but still can't open database.
The folder "C:\Users\\AppData\Local\Temp\appengine.levalult" exists, but empty. I don't know what else to do.
Thanks. I will be grateful for any advice.

Solve:
Change username to none Unicode or
Change the tmp and temp environment variable value to e:\ or
In the cmd prompt, do:
Change env var value
Set temp=e:\
Set tmp=e:\
2: run gae
D:\Program Files (x86)\Google\google_appengine\launcher\GoogleAppEngineLauncher.exe
Reason:
In datastore_sqllite_stub.py,
In def __init__
Before the self.__connection = sqlite3.connect:
Add the following code:
f = open( 'e:/tmp/a.log', 'w' )
f.write( self.__datastore_file )
f.write( '\n' )
for name in os.environ.keys():
f.write( '\n' )
v = os.environ[name]
f.write( name )
f.write( ' ' )
f.write( v )
f.close()
self.__datastore_file = 'e:/tmp/datastore.db'
According to the code, the database file is located in:
c:\users\%username%\appdata\local\temp\appengine.xgogo\datastore.db
which equal to:
%TEMP%\ appengine.xgogo\datastore.db
Where the %Temp% is the environment variable.
When username have Unicode characters, make failed.

Related

Why can't i create a folder when he doesn't exist?

i 'm trying to make a for loop who browse files in a specific directory while creating a folder if he doesn't exist with this solution. here is the code:
import ftputil
host=ftputil.FTPHost('x.x.x.x',"x","x") #connecting to the ftp server
mypathexist='./CameraOld' (he is here: /opt/Camera/CameraOld
mypath = '.' #it put you in /opt/Camera (it's the default path configured)
host.chdir(mypath)
files = host.listdir(host.curdir)
for f in files: #i browse the files in my folders
if f==mypathexist: #if a file is named CameraOld (it's a folder)
isExist=True
break
else: isExist=False #if 0 file are named like it
print(isExist)
if isExist==False: #if the file doesn't exist
host.mkdir(mypathexist) #create the folder
else:
print("ok")
The problem is that isExist is always false so the script try to create a folder who is already created. And i don't understand why.
Here's the output:
False #it's the print(isExist)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/ftputil/host.py", line 695, in command
self._session.mkd(path)
File "/usr/lib/python3.10/ftplib.py", line 637, in mkd
resp = self.voidcmd('MKD ' + dirname)
File "/usr/lib/python3.10/ftplib.py", line 286, in voidcmd
return self.voidresp()
File "/usr/lib/python3.10/ftplib.py", line 259, in voidresp
resp = self.getresp()
File "/usr/lib/python3.10/ftplib.py", line 254, in getresp
raise error_perm(resp)
ftplib.error_perm: 550 CameraOld: file exist
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/user/Bureau/try.py", line 16, in <module>
host.mkdir(mypathexist)
File "/usr/local/lib/python3.10/dist-packages/ftputil/host.py", line 697, in mkdir
self._robust_ftp_command(command, path)
File "/usr/local/lib/python3.10/dist-packages/ftputil/host.py", line 656, in _robust_ftp_command
return command(self, tail)
File "/usr/local/lib/python3.10/dist-packages/ftputil/host.py", line 694, in command
with ftputil.error.ftplib_error_to_ftp_os_error:
File "/usr/local/lib/python3.10/dist-packages/ftputil/error.py", line 195, in __exit__
raise PermanentError(
ftputil.error.PermanentError: 550 CameraOld: file exist
Debugging info: ftputil 5.0.4, Python 3.10.4 (linux)
I would bet your mypathexist is not correct. Or the other way around, your file list, doesn't hold the strings in that condition you assume it does.
Take a look at your condition by hand. Print out f in your loop. Is it what you would expect to be?
In the end, Python is simply comparing Strings.

No such file or directory: 'GoogleNews-vectors-negative300.bin'

I have this code :
import gensim
filename = 'GoogleNews-vectors-negative300.bin'
model = gensim.models.KeyedVectors.load_word2vec_format(filename, binary=True)
and this is my folder organization thing :
image of my folder tree that shows that the .bin file is in the same directory as the file calling it, the file being ai_functions
But sadly I'm not sure why I'm having an error saying that it can't find it. Btw I checked, I am sure the file is not corrupted. Any thoughts?
Full traceback :
File "/Users/Ile-Maurice/Desktop/Flask/flaskapp/run.py", line 1, in <module>
from serv import app
File "/Users/Ile-Maurice/Desktop/Flask/flaskapp/serv/__init__.py", line 13, in <module>
from serv import routes
File "/Users/Ile-Maurice/Desktop/Flask/flaskapp/serv/routes.py", line 7, in <module>
from serv.ai_functions import checkplagiarism
File "/Users/Ile-Maurice/Desktop/Flask/flaskapp/serv/ai_functions.py", line 31, in <module>
model = gensim.models.KeyedVectors.load_word2vec_format(filename, binary=True)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/gensim/models/keyedvectors.py", line 1629, in load_word2vec_format
return _load_word2vec_format(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/gensim/models/keyedvectors.py", line 1955, in _load_word2vec_format
with utils.open(fname, 'rb') as fin:
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/smart_open/smart_open_lib.py", line 188, in open
fobj = _shortcut_open(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/smart_open/smart_open_lib.py", line 361, in _shortcut_open
return _builtin_open(local_path, mode, buffering=buffering, **open_kwargs)
FileNotFoundError: [Errno 2] No such file or directory: 'GoogleNews-vectors-negative300.bin'
The 'current working directory' that the Python process will consider active, and thus will use as the expected location for your plain relative filename GoogleNews-vectors-negative300.bin, will depend on how you launched Flask.
You could print out the directory to be sure – see some ways at How do you properly determine the current script directory? – but I suspect it may just be the /Users/Ile-Maurice/Desktop/Flask/flaskapp/ directory.
If so, you could relatively-reference your file with the path relative to the above directory...
serv/GoogleNews-vectors-negative300.bin
...or you could use a full 'absolute' path...
/Users/Ile-Maurice/Desktop/Flask/flaskapp/serv/GoogleNews-vectors-negative300.bin
...or you could move the file up to its parent directory, so that it is alonside your Flask run.py.

PermissionError: [WinError 5] Access is denied: 'foldername/ Error while S3 subfolder file download window system

I am trying to download the s3 folder files into my windows system and I am getting Permission Error while executing the my python script in windows system.
Any help will be highly Appreciate.
# creating folder but no data.
import boto3
import os
from pathlib import Path
s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucketname')
key = 'foldername1'
objs = list(bucket.objects.filter(Prefix=key))
for obj in objs:
# print(obj.key)
# remove the file name from the object key
obj_path = os.path.dirname(obj.key)
# create nested directory structure
Path(obj_path).mkdir(parents=True, exist_ok=True)
# save file with full path locally
bucket.download_file(obj.key, obj.key)
Error I am getting below:
Traceback (most recent call last):
File "C:\MSA\EO projects\FEB 2022 WORKS\REMOTE AWZ\d6.py", line 23, in <module>
bucket.download_file(obj.key, obj.key)
File "C:\Program Files\Python37\lib\site-packages\boto3\s3\inject.py", line 246, in bucket_download_file
ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
File "C:\Program Files\Python37\lib\site-packages\boto3\s3\inject.py", line 172, in download_file
extra_args=ExtraArgs, callback=Callback)
File "C:\Program Files\Python37\lib\site-packages\boto3\s3\transfer.py", line 307, in download_file
future.result()
File "C:\Program Files\Python37\lib\site-packages\s3transfer\futures.py", line 106, in result
return self._coordinator.result()
File "C:\Program Files\Python37\lib\site-packages\s3transfer\futures.py", line 265, in result
raise self._exception
File "C:\Program Files\Python37\lib\site-packages\s3transfer\tasks.py", line 126, in __call__
return self._execute_main(kwargs)
File "C:\Program Files\Python37\lib\site-packages\s3transfer\tasks.py", line 150, in _execute_main
return_value = self._main(**kwargs)
File "C:\Program Files\Python37\lib\site-packages\s3transfer\download.py", line 601, in _main
osutil.rename_file(fileobj.name, final_filename)
File "C:\Program Files\Python37\lib\site-packages\s3transfer\utils.py", line 273, in rename_file
rename_file(current_filename, new_filename)
File "C:\Program Files\Python37\lib\site-packages\s3transfer\compat.py", line 25, in rename_file
os.remove(new_filename)
PermissionError: [WinError 5] Access is denied: 'foldername1/'
When the Create Folder button is used in the Amazon S3 console, it creates a 'folder'. However, Amazon S3 does not use folders. Instead, it creates a zero-length object with the name of the folder. In this case, it created an object called folder1/.
However, when your code attempted to download this object as a file, your Operating System did not like the idea of creating a file with a name ending in a slash (/). In fact, you do not need to download this folder since the code is already using mkdir() to create the directory.
Therefore, the code can simply skip-over such objects, like this:
for obj in objs:
if not obj.key.endswith('/'):
# Your existing code here
Alternatively, it could skip-over zero-length objects with:
if obj.size > 0:

[Windows 10 / Python 3.9 / Pipenv]How to debug an error when using `pipenv install xxx` from command line

I'm having trouble installing things without error when using Pipenv and am unable to figure out how to fix the error. Just for some upfront clarification, I'm using Windows 10, Python 3.9 and the issue is when I install with Pipenv. My last name is a little odd and has an apostrophe in it; however, I usually leave the apostrophe out when registering for things due to potential issues with the apostrophe. For some reason, at work, the IT department created my account using the apostrophe. Since it is configured with Azure AD, I cannot adjust the username from my own PC, even though I have admin rights on this machine.
The error seems to be coming from my opening a JSON file in my appdata directory, which , when fully qualified contains the string "t'alua" (my last name). Unfortunately, the string seems to be enclosed in single quotes so the apostrophe in my name causes a parsing issue. Just for the record, I have even uninstalled python from my C drive and installed on my D drive and executed commands in my D drive (where there is no apostrophe in the working directory's path) but I still get the same issue since it is related to the appdata directory's absolute path. I have also asked the IT department to change my username to remove the apostrophe but they said its not feasible due to so many downstream changes that would be required to prevent any kind of bugs, which I understand.
Here is an example command with the error shown:
(corl_anl_coercion-sJyNKu6O) D:\mttf_mtbf_datasets\training_data\corl_anl_coercion>pipenv install numpy
Installing numpy...
[= ] Installing numpy...Failed to load paths: File "<string>", line 1
import sysconfig, distutils.sysconfig, io, json, sys; paths = {u'purelib': u'{0}'.format(distutils.sysconfig.get_python_lib(plat_specific=0)),u'stdlib': u'{0}'.format(sysconfig.get_path('stdlib')),u'platlib': u'{0}'.format(distutils.sysconfig.get_python_lib(plat_specific=1)),u'platstdlib': u'{0}'.format(sysconfig.get_path('platstdlib')),u'include': u'{0}'.format(distutils.sysconfig.get_python_inc(plat_specific=0)),u'platinclude': u'{0}'.format(distutils.sysconfig.get_python_inc(plat_specific=1)),u'scripts': u'{0}'.format(sysconfig.get_path('scripts')),u'py_version_short': u'{0}'.format(distutils.sysconfig.get_python_version()), }; value = u'{0}'.format(json.dumps(paths));fh = io.open('c:/users/t'alua~1/appdata/local/temp/tmp0m5ovbrb.json', 'w'); fh.write(value); fh.close()
^
SyntaxError: invalid syntax
Output:
[ ==] Installing numpy...Failed to load paths: File "<string>", line 1
import sysconfig, distutils.sysconfig, io, json, sys; paths = {u'purelib': u'{0}'.format(distutils.sysconfig.get_python_lib(plat_specific=0)),u'stdlib': u'{0}'.format(sysconfig.get_path('stdlib')),u'platlib': u'{0}'.format(distutils.sysconfig.get_python_lib(plat_specific=1)),u'platstdlib': u'{0}'.format(sysconfig.get_path('platstdlib')),u'include': u'{0}'.format(distutils.sysconfig.get_python_inc(plat_specific=0)),u'platinclude': u'{0}'.format(distutils.sysconfig.get_python_inc(plat_specific=1)),u'scripts': u'{0}'.format(sysconfig.get_path('scripts')),u'py_version_short': u'{0}'.format(distutils.sysconfig.get_python_version()), }; value = u'{0}'.format(json.dumps(paths));fh = io.open('c:/users/t'alua~1/appdata/local/temp/tmpywegc7gn.json', 'w'); fh.write(value); fh.close()
^
SyntaxError: invalid syntax
Output:
Adding numpy to Pipfile's [packages]...
Installation Succeeded
Pipfile.lock not found, creating...
Locking [dev-packages] dependencies...
Locking [packages] dependencies...
Resolving dependencies...
Locking Failed!
Traceback (most recent call last):
File "D:\installations\python\lib\site-packages\pipenv\resolver.py", line 764, in <module>
main()
File "D:\installations\python\lib\site-packages\pipenv\resolver.py", line 758, in main
_main(parsed.pre, parsed.clear, parsed.verbose, parsed.system, parsed.write,
File "D:\installations\python\lib\site-packages\pipenv\resolver.py", line 741, in _main
resolve_packages(pre, clear, verbose, system, write, requirements_dir, packages, dev)
File "D:\installations\python\lib\site-packages\pipenv\resolver.py", line 702, in resolve_packages
results, resolver = resolve(
File "D:\installations\python\lib\site-packages\pipenv\resolver.py", line 684, in resolve
return resolve_deps(
File "d:\installations\python\lib\site-packages\pipenv\utils.py", line 1406, in resolve_deps
results, hashes, markers_lookup, resolver, skipped = actually_resolve_deps(
File "d:\installations\python\lib\site-packages\pipenv\utils.py", line 1119, in actually_resolve_deps
resolver.resolve()
File "d:\installations\python\lib\site-packages\pipenv\utils.py", line 834, in resolve
results = self.resolver.resolve(max_rounds=environments.PIPENV_MAX_ROUNDS)
File "d:\installations\python\lib\site-packages\pipenv\utils.py", line 822, in resolver
self.get_resolver(clear=self.clear, pre=self.pre)
File "d:\installations\python\lib\site-packages\pipenv\utils.py", line 813, in get_resolver
constraints=self.parsed_constraints, repository=self.repository,
File "d:\installations\python\lib\site-packages\pipenv\utils.py", line 806, in parsed_constraints
self._parsed_constraints = [c for c in self.constraints]
File "d:\installations\python\lib\site-packages\pipenv\utils.py", line 806, in <listcomp>
self._parsed_constraints = [c for c in self.constraints]
File "d:\installations\python\lib\site-packages\pipenv\patched\notpip\_internal\req\req_file.py", line 137, in parse_requirements
for parsed_line in parser.parse(filename, constraint):
File "d:\installations\python\lib\site-packages\pipenv\patched\notpip\_internal\req\req_file.py", line 282, in parse
for line in self._parse_and_recurse(filename, constraint):
File "d:\installations\python\lib\site-packages\pipenv\patched\notpip\_internal\req\req_file.py", line 287, in _parse_and_recurse
for line in self._parse_file(filename, constraint):
File "d:\installations\python\lib\site-packages\pipenv\patched\notpip\_internal\req\req_file.py", line 329, in _parse_file
args_str, opts = self._line_parser(line)
File "d:\installations\python\lib\site-packages\pipenv\patched\notpip\_internal\req\req_file.py", line 365, in parse_line
shlex.split(options_str), defaults) # type: ignore
File "D:\installations\python\lib\shlex.py", line 315, in split
return list(lex)
File "D:\installations\python\lib\shlex.py", line 300, in __next__
token = self.get_token()
File "D:\installations\python\lib\shlex.py", line 109, in get_token
raw = self.read_token()
File "D:\installations\python\lib\shlex.py", line 191, in read_token
raise ValueError("No closing quotation")
ValueError: No closing quotation
I can see from the output above that the error is due to this line of python being executed:
fh = io.open('c:/users/t'alua~1/appdata/local/temp/tmp0m5ovbrb.json', 'w');
since the string is not terminated correctly.
Unfortunately, I'm not sure where this is coming from, though. I tried to read through each of the python files listed in the stack trace and I see where, in shlex.py if the string ends, while still being in a 'quoted state', then it raises the "No closing Quotation" valueerror. I couldn't see any line of code where it actually pulls the appdata path or why it is using appdata in the first place. I did find in my python installation's site packages directory's util.py:
D:\installations\python\Lib\site-packages\pipenv\utils.py
that there is an escaped_grouped_arguments function. I thought perhaps if the filepath is being parsed with this, I can modify the function to check for variations of my lastname and replace it with a good value so I made the following change:
def escape_grouped_arguments(s):
"""Prepares a string for the shell (on Windows too!)
Only for use on grouped arguments (passed as a string to Popen)
"""
if s is None:
return None
## This is the original code
# Additional escaping for windows paths
#if os.name == "nt":
# s = "{}".format(s.replace("\\", "\\\\"))
#return '"' + s.replace("'", "'\\''") + '"'
## This is my modification
if "t'alua" in s.lower():
bad_names = ["T'Alua","t'alua","T'alua","T'ALUA"]
good_name = "talua"
for bad_name in bad_names:
s = s.replace(bad_name, good_name)
if os.name == "nt":
s = "{}".format(s.replace("\\", "\\\\"))
return '"' + s.replace("'", "'\\''") + '"'
But this change didn't seem to resolve the errors. Any and all guidance on how to correctly spot exactly where the error is coming from as well as how to best fix this particular issue would be greatly appreciated.

spark-submit python file ‘home/.python-eggs’ permission denied

I had a problem when I use spark-submit to run a python file.When the 'map' code run in 'executor', the problem like this :
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 151, in _run_module_as_main
mod_name, loader, code, fname = _get_module_details(mod_name)
File "/usr/lib64/python2.7/runpy.py", line 101, in _get_module_details
loader = get_loader(mod_name)
File "/usr/lib64/python2.7/pkgutil.py", line 464, in get_loader
return find_loader(fullname)
File "/usr/lib64/python2.7/pkgutil.py", line 474, in find_loader
for importer in iter_importers(fullname):
File "/usr/lib64/python2.7/pkgutil.py", line 430, in iter_importers
__import__(pkg)
File "/data8/yarn/local-dir/usercache/bo.feng/appcache/application_1448854352032_70810/container_1448854352032_70810_01_000002/pyspark.zip/pyspark/__init__.py", line 41, in <module>
File "/data8/yarn/local-dir/usercache/bo.feng/appcache/application_1448854352032_70810/container_1448854352032_70810_01_000002/pyspark.zip/pyspark/context.py", line 35, in <module>
File "/data8/yarn/local-dir/usercache/bo.feng/appcache/application_1448854352032_70810/container_1448854352032_70810_01_000002/pyspark.zip/pyspark/rdd.py", line 51, in <module>
File "/data8/yarn/local-dir/usercache/bo.feng/appcache/application_1448854352032_70810/container_1448854352032_70810_01_000002/pyspark.zip/pyspark/shuffle.py", line 33, in <module>
File "build/bdist.linux-x86_64/egg/psutil/__init__.py", line 89, in <module>
File "build/bdist.linux-x86_64/egg/psutil/_pslinux.py", line 24, in <module>
File "build/bdist.linux-x86_64/egg/_psutil_linux.py", line 7, in <module>
File "build/bdist.linux-x86_64/egg/_psutil_linux.py", line 4, in __bootstrap__
File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 945, in resource_filename
self, resource_name
File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 1633, in get_resource_filename
self._extract_resource(manager, self._eager_to_zip(name))
File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 1661, in _extract_resource
self.egg_name, self._parts(zip_path)
File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 1025, in get_cache_path
self.extraction_error()
File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 991, inextraction_error
raise err
pkg_resources.ExtractionError: Can't extract file(s) to egg cache
The following error occurred while trying to extract file(s) to the Python egg
cache:
[Errno 13] Permission denied: '/home/.python-eggs'
The Python egg cache directory is currently set to:
/home/.python-eggs
Perhaps your account does not have write access to this directory? You can
change the cache directory by setting the PYTHON_EGG_CACHE environment
variable to point to an accessible directory.
I set the PYTHON_EGG_CACHE environment variable to every executor,and I also write 'os.environ['PYTHON_EGG_CACHE'] = "/tmp/"' in my program,but the problem is still happen.
My code :
import os,sys
print "env::::"+os.environ['PYTHON_EGG_CACHE']
from pyspark import SparkConf, SparkContext
# Load and parse the data
def parsePoint(line):
import os
print "env::::"+os.environ['PYTHON_EGG_CACHE']
os.environ['PYTHON_EGG_CACHE'] = "/tmp/"
values = [float(x) for x in line.split(' ')]
return line
if __name__ == "__main__":
os.environ['PYTHON_EGG_CACHE'] = "/tmp/"
print "env::::"+os.environ['PYTHON_EGG_CACHE']
conf = SparkConf()
sc = SparkContext(conf = conf)
data = sc.textFile(sys.argv[1])
parsedData = data.map(parsePoint)
parsedData.collect()
I run this python program in 'standalone' model and succeeded.
This is my submit command:
spark-submit --name test_py --master yarn-client testpy.py input/sample_svm_data.txt
Is the yarn's problem?
This is late, but it's the first result # google I found with this problem... the previous answer is helpful (i wanted to know which env vars I had to modify), but please DONT modify editing Spark sources, just change environment variables using the proper tools, add this to your spark.conf variables...
spark.executorEnv.PYTHON_EGG_CACHE="./.python-eggs/"
spark.executorEnv.PYTHON_EGG_DIR="./.python-eggs/"
spark.driverEnv.PYTHON_EGG_CACHE="./.python-eggs/"
spark.driverEnv.PYTHON_EGG_DIR="./.python-eggs/"
(I prefer not to use /tmp/ because . will get deleted after my job ends, so eggs should disappear too IMO)
I solved this problem:
unzip the pyspark.zip then find rdd.py file
open this file , under "import os" line ,add code as :
os.environ['PYTHON_EGG_CACHE'] = '/tmp/.python-eggs/'
os.environ['PYTHON_EGG_DIR']='/tmp/.python-eggs/'
save file and zip pyspark
I solved this problem with the help from BiS's answer. By adding the four configuration values when running spark-submit, it fixed the egg problem.
Here's an example of what adding the four parameters looks like when using spark-submit.
spark-submit \
--conf spark.executorEnv.PYTHON_EGG_CACHE="./.python-eggs/" \
--conf spark.executorEnv.PYTHON_EGG_DIR="./.python-eggs/" \
--conf spark.driverEnv.PYTHON_EGG_CACHE="./.python-eggs/" \
--conf spark.driverEnv.PYTHON_EGG_DIR="./.python-eggs/" \

Categories

Resources