I am trying a python code in which I am using pyarrow and trying to make connection to hadoop server using fs.HadoopFileSystem(host=host_value, port=port_value) but everytime I am getting an error message:
self.parquet_writer = HDFSWriter(host_value='hdfs://10.110.8.239',port_value=9000)
File "/app/aerial_server.py", line 54, in __init__
self.hdfs_client = fs.HadoopFileSystem(host=host_value, port=port_value)
File "pyarrow/_hdfs.pyx", line 89, in pyarrow._hdfs.HadoopFileSystem.__init__
File "pyarrow/error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 114, in pyarrow.lib.check_status
OSError: HDFS connection failed
env variables
PYTHON_VERSION=3.7.13
HADOOP_OPTS=-Djava.library.path=/app/hadoop-3.3.2/lib/nativ
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
HADOOP_INSTALL=/app/hadoop-3.3.2
ARROW_LIBHDFS_DIR=/app/hadoop-3.3.2/lib/nativeHADOOP_MAPRED_HOME=/app/hadoop-3.3.2
HADOOP_COMMON_HOME=/app/hadoop-3.3.2
HADOOP_HOME=/app/hadoop-3.3.2
HADOOP_HDFS_HOME=/app/hadoop-3.3.2PYTHON_PIP_VERSION=22.0.4
CLASSPATH=/app/hadoop-3.3.2/bin/hdfs classpath --glob
HADOOP_COMMON_LIB_NATIVE_DIR=/app/hadoop-3.3.2/lib/native
PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/app/hadoop-3.3.2/sbin:/app/hadoop-3.3.2/bin
_=/usr/bin/env
Related
I am running ansible playbook with AWS dynamic inventory and I just can't make any sense from this error (ec2.py problems):
[WARNING]: * Failed to parse /var/opt/git/ansible/project-x/inventory/ec2.py with script plugin: Inventory script (/var/opt/git/ansible/project-x/inventory/ec2.py) had an execution error: Traceback (most recent call last): File
"/var/opt/git/ansible/project-x/inventory/ec2.py", line 1701, in <module> Ec2Inventory() File "/var/opt/git/ansible/project-x/inventory/ec2.py", line 272, in __init__ self.do_api_calls_update_cache() File
"/var/opt/git/ansible/project-x/inventory/ec2.py", line 538, in do_api_calls_update_cache self.get_instances_by_region(region) File "/var/opt/git/ansible/project-x/inventory/ec2.py", line 592, in get_instances_by_region
conn = self.connect(region) File "/var/opt/git/ansible/project-x/inventory/ec2.py", line 556, in connect conn = self.connect_to_aws(ec2, region) File "/var/opt/git/ansible/project-x/inventory/ec2.py", line 581, in
connect_to_aws conn = module.connect_to_region(region, **connect_args) File "/usr/local/lib/python2.7/dist-packages/boto/ec2/__init__.py", line 66, in connect_to_region connection_cls=EC2Connection, **kw_params) File
"/usr/local/lib/python2.7/dist-packages/boto/regioninfo.py", line 220, in connect return region.connect(**kw_params) File "/usr/local/lib/python2.7/dist-packages/boto/regioninfo.py", line 290, in connect return
self.connection_cls(region=self, **kw_params) File "/usr/local/lib/python2.7/dist-packages/boto/ec2/connection.py", line 103, in __init__ profile_name=profile_name) File "/usr/local/lib/python2.7/dist-
packages/boto/connection.py", line 1100, in __init__ provider=provider) File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 555, in __init__ profile_name) File "/usr/local/lib/python2.7/dist-
packages/boto/provider.py", line 201, in __init__ self.get_credentials(access_key, secret_key, security_token, profile_name) File "/usr/local/lib/python2.7/dist-packages/boto/provider.py", line 297, in get_credentials
profile_name) boto.provider.ProfileNotFoundError: Profile "project-x" not found!
[WARNING]: * Failed to parse /var/opt/git/ansible/project-x/inventory/ec2.py with ini plugin: /var/opt/git/ansible/project-x/inventory/ec2.py:3: Error parsing host definition ''''': No closing quotation
[WARNING]: Unable to parse /var/opt/git/ansible/project-x/inventory/ec2.py as an inventory source
[WARNING]: Unable to parse /var/opt/git/ansible/project-x/inventory as an inventory source
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
[WARNING]: Could not match supplied host pattern, ignoring: tag_Role_leader
PLAY [tag_Role_leader] **********************************************************************************************************************************************************************************************************************
skipping: no hosts matched
[WARNING]: Could not match supplied host pattern, ignoring: tag_Role_manager
PLAY [tag_Role_manager] *********************************************************************************************************************************************************************************************************************
skipping: no hosts matched
PLAY RECAP **********************************************************************************************************************************************************************************************************************************
When my friend runs this, it works for him, playbook executes fine on tag_Role_leader and manager. And my aws credentials are correct.
I am following the installation steps mentioned below but have encountered a python problem.
https://en.wikibooks.org/wiki/GNU_Health/Installation#Installing_GNU_Health_on_GNU/Linux_and_FreeBSD
At the step where the initialisation of the database instance is to be performed, I have encountered the following error after executing the following command.
python3 ./trytond-admin --all --database=health
Error encountered:
Traceback (most recent call last):
File "./trytond-admin", line 21, in <module>
admin.run(options)
File "/home/gnuhealth/gnuhealth/tryton/server/trytond-4.6.18/trytond/admin.py", line 24, in run
with Transaction().start(db_name, 0, _nocache=True):
File "/home/gnuhealth/gnuhealth/tryton/server/trytond-4.6.18/trytond/transaction.py", line 88, in start
database = Database(database_name).connect()
File "/home/gnuhealth/gnuhealth/tryton/server/trytond-4.6.18/trytond/backend/postgresql/database.py", line 97, in __new__
**cls._connection_params(name))
File "/home/gnuhealth/.local/lib/python3.6/site-packages/psycopg2/pool.py", line 161, in __init__
self, minconn, maxconn, *args, **kwargs)
File "/home/gnuhealth/.local/lib/python3.6/site-packages/psycopg2/pool.py", line 58, in __init__
self._connect()
File "/home/gnuhealth/.local/lib/python3.6/site-packages/psycopg2/pool.py", line 62, in _connect
conn = psycopg2.connect(*self._args, **self._kwargs)
File "/home/gnuhealth/.local/lib/python3.6/site-packages/psycopg2/__init__.py", line 126, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: fe_sendauth: no password supplied
Can anyone help me out with this error or tell me what I am missing?
Based on the error, I suspect that there's a difficulty in connecting to the DB as there is no password specified.
It seems that you did not configured the URI with the credentials to connect to the database. You can find the description of the configuration file at http://docs.tryton.org/projects/server/en/latest/topics/configuration.html#uri
Once you have a configuration file, you must run the command like this:
python3 ./trytond-admin --all --database=health -c /path/to/trytond.conf
I am new to neo4j world. I have successfully used it on my macbook. Now I am deploying it on a remote Linux machine with the same setup. But I keep getting this Protocol error. What caused this issue? How to fix this? I have been banging my head on this error for days.
Traceback (most recent call last):
File "/root/dev/knowledgeGraphH/knowledge/media_entity_mapper.py", line 31, in <module>
main()
File "/root/dev/knowledgeGraphH/knowledge/media_entity_mapper.py", line 28, in main
map_media_to_entities()
File "/root/dev/knowledgeGraphH/knowledge/media_entity_mapper.py", line 7, in map_media_to_entities
data_manager = DataManager()
File "/root/dev/knowledgeGraphH/knowledge/data_manager/data_manager.py", line 13, in __init__
self.graphDB = Neo4jManager()
File "/root/dev/knowledgeGraphH/knowledge/neo4j_manager.py", line 10, in __init__
self.session = self.driver.session()
File "/root/dev/knowledgeGraphH/env/lib/python2.7/site-packages/neo4j/v1/session.py", line 148, in session
session = Session(self)
File "/root/dev/knowledgeGraphH/env/lib/python2.7/site-packages/neo4j/v1/session.py", line 461, in __init__
self.connection = connect(driver.host, driver.port, driver.ssl_context, **driver.config)
File "/root/dev/knowledgeGraphH/env/lib/python2.7/site-packages/neo4j/v1/connection.py", line 465, in connect
return Connection(s, der_encoded_server_certificate=der_encoded_server_certificate, **config)
File "/root/dev/knowledgeGraphH/env/lib/python2.7/site-packages/neo4j/v1/connection.py", line 237, in __init__
self.fetch()
File "/root/dev/knowledgeGraphH/env/lib/python2.7/site-packages/neo4j/v1/connection.py", line 326, in fetch
self.acknowledge_failure()
File "/root/dev/knowledgeGraphH/env/lib/python2.7/site-packages/neo4j/v1/connection.py", line 273, in acknowledge_failure
fetch()
File "/root/dev/knowledgeGraphH/env/lib/python2.7/site-packages/neo4j/v1/connection.py", line 311, in fetch
raw.writelines(self.channel.chunk_reader())
File "/root/dev/knowledgeGraphH/env/lib/python2.7/site-packages/neo4j/v1/connection.py", line 169, in chunk_reader
chunk_header = self._recv(2)
File "/root/dev/knowledgeGraphH/env/lib/python2.7/site-packages/neo4j/v1/connection.py", line 152, in _recv
raise ProtocolError("Server closed connection")
neo4j.v1.exceptions.ProtocolError: Server closed connection
Seems to be port issue. Is the bolt port open or not? You have access to the port or not?
Check the output of the following command:
lsof -i tcp:7687
change the port number if you have changed bolt port address.
It turns out it was because I used the wrong credentials for this connection.
I'm following the instructions on https://github.com/jorilallo/celery-flower-heroku to deploy Flower celery monitoring app to Heroku.
After configuring and deploying my app I see the following in heroku logs:
Traceback (most recent call last):
File "/app/.heroku/python/bin/flower", line 9, in <module>
load_entry_point('flower==0.7.0', 'console_scripts', 'flower')()
File "/app/.heroku/python/lib/python2.7/site-packages/flower/__main__.py", line 11, in main
flower.execute_from_commandline()
File "/app/.heroku/python/lib/python2.7/site-packages/celery/bin/base.py", line 306, in execute_from_commandline
return self.handle_argv(self.prog_name, argv[1:])
File "/app/.heroku/python/lib/python2.7/site-packages/flower/command.py", line 99, in handle_argv
return self.run_from_argv(prog_name, argv)
File "/app/.heroku/python/lib/python2.7/site-packages/flower/command.py", line 75, in run_from_argv
**app_settings)
File "/app/.heroku/python/lib/python2.7/site-packages/flower/app.py", line 40, in __init__
max_tasks_in_memory=max_tasks)
File "/app/.heroku/python/lib/python2.7/site-packages/flower/events.py", line 60, in __init__
state = shelve.open(self._db)
File "/app/.heroku/python/lib/python2.7/shelve.py", line 239, in open
return DbfilenameShelf(filename, flag, protocol, writeback)
File "/app/.heroku/python/lib/python2.7/shelve.py", line 223, in __init__
Shelf.__init__(self, anydbm.open(filename, flag), protocol, writeback)
File "/app/.heroku/python/lib/python2.7/anydbm.py", line 85, in open
return mod.open(file, flag, mode)
File "/app/.heroku/python/lib/python2.7/dumbdbm.py", line 250, in open
return _Database(file, mode)
File "/app/.heroku/python/lib/python2.7/dumbdbm.py", line 71, in __init__
f = _open(self._datfile, 'w')
IOError: [Errno 2] No such file or directory: 'postgres://USERNAME:PASSWORD#ec2-HOST.compute-1.amazonaws.com:5432/DBNAME.dat'
Notice the .dat appendix there? No idea where it comes from, its not present int my DATABASE_URL env variable.
Furthermore, the error above is with flower 0.7. I also tried installing 0.6, with which I do get further (namely the DB is correctly recognized and connection established), but I then get the following warnings once flower starts:
2014-06-19T15:14:02.464424+00:00 app[web.1]: [E 140619 15:14:02 state:138] Failed to inspect workers: '[Errno 104] Connection reset by peer', trying again in 128 seconds
2014-06-19T15:14:02.464844+00:00 app[web.1]: [E 140619 15:14:02 events:103] Failed to capture events: '[Errno 104] Connection reset by peer', trying again in 128 seconds.
Loading flower in my browser does show a few tabs of stuff, but there is no data.
How do I resolve these issues?
Flower doesn't support database persistence. It saves the state to file(s) using shelve module.
I am using django-1.2 and python-2.6 and I am using mysql server.
After working for a while - selecting and updating records, I got this error:
Exception in thread Thread-269:
Traceback (most recent call last):
File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
File "dispatcher.py", line 42, in run
File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 80, in __len__
File "/usr/lib/python2.6/site-packages/django/db/models/query.py", line 271, in iterator
File "/usr/lib/python2.6/site-packages/django/db/models/sql/compiler.py", line 677, in results_iter
File "/usr/lib/python2.6/site-packages/django/db/models/sql/compiler.py", line 731, in execute_sql
File "/usr/lib/python2.6/site-packages/django/db/backends/__init__.py", line 75, in cursor
File "/usr/lib/python2.6/site-packages/django/db/backends/mysql/base.py", line 297, in _cursor
File "/usr/lib64/python2.6/site-packages/MySQLdb/__init__.py", line 81, in Connect
File "/usr/lib64/python2.6/site-packages/MySQLdb/connections.py", line 187, in __init__
OperationalError: (2001, "Can't create UNIX socket (24)")
here are lines 41,42 of my dispatcher.py:
dataList = Mydata.objects.filter(date__isnull=True)[:chunkSize]
print '%s - DB worker finished reading %s entrys' % (datetime.now(),len(dataList))
Any clue why I get this error?
I tried googling but could not find an answer.
I am connecting to the db using django - (I am using localhost)
On my machine, errno==24 is defined like
#define EMFILE 24 /* Too many open files */
Which means you are running out of filedescriptors. Your app is "leaking" filedescriptors by opening them (and not closing them) again and again.
Maybe you're not forgetting to close file. But have too many files opened at the same time.