Inserting data to impala table using Ibis python

Inserting data to impala table using Ibis python - python

I'm trying to insert df into a ibis created impala table with partition. I am running this on remote kernel using spyder 3.2.4 on windows 10 machine and python 3.6.2 on edge node machine running CentOS.
I get following error:
Writing DataFrame to temporary file
Writing CSV to: /tmp/ibis/pandas_0032f9dd1916426da62c8b4d8f4dfb92/0.csv
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
insert = target_table.insert(df3)
File "/usr/local/lib/python3.6/site-packages/ibis/impala/client.py", line 1674, in insert
writer, expr = write_temp_dataframe(self._client, obj)
File "/usr/local/lib/python3.6/site-packages/ibis/impala/pandas_interop.py", line 225, in write_temp_dataframe
return writer, writer.delimited_table(path)
File "/usr/local/lib/python3.6/site-packages/ibis/impala/pandas_interop.py", line 188, in delimited_table
schema = self.get_schema()
File "/usr/local/lib/python3.6/site-packages/ibis/impala/pandas_interop.py", line 184, in get_schema
return pandas_to_ibis_schema(self.df)
File "/usr/local/lib/python3.6/site-packages/ibis/impala/pandas_interop.py", line 219, in pandas_to_ibis_schema
return schema(pairs)
File "/usr/local/lib/python3.6/site-packages/ibis/expr/api.py", line 105, in schema
return Schema.from_tuples(pairs)
File "/usr/local/lib/python3.6/site-packages/ibis/expr/datatypes.py", line 109, in from_tuples
return Schema(names, types)
File "/usr/local/lib/python3.6/site-packages/ibis/expr/datatypes.py", line 55, in init
self.types = [validate_type(typ) for typ in types]
File "/usr/local/lib/python3.6/site-packages/ibis/expr/datatypes.py", line 55, in
self.types = [validate_type(typ) for typ in types]
File "/usr/local/lib/python3.6/site-packages/ibis/expr/datatypes.py", line 1040, in validate_type
return TypeParser(t).parse()
File "/usr/local/lib/python3.6/site-packages/ibis/expr/datatypes.py", line 901, in parse
t = self.type()
File "/usr/local/lib/python3.6/site-packages/ibis/expr/datatypes.py", line 1033, in type
raise SyntaxError('Type cannot be parsed: {}'.format(self.text))
File "", line unknown
SyntaxError: Type cannot be parsed: integer

Error was coming due to structure and security of the hadoop system. Ibis package tries to create temp_db & temp_hdfs_location in __ibis_tmp & /tmp/ibis/ respectively. Since in our system default locations are not open to any user other than root/system admin... insert command was erroring out when getting data from /tmp/ibis/ to actual db (still not clear but may be via __ibis_tmp dbase). Once we edited the config_init.py file for ibis package to a allowed temp location/db. It worked like a charm.

instead of editing the config_init.py mentioned
https://stackoverflow.com/a/47543691/5485370
It is easier to assign the temp db and path using the ibis.options:
ibis.options.impala.temp_db = 'your_temp_db'
ibis.options.impala.temp_hdfs_path = 'your_temp_hdfs_path'

Related

How can read Minecraft .mca files so that in python I can extract individual blocks?

I can't find a way of reading the Minecraft world files in a way that i could use in python
I've looked around the internet but can find no tutorials and only a few libraries that claim that they can do this but never actually work
from nbt import *
nbtfile = nbt.NBTFile("r.0.0.mca",'rb')
I expected this to work but instead I got errors about the file not being compressed or something of the sort
Full error:
Traceback (most recent call last):
File "C:\Users\rober\Desktop\MinePy\MinecraftWorldReader.py", line 2, in <module>
nbtfile = nbt.NBTFile("r.0.0.mca",'rb')
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 628, in __init__
self.parse_file()
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 652, in parse_file
type = TAG_Byte(buffer=self.file)
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 99, in __init__
self._parse_buffer(buffer)
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 105, in _parse_buffer
self.value = self.fmt.unpack(buffer.read(self.fmt.size))[0]
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\gzip.py", line 276, in read
return self._buffer.read(size)
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\gzip.py", line 463, in read
if not self._read_gzip_header():
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\gzip.py", line 411, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'\x00\x00')

Use anvil parser. (Install with pip install anvil-parser)
Reading
import anvil
region = anvil.Region.from_file('r.0.0.mca')
# You can also provide the region file name instead of the object
chunk = anvil.Chunk.from_region(region, 0, 0)
# If `section` is not provided, will get it from the y coords
# and assume it's global
block = chunk.get_block(0, 0, 0)
print(block) # <Block(minecraft:air)>
print(block.id) # air
print(block.properties) # {}
https://pypi.org/project/anvil-parser/

According to this page, the .mca files is not totally kind of of NBT file. It begins with an 8KiB header which includes the offsets of chunks in the region file itself and the timestamps for the last updates of those chunks.
I recommend you to see the offical announcement and this page for more information.

Cannot create simple table using Happybase in Python

I am trying to create a table using Happybase. To start I enter the following commands get Hbase and Thrift running:
start-hbase.sh
hbase thrift start &
Once this is running I open Python's command prompt and type the following:
import happybase as hb
connection = hb.Connection()
connection.open()
However when I try to create a table:
connection.create_table(
'mytable',
{'cf1': dict(max_versions=10),
'cf2': dict(max_versions=1, block_cache_enabled=False),
'cf3': dict(), # use defaults
}
I get the following error that I just don't understand.
Traceback (most recent call last):
File "<stdin>", line 5, in <module>
File "/usr/local/lib/python2.7/dist-packages/happybase/connection.py", line 309, in create_table
self.client.createTable(name, column_descriptors)
File "/usr/local/lib/python2.7/dist-packages/thriftpy/thrift.py", line 198, in _req
return self._recv(_api)
File "/usr/local/lib/python2.7/dist-packages/thriftpy/thrift.py", line 210, in _recv
fname, mtype, rseqid = self._iprot.read_message_begin()
File "thriftpy/protocol/cybin/cybin.pyx", line 429, in cybin.TCyBinaryProtocol.read_message_begin (thriftpy/protocol/cybin/cybin.c:6325)
File "thriftpy/protocol/cybin/cybin.pyx", line 60, in cybin.read_i32 (thriftpy/protocol/cybin/cybin.c:1546)
File "thriftpy/transport/buffered/cybuffered.pyx", line 65, in thriftpy.transport.buffered.cybuffered.TCyBufferedTransport.c_read (thriftpy/transport/buffered/cybuffered.c:1881)
File "thriftpy/transport/buffered/cybuffered.pyx", line 69, in thriftpy.transport.buffered.cybuffered.TCyBufferedTransport.read_trans (thriftpy/transport/buffered/cybuffered.c:1948)
File "thriftpy/transport/cybase.pyx", line 61, in thriftpy.transport.cybase.TCyBuffer.read_trans (thriftpy/transport/cybase.c:1472)
File "/usr/local/lib/python2.7/dist-packages/thriftpy/transport/socket.py", line 125, in read
message='TSocket read 0 bytes')
thriftpy.transport.TTransportException: TTransportException(message='TSocket read 0 bytes', type=4)
)

You need to specify the server address, and possibly port:
connection = hb.Connection(SERVER, PORT)
You can probably omit the port value, as most likely the default value will match, but just in case check on what port your thrift server is listening and specify that as a numeric value

Searching for a string with Web.py

I'm trying to build a python function with web.py and SQLite that will allow users to search for a given string within a description field and will return all matching results.
Right now I've gotten to the below function, which works but only if the input is an exact match.
def getItem(params, max_display):
query_string = 'SELECT * FROM items WHERE 1=1'
description = params['description']
if params['description']:
query_string = query_string + ' AND description LIKE $description'
result = query(query_string, {
'description': params['description']
I've tried to implement this feature with LIKE "%$description%"' , however I keep getting the below web.py error.
Traceback (most recent call last):
File "lib/web/wsgiserver/__init__.py", line 1245, in communicate
req.respond()
File "lib/web/wsgiserver/__init__.py", line 775, in respond
self.server.gateway(self).respond()
File "lib/web/wsgiserver/__init__.py", line 2018, in respond
response = self.req.server.wsgi_app(self.env, self.start_response)
File "lib/web/httpserver.py", line 306, in __call__
return self.app(environ, xstart_response)
File "lib/web/httpserver.py", line 274, in __call__
return self.app(environ, start_response)
File "lib/web/application.py", line 279, in wsgi
result = self.handle_with_processors()
File "lib/web/application.py", line 249, in handle_with_processors
return process(self.processors)
File "lib/web/application.py", line 246, in process
raise self.internalerror()
File "lib/web/application.py", line 478, in internalerror
return debugerror.debugerror()
File "lib/web/debugerror.py", line 305, in debugerror
return web._InternalError(djangoerror())
File "lib/web/debugerror.py", line 290, in djangoerror
djangoerror_r = Template(djangoerror_t, filename=__file__, filter=websafe)
File "lib/web/template.py", line 846, in __init__
code = self.compile_template(text, filename)
File "lib/web/template.py", line 926, in compile_template
ast = compiler.parse(code)
File "/Users/sokeefe/homebrew/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/compiler/transformer.py", line 51, in parse
return Transformer().parsesuite(buf)
File "/Users/sokeefe/homebrew/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/compiler/transformer.py", line 128, in parsesuite
return self.transform(parser.suite(text))
AttributeError: 'module' object has no attribute 'suite'
Any thoughts on what might be going wrong with this function?
Thanks in advance!

What do you think is going on with parser.py?
Here is the relevant portion of the error message:
File
"/Users/sokeefe/homebrew/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/compiler/transformer.py",
line 128, in parsesuite
return self.transform(parser.suite(text)) AttributeError: 'module' object has no attribute 'suite'
So, somewhere there is a file called parser.py, which defines a function called suite(), which is used by some library code that executes when your program executes. But because you named one of your files parser.py, when the library code executes, python searches for a file named parser.py, and python found your file first, and there was no function named suite() in your file.

Create HDF5 file using pytables with table format and data columns

I want to read a h5 file previously created with PyTables.
The file is read using Pandas, and with some conditions, like this:
pd.read_hdf('myH5file.h5', 'anyTable', where='some_conditions')
From another question, I have been told that, in order for a h5 file to be "queryable" with read_hdf's where argument it must be writen in table format and, in addition, some columns must be declared as data columns.
I cannot find anything about it in PyTables documentation.
The documentation on PyTable's create_table method does not indicate anything about it.
So, right now, if I try to use something like that on my h5 file createed with PyTables I get the following:
>>> d = pd.read_hdf('test_file.h5','basic_data', where='operation==1')
C:\Python27\lib\site-packages\pandas\io\pytables.py:3070: IncompatibilityWarning:
where criteria is being ignored as this version [0.0.0] is too old (or
not-defined), read the file in and write it out to a new file to upgrade (with
the copy_to method)
warnings.warn(ws, IncompatibilityWarning)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 323, in read_hdf
return f(store, True)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 305, in <lambda>
key, auto_close=auto_close, **kwargs)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 665, in select
return it.get_result()
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 1359, in get_result
results = self.func(self.start, self.stop, where)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 658, in func
columns=columns, **kwargs)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 3968, in read
if not self.read_axes(where=where, **kwargs):
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 3196, in read_axes
values = self.selection.select()
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 4482, in select
start=self.start, stop=self.stop)
File "C:\Python27\lib\site-packages\tables\table.py", line 1567, in read_where
self._where(condition, condvars, start, stop, step)]
File "C:\Python27\lib\site-packages\tables\table.py", line 1528, in _where
compiled = self._compile_condition(condition, condvars)
File "C:\Python27\lib\site-packages\tables\table.py", line 1366, in _compile_condition
compiled = compile_condition(condition, typemap, indexedcols)
File "C:\Python27\lib\site-packages\tables\conditions.py", line 430, in compile_condition
raise _unsupported_operation_error(nie)
NotImplementedError: unsupported operand types for *eq*: int, bytes
EDIT:
The traceback mentions something about IncompatibilityWarning and version [0.0.0], however if I check my versions of Pandas and Tables I get:
>>> import pandas
>>> pandas.__version__
'0.15.2'
>>> import tables
>>> tables.__version__
'3.1.1'
So, I am totally confused.

I had the same issue, and this is what I have done.
Create a HDF5 file by PyTables;
Read this HDF5 file by pandas.read_hdf and use parameters like "where = where_string, columns = selected_columns"
I got the warning message like below and other error messages:
D:\Program
Files\Anaconda3\lib\site-packages\pandas\io\pytables.py:3065:
IncompatibilityWarning: where criteria is being ignored as this
version [0.0.0] is too old (or not-defined), read the file in and
write it out to a new file to upgrade (with the copy_to method)
warnings.warn(ws, IncompatibilityWarning)
I tried commands like this:
hdf5_store = pd.HDFStore(hdf5_file, mode = 'r')
h5cpt_store_new = hdf5_store.copy(hdf5_new_file, complevel=9, complib='blosc')
h5cpt_store_new.close()
And run the command exactly like step 2, it works.
pandas.version
'0.17.1'
tables.version
'3.2.2'

Cannot open my client

I am trying to open my tryton client but it is not working
the snapshot of my problem :
the text of the problem is:
File "./tryton", line 66, in <module>
tryton.client.TrytonClient().run()
File "/home/ghassen/work/tryton/tryton/client.py", line 101, in run
main.sig_login()
File "/home/ghassen/work/tryton/tryton/gui/main.py", line 910, in sig_login
res = DBLogin().run()
File "/home/ghassen/work/tryton/tryton/gui/window/dblogin.py", line 579, in run
if (self.profiles.get(profile_name, sectionname)
File "/usr/lib/python2.7/ConfigParser.py", line 618, in get
raise NoOptionError(option, section)

Tryton has a profile file where it saves the know connections, and from the error it seems that file is corrupted. You can find this file under ~/.config/tryton/x.y/profiles.cfg where x.y corresponds to you version number.
If you don't have any saved profile, you can remove this file and the client will recreate them when started another time.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Inserting data to impala table using Ibis python - python

instead of editing the config_init.py mentioned https://stackoverflow.com/a/47543691/5485370 It is easier to assign the temp db and path using the ibis.options: ibis.options.impala.temp_db = 'your_temp_db' ibis.options.impala.temp_hdfs_path = 'your_temp_hdfs_path'

Related

How can read Minecraft .mca files so that in python I can extract individual blocks?

Cannot create simple table using Happybase in Python

Searching for a string with Web.py

Create HDF5 file using pytables with table format and data columns

Cannot open my client

Categories

Resources