Running a small python code to create a pandas dataframe from Bigquery table results . When i run the code I see the below results. The db_dtypes is already installed , not sure what other dependencies i need to add. Any help is appreciated.
Here is the code
import pandas
from google.cloud import bigquery
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(
'/Users/kar/Downloads/data-4045ff698b4f.json')
project_id = 'data-platform'
client = bigquery.Client(credentials=credentials, project=project_id)
sql = """SELECT * FROM `data-platform.airbnb.raw_hosts` LIMIT 1"""
query_job = client.query(sql)
df = query_job.to_dataframe()
Error
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/ka/PycharmProjects/pythonProject4/main.py", line 17, in <module>
df = query_job.to_dataframe()
File "/Users/ka/PycharmProjects/pythonProject4/venv/lib/python3.7/site-packages/google/cloud/bigquery/job/query.py", line 1689, in to_dataframe
geography_as_object=geography_as_object,
File "/Users/ka/PycharmProjects/pythonProject4/venv/lib/python3.7/site-packages/google/cloud/bigquery/table.py", line 1965, in to_dataframe
_pandas_helpers.verify_pandas_imports()
File "/Users/ka/PycharmProjects/pythonProject4/venv/lib/python3.7/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 991, in verify_pandas_imports
raise ValueError(_NO_DB_TYPES_ERROR) from db_dtypes_import_exception
ValueError: Please install the 'db-dtypes' package to use this function.
Process finished with exit code 1
Related
I am trying to read data from Excel to pandas dataframe and then write the dataframe to Snowflake table. Code as below.
Connection is established and Excel read is working fine but write to snowflake table is not working. Am getting below error . Requesting help to resolve the error
snowflake.connector.errors.MissingDependencyError: Missing optional dependency: pandas Process finished with exit code 1
import pandas as pd
from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL
from snowflake.connector.pandas_tools import pd_writer
url = URL(
account = '',
user = '',
schema = 'TMP',
database = 'TMP',
warehouse= 'DATABRICKS',
role = '',
authenticator='externalbrowser',
)
engine = create_engine(url)
con = engine.connect()
df = pd.read_excel("C:\\Final.xlsx")
df.columns = df.columns.astype(str)
table_name = 'test_connect'
if_exists = 'replace'
df.to_sql(name=table_name.lower(), con=con,index= False, if_exists=if_exists, method=pd_writer)
Detailed Error info below
Traceback (most recent call last):
File "C:\Users\XYZ\AppData\Roaming\JetBrains\DataSpell2022.2\scratches\scratch.py", line 32, in <module>
df.to_sql(name=table_name.lower(), con=con,index= False, if_exists=if_exists, method=pd_writer)
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\pandas\core\generic.py", line 2963, in to_sql
return sql.to_sql(
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\pandas\io\sql.py", line 697, in to_sql
return pandas_sql.to_sql(
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\pandas\io\sql.py", line 1739, in to_sql
total_inserted = sql_engine.insert_records(
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\pandas\io\sql.py", line 1322, in insert_records
return table.insert(chunksize=chunksize, method=method)
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\pandas\io\sql.py", line 950, in insert
num_inserted = exec_insert(conn, keys, chunk_iter)
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\snowflake\connector\pandas_tools.py", line 320, in pd_writer
df = pandas.DataFrame(data_iter, columns=keys)
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\snowflake\connector\options.py", line 36, in __getattr__
raise MissingDependencyError(self._dep_name)
snowflake.connector.errors.MissingDependencyError: Missing optional dependency: pandas
Process finished with exit code 1
I believe the following dependency install step has not been completed: https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#installation
I'm starting on a project using Clarifai. However, when I define the app, I'm getting a key error:
from clarifai.rest import ClarifaiApp
from clarifai.rest import Image as ClImage
import os
from glob import glob
api_key = 'my api key'
app = ClarifaiApp(api_key=api_key) # Error occurs here
model_id = 'model id'
concepts = ['concept1', 'concept2', 'concept3']
Traceback (most recent call last):
File "C:\Users\crayo\uShoe\main.py", line 6, in <module>
app = ClarifaiApp(api_key=api_key)
File "C:\Users\user\project\venv\lib\site-packages\clarifai\rest\client.py", line 124, in __init__
self.models = Models(self.api, self.solutions) # type: Models
File "C:\Users\user\project\venv\lib\site-packages\clarifai\rest\client.py", line 1068, in __init__
self.model_id_cache = self.init_model_cache()
File "C:\Users\user\project\venv\lib\site-packages\clarifai\rest\client.py", line 1088, in init_model_cache
model_type = m.output_info['type']
KeyError: 'type'
I'm not sure what's causing this error, so if someone could provide input I'd appreciate it! Thanks!
You are using the deprecated Python REST package: https://github.com/Clarifai/clarifai-python. Please replace your code with the new & updated Python gRPC client: https://github.com/Clarifai/clarifai-python-grpc
Make sure to uninstall the REST package to avoid conflicts.
You can find our API docs & code snippets here: https://docs.clarifai.com
i'm trying to create an exe file which take advantage of teradataml python. I'm trying to create a table in teradata and import the data form pandas dataframe.
here is my code.
import pandas as pd
from sqlalchemy import create_engine
from teradataml.context.context import *
from sqlalchemy import *
from teradataml.dataframe.copy_to import copy_to_sql
from sqlalchemy.dialects import registry
from teradatasqlalchemy import dialect
registry.register('teradata', 'teradatasqlalchemy', 'dialect')
user = 'dbc'
pasw=user
host = '192.168.1.7'
td_engine = create_engine('teradata://'+ user +':' + pasw + '#'+ host )
create_context(tdsqlengine =td_engine)
df = pd.read_csv(r"C:/krishna/data/FL_insurance_sample1.csv", delimiter=',')
copy_to_sql(df = df, table_name = "Insurece_sample", primary_index="InsurenceID", if_exists="replace")
remove_context()
initially i was getting below error however i fixed that one.
sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:teradata
pyinstaller command which i tried:
pyinstaller --add-binary "C:\Users\krishna\AppData\Local\Programs\Python\Python38\Lib\site-packages\teradatasql\teradatasql.dll;teradatasql"-F pyinstalletest.py
the error which i'm getting now:
Traceback (most recent call last):
File "pyinstalletest.py", line 18, in <module>
File "teradataml\context\context.py", line 459, in create_context
File "teradataml\context\context.py", line 751, in _load_function_aliases
File "teradataml\common\utils.py", line 1591, in _check_alias_config_file_exists
teradataml.common.exceptions.TeradataMlException: [Teradata][teradataml](TDML_2069) Alias config file 'C:\Users\krishna\AppData\Local\Temp\_MEI63962\teradataml\config\mlengine_alias_definitions_v1.0' is not defined for the current Vantage version 'vantage1.0'. Please add the config file.
[1660] Failed to execute script pyinstalletest
please help me to resolve the error.
While connecting to Hive2 using Python with below code:
import pyhs2
with pyhs2.connect(host='localhost',
port=10000,
authMechanism="PLAIN",
user='root',
password='test',
database='default') as conn:
with conn.cursor() as cur:
#Show databases
print cur.getDatabases()
#Execute query
cur.execute("select * from table")
#Return column info from query
print cur.getSchema()
#Fetch table results
for i in cur.fetch():
print i
I am getting below error:
File
"C:\Users\vinbhask\AppData\Roaming\Python\Python36\site-packages\pyhs2-0.6.0-py3.6.egg\pyhs2\connections.py",
line 7, in <module>
from cloudera.thrift_sasl import TSaslClientTransport ModuleNotFoundError: No module named 'cloudera'
Have tried here and here but issue wasn't resolved.
Here is the packages installed till now:
bitarray0.8.1,certifi2017.7.27.1,chardet3.0.4,cm-api16.0.0,cx-Oracle6.0.1,future0.16.0,idna2.6,impyla0.14.0,JayDeBeApi1.1.1,JPype10.6.2,ply3.10,pure-sasl0.4.0,PyHive0.4.0,pyhs20.6.0,pyodbc4.0.17,requests2.18.4,sasl0.2.1,six1.10.0,teradata15.10.0.21,thrift0.10.0,thrift-sasl0.2.1,thriftpy0.3.9,urllib31.22
Error while using Impyla:
Traceback (most recent call last):
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\Scripts\HiveConnTester4.py", line 1, in <module>
from impala.dbapi import connect
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\lib\site-packages\impala\dbapi.py", line 28, in <module>
import impala.hiveserver2 as hs2
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\lib\site-packages\impala\hiveserver2.py", line 33, in <module>
from impala._thrift_api import (
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\lib\site-packages\impala\_thrift_api.py", line 74, in <module>
include_dirs=[thrift_dir])
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\lib\site-packages\thriftpy\parser\__init__.py", line 30, in load
include_dir=include_dir)
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\lib\site-packages\thriftpy\parser\parser.py", line 496, in parse
url_scheme))
thriftpy.parser.exc.ThriftParserError: ThriftPy does not support generating module with path in protocol 'c'
thrift_sasl.py is trying cStringIO which is no longer available in Python 3.0. Try with python 2 ?
You may need to install an unreleased version of thrift_sasl. Try:
pip install git+https://github.com/cloudera/thrift_sasl
If you're comfortable learning PySpark, then you just need to setup the hive.metastore.uris property to point at the Hive Metastore address, and you're ready to go.
The easiest way to do that would be to export the hive-site.xml from the your cluster, then pass --files hive-site.xml during spark-submit.
(I haven't tried running standalone Pyspark, so YMMV)
I'm pretty new to Python and I'm trying to connect to smartsheet with API.
I have ran "pip install smartsheet-python-sdk" and it installed smartsheet as I can find it under "lib"
This is code I have found and supposed to work(I replaced the token with the token)
# Import.
import smartsheet
# Instantiate smartsheet and specify access token value.
smartsheet = smartsheet.Smartsheet('Token_here')
# Get all columns.
action = smartsheet.Sheets.get_columns('Template for Bram', include_all=True)
columns = action.data
# For each column, print Id and Title.
for col in columns:
print(col.id)
print(col.title)
print('')
It shows this error:
Traceback (most recent call last):
File "C:\Users\bram\Desktop\smartsheet.py", line 2, in <module>
import smartsheet
File "C:\Users\bram\Desktop\smartsheet.py", line 5, in <module>
smartsheet = smartsheet.Smartsheet('token_here')
AttributeError: 'module' object has no attribute 'Smartsheet'
Now I'm not sure what my next step is. I think I have followed all of the appropriate steps. When I run import smartsheet by itself it won't error out.
What am I doing wrong?
Thank you
Update***
After using the code from the github page and implementing my token and sheet id I get this error:
Traceback (most recent call last):
File "C:\Users\bvanhout\Desktop\test23.py", line 58, in <module>
sheet = ss.Sheets.get_sheet(sheet_id)
File "C:\Python27\lib\site-packages\smartsheet\sheets.py", line 460, in get_sheet
response = self._base.request(prepped_request, expected, _op)
File "C:\Python27\lib\site-packages\smartsheet\smartsheet.py", line 178, in request
res = self.request_with_retry(prepped_request, operation)
File "C:\Python27\lib\site-packages\smartsheet\smartsheet.py", line 242, in request_with_retry
return self._request(prepped_request, operation)
File "C:\Python27\lib\site-packages\smartsheet\smartsheet.py", line 210, in _request
raise UnexpectedRequestError(rex.request, rex.response)
UnexpectedRequestError: (<PreparedRequest [GET]>, None)
# TODO: Update this with the ID of your sheet to update
sheet_id = 48568543424234
I printed ss and ss.Sheets and both do not reflect the actual token or sheet_id
>>> print (ss.Sheets)
<smartsheet.sheets.Sheets object at 0x0000000003874438>
I suspect the problem is that you are using a local variable with the same name as the module ('smartsheet')
Please take a look at the sample here: https://github.com/smartsheet-samples/python-read-write-sheet