I am working with a Teradata table that has a timestamp column: TIMESTAMP(6) with data that looks like this:
2/14/2019 13:09:51.210000
Currently I have a Python time variable that I want to send into the Teradata table via SQL, that looks like below:
from datetime import datetime
time = datetime.now().strftime("%m/%d/%Y %H:%M:%S")
02/14/2019 13:23:24
How can I reformat that to insert correctly? It is error'ing out with:
teradata.api.DatabaseError: (6760, '[22008] [Teradata][ODBC Teradata Driver][Teradata Database](-6760)Invalid timestamp.')
I tried using the same format the Teradata timestamp column uses:
time = datetime.now().strftime("%mm/%dd/%YYYY %HH24:%MI:%SS")
Same error message
Thanks
Figured it out. Turned out to be unrelated to the timestamp, and I had to reformat the DataFrame column it was being read from. Changing the Data Type fixed it:
final_result_set['RECORD_INSERTED'] = pd.to_datetime(final_result_set['RECORD_INSERTED'])
Now when looping through and inserting via SQL, the following worked fine for populating 'RECORD_INSERTED':
time = datetime.now().strftime("%m/%d/%Y %H:%M:%S")
Sorry for the confusion
Related
I have requirement of exporting a pandas dataframe into teradata's temp table .
I usually try with udaExec for connecting terdata .
So temp table has to be created on the fly while loading the data , since Dataframe on today might be 100 col's tomorrow might be 200 col's due to insatiability of data arrival i'm afraid i can't create a DDL and then load.
Please suggest .
You can use pandas.to_sql function (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html)
you'll need to pass a SqlAlchemy connection and the name of your SQL table,
something like this:
my_pandas_df.to_sql(
name="my_table_in_teradata",
con=my_connection_to_teradata,
if_exist="replace",
)
I read from an API the following data into a pandas dataframe:
Now, I want to write this data into a MySQL-DB-table, using pandas to_sql:
In MySQL, the column is set up correctly, but has not written the values:
Then I looked in the debugger to show me the dataframe:
I thought it would maybe a formatting issue, and added the following lines:
In the debugger, it looks now fine:
But now, in the database, it wants to write the index column as text
... and interrupts the execution with an error:
Is there a way to get this going, aka to write df index data as date into a MySQL DB using pandas to_SQL in connection with a sqlalchemy engine?
Edit:
Table schema:
DataFrame Header:
It seems you are using Date column as primary key. I would suggest not to use that as primary key instead you should use Date + Ticker as primary key.
Issue:
I stored some dates as text in my Postgres table, and I want to convert it over to actual dates, again in Postgres.
Im not sure if there is a better way to do this or what im doing wrong. I have pulled a bunch of data into a PostgreSQL database in just text format. As a result I need to go back through and clean it up. I am running into issues with the data. I need to convert it into a format that PostgreSQL can use. I went about pulling it back into python and trying to convert and kick it back. Is this the best way to do this? Also I am having issue with datetime.strptime.. I believe i've got the directive correct but no go. :/
import psycopg2
from datetime import datetime
# connect to the PostgreSQL database
conn = psycopg2.connect(
"dbname='postgres' user='postgres' host=10.0.75.1 password='mysecretpassword'")
# create a new cursor
cur = conn.cursor()
cur.execute("""SELECT "Hash","Date" FROM nas """)
# commit the changes to the database
myDate = cur.fetchall()
for rows in myDate:
target = rows[1]
datetime.strptime(target, '%B %d, %Y, %H:%M:%S %p %Z')
Here is a Postgres query which can convert your strings into actual timestamps:
select
ts_col,
to_timestamp(ts_col, 'Month DD, YYYY HH:MI:SS PM')::timestamp with time zone
from your_table;
For a full solution, you might take the following steps:
create a new timestamp column ts_col_new in your table
update that column using the logic from the above query
then delete the old column containing text
The update might look something like this:
update your_table
set ts_col_new = to_timestamp(ts_col, 'Month DD, YYYY HH:MI:SS PM')::timestamp with time zone;
I'm using the PyTd teradata module to query data from Teradata and want to read it into a Pandas DataFrame
import teradata
import pandas as pd
# teradata connection
udaExec = teradata.UdaExec(appName="Example", version="1.0",
logConsole=False)
session = udaExec.connect(method="odbc", system="", username="", password="")
# Create empty dataframe with column names
query = session.execute("SELECT TOP 1 * FROM table")
cols = [str(d[0]) for d in query.description]
df = pd.DataFrame(columns=cols)
# Read data into dataframe
for row in session.execute("SELECT * FROM table"):
print type(row)
df.append(row)
row is of teradata.util.Row class and can't be appended to the dataframe. I tried converting it to a list but the format gets messed up.
How can I read my data into a dataframe from Teradata using the teradata module? I'm not able to use the pyodbc module for this.
Is there a better way to create the empty dataframe with column names matching those in the database?
You can use pandas.read_sql :)
import teradata
import pandas as pd
# teradata connection
udaExec = teradata.UdaExec(appName="Example", version="1.0",
logConsole=False)
with udaExec.connect(method="odbc", system="", username="", password="") as session:
query ="SELECT * FROM table"
df = pd.read_sql(query,session)
Using ‘with’ will ensure close of session after the query. I hope that helped :)
I know its a little late. But putting a note nevertheless.
There are a few questions here.
How can I read my data into a dataframe from Teradata using the
teradata module?
At the end of the day, a teradata.util.Row is simply a list. So a simple list operation should help you get things out of Row.
','.join(str(item) for item in row)
kinda thing.
Pushing that into a pandas dataframe should be a list to df conversion exercise.
I'm not able to use the pyodbc module for this.
I used teradata's python module to do a LDAP auth. All worked fine. Didn't have this requirement. Sorry.
Is there a better way to create the empty dataframe with column names matching those in the database?
I assume, given a table name, you can query to figure it schema (table names) >> convert that to a list and create your pandas df?
I know this is very late.
You can use read_sql() from pandas module. It returns pandas dataframe.
Here is the reference:
http://pandas.pydata.org/pandas-docs/version/0.20/generated/pandas.read_sql.html
Given an Oracle table created using the following:
CREATE TABLE Log(WhenAdded TIMESTAMP(6) WITH TIME ZONE);
Using the Python ODBC module from its Win32 extensions (from the win32all package), I tried the following:
import dbi, odbc
connection = odbc.odbc("Driver=Oracle in OraHome92;Dbq=SERVER;Uid=USER;Pwd=PASSWD")
cursor = connection.cursor()
cursor.execute("SELECT WhenAdded FROM Log")
results = cursor.fetchall()
When I run this, I get the following:
Traceback (most recent call last):
...
results = cursor.fetchall()
dbi.operation-error: [Oracle][ODBC][Ora]ORA-00932: inconsistent datatypes: expected %s got %s
in FETCH
The other data types I've tried (VARCHAR2, BLOB) do not cause this problem. Is there a way of retrieving timestamps?
I believe this is a bug in the Oracle ODBC driver. Basically, the Oracle ODBC driver does not support the TIMESTAMP WITH (LOCAL) TIME ZONE data types, only the TIMESTAMP data type. As you have discovered, one workaround is in fact to use the TO_CHAR method.
In your example you are not actually reading the time zone information. If you have control of the table you could convert it to a straight TIMESTAMP column. If you don't have control over the table, another solution may be to create a view that converts from TIMESTAMP WITH TIME ZONE to TIMESTAMP via a string - sorry, I don't know if there is a way to convert directly from TIMESTAMP WITH TIME ZONE to TIMESTAMP.
My solution to this, that I hope can be bettered, is to use Oracle to explicitly convert the TIMESTAMP into a string:
cursor.execute("SELECT TO_CHAR(WhenAdded, 'YYYY-MM-DD HH:MI:SSAM') FROM Log")
This works, but isn't portable. I'd like to use the same Python script against a SQL Server database, so an Oracle-specific solution (such as TO_CHAR) won't work.