Batch unload using teradatasql python module

Batch unload using teradatasql python module - python

I have installed teradatasql python module recently. When I am doing batch select into table it is not providing all outputs. How to select multiple records at a time?
with teradatasql.connect ('{"host":"whomooz","user":"guest","password":"please"}') as con:
with con.cursor () as cur:
cur.fast_executemany=True
cur.execute ("select * from table where userid=? and username=?", [
[1, "abc"],
[2, "def"],
[3, "ghi"]])
print(cur.fetchall())

It sounds like you are trying to specify multiple possible column values to match, and you want a single output result set that contains all matching rows. That kind of query would look like the following, and it would produce a single result set:
select * from table
where userid=1 and username='abc'
or userid=2 and username='def'
or userid=3 and username='ghi'
In contrast, when you bind multiple rows of parameter values to question-mark parameter markers, then you will get multiple output result sets. That is how the database works. Each row of bound parameter values is treated as a separate SQL statement that is executed by the database.

Related

How do I add timestamp (using GETDATE()) to my insert statement?

I'm trying to figure out how to add a timestamp to my database table. df2 doesn't include any column for time so i'm trying to create the value either in values_ or when I execute to sql. I want to use the GETDATE() redshift function
values_ = ', '.join([f"('{str(i.columnA)}','{str(i.columnB)}','{str(i.columnC)}','{str(i.columnD)}', 'GETDATE()')" for i in df2.itertuples()])
sqlexecute(f'''insert into table.table2 (columnA, columnB, columnC, columnD, time_)
values
({values_})
;
''')
This is one of several errors I get depending on where I put GETDATE()
FeatureNotSupported: ROW expression, implicit or explicit, is not supported in target list

The "INSERT ... VALUES (...)" construct is for inserting literals into a table and getdate() is not a literal. However, there are a number of ways to get this to work. A couple of easy ways are:
You can make the default value of the column 'time_' be getdate() and then just use the key work default in the insert values statement. This will tell Redshift to use the default for the column (getdate())
insert into values ('A', 'B', 3, default)
You could switch to a "INSERT ... SELECT ..." construct which will allow you to have a mix of literals and function calls.
insert into table (select 'A', 'B', 3, getdate())
NOTE: inserting row by row into a table in Redshift can slow and make a mess of the table if the number of rows being inserted is large. This can be compounded if auto-commit is on as each insert will be committed which will need to work its way through the commit queue. If you are inserting a large amount of data you should do this through writing an S3 object and COPYing it to Redshift. Or at least bundling up 100+ rows of data into a single insert statement (with auto-commit off and explicitly commit the changes at the end).

When I created the table I added a time_log column using timestamp.
drop table if exists table1;
create table table1(
column1 varchar (255),
column2 varchar(255),
time_log timestamp
);
The issue was I had parentheses around the values in my insert statement. remove those and it will work.{values_}
sqlexecute(f'''insert into table.table2 (columnA, columnB, time_log)
values
({values_})
;
''')

Pass BOTH single bind variable and list variable to SQL query cx_Oracle Python

I have a Oracle SQL query:
SELECT * from table1 WHERE deliveredDate = ? AND productID IN (?,?,?,...);
I would like to pass a single variable to deliveredDate and a list with length unknown to the productID using cx_Oracle and Python
From the Oracle Using Bind guide (https://cx-oracle.readthedocs.io/en/latest/user_guide/bind.html) I understand that you can bind either the single variable or list of items, but I'm not sure if we can bind both.
Please help me with this issue.
Thank you.

Of course you can, but convert the notation for bind variables from ? to :-preceeded integers such as
import pandas as pd
import cx_Oracle
import datetime
conn = cx_Oracle.connect('un/pwd#ip:port/db')
cur = conn.cursor()
sql = """
SELECT *
FROM table1
WHERE deliveredDate = :0 AND productID IN (:1,:2)
"""
cur.execute(sql,[datetime.datetime(2022, 5, 3),1,2])
res = cur.fetchall()
print(res)

The key part of your question was the 'unknown length' for the IN clause. The cx_Oracle documentation Binding Multiple Values to a SQL WHERE IN Clause shows various solutions each with some pros & cons depending on size of the list and the number of times the statement will be executed. For most cases you will not want to bind to a single placeholder in your statement IN list because of performance implications. If there is an upper bound on the size of the IN list, then put that many placeholders and bind None for all unknown values. The doc example explains it better:
cursor.execute("""
select employee_id, first_name, last_name
from employees
where last_name in (:name1, :name2, :name3, :name4, :name5)""",
name1="Smith", name2="Taylor", name3=None, name4=None, name5=None)
for row in cursor:
print(row)
(This uses keyword parameters to match the bind placeholders, but you can use a list instead).
Other solutions are shown in that doc link.

inputing python list into Teradata SQL

I'm having a problem executing this SQL statement with a python list injection. I'm new to teradata SQL, and I'm not sure if this is the appropriate syntax for injecting a list into the where clause.
conn = teradatasql.connect(host='PROD', user='1234', password='1234', logmech='LDAP')
l = ["Comp-EN Routing", "Comp-COLLABORATION"]
l2 = ["PEO", "TEP"]
l3 = ["TCV"]
crsr = conn.cursor()
query = """SELECT SOURCE_ORDER_NUMBER
FROM DL_.BV_DETAIL
WHERE (LEVEL_1 IN ? AND LEVEL_2 IN ?) or LEVEL_3 IN ?"""
crsr.executemany(query, [l,l2,l3])
conn.autocommit = True
I keep getting this error
Version 17.0.0.2] [Session 308831600] [Teradata Database] [Error 3939] There is a mismatch between the number of parameters specified and the number of parameters required.

Late to answer this, but if I found the question someone else will in the future too.
executemany in teradatasql requires that second parameter to be a "sequence of sequences". The most common type of sequence we generally use in Python is a list. Essentially you need a list that contains, for each element in the list, a list.
In your case this may look like:
myListOfLists=[['level1valueA','level1valueA','level3valueA'],['level1valueB','level1valueB','level3valueB']]
Your SQL statement will be executed twice, once for each list in your list.
In your case though I suspect you are wanting to find any combination of the values that you have stored in your three lists which is entirely different ball of wax and is going to take some creativity (generate a list of list with all possible combinations and submit to executemany OR construct a SQL statement that can take in multiple comma delimited lists of values, form a cartesian product, and test for hits)

Want to add some regarding SELECT statement and executemany method: to retrieve all records returned by your query you will need to call .nextset() followed by .fetchall() as many times as it will become False. First .fetchall() will give you only first result (first list of parameters specified).
...
with teradatasql.connect(connectionstring) as conn:
with conn.cursor() as cur:
cur.executemany("SELECT COL1 FROM THEDATABASE.THETABLE WHERE COL1 = ?;",[['A'],['B']])
result=cur.fetchall() # will bring you only rows matching 'A'
if (cur.nextset()):
result2=cur.fetchall() # results for 'B'
...

Bulk insert with cx_Oracle: inconsistent data types

I'm trying to load a large number of records into an Oracle DB using Python and cx_Oracle. The consensus seems to be that you should prepare a cursor and executemany against a list of rows (per this post). So my code looks like:
stmt = "INSERT INTO table (address, shape) VALUES (:1, :2)"
cursor.prepare(stmt)
rows = []
# Make huge list of rows
cursor.executemany(None, rows)
The values I'm passing in look like this:
['1234 MARKET ST', "SDE.ST_Geometry('POINT (0 0)', 2272)"]
The problem is that the SDE.ST_Geometry() database function is being treated as a literal string rather than being evaluated, so I get a cx_Oracle.DatabaseError: ORA-00932: inconsistent datatypes: expected SDE.ST_GEOMETRY got CHAR.
Is it not possible to pass in database functions to a prepared cursor with cx_Oracle?

The brief answer is that since you are passing a string, it is treated as a string. Bind values are only every treated as data.
But look at the not-yet released cx_Oracle main line https://bitbucket.org/anthony_tuininga/cx_oracle? It has new object support.
And look at this commit "Added example for creating SDO_GEOMETRY.":
https://bitbucket.org/anthony_tuininga/cx_oracle/commits/2672c799d987a8901ac1c4917e87ae4101a1d605

In the end I got around this by embedding the function call in the prepared statement:
stmt = "INSERT INTO table (address, shape) VALUES (:1, ST_Geometry(:2, 2272))"
Hat tip to PatrickMarchand for the hint.

Wildcards in column name for MySQL

I am trying to select multiple columns, but not all of the columns, from the database. All of the columns I want to select are going to start with "word".
So in pseudocode I'd like to do this:
SELECT "word%" from searchterms where onstate = 1;
More or less. I am not finding any documentation on how to do this - is it possible in MySQL? Basically, I am trying to store a list of words in a single row, with an identifier, and I want to associate all of the words with that identifier when I pull the records. All of the words are going to be joined as a string and passed to another function in an array/dictionary with their identifier.
I am trying to make as FEW database calls as possible to keep speedy code.
Ok, here's another question for you guys:
There are going to be a variable number of columns with the name "word" in them. Would it be faster to do a separate database call for each row, with a generated Python query per row, or would it be faster to simply SELECT *, and only use the columns I needed? Is it possible to say SELECT * NOT XYZ?

No, SQL doesn't provide you with any syntax to do such a select.
What you can do is ask MySQL for a list of column names first, then generate the SQL query from that information.
SELECT column_name
FROM information_schema.columns
WHERE table_name = 'your_table'
AND column_name LIKE 'word%'
let's you select the column names. Then you can do, in Python:
"SELECT * FROM your_table WHERE " + ' '.join(['%s = 1' % name for name in columns])
Instead of using string concatenation, I would recommend using SQLAlchemy instead to do the SQL generating for you.
However, if all you are doing is limit the number of columns there is no need to do a dynamic query like this at all. The hard work for the database is selecting the rows; it makes little difference to send you 5 columns out of 10, or all 10.
In that case just use a "SELECT * FROM ..." and use Python to pick out the columns from the result set.

No, you cannot dynamically produce the list of columns to be selected. It will have to be hardcoded in your final query.
Your current query would produce a result set with one column and the value of that column would be the string "word%" in all rows that satisfy the condition.

You can generate the list of column names first by using
SHOW COLUMNS IN tblname LIKE "word%"
Then loop through the cursor and generate SQL statement uses all the columns from the query above.
"SELECT {0} FROM searchterms WHERE onstate = 1".format(', '.join(columns))

This could be helpful: MySQL wildcard in select
In conclusion it is not possible in MySQL directly.
What you could do as a dirty workaround is get all the column names from the table with an initial query (http://dev.mysql.com/doc/refman/5.0/en/show-columns.html) and then compare in python if the name matches your pattern. Afterwards you could do the MySQL select statement with the found column names like this:
SELECT word1, word2, word3 from searchterms where onstate = 1;

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Batch unload using teradatasql python module - python

Related

How do I add timestamp (using GETDATE()) to my insert statement?

Pass BOTH single bind variable and list variable to SQL query cx_Oracle Python

inputing python list into Teradata SQL

Bulk insert with cx_Oracle: inconsistent data types

Wildcards in column name for MySQL

Categories

Resources