Python3 + beatbox : Not able to querymore - python

I have logged into my SFDC org using instructions provided here http://tomhayden3.com/2013/08/04/salesforce-python/. However, I am not able to implement the queryMore part of it. It just does nothing. When I print(query_locator) it prints out an ID with a suffix -500. Can someone please look into this code and highlight what am I doing wrong?
#!/usr/bin/env python3
import beatbox
# Connecting to SFDC
sf = beatbox._tPartnerNS
service = beatbox.Client()
service.serverUrl = 'https://test.salesforce.com/services/Soap/u/38.0'
service.login('my-username', 'my-password')
query_result = service.query("SELECT id, Name, Department FROM User")
records = query_result['records'] # dictionary of results!
total_records = query_result['size'] # full size of results
query_locator = query_result['queryLocator'] # get the mystical queryLocator
# loop through, pulling the next 500 and appending it to your records dict
while query_result['done'] is False and len(records) < total_records:
query_result = self._service.queryMore(query_locator)
query_locator = query_result['queryLocator'] # get the updated queryLocator
records = records + query_result['records'] # append to records dictionary
print(records['id']) #This should print all IDs??? But it is not.

The examples here resolved the issue for me.
https://github.com/superfell/Beatbox/blob/master/examples/export.py
#!/usr/bin/env python3
import beatbox
import sqlalchemy
engine_str = 'mysql+mysqlconnector://db-username:db-pass#localhost/db-name'
engine = sqlalchemy.create_engine(engine_str, echo=False, encoding='utf-8')
connection = engine.connect()
sf = beatbox._tPartnerNS
service = beatbox.Client()
service.serverUrl = 'https://test.salesforce.com/services/Soap/u/38.0' #I hard quoted it since I was to test against sandbox only.
def export(objectSOQL):
service.login('sfdc-username', 'sfdc-pass')
query_result = service.query(objectSOQL)
while True:
for row in query_result[sf.records:]:
SQL_query = 'INSERT INTO user(' \
'id, ' \
'name, ' \
'department ' \
'VALUES(' \
'\"{}\",\"{}\",\"{}\")'\
.format(
row[2],
row[3],
row[4]
)
try:
connection.execute(SQL_query)
except Exception as e:
print(e)
# This is key part which actually pulls records beyond 500 until sf.done becomes true which means the query has been completed.
if str(query_result[sf.done]) == 'true':
break
query_result = service.queryMore(str(query_result[sf.queryLocator]))
SOQL = 'SELECT id, Name, Department FROM User'
export(SOQL)

Related

How query asynchronous postgres with aws lambda python?

In my case I use the pycopg2 client and I need to create a table but it gives me a time out error, this is obviously because the table takes a long time and exceeds the 15 min limit.
For my little purposes I found the following documentation that helped me a lot psycopg doc
I will leave the small implementation, note that I have separated the connection as aconn because it works differently than the normal connection, for example it does not use commit
The little detail is async_ =True in the connection line
import select
import psycopg2
def wait(conn):
while True:
state = conn.poll()
if state == psycopg2.extensions.POLL_OK:
break
elif state == psycopg2.extensions.POLL_WRITE:
select.select([], [conn.fileno()], [])
elif state == psycopg2.extensions.POLL_READ:
select.select([conn.fileno()], [], [])
else:
raise psycopg2.OperationalError("poll() returned %s" % state)
db_host = db_secret["host"]
db_name = db_secret["dbname"]
db_user = db_secret["username"]
db_pass = db_secret["password"]
aconn = None
stringConn = "dbname='%s' user='%s' host='%s' password='%s'" % (db_name, db_user, db_host, db_pass)
aconn = psycopg2.connect(stringConn , async_ =True)
wait(aconn)
acursor = aconn.cursor()
query ="CREATE TABLE CHEMA.TABLE AS SELECT * FRO BLA "
acursor.execute(query, params={})
wait(acursor.connection)
aconn.close()
#END AND EXIT

MySQL update or insert based on fetchall results in Python

I need to set some user meta in my wordpress through local python script. Hence I can't use the WP update_user_meta for it - it has to be done manually.
import mysql.connector as mysql
cnx = mysql.connect(host=HOST, database=DATABASE, user=USER, password=PASSWORD)
cursor = cnx.cursor()
get_meta = ("SELECT * FROM `ff_usermeta` WHERE `user_id`= 1 AND (`meta_key`='nickname' OR `meta_key`='info' OR `meta_key`='bg' OR `meta_key`='avatar' OR `meta_key`='profile_updated')")
cursor.execute(get_meta)
meta = cursor.fetchall()
#some processing of the result
cursor.execute(q, (...))
cnx.commit()
cursor.close()
cnx.close()
Now I need to check if the result has meta with each of the keys.
If the key already exists for this user, it needs to run UPDATE for this meta.
If this user still has no meta of this key, it has to INSERT new row.
if(there's no 'nickname' in meta_key on either of 5 or less rows):
q = ("INSERT INTO `ff_usermeta` ...")
else:
q = ("UPDATE `ff_usermeta` ...")
...and 4 more times like that?.. Seems like a good place for a cycle, but I don't really like the idea to make it 5x queues, especially since there might be more fields in the future.
I was thinking along the lines of searching the fetchall result for matches in meta_key, and if found, adding required data to one array, if not - to another. And then just running one update and one insert at the end, assuming both are not empty. If I were to write it in semi-php style, it would look roughly like this:
if(in_array("nickname", meta))
for_update .= "`nickname`='"+data[0]+"', "
else:
fields .= "`nickname`, "
vals .= "'"+data[0]+"', "
if(in_array("bg", meta)):
for_update .= "`bg`='"+data[1]+"', "
else:
fields .= "`bg`, "
vals .= "'"+data[1]+"', "
if(for_update):
update = ("UPDATE `ff_usermeta` SET "+for_update+" WHERE 1")
if(fields):
insert = ("INSERT INTO `ff_usermeta`("+fields+") VALUES ("+vals+")")
But absolutely no clue how to translate it correctly to python. I had to google it up to things like "why dot not working to add one string to another". Any advice? Or perhaps there is a better way? Thanks!
It is not complete, you can not update your rows in that way.
But with this you can start to make your query
The frist select gets exactly 1 row, if the user_id exists.
The user_id doesn't seem the right choice for this, but to get what you can do it is enough.
If the query doesn't have an entry, the it will insert some data you get from anywhere
The update as the insert are in that form wrong as you have to insert 5 new orws or update max 5 rows, but that is more for you to programm
import mysql.connector as mysql
HOST = "localhost"
DATABASE = ""
USER = "root"
PASSWORD = "mypassword"
cnx = mysql.connect(host=HOST, database=DATABASE, user=USER, password=PASSWORD)
cnx = mysql.connect(host=HOST, database=DATABASE, user=USER, password=PASSWORD)
cursor = cnx.cursor()
user_id = 1
get_meta = ("""SELECT umeta_id, user_id , MAX(IF( `meta_key`='nickname', meta_value,'')) AS 'nickname' , MAX(IF( `meta_key`='info', meta_value,'')) AS 'info' , MAX(IF( `meta_key`='bg', meta_value,'')) AS 'bg' , MAX(IF( `meta_key`='avatar', meta_value,''NULL'')) AS 'avatar' , MAX(IF (`meta_key`='profile_updated', meta_value,'')) AS 'profile_updated' FROM `ff_usermeta` WHERE `user_id`= %s GROUP BY umeta_id, user_id:""")
result = cursor.execute(get_meta,(user_id,))
if result > 0:
data = cursor.fetchone()
for_update = "";
#some processing of the result
if not data["nickname"]:
for_update += "`nickname`='"+data["nickname"]+"', "
if not data["bg"]:
for_update += "`bg`='"+data["bg"]+"', "
query = ("UPDATE `ff_usermeta` SET "+for_update+" WHERE user_id = " + user_id)
else:
#here are no data to be gathered as there is no user_id present add new user
nickname = ""
bg= ""
info = ""
avatar = ""
profile_updated = ""
fields= ""
vals = ""
fields += "`nickname`,`info`, `bg`,`avatar`,`profile_updated`"
vals += "'"+nickname+"', "+"'"+info+"', "+"'"+bg+"', "+"'"+avatar+"', "+"'"+profile_updatedfo+"'"
query = ("INSERT INTO `ff_usermeta`("+fields+") VALUES ("+vals+")")
cursor.execute(query)
cnx.commit()
cursor.close()
cnx.close()
I tried my best to adapt the suggestion above, but couldn't figure out how to make it work. Eventually I went another way, and it seems to work somehow, so I'll post the full code in case anyone would find it useful.
What it does: checks the queue in table with validation request, then parses a page (separate function) and updates user profile accodringly.
import mysql.connector as mysql
import time
from datetime import datetime
cnx = mysql.connect(host=HOST, database=DATABASE, user=USER, password=PASSWORD)
while True: #endless loop as a temporary scheduler
cursor = cnx.cursor()
#getting first request in the queue - 0: id, 1: url, 2: parse, 3: status, 4: user, 5: user_page, 6: req_date, 7: action
cursor.execute("SELECT * FROM `ff_qq` WHERE status = 0 LIMIT 1")
row = cursor.fetchone()
if row:
status = 1 #processed
if row[7] == "verify":
get_user = ("SELECT * FROM `ff_users` WHERE ID = %s LIMIT 1")
cursor.execute(get_user, (row[4],))
user = cursor.fetchone() #0 - ID, 5 - user_url, 8 - user_status, 9 - display_name
#separate function that returns data to insert into mysql
udata = verify(row) #0 - nickname, 1 - fb_av, 2 - fb_bg, 3 - fb_info, 4 - owner
ustat = row[1].split("/authors/")
if udata['owned'] or user[8] == ustat[1]:
update_user = ("UPDATE `ff_users` SET user_status = %s, display_name = %s, user_url = %s WHERE ID = %s LIMIT 1")
cursor.execute(update_user, (ustat[1], udata['nickname'], row[1], user[0]))
status = 2 #success
get = ("SELECT `meta_value` FROM `ff_usermeta` WHERE `user_id`= %s AND `meta_key`='ff_capabilities' LIMIT 1")
cursor.execute(get, (row[4],))
rights = cursor.fetchone()
if rights == 'a:1:{s:10:"subscriber";b:1;}':
promote = ("UPDATE `ff_usermeta` SET `meta_value` = 'a:1:{s:6:\"author\";b:1;}' "
"WHERE `user_id` = %s AND `meta_key`='ff_capabilities' LIMIT 1")
cursor.execute(promote, (row[0],))
#list of meta_key values in same order as returned data
ff = ['nickname', 'fb_av', 'fb_bg', 'fb_info']
for x in range(0,3): #goes through each one of the above list
if udata[ff[x]]: #yes this actually works, who would've thought?..
#current meta_key added directly into the string
get = ("SELECT `meta_value` FROM `ff_usermeta` WHERE `user_id`= %s AND `meta_key`='" + ff[x] + "' LIMIT 1")
cursor.execute(get, (row[4],))
meta = cursor.fetchone()
if(meta): #update if it exists, otherwise insert new row
qq = ("UPDATE `ff_usermeta` SET `meta_value` = %s "
"WHERE `user_id` = %s AND `meta_key`='" + ff[x] + "' LIMIT 1")
else:
qq = ("INSERT INTO `ff_usermeta`(`meta_value`, `meta_key`, `user_id`) "
"VALUES ('%s','" + ff[x] + "','%s'")
cursor.execute(qq, (udata[ff[x]], row[0])) #same execute works for both
else:
status = 3 #verification failed
#update queue to reflect its status
update = ("UPDATE `ff_qq` SET status = %s WHERE id = %s LIMIT 1")
cursor.execute(update, (status, row[0]))
cnx.commit()
cursor.close()
now = datetime.now()
print(now.strftime("%d.%m.%Y %H:%M:%S"))
time.sleep(180) #sleep until it's time to re-check the queue
cnx.close()

Python Database update from XML only updates last record

I have write a script to update stock in my mysql db, but i get only the last record from the xml updated, i'm looking through a tunnel right know and love to have a fresh pair of eyes looking at it.
import requests
import xml.etree.ElementTree as ET
from lxml import etree
from getpass import getpass
from mysql.connector import connect, Error
r = requests.get('http://api.edc.nl/xml/eg_xml_feed_stock.xml')
root = ET.fromstring(r.content)
for x in root.iter('product'):
id = x.find('productid').text
qty = x.find('qty').text
try:
with connect(
host="my host",
user=input("Enter username: "),
database="my database",
password=getpass("Enter password: "),
) as connection:
query = "UPDATE `ps_stock_available` SET `quantity` = " + \
qty + " WHERE `id_product` = " + id + ";"
with connection.cursor() as cursor:
cursor.execute(query)
connection.commit()
#result = cursor.fetchall()
# for row in result:
print(query)
except Error as e:
print(e)
The code that upates the table needs to be inside the for loop. Otherwise it only runs once after the loop completes, and uses the last values of the variables.
query = "UPDATE `ps_stock_available` SET `quantity` = %s WHERE `id_product` = %s"
try:
with connect(
host="my host",
user=input("Enter username: "),
database="my database",
password=getpass("Enter password: "),
) as connection:
with connection.cursor() as cursor:
for x in root.iter('product'):
prod_id = x.find('productid').text
qty = x.find('qty').text
cursor.execute(query, (qty, prod_id))
connection.commit()
except Error as e:
print(e)
Don't use id as a variable, it's the name of a built-in function.

Python code not creating tables on the database but able to query the results postgres

My usecase is to write create a temp table in the postgres database and fetch records from it and insert into a different table.
The code i used is:
import psycopg2
import sys
import pprint
from __future__ import print_function
from os.path import join,dirname,abspath
import xlrd
import os.path
newlist = []
itemidlist = []
def main():
conn_string = "host='prod-dump.cvv9i14mrv4k.us-east-1.rds.amazonaws.com' dbname='ebdb' user='ebroot' password='*********'"
# print the connection string we will use to connect
# print "Connecting to database" % (conn_string)
# get a connection, if a connect cannot be made an exception will be raised here
conn = psycopg2.connect(conn_string)
# conn.cursor will return a cursor object, you can use this cursor to perform queries
cursor = conn.cursor()
dealer_id = input("Please enter dealer_id: ")
group_id = input("Please enter group_id: ")
scriptpath = os.path.dirname('__file__')
filename = os.path.join(scriptpath, 'Winco - Gusti.xlsx')
xl_workbook = xlrd.open_workbook(filename, "rb")
xl_sheet = xl_workbook.sheet_by_index(0)
print('Sheet Name: %s' % xl_sheet.name)
row=xl_sheet.row(0)
from xlrd.sheet import ctype_text
print('(Column #) type:value')
for idx, cell_obj in enumerate(row):
cell_type_str = ctype_text.get(cell_obj.ctype, 'unknown type')
#print('(%s) %s %s' % (idx, cell_type_str, cell_obj.value))
num_cols = xl_sheet.ncols
for row_idx in range(0, xl_sheet.nrows): # Iterate through rows
num_cols = xl_sheet.ncols
id_obj = xl_sheet.cell(row_idx, 1) # Get cell object by row, col
itemid = id_obj.value
#if itemid not in itemidlist:
itemidlist.append(itemid)
# execute our Query
'''
cursor.execute("""
if not exists(SELECT 1 FROM model_enable AS c WHERE c.name = %s);
BEGIN;
INSERT INTO model_enable (name) VALUES (%s)
END;
""" %(itemid,itemid))
'''
cursor.execute("drop table temp_mbp1")
try:
cursor.execute("SELECT p.model_no, pc.id as PCid, g.id AS GROUPid into public.temp_mbp1 FROM products p, \
model_enable me, products_clients pc, groups g WHERE p.model_no = me.name \
and p.id = pc.product_id and pc.client_id = %s and pc.client_id = g.client_id and g.id = %s"\
% (dealer_id,group_id)
except (Exception, psycopg2.DatabaseError) as error:
print(error)
cursor.execute("select count(*) from public.temp_mbp1")
# retrieve the records from the database
records = cursor.fetchall()
# print out the records using pretty print
# note that the NAMES of the columns are not shown, instead just indexes.
# for most people this isn't very useful so we'll show you how to return
# columns as a dictionary (hash) in the next example.
pprint.pprint(records)
if __name__ == "__main__":
main()
The try except block in between the program is not throwing any error but the table is not getting created in the postgres database as i see in the data admin.
The output shown is:
Please enter dealer_id: 90
Please enter group_id: 13
Sheet Name: Winco Full 8_15_17
(Column #) type:value
[(3263,)]
Thanks,
Santosh
You didn't commit the changes, so they aren't saved in the database. Add to the bottom, just below the pprint statement:
conn.commit()

Process very large 900M row MySQL table line by line with Python

I often need to process several hundred million rows of a MySQL table on a line by line basis using Python. I want a script that is robust and does not need to be monitored.
Below I pasted a script that classifying the language of the message field in my row. It utilizes the sqlalchemy and MySQLdb.cursors.SSCursor modules. Unfortunately this script consistently throws a 'Lost connection to MySQL server during query' error after 4840 rows when I run remotely and 42000 rows when I run locally.
Also, I have checked and max_allowed_packet = 32M on my MySQL server's /etc/mysql/my.cnf file as per the answers to this stackoverflow question Lost connection to MySQL server during query
Any advice for either fixing this error, or using another approach to use Python for processing very large MySQL files in a robust way would be much appreciated!
import sqlalchemy
import MySQLdb.cursors
import langid
schema = "twitterstuff"
table = "messages_en" #900M row table
engine_url = "mysql://myserver/{}?charset=utf8mb4&read_default_file=~/.my.cnf".format(schema)
db_eng = sqlalchemy.create_engine(engine_url, connect_args={'cursorclass': MySQLdb.cursors.SSCursor} )
langid.set_languages(['fr', 'de'])
print "Executing input query..."
data_iter = db_eng.execute("SELECT message_id, message FROM {} WHERE langid_lang IS NULL LIMIT 10000".format(table))
def process(inp_iter):
for item in inp_iter:
item = dict(item)
(item['langid_lang'], item['langid_conf']) = langid.classify(item['message'])
yield item
def update_table(update_iter):
count = 0;
for item in update_iter:
count += 1;
if count%10 == 0:
print "{} rows processed".format(count)
lang = item['langid_lang']
conf = item['langid_conf']
message_id = item['message_id']
db_eng.execute("UPDATE {} SET langid_lang = '{}', langid_conf = {} WHERE message_id = {}".format(table, lang, conf, message_id))
data_iter_upd = process(data_iter)
print "Begin processing..."
update_table(data_iter_upd)
According to MySQLdb developer Andy Dustman,
[When using SSCursor,] no new queries can be issued on the connection until
the entire result set has been fetched.
That post says that if you issue another query you will get a "commands out of sequence" error, which is not the error you are seeing. So I am not sure that the following will necessarily fix your problem. Nevertheless, it might be worth trying to remove SSCursor from your code and use the simpler default Cursor just to test if that is the source of the problem.
You could, for example, use LIMIT chunksize OFFSET n in your SELECT statement
to loop through the data set in chunks:
import sqlalchemy
import MySQLdb.cursors
import langid
import itertools as IT
chunksize = 1000
def process(inp_iter):
for item in inp_iter:
item = dict(item)
(item['langid_lang'], item['langid_conf']) = langid.classify(item['message'])
yield item
def update_table(update_iter, engine):
for count, item in enumerate(update_iter):
if count%10 == 0:
print "{} rows processed".format(count)
lang = item['langid_lang']
conf = item['langid_conf']
message_id = item['message_id']
engine.execute(
"UPDATE {} SET langid_lang = '{}', langid_conf = {} WHERE message_id = {}"
.format(table, lang, conf, message_id))
schema = "twitterstuff"
table = "messages_en" #900M row table
engine_url = ("mysql://myserver/{}?charset=utf8mb4&read_default_file=~/.my.cnf"
.format(schema))
db_eng = sqlalchemy.create_engine(engine_url)
langid.set_languages(['fr', 'de'])
for offset in IT.count(start=0, step=chunksize):
print "Executing input query..."
result = db_eng.execute(
"SELECT message_id, message FROM {} WHERE langid_lang IS NULL LIMIT {} OFFSET {}"
.format(table, chunksize, offset))
result = list(result)
if not result: break
data_iter_upd = process(result)
print "Begin processing..."
update_table(data_iter_upd, db_eng)

Categories

Resources