I am using a code that is using mysql. I am very new in mysql so I would be thankful if you could help. My input is a huge dumpfile of wikipediapages in xml bz2 format. The input format is some text files extracted from that xml file with this format:
<doc id="12" url="https://en.wikipedia.org/wiki?curid=12" title="Anarchism"> text... </doc>
the only parts that connects the program to sql is as follows:
def read_in_STOP_CATS(f_n = "/media/sscepano/Data/Wiki2015/STOPCAT/STOP_CATS.txt"):
s = []
f = open(f_n, "r")
for line in f:
s.append(line.rstrip().lower())
return s
def connect_2_db():
try:
cnx = mysql.connector.connect(user='test', password='test',
host='127.0.0.1',
database='wiki_category_links')
except mysql.connector.Error as err:
if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
print("Something is wrong with your user name or password")
elif err.errno == errorcode.ER_BAD_DB_ERROR:
print("Database does not exist")
else:
print(err)
return cnx
def articles_selected(aid):
global cnx
global STOP_CATS
cursor = cnx.cursor(buffered=True)
cursor.execute("SELECT * FROM categorylinks where cl_from = " + str(aid))
row = cursor.fetchone()
while row is not None:
#print(row)
cat = row[1].lower()
#print cat
for el in STOP_CATS:
if el in cat:
return False
row = cursor.fetchone()
return True
cnx = connect_2_db()
STOP_CATS = read_in_STOP_CATS()
TITLE_WEIGHT = 4
my problem is that right now I do not know how should I connect to mysql to be able to run the code and the main prob;lem is that I do not know what is categorylinks in the code? That should be the name of my sql table? Does it mean that I need to make an sql table with this name and import all my text file in this one table?
what does 'where' means in this line also????
As RiggsFolly said, you need to get something like WHERE cl_from = 'some string'
You could do it this way:
cursor.execute("SELECT * FROM categorylinks where cl_from ='" + str(aid)+"'")
But it is better to use prepared statements like this one:
select_stmt = "SELECT * FROM categorylinks where cl_from = %(aid)s"
cursor.execute(select_stmt, { 'aid':str(aid) })
So in your code you have:
A database named wiki_category_links
In that database you have a table called categorylinks
And the select you have means that you are going to get, from table categorylinks, all rows that have the column cl_from equal to the value of aid variable.
Related
I am new to Python and started off with sqlite.
I have two csv transaction.csv and users.csv from where I am reading the data and writing to the sqlite database.Below is the snippet
import csv
import sqlite3 as db
def readCSV_users():
with open('users.csv',mode='r') as data:
dr = csv.DictReader(data, delimiter=',')
users_data = [(i['user_id'], i['is_active']) for i in dr if i['is_active']=='True']
#---------------------
return users_data
def readCSV_transactions():
with open('transactions.csv',mode='r') as d:
dr = csv.DictReader(d, delimiter=',')
trans_data = [(i['user_id'], i['is_blocked'],i['transaction_amount'],i['transaction_category_id']) for i in dr if i['is_blocked']=='False']
#---------------------
return trans_data
def SQLite_connection(database):
try:
# connect to the database
conn = db.connect(database)
print("Database connection is established successfully!")
conn = db.connect(':memory:')
print("Established database connection to a database\
that resides in the memory!")
cur = conn.cursor()
return cur,conn
except exception as Err:
print(Err)
def dbQuery(users_data,trans_data,cur,conn):
try:
cur.executescript(""" CREATE TABLE if not exists users(user_id text,is_active text);
CREATE TABLE if not exists transactions(user_id text,is_blocked text,transaction_amount text,transaction_category_id text);
INSERT INTO users VALUES (?,?),users_data;
INSERT INTO transactions VALUES (?,?,?,?),trans_data""")
conn.commit()
a=[]
rows = curr.execute("SELECT * FROM users").fetchall()
for r in rows:
a.append(r)
return a
except Err:
print(Err)
finally:
conn.close()
if __name__ == "__main__":
database='uit'
users_data=readCSV_users()
trans_data=readCSV_transactions()
curr,conn=SQLite_connection(database)
print(dbQuery(users_data,trans_data,curr,conn))
But I am facing below error.I believe the ? is throwing the error in executescript
cur.executescript(""" CREATE TABLE if not exists users(user_id text,is_active text);
sqlite3.OperationalError: near "users_data": syntax error
Any pointers to resolve this?
Putting users_data directly in query is wrong. It treats it as normal string.
But it seems executescript can't use arguments.
You would have to put values directly in place of ?.
Or you have to use execute()
cur.execute("INSERT INTO users VALUES (?,?);", users_data)
cur.execute("INSERT INTO transactions VALUES (?,?,?,?)", trans_data)
I used Python and the package SQLite to create table and insert data into the table. However, there is nothing in the table after I fired the execution. Can anyone help me figure it out? Thanks.
def conSqlite():
conn = sqlite3.connect('C:\\Users\jet.cai\Documents\Logsitic.db')
json_path = r'C:\Users\jet.cai\PycharmProjects\VJSF\txtToJson.json'
try:
create_table = ('''
CREATE TABLE IF NOT EXISTS CODE2
(Delivery TEXT,
Customer_Name NCHAR(50),
Shipment_Priority TEXT
)''')
conn.execute(create_table)
except:
print("Table Failed")
return False
with open(json_path, 'r') as jsonf:
lines = json.load(jsonf)
for line in lines:
sql = "insert into CODE2(Delivery,Customer_Name,Shipment_Priority) values('%s','%s','%s')"%(line['Delivery'],line['Customer Name'],line['Shipment Priority'])
conn.execute(sql)
# No results can be selected out
df = pd.read_sql("select Delivery from CODE2", conn)
print(df)
I have some python code that gets data from one database (SQL server) and inserts it into another database (MySQL). I am trying to add a WHERE NOT EXIST to the INSERT query so only new rows are inserted, but need to use one of the values in the tuple SageResults a second time for the primary key.
Code:
import mysql.connector
import pyodbc
def insert_VPS(SageResult):
query = """
INSERT INTO SOPOrderReturn(SOPOrderReturnID, DocumentTypeID, DocumentNo, DocumentDate, CustomerID, CustomerTypeID, CurrencyID, SubtotalGoodsValue, TotalNetValue, TotalTaxValue, TotalGrossValue, SourceTypeID, SourceDocumentNo)
VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)
WHERE NOT EXISTS (SELECT * FROM SOPOrderReturn WHERE SOPOrderReturnID = %1$s)"""
try:
mydbVPS = mysql.connector.connect(
host="address",
user="user",
passwd="password",
database="database"
)
VPScursor = mydbVPS.cursor()
#print(SageResult)
VPScursor.executemany(query, SageResult)
mydbVPS.commit()
except Exception as e:
print('InsertError:', e)
finally:
VPScursor.close()
mydbVPS.close()
def main():
selectQuery = """
SELECT TOP 51 [SOPOrderReturnID]
,[DocumentTypeID]
,[DocumentNo]
,[DocumentDate]
,[CustomerID]
,[CustomerTypeID]
,[CurrencyID]
,[SubtotalGoodsValue]
,[TotalNetValue]
,[TotalTaxValue]
,[TotalGrossValue]
,[SourceTypeID]
,[SourceDocumentNo]
FROM [Live].[dbo].[SOPOrderReturn]
"""
try:
mydbSage = pyodbc.connect('Driver={SQL Server};'
'Server=CRMTEST;'
'Database=Live;'
'UID=sa;'
'PWD=password;')
Sagecursor = mydbSage.cursor()
Sagecursor.execute(selectQuery)
#SageResult = tuple(Sagecursor.fetchall())
SageResult = []
while True:
row = Sagecursor.fetchone()
if row:
SageResult.append(tuple(row))
else:
break
#SageResult = Sagecursor.fetchall()
mydbSage.commit()
except Exception as e:
print('MainError:', e)
finally:
Sagecursor.close()
mydbSage.close()
insert_VPS(SageResult)
if __name__ == '__main__':
main()
Output:
D:\xampp\htdocs\stripe\group\beta>sql-sync.py
InsertError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use ne
ar 'WHERE NOT EXISTS (SELECT * FROM SOPOrderReturn WHERE SOPOrderReturnID = %1$s),(1' at line 3
The part in question is the query string variable. Everything else in here works fine. I basically need to use the SOPOrderReturnID value from the tuple a second time where I currently have %1$s
What is the issue with the query syntax? Is my use of %1$s correct?
I know there are some other posts out there, but I was not able to find the specific question I had in mind.
I'm using US_baby_names csv file. and want to import this csv file line by line into sqlite3 as a table.
I'm able to create the table called storage.
I'm then trying to read lines in the csv file and put it into that table, but I must be doing something wrong.
import sqlite3 as sql
from sqlite3 import Error
import csv
def CreateConnection ( dbFileName ):
try:
conn = sql.connect(dbFileName)
return conn
except Error as e:
print(e)
return None
def CreateNew( dbConnection, new):
sql = """INSERT INTO storage (dat, Id, Name, Year, group, subgroup, Count)
VALUES (?,?,?,?,?,?,?)"""
try:
cursor = dbConnection.cursor()
cursor.execute(sql, new)
return cursor.lastrowid
except Error as e:
print(e)
def Main():
database = "storage.db"
dbConnection = CreateConnection(database)
with open('storage.csv', 'rb') as fin:
dr = csv.DictReader(fin)
to_db = [(i['dat'], i['Id'], i['Name'], i['Year'], i['group'], i['subgroup'], i['Count']) \
for i in dr]
cursor.executemany(CreateNew(sql, to_db))
dbConnection.close()
if __name__ == "__main__":
Main()
I believe my cursor.executemany is wrong, but I'm not able to figure out what else to do..
Thanks
You are almost right with much of your code, but:
in cursor.execute(sql, new) you are passing an iterable, new, to sqlite3.execute() (which requires a simple SQL statement), instead of sqlite3.executemany().
Moreover, the result of CreateNew() is an integer, lastrowid, and you pass that result to executemany().
You must use Connection.commit() to save the changes to the database, and Connection.rollback() to discard them.
You must open the file for the csv.DictReader class as a text file, in r or rt mode.
Finally, remember that sqlite3.Connection is a context manager, so you can use it in a with statement.
This should be your desired outcome:
import sqlite3 as sql
from sqlite3 import Error
import csv
def create_table(conn):
sql = "CREATE TABLE IF NOT EXISTS baby_names("\
"dat TEXT,"\
"Id INTEGER PRIMARY KEY,"\
"Name TEXT NOT NULL,"\
"Year INTEGER NOT NULL,"\
"Gender TEXT NOT NULL,"\
"State TEXT NOT NULL,"\
"Count INTEGER)"
conn.execute(sql)
conn.execute("DELETE FROM baby_names")
def select_all(conn):
for r in conn.execute("SELECT * FROM baby_names").fetchall():
print(r)
def execute_sql_statement(conn, data):
sql = "INSERT INTO baby_names "\
"(dat, Id, Name, Year, Gender, State, Count) "\
"VALUES (?,?,?,?,?,?,?)"
try:
cursor = conn.executemany(sql, data)
except Error as e:
print(e)
conn.rollback()
return None
else:
conn.commit()
return cursor.lastrowid
def main():
with sql.connect('baby_names.db') as conn, open('US_Baby_Names_right.csv', 'r') as fin:
create_table(conn)
dr = csv.DictReader(fin)
data = [(i['dat'], i['Id'], i['Name'], i['Year'], i['Gender'], i['State'], i['Count']) for i in dr ]
lastrowid = execute_sql_statement(conn, data)
select_all(conn)
main()
I added a create_table() function just to test my code. I also made up a sample test file as follows:
dat,Id,Name,Year,Gender,State,Count
1,1,John,1998,M,Washington,2
2,2,Luke,2000,M,Arkansas,10
3,3,Carrie,1999,F,Texas,3
The output of the select_all() function is:
('1',1,'John',1998,'M','Washington',2)
('2',2,'Luke',2000,'M','Arkansas',10)
('3',3,'Carrie',1999,'F','Texas',3)
Hello
I have a question about SQLite functions, maybe.
So, question:
How to check if name I set in Python is in certain column?
Example:
name = 'John'
Table name = my_table
Column name = users
Code details:
C = conn.cursor()
Please
Use parameter in the query as required. See the attached example for better understanding.
Sample SQLite code for searching value in tables
import sqlite3 as sqlite
import sys
conn = sqlite.connect("test.db")
def insert_single_row(name, age):
try:
age = str(age)
with conn:
cursor = conn.cursor()
cursor.execute("CREATE TABLE IF NOT EXISTS USER_TABLE(NAME TEXT, AGE INTEGER);")
cursor.execute("INSERT INTO USER_TABLE(NAME, AGE) VALUES ('"+name+"',"+age+")")
return cursor.lastrowid
except:
raise ValueError('Error occurred in insert_single_row(name, age)')
def get_parameterized_row(name):
try:
with conn:
cursor = conn.cursor()
cursor.execute("SELECT * FROM USER_TABLE WHERE NAME = :NAME",
{"NAME":name})
conn.commit()
return cursor.fetchall()
except:
raise ValueError('Error occurred in get_parameterized_row(name)')
if __name__ == '__main__':
try:
return_id = insert_single_row("Shovon", 24)
return_id = insert_single_row("Shovon", 23)
return_id = insert_single_row("Sho", 24)
all_row = get_parameterized_row("Shovon")
for row in all_row:
print(row)
except Exception as e:
print(str(e))
Output:
('Shovon', 24)
('Shovon', 23)
Here I have created a table called USER_TABLE with two attributes: NAME and AGE. Then I inserted several values in the table and searched for a specific NAME. Hope it gives a way to start using SQLite in the project.