What should I refactor in my Python Class? - python

I'm fairly new to Python. And this is my first class:
import config # Ficheiro de configuracao
import twitter
import random
import sqlite3
import time
import bitly_api #https://github.com/bitly/bitly-api-python
class TwitterC:
def logToDatabase(self, tweet, timestamp):
# Will log to the database
database = sqlite3.connect('database.db') # Create a database file
cursor = database.cursor() # Create a cursor
cursor.execute("CREATE TABLE IF NOT EXISTS twitter(id_tweet INTEGER AUTO_INCREMENT PRIMARY KEY, tweet TEXT, timestamp TEXT);") # Make a table
# Assign the values for the insert into
msg_ins = tweet
timestamp_ins = timestamp
values = [msg_ins, timestamp_ins]
# Insert data into the table
cursor.execute("INSERT INTO twitter(tweet, timestamp) VALUES(?, ?)", values)
database.commit() # Save our changes
database.close() # Close the connection to the database
def shortUrl(self, url):
bit = bitly_api.Connection(config.bitly_username, config.bitly_key) # Instanciar a API
return bit.shorten(url) # Encurtar o URL
def updateTwitterStatus(self, update):
short = self.shortUrl(update["url"]) # Vou encurtar o URL
update = update["msg"] + short['url']
# Will post to twitter and print the posted text
api = twitter.Api(consumer_key=config.consumer_key,
consumer_secret=config.consumer_secret,
access_token_key=config.access_token_key,
access_token_secret=config.access_token_secret)
status = api.PostUpdate(update) # Fazer o update
msg = status.text # Vou gravar o texto enviado para a variavel 'msg'
# Vou gravar p a Base de Dados
self.logToDatabase(msg, time.time())
print msg # So p mostrar o texto enviado. Comentar esta linha de futuro.
x = TwitterC()
x.updateTwitterStatus({"url": "http://xxxx.com/?cat=49", "msg": "Searching for some ....? "})
My question. What should I refactor in this ugly code(I think)?
For example. When I try do duplicate a Twitter Update I got this error:
Traceback (most recent call last):
File "C:\Users\anlopes\workspace\redes_sociais\src\twitterC.py", line 42, in <module>
x.updateTwitterStatus({"url": "http://xxx.com/?cat=49", "msg": "Searching for some ...? "})
File "C:\Users\anlopes\workspace\redes_sociais\src\twitterC.py", line 35, in updateTwitterStatus
status = api.PostUpdate(update) # Fazer o update
File "C:\home_python\python_virtualenv\lib\site-packages\twitter.py", line 2549, in PostUpdate
self._CheckForTwitterError(data)
File "C:\home_python\python_virtualenv\lib\site-packages\twitter.py", line 3484, in _CheckForTwitterError
raise TwitterError(data['error'])
twitter.TwitterError: Status is a duplicate.
How can I for example catch this error in Python?
Some clues needed.
Best Regards,

As the output states clearly, your code is raising a twitter.TwitterError exception. You catch it like this:
try:
# yadda yadda
except twitter.TwitterError:
# exception code
else:
# happy flow code, optionally.
finally:
# must-run code, optionally
Now, when you are writing your first class and don't know how to catch exceptions in a language, you don't try to fetch twitter updates and save them in a database. You print "Hello World!". Go do a tutorial :D.

One possibility would be to write a function that connects to and disconnects from the database and during the connection time does some stuff. It could look something like this:
class DBFactory(object):
def DBConnection(self, Func, args):
database = sqlite3.connect('database.db') # Create a database file
cursor = database.cursor() # Create a cursor
Func(cursor, args)
database.commit() # Save our changes
database.close() # Close the connection to the database
Now the Func and args parameter actually do the interaction to the database. For example something like this:
def CreateTable(cursor, args):
cursor.execute("CREATE TABLE IF NOT EXISTS {0};".format(args)) # Make a table
Now if you wish to create a table you simple have to make this call:
f = DBFactory()
f.DBConnection(CreateTable, "twitter(id_tweet INTEGER AUTO_INCREMENT PRIMARY KEY, tweet TEXT, timestamp TEXT)"
You can proceed similarly with other interactions to the database for example inserting or deleting entries. Each time calling the DBConnection method. This should modularize your class a little better. At least in my opinion.
Please note that I did not give this code above a try, so there might be a typo in there, but I hope you get the idea. I hope this helped ya
Cherio
Woltan

The first thing to refactor is to get this code out of a class. It has absolutely no need to be in one. This should be a module, with standalone functions.
Edit to add more explanation In Python, most code is grouped naturally into modules. Classes are mainly useful for when you will need discrete instances, each with their own data. This is not the case here - you are just using the class as a placeholder for related code. That's what modules are for.
If, for example, you wanted to model an individual Tweet, which knew about its own content and how to save itself into a database, that would indeed be a good use of OOP. But "stuff that's related to Twitter" is not a class, it's a module.

Related

error when using loop to input data to sqlite3 database

I have already created a database using the DB browser and I want to put all the research result I get from PubMed into the database including the title and article, I am able to get them out but when I input them to the database I keep getting error.
here is my code
import requests
import re
num=[]
page=1
for i in range(1,page+1):
try:
html=requests.get(f"https://pubmed.ncbi.nlm.nih.gov/?term=covid19&page={i}").text
num.extend((re.findall('class="docsum-title"\s+href="(.*?)"',html)))
except:
continue
listToStr = ' '.join([str(elem) for elem in num])
numbers = re.findall(r'\d+', listToStr)
from pubmed_lookup import PubMedLookup
from pubmed_lookup import Publication
import sqlite3
for i in numbers:
email = 'litsunchak#gmail.com'
url = 'http://www.ncbi.nlm.nih.gov/pubmed/'+i
lookup = PubMedLookup(url, email)
publication = Publication(lookup)
#define connection
db = sqlite3.connect('pubMed.db')
#create cursor to execute your equest
c = db.cursor()
c.execute(' CREATE TABLE IF NOT EXISTS(pubmed_data)')
c.execute('insert into pubmed_data(title,article) values(?,? )', (publication.title, repr(publication.abstract)))
db.commit()
db.close()
print('Insert ok')
this is the error I get,
OperationalError: near "(": syntax error
really need some help
Your CREATE TABLE statement is wrong.
The correct way to do it is like this:
CREATE TABLE IF NOT EXISTS pubmed_data (title TEXT, article TEXT)
The name of the table must be written before the parentheses and inside the parentheses you list all the columns of the new table.
See this: CREATE TABLE

Python code not deleting record from database

This is the code I have to delete a record from two tables in my database that share the same ID code and I'm not too sure where I've gone wrong. Anything missing? I've checked this a million times
def deletePhoto(photoID):
"""
Middleware function to delete a photo post
"""
#connect to the database
conn, cursor = getConnectionAndCursor()
#create sql to delete from the ratings table
sql = """
DELETE
FROM ratings
WHERE photoID= %s
"""
#set the parameters
parameters = (photoID)
#execute the sql
cursor.execute(sql, parameters)
#create sql to delete from the photo table
sql = """
DELETE
FROM photo
WHERE photoID = %s
"""
#set the parameters
parameters = (photoID)
#execute the sql
cursor.execute(sql, parameters)
#fetch the data
data = cursor.rowcount
#clean up
conn.commit()
cursor.close()
conn.close()
You might try adding a sleeper after your executes.
It can take some time for the server to process your query.
import time
time.sleep(x)
x in seconds
You need to pass in a sequence for the second argument. Using just parentheses does not create a sequence. To top this off, if photoID then that is a sequence too, one that consists of individual characters.
To create a tuple, you need to use a comma. Parentheses are optional here:
parameters = photoID,
or
parameters = (photoID,)
If you find it easier to avoid mistakes here, you could also make it a list:
parameters = [photoID]
You only have to do this once.
As a side note, you can use the MySQLdb connection object, as well as the cursor, as context managers:
with connection, cursor:
ratings_delete = """
DELETE FROM ratings
WHERE photoID= %s
"""
cursor.execute(ratings_delete, (photoID,))
photo_delete = """
DELETE FROM photo
WHERE photoID = %s
"""
cursor.execute(photo_delete, (photoID,))
The with statement will then take care of closing the cursor and connection for you, and if nothing has gone wrong in the block (no exceptions were raised), will also commit the transaction for you.

How to use my class 'database' in another class?

I'm newer to OOP in Python, and have been trying for awhile to use my database class database within another class.
How can I do so?
class database(object):
def connect_db(self):
try:
import sqlite3 as sqli
connection = sqli.connect('pw.db')
cur = connection.cursor()
except:
print("There was an error connecting to the database.")
I've been trying to like this, but it doesnt work:
import db_helper as dbh
class account_settings(object):
def create_account(self):
setting_username = input('\nUsername?\n')
setting_password = input('\nPassword?\n')
cur = db.connect_db()
with cur:
cur.execute('''
CREATE TABLE IF NOT EXISTS my_passwords(
id INTEGER PRIMARY KEY AUTOINCREMENT NULL,
password text UNIQUE
)
''')
try:
cur.execute('INSERT INTO my_passwords(password) VALUES(?)', (self.create_password(),) )
except:
print("Error occurred trying to insert password into db. Please retry")
c = account_settings()
c.create_account()
new error:
File "settings.py", line 30, in <module>
c.create_account()
File "settings.py", line 15, in create_account
with cur:
AttributeError: __exit__
You need to learn about variable scope. db.connect_db() creates a cursor connection with the name cur, but does not do anything with it; when that method finishes, the object is destroyed. In particular, it never makes it back to the create_account method.
There is a simple way to solve this: return the object back to the method, and use it there.
def connect_db(self):
...
cur = connection.cursor()
return cur
...
def create_account(self):
cur = db.connect_db()
with cur:
or even beter:
with db.connect_db()
Note that really, neither of these should be classes. Classes in Python are really only useful when you're keeping some kind of state, which isn't happening here. connect_db and create_account should just be standalone functions in their respective modules.

Django: Insert new row with 'order' value of the next highest value avoiding race condition

Say I have some models:
from django.db import models
class List(models.Model):
name = models.CharField(max_length=32)
class ListElement(models.Model):
lst = models.ForeignKey(List)
name = models.CharField(max_length=32)
the_order = models.PositiveSmallIntegerField()
class Meta:
unique_together = (("lst", "the_order"),)
and I want to append a new ListElement on to a List with the next-highest the_order value. How do I do this without creating a race condition whereby another ListElement is inserted between looking up the highest the_order and inserting new one?
I have looked into select_for_update() but that won't stop a new INSERT from taking place, just stop the existing elements from being changed. I have also thought about using transactions, but that will simply fail if another thread gets there before us, and I don't want to loop until we succeed.
What I was thinking is along the lines of the following MySQL query
INSERT INTO list_elements (name, lists_id, the_order) VALUES ("another element", 1, (SELECT MAX(the_order)+1 FROM list_elements WHERE lists_id = 1));
however, even this is invalid SQL since you're not able to SELECT from the table you're INSERTing into.
Perhaps there is a way using Django's F() expressions, but I haven't been able to get anything working with it.
AUTO_INCREMENT won't help here either since it's table-wide and not per foreign key.
EDIT:
This SQL does seem to do the trick, however, there doesn't appear to be a way to use the INSERT ... SELECT function from Django's ORM.
INSERT INTO list_elements (name, lists_id, the_order) SELECT "another element", 1, MAX(the_order)+1 FROM list_elements WHERE lists_id = 1;
For concurrency problems in Django & relational databases, you could write table lock to achieve atomic transactions. I came across this problem and found this great code snippet from http://shiningpanda.com/mysql-table-lock-django.html. I'm not sure if copy/pasting his code directly here would be offend anybody, but since SO discourage link-only answers, I will cite it anyway(Thanks to ShiningPanda.com for this):
#-*- coding: utf-8 -*-
import contextlib
from django.db import connection
#contextlib.contextmanager
def acquire_table_lock(read, write):
'''Acquire read & write locks on tables.
Usage example:
from polls.models import Poll, Choice
with acquire_table_lock(read=[Poll], write=[Choice]):
pass
'''
cursor = lock_table(read, write)
try:
yield cursor
finally:
unlock_table(cursor)
def lock_table(read, write):
'''Acquire read & write locks on tables.'''
# MySQL
if connection.settings_dict['ENGINE'] == 'django.db.backends.mysql':
# Get the actual table names
write_tables = [model._meta.db_table for model in write]
read_tables = [model._meta.db_table for model in read]
# Statements
write_statement = ', '.join(['%s WRITE' % table for table in write_tables])
read_statement = ', '.join(['%s READ' % table for table in read_tables])
statement = 'LOCK TABLES %s' % ', '.join([write_statement, read_statement])
# Acquire the lock
cursor = connection.cursor()
cursor.execute(statement)
return cursor
# Other databases: not supported
else:
raise Exception('This backend is not supported: %s' %
connection.settings_dict['ENGINE'])
def unlock_table(cursor):
'''Release all acquired locks.'''
# MySQL
if connection.settings_dict['ENGINE'] == 'django.db.backends.mysql':
cursor.execute("UNLOCK TABLES")
# Other databases: not supported
else:
raise Exception('This backend is not supported: %s' %
connection.settings_dict['ENGINE'])
It works with the models declared in your django application, by
simply providing two lists:
the list of models to lock for read purposes, and the list of models
to lock for write purposes. For instance, using django tutorial's
models, you would just call the context manager like this:
with acquire_table_lock(read=[models.Poll], write=[models.Choice]):
# Do something here
pass
It basically creates a python context manager to wrap your insert your ORM statement and do LOCK TABLES UNLOCK TALBES upon entering and exiting the context.

"IndexError: list index out of range" while charging MySQL DB

I get the following error code while executing my Code. The error does not occur immediately - it occurs randomly after 2-7 hours. Until the error occurs there is no problem to stream the online feeds and write them in a DB.
Error message:
Traceback (most recent call last):
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 78, in <module>
main()
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 63, in main
feed_iii = feed_load_iii(feed_url_iii)
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 44, in feed_load_iii
in feedparser.parse(feed_iii).entries]
IndexError: list index out of range
Here you can find my Code:
import feedparser
import MySQLdb
import time
from cookielib import CookieJar
db = MySQLdb.connect(host="localhost", # your host, usually localhost
user="root", # your username - SELECT * FROM mysql.user
passwd="****", # your password
db="sentimentanalysis_unicode",
charset="utf8") # name of the data base
cur = db.cursor()
cur.execute("SET NAMES utf8")
cur.execute("SET CHARACTER SET utf8")
cur.execute("SET character_set_connection=utf8")
cur.execute("DROP TABLE IF EXISTS feeddata_iii")
sql_iii = """CREATE TABLE feeddata_iii(III_ID INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(III_ID),III_UnixTimesstamp integer,III_Timestamp varchar(255),III_Source varchar(255),III_Title varchar(255),III_Text TEXT,III_Link varchar(255),III_Epic varchar(255),III_CommentNr integer,III_Author varchar(255))"""
cur.execute(sql_iii)
def feed_load_iii(feed_iii):
return [(time.time(),
entry.published,
'iii',
entry.title,
entry.summary,
entry.link,
(entry.link.split('=cotn:')[1]).split('.L&id=')[0],
(entry.link.split('.L&id=')[1]).split('&display=')[0],
entry.author)
for entry
in feedparser.parse(feed_iii).entries]
def main():
feed_url_iii = "http://www.iii.co.uk/site_wide_discussions/site_wide_rss2.epl"
feed_iii = feed_load_iii(feed_url_iii)
print feed_iii[1][1]
for item in feed_iii:
cur.execute("""INSERT INTO feeddata_iii(III_UnixTimesstamp, III_Timestamp, III_Source, III_Title, III_Text, III_Link, III_Epic, III_CommentNr, III_Author) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s)""",item)
db.commit()
if __name__ == "__main__":
while True:
main()
time.sleep(240)
If you need further information - please feel free to ask. I need your help!
Thanks and Regards from London!
In essence, your program is insufficiently resilient to poorly-formatted data.
Your code makes very explicit assumptions about the structure of the data, and is unable to cope if the data is not so structured. You need to detect the cases where the data is incorrectly formatted and take some other action then.
A rather sloppy way to do this would simply trap the exception that's currently being raised which you could do with (something like)
try:
feed_iii = feed_load_iii(feed_url_iii)
except IndexError:
# do something to report or handle the data format problem

Categories

Resources