parsing of xml to mysql database

parsing of xml to mysql database - python

In my app,i am including a function called xml parsing.I am trying to data the data from an xml file and save it to mysql database.
I coded this with the help from google engine,but as required the data's are not saving in the database.I can run the app without any error.
Please see my codes below
views.py
def goodsdetails(request):
path = "{0}shop.xml".format(settings.PROJECT_ROOT)
xmlDoc = open(path, 'r')
xmlDocData = xmlDoc.read()
xmlDocTree = etree.XML(xmlDocData)
for items in xmlDocTree.iter('item'):
item_id = items[0].text
customername = items[1].text
itemname = items[2].text
location = items[3].text
rate = items[4].text
shop=Shop.objects.create(item_id=item_id,customername=customername,
itemname=itemname,location=location,rate=rate)
shop.save()
shops = Shop.objects.all()
context={'shops':shops}
return render(request,'index.html', context)
I am using the above logic to save the data in database from xml file.I am not getting any error but it is not saving into database
Expected answers are most welcome.
*Update:*I updated the code,really the xml data gets saved in db,but while displaying the same i am getting the following Traceback
IntegrityError at /
(1062, "Duplicate entry '101' for key 'PRIMARY'")
Thanks

Shop.objects.get loads data from the database. You want to create data, which you do by just calling Shop(item_id=item_id,customername=customername, itemname=itemname,location=location,rate=rate) and then shop.save().
If you want to update the data, you need to do something like this:
shop = Shop.objects.get(tem_id=item_id)
shop.customername = customername
...etc...
shop.save()

Related

Update database when CSV file is updated

I have data on a CSV file that I have imported into the DB, and display the data into my index.html as a HTML table.
The CSV is updated frequently
Is there anyway to update the data in the DB from the CSV file every hour or day, or even every time the file is updated?
PS: As I'm new to Django, what I do now is that I delete the whole DB and than migrate and import the file again, and I don't think that is a good way to do it.

According to your requirements, you can write a management command and schedule a job using cron every hour or every day.
here is a link to read more about How to create custom django-admin commands
def dummyfunc():
tests = TEST.objects.filter(parameter = parameter)
if tests.exist():
test = tests[0]
test.field = file_name
test.save()
else:
test = Test.objects.create(
field = field_name
)
test.save()
return test

sqlalchemy, can I get around creating the table classes?

I recently found out about sqlalchemy in Python. I'd like to use it for data science rather than website applications.
I've been reading about it and I like that you can translate the sql queries into Python.
The main thing that I am confused about that I'm doing is:
Since I'm reading data from an already well established schema, I wish I didn't have to create the corresponding models myself.
I am able to get around that reading the metadata for the table and then just querying the tables and columns.
The problem is when I want to join to other tables, this metadata reading is taking too long each time, so I'm wondering if it makes sense to pickle cache it in an object, or if there's another built in method for that.
Edit: Include code.
Noticed that the waiting time was due to an error in the loading function, rather than how to use the engine. Still leaving the code in case people comment something useful. Cheers.
The code I'm using is the following:
def reflect_engine(engine, update):
store = f'cache/meta_{engine.logging_name}.pkl'
if update or not os.path.isfile(store):
meta = alq.MetaData()
meta.reflect(bind=engine)
with open(store, "wb") as opened:
pkl.dump(meta, opened)
else:
with open(store, "r") as opened:
meta = pkl.load(opened)
return meta
def begin_session(engine):
session = alq.orm.sessionmaker(bind=engine)
return session()
Then I use the metadata object to get my queries...
def get_some_cars(engine, metadata):
session = begin_session(engine)
Cars = metadata.tables['Cars']
Makes = metadata.tables['CarManufacturers']
cars_cols = [ getattr(Cars.c, each_one) for each_one in [
'car_id',
'car_selling_status',
'car_purchased_date',
'car_purchase_price_car']] + [
Makes.c.car_manufacturer_name]
statuses = {
'selling' : ['AVAILABLE','RESERVED'],
'physical' : ['ATOURLOCATION'] }
inventory_conditions = alq.and_(
Cars.c.purchase_channel == "Inspection",
Cars.c.car_selling_status.in_( statuses['selling' ]),
Cars.c.car_physical_status.in_(statuses['physical']),)
the_query = ( session.query(*cars_cols).
join(Makes, Cars.c.car_manufacturer_id == Makes.c.car_manufacturer_id).
filter(inventory_conditions).
statement )
the_inventory = pd.read_sql(the_query, engine)
return the_inventory

Django Querying MongoDB ObjectIds in views from json object

I am currently working on querying MongoDB objects in Python Django and had no trouble in creating queries if it's the other attributes needed.
However I need to modify my queries to specifically filter through the ObjectIds returning one or no object found.
From my Javascript I am passing a json data to my Django views.py here's how it currently looks like:
def update(request):
#AJAX data
line = json.loads(request.body)
_id = line['_id']
print("OBJECT_ID: %s" % (_id))
another_id = line['another_id']
print("ANOTHER_ID: %s" % (another_id))
*Don't confuse the another_id, there are objects that has the same another_id s and unfortunately has to remain like that. That's why I can't query it for update since it will update all duplicates. This is the reason why I need the ObjectId.
For checking here's what it prints out:
{u'$oid': u'582fc95bb7abe7943f1a45b2'}
ANOTHER_ID: LTJ1277
Therefore I appended the query in views.py like this:
try:
Line.objects(_id=_id).update(set__geometry=geometry, set__properties=properties)
print("Edited: " + another_id)
except:
print("Unedited.")
But it didn't return any object.
So I was wondering if the query itself can't recognize the $oidin the json body as "_id" : ObjectId("582fc95bb7abe7943f1a45b2")?
*Edit:
from bson.objectid import ObjectId
where I edited my views.py with:
_id = line['_id']
print("VALUES: %s" % (_id.get('$oid')))
try:
Line.objects(_id=ObjectId(_id.get('$oid'))).update(set__geometry=geometry, set__properties=properties)
Output:
VALUES: 582fc95bb7abe7943f1a498c
No luck. Still not querying/not found.

According to this Using MongoDB with Django reference site:
Notice that to access the unique object ID, you use "id" rather than "_id".
I tried revising the code from:
Line.objects(_id=ObjectId(_id.get('$oid'))).update(set__geometry=geometry, set__properties=properties)
to
Line.objects(id=ObjectId(_id.get('$oid'))).update(set__geometry=geometry, set__properties=properties)
...And it now works fine. Keeping this question for others who might need this.

Looping SQL query in python

I am writing a python script which queries the database for a URL string. Below is my snippet.
db.execute('select sitevideobaseurl,videositestring '
'from site, video '
'where siteID =1 and site.SiteID=video.VideoSiteID limit 1')
result = db.fetchall()
filename = '/home/Site_info'
output = open(filename, "w")
for row in result:
videosite= row[0:2]
link = videosite[0].format(videosite[1])
full_link = link.replace("http://","https://")
print full_link
output.write("%s\n"%str(full_link))
output.close()
The query basically gives a URL link.It gives me baseURL from a table and the video site string from another table.
output: https://www.youtube.com/watch?v=uqcSJR_7fOc
SiteID is the primary key which is int and not in sequence.
I wish to loop this sql query to pick a new siteId for every execution so that i have unique site URL everytime and write all the results to a file.
desired output: https://www.youtube.com/watch?v=uqcSJR_7fOc
https://www.dailymotion.com/video/hdfchsldf0f
There are about 1178 records.
Thanks for your time and help in advance.

I'm not sure if I completely understand what you're trying to do. I think your goal is to get a list of all links to videos. You get a link to a video by joining the sitevideobaseurl from site and videositestring from video.
From my experience it's much easier to let the database do the heavy lifting, it's build for that. It should be more efficient to join the tables, return all the results and then looping trough them instead of making subsequent queries to the database for each row.
The code should look something like this: (Be careful, I didn't test this)
query = """
select s.sitevideobaseurl,
v.videositestring
from video as v
join site as s
on s.siteID = v.VideoSiteID
"""
db.execute(query)
result = db.fetchall()
filename = '/home/Site_info'
output = open(filename, "w")
for row in result:
link = "%s%s" % (row[0],row[1])
full_link = link.replace("http://","https://")
print full_link
output.write("%s\n" % str(full_link))
output.close()
If you have other reasons for wanting to fetch these ony by one an idea might be to fetch a list of all SiteIDs and store them in a list. Afterwards you start a loop for each item in that list and insert the id into the query via a parameterized query.

Reports With Django

I'm trying to create a pretty advanced query within django and im having problems doing so, i can use the basic:
for obj in Invoice.objects.filter():
but if i try and move this into raw PostgreSQL query i get an error telling me that the relation does not exist am i doing something wrong, i am following the Preforming raw SQL on the django documents but i keep getting same error
full code:
def csv_report(request):
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
writer = csv.writer(response, csv.excel)
response.write(u'\ufeff'.encode('utf8'))
writer.writerow([
smart_str(u"ID"),
smart_str(u"value"),
smart_str(u"workitem content type"),
smart_str(u"created date"),
smart_str(u"workitem.id"),
smart_str(u"workitem"),
smart_str(u"workitem_content_type"),
])
for obj in Invoice.objects.raw('SELECT * from twm_Invoice'):
writer.writerow([
smart_str(obj.pk),
smart_str(obj.value),
smart_str(obj.workitem_content_type),
smart_str(obj.created_date),
smart_str(obj.workitem_id),
smart_str(obj.workitem),
smart_str(obj.workitem_content_type),
])
return response
i have tried to use the app now within front of the model name and without none of them seem to work.
Thanks J

try running your raw sql directly in the database.. my guess is that your table name is not correct, usually they're lowercase
BTW.. I hope you have a very good reason for using raw sql queries and not the awesome ORM ;)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

parsing of xml to mysql database - python

Related

Update database when CSV file is updated

sqlalchemy, can I get around creating the table classes?

Django Querying MongoDB ObjectIds in views from json object

Looping SQL query in python

Reports With Django

Categories

Resources