don´t create duplicated objects. django, python - python

I created a script to avoid creating duplicate objects but it still created the same objects when I run the command 3 times it creates them 3 times over and over again. I would like you to help me and know what is wrong with my code.
from django.core.management.base import BaseCommand
from jobs.models import Job
import json
from datetime import datetime
import dateparser
class Command(BaseCommand):
help = 'Set up the database'
def handle(self, *args: str, **options: str):
with open('static/newdata.json', 'r') as handle:
big_json = json.loads(handle.read())
for item in big_json:
if len(item['description']) == 0:
print('Not created. Description empty')
continue
dt = dateparser.parse(item['publication_date'])
existing_job = Job.objects.filter(
job_title = item['job_title'],
company = item['company'],
company_url = item['company_url'],
description = item['description'],
publication_date = dt,
salary = item['salary'],
city = item['city'],
district = item['district'],
job_url = item['job_url'],
job_type = item['job_type'],
)
if existing_job.exists() is True:
print('This Job already exist')
else:
Job.objects.create(
job_title = item['job_title'],
company = item['company'],
company_url = item['company_url'],
description = item['description'],
publication_date = dt,
salary = item['salary'],
city = item['city'],
district = item['district'],
job_url = item['job_url'],
job_type = item['job_type'],
)
self.stdout.write(self.style.SUCCESS('added jobs!'))

Have you tried using built-in field validation unique=True?
https://docs.djangoproject.com/en/3.1/ref/models/fields/#unique

try
if existing_job.exists():
instead of
if existing_job.exists() is True:
because .exists() returns boolean itself

Have you tried using unique_together without the publication_date field? Docs
# models.py
class Job(models.Model):
# Your fields here...
class Meta:
unique_together = [[
'job_title',
'company',
'company_url',
'description',
'salary',
'city',
'district',
'job_url',
'job_type'
]]

dt = dateparser.parse(item['publication_date'])
new_date = date(dt.year, dt.month, dt.day)
here was the problem. I am scraping the publication date in this format ('1 Week Ago') and then I was changing to a date and time format. and when I run the script again the time of the conversion is a different time. so that's why the job is created again. because is not the same because the time creation

Related

Is there a way to pass data from a python list into particular fields of an sqlite database using a django command?

I have data that has been scraped from a website, parsed, and cleaned into the pieces I need. The data is stored in a two dimensional list as such [[address1, name1, ip1], [address2, name2, ip2]]. The scraping and storing of the aforementioned data is done through a django command, and I would like to update my model with the same command as it validates the data. I also have a model with the following fields and attributes:
class MonitoredData(models.Model):
external_id = models.UUIDField(
primary_key = True,
default = uuid.uuid4,
editable = False)
mac_address = models.CharField(max_length=12)
ipv4_address = models.CharField(max_length=200)
interface_name = models.CharField(max_length=200)
created_at = models.DateTimeField(auto_now_add=True)
update_at = models.DateTimeField(auto_now=True)
address1 needs to go in the mac_address field, name1 needs to go into the interface_name field, and ip1 needs to go into ipv4_address field. The other fields need to auto-fill according to their attributes.
The django command that grabs and parses the data is:
class Command(BaseCommand):
def handle(self, *args, **options):
url1 = 'https://next.json-generator.com/api/json/get/41wV8bj_O'
url2 = 'https://next.json-generator.com/api/json/get/Nk48cbjdO'
res1 = requests.get(url1)
data1 = str(res1.content)
res2 = requests.get(url2)
data2 = str(res2.content)
parsedData1 = parse1(data1)
goodMac1 = []
badMac1 = []
for k in parsedData1:
if len(k[0]) == 12:
if match(k[0]):
goodMac1.append(k)
else:
badMac1.append(k)
parsedData2 = parse2(data2)
goodMac2 = []
badMac2 = []
for j in parsedData2:
if len(j[0]) == 12:
if match(j[0]):
goodMac2.append(j)
else:
badMac2.append(j)
I'd like to store the data into the database instead of appending to the goodMac list in the nested if statement.
Any help with this would be greatly appreciated, I am using Python 3.7.5 and Django 3.0.5
I figured it out! I hope this will save someone all the time and trouble I went through solving this, the solution, as I suspected, was fairly trivial once I found it. You import your model, instantiate an object of it, then update the fields and use the save() function. Here is the fixed code.
import requests
from django.core.management.base import BaseCommand, CommandError
from Monitor.models import *
from Monitor.parse1 import parse1
from Monitor.parse2 import parse2
from Monitor.matcher import match
class Command(BaseCommand):
def handle(self, *args, **options):
url1 = 'https://next.json-generator.com/api/json/get/41wV8bj_O'
url2 = 'https://next.json-generator.com/api/json/get/Nk48cbjdO'
res1 = requests.get(url1)
data1 = str(res1.content)
res2 = requests.get(url2)
data2 = str(res2.content)
parsedData1 = parse1(data1)
goodMac1 = []
badMac1 = []
for k in parsedData1:
if len(k[0]) == 12:
if match(k[0]):
monInter = MonitoredData()
monInter.mac_address = k[0]
monInter.interface_name = k[1]
monInter.ipv4_address = k[2]
monInter.save()
goodMac1.append(k)
else:
badMac1.append(k)
parsedData2 = parse2(data2)
goodMac2 = []
badMac2 = []
for j in parsedData2:
if len(j[0]) == 12:
if match(j[0]):
goodMac2.append(j)
else:
badMac2.append(j)
Here are links to the documentation I ultimately ended up using:
https://docs.djangoproject.com/en/3.0/ref/models/instances/#django.db.models.Model.save
https://docs.djangoproject.com/en/3.0/topics/db/models/

Django import export edit queryset before export

I'm trying to calculate the pending amount via models and export result in the csv. But the csv shows an empty column for amountpending
class FinancePendingResource(resources.ModelResource):
invoiceNumber = Field(attribute='invoiceNumber', column_name='Invoice Number')
student = Field(attribute='student', column_name='Student')
Schedule = Field(attribute='Schedule', column_name='Schedule')
TotalAmount = Field(attribute='TotalAmount', column_name='Total Value(PKR ₨)')
issueDate = Field(attribute='issueDate', column_name='Issue Date')
dueDate = Field(attribute='dueDate', column_name='Due Date')
amountPaid = Field(attribute='amountPaid', column_name='Amount Paid (PKR ₨)')
class Meta:
model = FinancePending
import_id_fields = ('invoiceNumber',)
fields = ('invoiceNumber', 'student', 'amountPaid', 'issueDate', 'dueDate', 'Schedule', 'TotalAmount',
'AmountPending',)
exclude = ('id',)
skip_unchanged = True
report_skipped = True
def before_export(self, queryset, *args, **kwargs):
amount_paid = FinancePending.objects.values_list('amountPaid', flat=True)
amount_paid = list(amount_paid)
total_amount = FinancePending.objects.values_list('TotalAmount', flat=True)
total_amount = list(total_amount)
# total - paid
TotalFee = [float(s.replace(',', '')) for s in total_amount]
AmountPaid = [float(s.replace(',', '')) for s in amount_paid]
def Diff(li1, li2):
return (list(set(li1) - set(li2)))
amount_pending = Diff(TotalFee, AmountPaid)
finance_pending = FinancePending()
i = 1
while i <= len(amount_pending):
FinancePending.objects.filter(invoiceNumber=i).update(AmountPending=str(amount_pending[i]))
i = i + 1
queryset.refresh_from_db()
Assuming that you have the data to compute amountPending already in the dataset, perhaps you don't need to read from the DB: you could calculate the amount by processing the dataset in memory. This could be done in after_export(). Then you can added the computed column to the dataset.
Perhaps tablib's dynamic columns can assist in adding the amountPending column:
import decimal
import tablib
headers = ('invoiceNumber', 'amountPaid', 'totalAmount')
rows = [
('inv100', '100.00', "500.00"),
('inv101', '200.00', "250.00")
]
def amount_pending(row):
return decimal.Decimal(row[2]) - decimal.Decimal(row[1])
data = tablib.Dataset(*rows, headers=headers)
data.append_col(amount_pending, header="amountPending")
print(data)
This will produce the following:
invoiceNumber|amountPaid|totalAmount|amountPending
-------------|----------|-----------|-------------
inv100 |100.00 |500.00 |400.00
inv101 |200.00 |250.00 |50.00

How to iterate over Query List of django model

I am reading a json file from a website and if the record is not in my Customers queryset I want to create a new Customer for that record. What is happening is when I iterate over the queryset, Django is trying to create a new Customer even when it is already in the queryset.
Please see my code below:
from rest_framework import generics
from customer.models import Customers
from .serializers import CustomersSerializer
import json
import urllib.request
class CustomerAPIView(generics.ListAPIView):
j = urllib.request.urlopen("https://web.njit.edu/~jsd38/json/customer.json")
customer_data = json.load(j)
queryset1 = Customers.objects.values_list('CustomerId', flat=True)
for customer in customer_data:
if customer["#Id"] not in queryset1.iterator():
CustomerId = customer["#Id"]
Name = customer["Name"]
PhoneNumber = customer["PhoneNumber"]
EmailAddress = customer["EmailAddress"]
StreetLine = customer["Address"]["StreetLine1"]
City = customer["Address"]["City"]
StateCode = customer["Address"]["StateCode"]
PostalCode = customer["Address"]["PostalCode"]
cus = Customers()
cus.CustomerId = CustomerId
cus.Name = Name
cus.PhoneNumber = PhoneNumber
cus.EmailAddress = EmailAddress
cus.StreetLine = StreetLine
cus.City = City
cus.StateCode = StateCode
cus.PostalCode = PostalCode
cus.save()
queryset = Customers.objects.all()
serializer_class = CustomersSerializer
Your JSON is returning strings for the "#Id" key, I'm assuming your model Customers has integers as CustomerId field.
You should convert them to str or int:
if int(customer["#Id"]) not in queryset1:
...

Create error message datefield

I want to create an error message for following form:
class ExaminationCreateForm(forms.ModelForm):
class Meta:
model = Examination
fields = ['patient', 'number_of_examination', 'date_of_examination']
Models:
class Patient(models.Model):
patientID = models.CharField(max_length=200, unique=True, help_text='Insert PatientID')
birth_date = models.DateField(auto_now=False, auto_now_add=False, help_text='YYYY-MM-DD')
gender = models.CharField(max_length=200,choices=Gender_Choice, default='UNDEFINED')
class Examination(models.Model):
number_of_examination = models.IntegerField(choices=EXA_Choices)
patient = models.ForeignKey(Patient, on_delete=models.CASCADE)
date_of_examination = models.DateField(auto_now=False, auto_now_add=False, help_text='YYYY-MM-DD')
Every Patient has 2 Examinations (number of examination = Choices 1 or 2) and the error message should be activated when the date of the second examination < date of the first examination. Something like this:
Solution: `
def clean_date_of_examination(self):
new_exam = self.cleaned_data.get('date_of_examination')
try:
old_exam = Examination.objects.get(patient=self.cleaned_data.get('patient'))
except Examination.DoesNotExist:
return new_exam
if old_exam:
if old_exam.date_of_examination > new_exam:
raise forms.ValidationError("Second examination should take place after first examination")
return new_exam`
def clean_date_of_examination(self):
new_exam = self.cleaned_data.get('date_of_examination')
old_exam = Examination.objects.get(patient = self.cleaned_data.get('Patient'))
if old_exam:
if old_exam.date_of_examination > new_exam.date_of_examination:
raise forms.ValidationError("Second examination should take place after first examination")
return data
def clean_date_of_examination(self):
# Where 'data' is used?
date_of_exam = self.cleaned_data['date_of_examination']
try:
pat1 = Patient.object.get(examination__number_of_examination=1, date_of_examination=date_of_exam)
except Patiens.DoesNotExist:
# Patient 1 with given query doesn't exist. Handle it!
try:
pat2 = Patient.object.get(examination__number_of_examination=2, date_of_examination=date_of_exam)
except Patiens.DoesNotExist:
# Patient 2 with given query doesn't exist. Handle it!
if pat2.date_of_examination < pat1.date_of_examination:
raise forms.ValidationError("Second examination should take place after first examination")`
return data`

Queryset from a ManyToMany relation

I'm creating a little calendar app in Django. I have two model classes; Calendar and Event. An event can be in multiple calendars. Because of this I'm using a ManyToMany relation.
This is my model
from django.db import models
class Calendar(models.Model):
title = models.CharField(max_length = 255)
def __unicode__(self):
return self.title
class Event(models.Model):
title = models.CharField(max_length = 255)
start_date = models.DateField()
end_date = models.DateField(blank = True, null = True)
location = models.CharField(blank = True, max_length = 255)
description = models.TextField(blank = True)
important = models.BooleanField(default = False)
calendar = models.ManyToManyField(Calendar)
How can I get a queryset with all events from a specific calendar?
You would use the .event_set attribute on an instance of a Calendar record. Like this:
# create two calendars
one = models.Calendar.objects.create(title='calendar one')
two = models.Calendar.objects.create(title='calendar two')
# attach event 1 to both calendars
event = models.Event.objects.create(title='event 1', start_date='2011-11-11')
one.event_set.add(event)
two.event_set.add(event)
# attach event 2 to calendar 2
two.event_set.add(models.Event.objects.create(title='event 2', start_date='2011-11-11'))
# get and print all events from calendar one
events_one = models.Calendar.objects.get(title='calendar one').event_set.all()
print [ event.title for event in events_one ]
# will print: [u'event 1']
# get and print all events from calendar two
events_two = models.Calendar.objects.get(title='calendar two').event_set.all()
print [ event.title for event in events_two ]
# will print: [u'event 1', u'event 2']
models.Calendar.objects.get(title='two').event_set.all()
Django automatically provides a way to access the related objects in a ManyToMany relationship:
events = my_calendar.events.all()
See the docs on many-to-many relationships.
If you don't already have a calendar instance, but just an ID or name, you can do the whole thing in one query:
events = Event.objects.filter(calendar__id=my_id)
mycalendar = Calendar.objects.get(id=1)
events = mycalendar.event_set.all()
Taken and modified from: http://docs.djangoproject.com/en/dev/topics/db/queries/#many-to-many-relationships

Categories

Resources