How to get values from different tables from DB faster? - python

so our DB was designed very badly. There is no foreign key used to link multiple tables
I need to fetch complete information and export it to csv. the challenge is the information need to be queried from multiple tables (say for e.g, usertable only stored sectionid in the table, in order to get section detail, I would have to query from section table and match it with sectionid acquired from usertable).
So i did this using serializer, because the fields are multiples.
So the problem with my current method is that its so slow because it needs to query for each object(queryset) to match with other tables using uuid/userid/anyid.
this is my views
class FileDownloaderSerializer(APIView):
def get(self, request, **kwargs):
filename = "All-users.csv"
f = open(filename, 'w')
datas = Userstable.objects.using(dbname).all()
serializer = UserSerializer( datas, context={'sector': sector}, many=True)
df=serializer.data
df.to_csv(f, index=False, header=False)
f.close()
wrapper = FileWrapper(open(filename))
response = HttpResponse(wrapper, content_type='text/csv')
response['Content-Length'] = os.path.getsize(filename)
response['Content-Disposition'] = "attachment; filename=%s" % filename
return response
so notice that i need one file exported which is .csv.
this is my serializer
class UserSerializer(serializers.ModelSerializer):
class Meta:
model = Userstable
fields = _all_
section=serializers.SerializerMethodField()
def get_section(self, obj):
return section.objects.using(dbname.get(pk=obj.sectionid).sectionname
department =serializers.SerializerMethodField()
def get_department(self, obj):
return section.objects.using(dbname).get(pk=obj.deptid).deptname
im showing only two tables here, but in my code i have total of 5 different tables
I tried to limit 100 rows and it is successful, i tried to fecth 300000 and it took me 3 hours to download csv. certainly not efficient. How can i solve this?

Related

Save excel file from main directory to database model in Django

How to save the file generated from pd.dataframe to certain database record.
This is the view..
#csrf_exempt
def Data_Communication(request):
if request.method == 'POST':
data_sets_number = (len(request.POST)) - 1
Data_Sets_asNestedList = []
Data_set_id = request.POST.get('id')
Data_instance = Data_Sets.objects.get(pk=Data_set_id)
for x in range(data_sets_number):
i = 1
Data_Sets_asNestedList.append(request.POST.getlist('Data'+str(i)))
i = i + 1
pd.DataFrame(Data_Sets_asNestedList).to_excel('output.xlsx', header=False, index=False)
print(Data_Sets_asNestedList)
return HttpResponse('1')
If you're looking to associate the generated Excel file with the model Data_Sets, then you'd probably want to add a FileField to that model:
class Data_Sets(models.Model):
excel_file = fields.FileField()
Once you've created the Excel file in your view, you can then associate it with the new field:
from django.core.files import File
#csrf_exempt
def Data_Communication(request):
if request.method == 'POST':
data_sets_number = (len(request.POST)) - 1
Data_Sets_asNestedList = []
Data_set_id = request.POST.get('id')
Data_instance = Data_Sets.objects.get(pk=Data_set_id)
for x in range(data_sets_number):
i = 1
Data_Sets_asNestedList.append(request.POST.getlist('Data'+str(i)))
i = i + 1
pd.DataFrame(Data_Sets_asNestedList).to_excel('output.xlsx', header=False, index=False)
# Associate the Excel file with the model
with open('output.xlsx', 'rb') as excel:
Data_instance.excel_file.save('output.xlsx', File(excel))
print(Data_Sets_asNestedList)
return HttpResponse('1')
The excel file itself will be saved into the folder specified by the MEDIA_ROOT setting in your settings.py, and the model will point to that file via the excel_file attribute.
Note that you may want to generate a unique filename for output.xlsx to avoid requests from treading on each other.
Additional info on saving a file can be found here.
Don't randomly insert your data to database, use django validation system to validate your data first.
check bulk_create api to store large chunks of records.

Searching in multiple models' tables in Django Rest Framework

I have 3 tables
PC(ID, PcNAME, Brand)
CellPhoness(ID, CellPhoneName, Brand)
Printers(ID, PrinterName, Brand).
There is no relationship between the 3 tables. I would like to run a query where the user can input the search string and the program will search the 3 models for where the data exists and return the same with the Id, name, and brand in the form of a JSON response.
You can do something like this:
Get query text from query params
Filter based on it
Return serializer data
def view(request):
query = request.GET.get("query", None)
pcs = PC.objects.all()
cell_phones = CellPhone.objects.all()
printers = Printer.objects.all()
if query:
pcs = pcs.filter(name__icontains=query)
cell_phones = cell_phones.filter(name__icontains=query)
printers = printers.filter(name__icontains=query)
return JsonResponse({"pcs": PCSerializer(instances=pcs, many=True).data,
"cell_phones": CellPhoneSerializer(instances=cell_phones, many=True).data,
"printers": PrinterSerializer(instances=printers, many=True).data})
You'll need to create serializers for each objects, please have a look at this documentation.

Django REST framework without model

I want to use Django REST framework to create an API to call different methods. I read the guide of [django-rest-framework][1] to work with this framework, but I still have some questions.
I have no model I get my data from an external database. I want to try first something simple:
Get all list of project
Get the data from one project
For that I create new app I include in the setting file and in my view.py I include that for the fist case
def connect_database():
db = MySQLdb.connect(host='...', port=, user='...', passwd='...', db='...')
try:
cursor = db.cursor()
cursor.execute('SELECT * FROM proj_cpus')
columns = [column[0] for column in cursor.description]
# all_rows = cursor.fetchall()
all_rows = []
for row in iter_row(cursor):
all_rows.append(dict(zip(columns, row)))
finally:
db.close()
return all_rows
def iter_row(cursor, size= 1000):
while True:
results = cursor.fetchmany(size)
if not results:
break
for item_result in results:
yield item_result
class cpuProjectsViewSet(viewsets.ViewSet):
serializer_class = serializers.cpuProjectsSerializer
def list(self, request):
all_rows = connect_database()
name_project = []
for item_row in all_rows:
name_project.append(item_row['project'])
name_project = list(sorted(set(name_project)))
serializer = serializers.cpuProjectsSerializer(instance=name_project, many=False)
return Response(serializer.data)
my serializers file I have this
class cpuProjectsSerializer(serializers.Serializer):
project = serializers.CharField(max_length=256)
def update(self, instance, validated_data):
instance.project = validated_data.get('project', instance.project)
return instance
Now when I execute this http://127.0.0.1:8000/hpcAPI
I obtain this error
Got AttributeError when attempting to get a value for field `project` on serializer `cpuProjectsSerializer`.
The serializer field might be named incorrectly and not match any attribute or key on the `list` instance.
Original exception text was: 'list' object has no attribute 'project'.
I look for in google and I change this
serializers.cpuProjectsSerializer(instance=name_project, many=False) for
serializers.cpuProjectsListSerializer(instance=name_project, many=False)
but I obtain the same error!
any idea about that!
Thanks in adavances
From docs here.You don't have to have a model for create a Serializer class.You can define some serializers field then use them. You should not import CPUProjectsViewSet and also define it in below
from mysite.hpcAPI.serializers import CPUProjectsViewSet
class CPUProjectsViewSet(viewsets.ViewSet):
"""
return all project name
"""
all_rows = connect_database()

How do I make a Django database query that has multiple filters?

I have a database of artists and paintings, and I want to query based on artist name and painting title. The titles are in a json file (the artist name comes from ajax) so I tried a loop.
def rest(request):
data = json.loads(request.body)
artistname = data['artiste']
with open('/static/top_paintings.json', 'r') as fb:
top_paintings_dict = json.load(fb)
response_data = []
for painting in top_paintings_dict[artist_name]:
filterargs = {'artist__contains': artistname, 'title__contains': painting}
response_data.append(serializers.serialize('json', Art.objects.filter(**filterargs)))
return HttpResponse(json.dumps(response_data), content_type="application/json")
It does not return a list of objects like I need, just some ugly double-serialized json data that does no good for anyone.
["[{\"fields\": {\"artist\": \"Leonardo da Vinci\", \"link\": \"https://trove2.storage.googleapis.com/leonardo-da-vinci/the-madonna-of-the-carnation.jpg\", \"title\": \"The Madonna of the Carnation\"}, \"model\": \"serve.art\", \"pk\": 63091}]",
This handler works and returns every painting I have for an artist.
def rest(request):
data = json.loads(request.body)
artistname = data['artiste']
response_data = serializers.serialize("json", Art.objects.filter(artist__contains=artistname))
return HttpResponse(json.dumps(response_data), content_type="application/json")
I just need to filter my query by title as well as by artist.
inYour problem is that you are serializing the data to json twice - once with serializers.serialize and then once more with json.dumps.
I don't know the specifics of your application, but can chain filters in django. So I would go with your second approach and just replace the line
response_data = serializers.serialize("json", Art.objects.filter(artist__contains=artistname))
with
response_data = serializers.serialize("json", Art.objects.filter(artist__contains=artistname).filter(title__in=paintings))
Check the queryset documentation.
The most efficient way to do this for a __contains search on painting title would be to use Q objects to or together all your possible painting names:
from operator import or_
def rest(request):
data = json.loads(request.body)
artistname = data['artiste']
with open('/static/top_paintings.json', 'r') as fb:
top_paintings_dict = json.load(fb)
title_filters = reduce(or_, (Q(title__contains=painting) for painting in top_paintings_dict[artist_name]))
paintings = Art.objects.filter(title_filters, artist__contains=artist_name)
That'll get you a queryset of paintings. I suspect your double serialization is not correct, but it seems you're happy with it in the single artist name case so I'll leave that up to you.
The reduce call here is a way to build up the result of |ing together multiple Q objects - operator.or_ is a functional handle for |, and then I'm using a generator expression to create a Q object for each painting name.

How would I write a CSV file populated with my sqlite3 db?

I'm a little confused on how I would populate the following csv function with the information in my models.py for a given user. Can anyone point me in the right direction? Do I need to process the information in a separare py file, or can I do it in my views?
My view to download the info
def download(request):
response = HttpResponse(mimetype='text/csv')
response['Content-Disposition'] = 'attachment; filename=UserData.csv'
writer = csv.writer(response)
writer.writerow(['Date', 'HighBGL', 'LowBGL', 'Diet', 'Weight', 'Height', 'Etc'])
writer.writerow(['Info pertaining to date 1'])
writer.writerow(['info pertaining to date 2'])
return response
One of the models who's info i'm interesting in saving
class DailyVital(models.Model):
user = models.ForeignKey(User)
entered_at = models.DateTimeField()
high_BGL = models.IntegerField()
low_BGL = models.IntegerField()
height = models.IntegerField(blank = True, null = True)
weight = models.IntegerField(blank = True, null = True)
First you need to query your django model, something like: DailyVital.objects.all() or DailyVital.objects.filter(user=request.user)
Then you can either transform the objects manually into tuples, or you can use Django QuerySet's values_list method with a list of field names to return tuples instead of objects. Something like:
def download(request):
response = HttpResponse(mimetype='text/csv')
response['Content-Disposition'] = 'attachment; filename=UserData.csv'
writer = csv.writer(response)
writer.writerow(['Date', 'HighBGL', 'LowBGL', 'Weight', 'Height'])
query = DailyVital.objects.filter(user=request.user)
for row in query.values_list('entered_at', 'high_BGL', 'low_BGL', 'weight', 'height'):
writer.writerow(row)
return response
If you didn't need it in Django, you might also consider the sqlite3 command line program's -csv option.
An easy way to do this would be to convert your models into a list of lists.
First you need an object to list function:
def object2list(obj, attr_list):
" returns values (or None) for the object's attributes in attr_list"
return [getattr(obj, attr, None) for attr in attr_list]
Then you just pass that to the csvwriter with a list comprehension (given some list_of_objects that you've queried)
attr_list = ['date', 'high_BGL', 'low_BGL', 'diet', 'weight', 'height']
writer.writerows([object2list(obj, attr_list) for obj in list_of_objects])

Categories

Resources