How would I write a CSV file populated with my sqlite3 db? - python

I'm a little confused on how I would populate the following csv function with the information in my models.py for a given user. Can anyone point me in the right direction? Do I need to process the information in a separare py file, or can I do it in my views?
My view to download the info
def download(request):
response = HttpResponse(mimetype='text/csv')
response['Content-Disposition'] = 'attachment; filename=UserData.csv'
writer = csv.writer(response)
writer.writerow(['Date', 'HighBGL', 'LowBGL', 'Diet', 'Weight', 'Height', 'Etc'])
writer.writerow(['Info pertaining to date 1'])
writer.writerow(['info pertaining to date 2'])
return response
One of the models who's info i'm interesting in saving
class DailyVital(models.Model):
user = models.ForeignKey(User)
entered_at = models.DateTimeField()
high_BGL = models.IntegerField()
low_BGL = models.IntegerField()
height = models.IntegerField(blank = True, null = True)
weight = models.IntegerField(blank = True, null = True)

First you need to query your django model, something like: DailyVital.objects.all() or DailyVital.objects.filter(user=request.user)
Then you can either transform the objects manually into tuples, or you can use Django QuerySet's values_list method with a list of field names to return tuples instead of objects. Something like:
def download(request):
response = HttpResponse(mimetype='text/csv')
response['Content-Disposition'] = 'attachment; filename=UserData.csv'
writer = csv.writer(response)
writer.writerow(['Date', 'HighBGL', 'LowBGL', 'Weight', 'Height'])
query = DailyVital.objects.filter(user=request.user)
for row in query.values_list('entered_at', 'high_BGL', 'low_BGL', 'weight', 'height'):
writer.writerow(row)
return response
If you didn't need it in Django, you might also consider the sqlite3 command line program's -csv option.

An easy way to do this would be to convert your models into a list of lists.
First you need an object to list function:
def object2list(obj, attr_list):
" returns values (or None) for the object's attributes in attr_list"
return [getattr(obj, attr, None) for attr in attr_list]
Then you just pass that to the csvwriter with a list comprehension (given some list_of_objects that you've queried)
attr_list = ['date', 'high_BGL', 'low_BGL', 'diet', 'weight', 'height']
writer.writerows([object2list(obj, attr_list) for obj in list_of_objects])

Related

Xlwt Excel Export Foreign Key By Actual Values / Django

I export products in excel format using xlwt.But foreign key fields are exported as id.
How can I export foreign key fields with their actual values?
I want to export brand_id and author fields with their actual values.
Here is my product model :
class Product(models.Model):
id = models.AutoField(primary_key=True)
author = models.ForeignKey(User,on_delete= models.CASCADE, verbose_name='Product Author', null=True)
brand_id = models.ForeignKey(Brand,on_delete=models.CASCADE, verbose_name="Brand Names")
name = models.CharField(max_length=255, verbose_name="Product Name")
barcode = models.CharField(max_length=255, verbose_name="Barcode")
unit = models.CharField(max_length=255,verbose_name="Product Unit")
def __str__(self):
return self.name
Here is my export view:
def export_excel(request):
response = HttpResponse(content_type='application/ms-excel')
response['Content-Disposition'] = "attachment; filename=Products-" + str(datetime.datetime.now().date())+".xls"
wb = xlwt.Workbook(encoding="utf-8")
ws = wb.add_sheet('Products')
row_num = 0
font_style = xlwt.XFStyle()
font_style.font.bold = True
columns = ["Product Id","Product Author","Product Brand","Product Name","Product Barcode","Product Unit"]
for col_num in range(len(columns)):
ws.write(row_num,col_num,columns[col_num],font_style)
font_style = xlwt.XFStyle()
rows = Product.objects.filter(author = request.user).values_list("id","author","brand_id","name","barcode","unit")
for row in rows:
row_num +=1
for col_num in range(len(row)):
ws.write(row_num,col_num,str(row[col_num]), font_style)
wb.save(response)
Thanks for your help. Kind regards
You could use django-import-export to export the data from a model to an excel file. This library also supports other data types in case you need them in the future.
As described in the documentation of django-import-export you can create a resource, which can then be used to both import and export data into a model. Start by creating a resource:
from import_export import resources
from import_export.fields import Field
from .models import Product
class ProductResource(resources.ModelResource):
author = Field() # for field with foreignkeys you need to add them here
brand_id = Field() # for field with foreignkeys you need to add them here
fields = ["id", "author", "brand_id", "name", "barcode", "unit"]
export_order = ["id", "author", "brand_id", "name", "barcode", "unit"]
def dehydrate_author(self, product: Product) -> str:
return f"{product.author.name}" # probably need to adapt the name of the field
def dehydrate_brand_id(self, product: Product) -> str:
return f"{product.brand_id.brand}" # probably need to adapt the name of the field
This is also documented here: django-import-export advanced manipulation
Now you can use this ModelResource to export your data to any supported format, in your case an Excel file. Import your resource you've created earlier all you need to do to return this in your view is the following:
from django.http import HttpResponse
from .resource import ProductRes
#... other code in your view
project_resource = ProjectResource()
dataset = project_resource.export()
response = HttpResponse(dataset.xlsx, ontent_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
response["Content-Disposition"] = 'attachment; filename="projects_export.xlsx"'

How to get values from different tables from DB faster?

so our DB was designed very badly. There is no foreign key used to link multiple tables
I need to fetch complete information and export it to csv. the challenge is the information need to be queried from multiple tables (say for e.g, usertable only stored sectionid in the table, in order to get section detail, I would have to query from section table and match it with sectionid acquired from usertable).
So i did this using serializer, because the fields are multiples.
So the problem with my current method is that its so slow because it needs to query for each object(queryset) to match with other tables using uuid/userid/anyid.
this is my views
class FileDownloaderSerializer(APIView):
def get(self, request, **kwargs):
filename = "All-users.csv"
f = open(filename, 'w')
datas = Userstable.objects.using(dbname).all()
serializer = UserSerializer( datas, context={'sector': sector}, many=True)
df=serializer.data
df.to_csv(f, index=False, header=False)
f.close()
wrapper = FileWrapper(open(filename))
response = HttpResponse(wrapper, content_type='text/csv')
response['Content-Length'] = os.path.getsize(filename)
response['Content-Disposition'] = "attachment; filename=%s" % filename
return response
so notice that i need one file exported which is .csv.
this is my serializer
class UserSerializer(serializers.ModelSerializer):
class Meta:
model = Userstable
fields = _all_
section=serializers.SerializerMethodField()
def get_section(self, obj):
return section.objects.using(dbname.get(pk=obj.sectionid).sectionname
department =serializers.SerializerMethodField()
def get_department(self, obj):
return section.objects.using(dbname).get(pk=obj.deptid).deptname
im showing only two tables here, but in my code i have total of 5 different tables
I tried to limit 100 rows and it is successful, i tried to fecth 300000 and it took me 3 hours to download csv. certainly not efficient. How can i solve this?

Adding rows manually to StreamingHttpResponse (Django)

I am using Django's StreamingHttpResponse to stream a large CSV file on the fly. According to the docs, an iterator is passed to the response's streaming_content parameter:
import csv
from django.http import StreamingHttpResponse
def get_headers():
return ['field1', 'field2', 'field3']
def get_data(item):
return {
'field1': item.field1,
'field2': item.field2,
'field3': item.field3,
}
# StreamingHttpResponse requires a File-like class that has a 'write' method
class Echo(object):
def write(self, value):
return value
def get_response(queryset):
writer = csv.DictWriter(Echo(), fieldnames=get_headers())
writer.writeheader() # this line does not work
response = StreamingHttpResponse(
# the iterator
streaming_content=(writer.writerow(get_data(item)) for item in queryset),
content_type='text/csv',
)
response['Content-Disposition'] = 'attachment;filename=items.csv'
return response
My question is: how can I manually write a row on the CSV writer? manually calling writer.writerow(data) or writer.writeheader() (which also internally calls writerow()) does not seem to write to the dataset, and instead only the generated / streamed data from streaming_content is written on the output dataset.
The answer is yielding results with a generator function instead of calculating them on the fly (within StreamingHttpResponse's streaming_content argument) and using the pseudo buffer we created (Echo Class) in order to write a row to the response:
import csv
from django.http import StreamingHttpResponse
def get_headers():
return ['field1', 'field2', 'field3']
def get_data(item):
return {
'field1': item.field1,
'field2': item.field2,
'field3': item.field3,
}
# StreamingHttpResponse requires a File-like class that has a 'write' method
class Echo(object):
def write(self, value):
return value
def iter_items(items, pseudo_buffer):
writer = csv.DictWriter(pseudo_buffer, fieldnames=get_headers())
yield pseudo_buffer.write(get_headers())
for item in items:
yield writer.writerow(get_data(item))
def get_response(queryset):
response = StreamingHttpResponse(
streaming_content=(iter_items(queryset, Echo())),
content_type='text/csv',
)
response['Content-Disposition'] = 'attachment;filename=items.csv'
return response
The proposed solution can actually lead to incorrect/mismatched CSVs (header mismatched with data). You'd want to replace the affected section with something like:
header = dict(zip(fieldnames, fieldnames))
yield writer.writerow(header)
instead. This is from the implementation of writeheader https://github.com/python/cpython/blob/08045391a7aa87d4fbd3e8ef4c852c2fa4e81a8a/Lib/csv.py#L141:L143
For some reason, it's not behaving well with yield
Hope this helps someone in the future :)
Also note that no fix is needed if using python 3.8+ because of this PR: https://bugs.python.org/issue27497
you can chain generator using itertools in python to add header row to the queryset row
here is how you do it:
import itertools
def some_streaming_csv_view(request):
"""A view that streams a large CSV file."""
# Generate a sequence of rows. The range is based on the maximum number of
# rows that can be handled by a single sheet in most spreadsheet
# applications.
headers = [["title 1", "title 2"], ]
row_titles = (header for header in headers) # title generator
items = Item.objects.all()
rows = (["Row {}".format(item.pk), str(item.pk)] for item in items)
pseudo_buffer = Echo()
writer = csv.writer(pseudo_buffer)
rows = itertools.chain(row_titles, rows) # merge 2 generators
return StreamingHttpResponse(
(writer.writerow(row) for row in rows),
content_type="text/csv",
headers={'Content-Disposition': 'attachment; filename="somefilename.csv"'},
)
and you will get csv with the title and the queryset:
title 1, title 2
1, 1
2, 2
...

Conversion of SQL to Django

Below are my django models
class SourceFile(models.Model):
full_path = models.TextField(unique = False)
project_name = models.TextField(blank = True)
def __str__(self):
return self.full_path
class Coverage(models.Model):
line_pct = models.IntegerField(default = 0, blank = True)
source_file = models.ForeignKey(SourceFile, related_name = 'coverage', null = True)
date_generated = models.DateTimeField(default = timezone.now, blank = True)
def source_file_full_path(self):
return self.source_file.full_path
Now i want count of distinct id of source file table present in coverage table based on project_name.
I wrote a sql query for the same but unable to write django equivalent for the same.
select count(distinct(sf.id)), sf.project_name from coverage c inner join sourcefile sf on c.source_file_id = sf.id group by sf.project_name;
Please help with this
You should take a look at this:
https://docs.djangoproject.com/en/1.11/topics/db/queries/#related-objects
You can use raw sql in django but it is much easier to use django manager.
I'm sorry I can't provide more specific help but I am not entirely sure what exactly you need.

How to convert results of Model queries into xlsx using pandas in python?

I have a standart model's collection in my project like this one:
class Dimension(models.Model):
dimension_id = MyCharField(max_length=1024, primary_key=True)
name = MyCharField(max_length=255, null = False, unique=True)
external_flg = models.BooleanField(default = False)
ext_owner = MyCharField(max_length=30, null = True)
ext_table_name = MyCharField(max_length=30, null = True)
ext_start_date_column_name = MyCharField(max_length=30, null = True)
ext_end_date_column_name = MyCharField(max_length=30, null = True)
ext_id_column_name = MyCharField(max_length=30, null = True)
ext_name_column_name = MyCharField(max_length=30, null = True)
ext_where_codition = MyCharField(max_length=512, null = True)
def save(self):
cnt =self.__class__.objects.filter(name=self.name).count()
if cnt==0:
if self.pk:
super(Dimension, self).save()
else:
self.dimension_id = getUid()
super(Dimension, self).save()
else:
raise DimensionUniqueError(self.name)
At this moment, I have to implement a button which will import data from our models to xlsx files and download it automatically on the client side.
We're planning to use pandas for sql to xlsx conversion, but I can't understand how to implement interaction with pandas for Models. For now I implemented it this way:
import pandas as ps
class Excel:
def __init__(self, model_name):
self.model_name = model_name
def sql_to_xlsx():
elements = self.model_name.objects.all()
filter = self.request.GET.get('filter', None)
if filter is not None:
elements = elements.filter(filter_field=filter)
columns = [desc[0] for desc in elements]
data = [desc[1:] for desc in elements]
df = ps.DataFrame(list(data), columns)
writer = ps.ExcelWriter('converted.xlsx')
df.to_excel(writer, sheet_name='converted')
return writer.save()
def get(self, request, *args, **kwargs):
document = self.sql_to_xlsx()
_file = open(document.filename, 'r')
response = HttpResponse(_file, content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s' % document.filename.split('/')[-1] # Here will return a full path, that's why probably you will need a split to get only the filename
add_never_cache_headers(response=response) # To avoid download the same file with out of date data.
return response
but it's not a correct logic what I would have expected to have. I have a feeling that it's not correct way to do.
Could you please help me to try out how to implement needed logic for our models?
Thank you!
I don't think you need pandas to write to excel.
I would use openpyxl for writing to excel. Just following the example on their page should be good enough for you. You just have to loop through your model and use openpyxl to write to specific rows and columns that you want.

Categories

Resources