How to send data into word document with Django? - python

I'm using Django I want to send some data from my database to a document word, I'm using Python-Docx for creating word documents I use the class ExportDocx it can generate a static word file but I want to place some dynamic data (e.g. product id =5, name=""..) basically all the details to the "product" into the document
class ExportDocx(APIView):
def get(self, request, *args, **kwargs):
queryset=Products.objects.all()
# create an empty document object
document = Document()
document = self.build_document()
# save document info
buffer = io.BytesIO()
document.save(buffer) # save your memory stream
buffer.seek(0) # rewind the stream
# put them to streaming content response
# within docx content_type
response = StreamingHttpResponse(
streaming_content=buffer, # use the stream's content
content_type='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
)
response['Content-Disposition'] = 'attachment;filename=Test.docx'
response["Content-Encoding"] = 'UTF-8'
return response
def build_document(self, *args, **kwargs):
document = Document()
sections = document.sections
for section in sections:
section.top_margin = Inches(0.95)
section.bottom_margin = Inches(0.95)
section.left_margin = Inches(0.79)
section.right_margin = Inches(0.79)
# add a header
document.add_heading("This is a header")
# add a paragraph
document.add_paragraph("This is a normal style paragraph")
# add a paragraph within an italic text then go on with a break.
paragraph = document.add_paragraph()
run = paragraph.add_run()
run.italic = True
run.add_text("text will have italic style")
run.add_break()
return document
This is the URL.py of the
path('<int:pk>/testt/', ExportDocx.as_view() , name='generate-testt'),
How can I generate it tho I think I need to make the data string so it can work with py-docx.
for the python-docx documentation: http://python-docx.readthedocs.io/

For a product record like: record = {"product_id": 5, "name": "Foobar"), you can add it to the document in your build_document()` method like:
document.add_paragraph(
"Product id: %d, Product name: %s"
% (record.product_id, record.name)
)
There are other more modern methods for interpolating strings, although this sprintf style works just fine for most cases. This resource is maybe not a bad place to start.

So I found out that I need to pass the model I was doing it but in another version of code and forgot to add it... Basically, I just had to add these lines of code, hope this helps whoever is reading this.
def get(self, request,pk, *args, **kwargs):
# create an empty document object
document = Document()
product = Product.objects.get(id=pk)
document = self.build_document(product)
And in the build of the document we just need to stringify it simply by using f'{queryset.xxxx}'
def build_document(self,queryset):
document = Document()
document.add_heading(f'{queryset.first_name}')

Related

Google Translate API - detect language + translate document (xlsx, csv)

I'm trying to use Google Cloud Translation API for translating an excel (or csv) document that includes text in multiple languages and my target language is english.
I would like to use "Translate text in batches (Advanced edition only)" code sample (link here: https://cloud.google.com/translate/docs/samples/translate-v3-batch-translate-text) but in the code sample is a line that defines the source language so there can only be one source language.
But I need to detect the langugage first in the document and then translate the text to english. There is code sample for detecting language in a simple string of a text "Detecting languages (Advanced)" (link: https://cloud.google.com/translate/docs/advanced/detecting-language-v3) but I need to combine the first code sample that translates documents (but only has one source language defined) with the ability to detect language instead of having one source language defined.
Is there this type of code sample in the resources? How could this be solved?
Here is the sample code in question:
from google.cloud import translate
def batch_translate_text(
input_uri="gs://YOUR_BUCKET_ID/path/to/your/file.txt",
output_uri="gs://YOUR_BUCKET_ID/path/to/save/results/",
project_id="YOUR_PROJECT_ID",
timeout=180,
):
"""Translates a batch of texts on GCS and stores the result in a GCS location."""
client = translate.TranslationServiceClient()
location = "us-central1"
# Supported file types: https://cloud.google.com/translate/docs/supported-formats
gcs_source = {"input_uri": input_uri}
input_configs_element = {
"gcs_source": gcs_source,
"mime_type": "text/plain", # Can be "text/plain" or "text/html".
}
gcs_destination = {"output_uri_prefix": output_uri}
output_config = {"gcs_destination": gcs_destination}
parent = f"projects/{project_id}/locations/{location}"
# Supported language codes: https://cloud.google.com/translate/docs/language
operation = client.batch_translate_text(
request={
"parent": parent,
"source_language_code": "en",
"target_language_codes": ["ja"], # Up to 10 language codes here.
"input_configs": [input_configs_element],
"output_config": output_config,
}
)
print("Waiting for operation to complete...")
response = operation.result(timeout)
print("Total Characters: {}".format(response.total_characters))
print("Translated Characters: {}".format(response.translated_characters))
Unfortunately it is not possible to pass array of values to field source_language_code using batchTranslateText. What I could suggest is to perform detectLanguage and translateText per file.
What the code below does is:
It extracts the content to be translated. For testing purposes the the csv files used only have 1 column and content for sample1.csv is in tl(Tagalog) and sample2.csv is in es(Spanish).
Pass the extracted content to detect_language() to get detected language code.
Pass all the required parameters to translate_text() to translate
NOTE: The code below is only tested using csv files with one column. Edit the code at main() to pattern on what column you would like to extract data.
from google.cloud import translate
import csv
def listToString(s):
""" Transform list to string"""
str1 = " "
return (str1.join(s))
def detect_language(project_id,content):
"""Detecting the language of a text string."""
client = translate.TranslationServiceClient()
location = "global"
parent = f"projects/{project_id}/locations/{location}"
response = client.detect_language(
content=content,
parent=parent,
mime_type="text/plain", # mime types: text/plain, text/html
)
for language in response.languages:
return language.language_code
def translate_text(text, project_id,source_lang):
"""Translating Text."""
client = translate.TranslationServiceClient()
location = "global"
parent = f"projects/{project_id}/locations/{location}"
# Detail on supported types can be found here:
# https://cloud.google.com/translate/docs/supported-formats
response = client.translate_text(
request={
"parent": parent,
"contents": [text],
"mime_type": "text/plain", # mime types: text/plain, text/html
"source_language_code": source_lang,
"target_language_code": "en-US",
}
)
# Display the translation for each input text provided
for translation in response.translations:
print("Translated text: {}".format(translation.translated_text))
def main():
project_id="your-project-id"
csv_files = ["sample1.csv","sample2.csv"]
# Perform your content extraction here if you have a different file format #
for csv_file in csv_files:
csv_file = open(csv_file)
read_csv = csv.reader(csv_file)
content_csv = []
for row in read_csv:
content_csv.extend(row)
content = listToString(content_csv) # convert list to string
detect = detect_language(project_id=project_id,content=content)
translate_text(text=content,project_id=project_id,source_lang=detect)
if __name__ == "__main__":
main()
sample1.csv:
kamusta
ayos
sample2.csv:
cómo estás
okey
Output using the code above:
Translated text: how are you okay
Translated text: how are you ok

Running multiple querys on YouTube API by looping through title columns of CSV python

I am using YouTubes API to get comment data from a list of music videos. The way I have it working right now is by manually typing in my query and then writing the data to a csv file and repeating for each song like such.
query = "song title"
query_results = service.search().list(
part = 'snippet',
q = query,
order = 'relevance', # You can consider using viewCount
maxResults = 20,
type = 'video', # Channels might appear in search results
relevanceLanguage = 'en',
safeSearch = 'moderate',
).execute()
What I would like to do is use the title and artist columns from a csv file that I have containing the song titles I am trying to gather data for so I can run the program once without having to manually type in the song each time.
A friend suggested using something like this
import pandas as pd
data = pd.read_csv("metadata.csv")
def songtitle():
for i in data.index:
title = data.loc[i,'title']
title = '\"' + title + '\"'
artist = data.loc[i,'artist']
return(artist, title)
But I am not sure how I would make this work because when I run this, it is only returning the final row of data, and even if it did run correctly, how I would handle getting the entire program to repeat it self for every instance of a new song.
you can save song title and artist to a list, the loop over that list to get details.
def get_songTitles():
data = pd.read_csv("metadata.csv")
return data['artist'].tolist(),data['title'].tolist()
artist, song_titles = get_songTitles()
for song in song_titles:
query_results = service.search().list(
part = 'snippet',
q = song,
order = 'relevance', # You can consider using viewCount
maxResults = 20,
type = 'video', # Channels might appear in search results
relevanceLanguage = 'en',
safeSearch = 'moderate',
).execute()

reportlab: how to set initial/default view?

I've managed to successfully generate an empy PDF, but it doesn't set the initial view. I'd like to set the initial view to "full view" i.e. the end user see one page fits the PDF reader (= an A4 page fits in the screen).
def render_to_response(self, context, **response_kwargs):
response = HttpResponse(content_type='application/pdf; charset=utf-8')
response['Content-Disposition'] = 'attachment; filename=""'
p = canvas.Canvas(response, pagesize=A4, )
p.showPage()
p.save()
return response
How to set the default zoom view (if it's possible) with reportlab?
In short
Add this line:
p.setViewerPreference("FitWindow", "true")
Explain
Set viewer preference with:
def setViewerPreference(self,pref,value):
set one of the allowed enbtries in the documents viewer preferences
Available pref and value are:
class ViewerPreferencesPDFDictionary(CheckedPDFDictionary):
validate=dict(
HideToolbar=checkPDFBoolean,
HideMenubar=checkPDFBoolean,
HideWindowUI=checkPDFBoolean,
FitWindow=checkPDFBoolean,
CenterWindow=checkPDFBoolean,
DisplayDocTitle=checkPDFBoolean, #contributed by mark Erbaugh
NonFullScreenPageMode=checkPDFNames(*'UseNone UseOutlines UseThumbs UseOC'.split()),
Direction=checkPDFNames(*'L2R R2L'.split()),
ViewArea=checkPDFNames(*'MediaBox CropBox BleedBox TrimBox ArtBox'.split()),
ViewClip=checkPDFNames(*'MediaBox CropBox BleedBox TrimBox ArtBox'.split()),
PrintArea=checkPDFNames(*'MediaBox CropBox BleedBox TrimBox ArtBox'.split()),
PrintClip=checkPDFNames(*'MediaBox CropBox BleedBox TrimBox ArtBox'.split()),
PrintScaling=checkPDFNames(*'None AppDefault'.split()),
)
Reference
ReportLab API Reference
ViewerPreferencesPDFDictionary

Extract OFS Image Pdata object and save it as Namedfile on Dexterity

I have an Archetype content that has field called file and it is MultiFileField (from archetypes.multifile.MultiFileField). The schema is something like:
MultiFileField('file',
primary=True,
languageIndependent=True,
widget = MultiFileWidget(
label= "File Uploads",
show_content_type = False,))
And I have a Dexterity content type that has the same field name which is file and I want to create a script that extract the stored uploaded object from the Archetype content and pass it on the Dexterity custom content type. The schema for Dexterity custom content type is:
form.widget(file=MultiFileFieldWidget)
file = schema.List(
title=_(u"File Attachment"),
required=False,
value_type=NamedFile(),
)
I observed that Archetype's MultiFileField stores the uploaded object as OFS Image Pdata, and for the latter part, it stores as plone.namedfile.file.NamedFile object. Is there a way to convert the OFS object into Namedfile object?
Update:
I have found a solution but I am not sure if it's the right thing.
for field in prev_obj.Schema().fields():
key = field.getName()
objects_list = []
value = field.getRaw(prev_obj)
for f in value:
data = str(f['file'].data)
filename = unicode(f['filename'])
contentType = f['content_type']
fileData = NamedFile(data=data, contentType=contentType, filename=filename)
objects_list.append(fileData)
new_obj.file = copy.copy(objects_list)
First off, you may want to use NamedBlobFile instead.
Then, have you tried something like this to convert the data?
from plone.namedfile.file import NamedBlobFile
new_obj.file = [NamedBlobFile(str(fdata), contentType=fdata.content_type, filename=fdata.filename) for fdata in previous_obj.getFile()]
Assuming you have both previous_obj and new_obj available.

How do I make a Django database query that has multiple filters?

I have a database of artists and paintings, and I want to query based on artist name and painting title. The titles are in a json file (the artist name comes from ajax) so I tried a loop.
def rest(request):
data = json.loads(request.body)
artistname = data['artiste']
with open('/static/top_paintings.json', 'r') as fb:
top_paintings_dict = json.load(fb)
response_data = []
for painting in top_paintings_dict[artist_name]:
filterargs = {'artist__contains': artistname, 'title__contains': painting}
response_data.append(serializers.serialize('json', Art.objects.filter(**filterargs)))
return HttpResponse(json.dumps(response_data), content_type="application/json")
It does not return a list of objects like I need, just some ugly double-serialized json data that does no good for anyone.
["[{\"fields\": {\"artist\": \"Leonardo da Vinci\", \"link\": \"https://trove2.storage.googleapis.com/leonardo-da-vinci/the-madonna-of-the-carnation.jpg\", \"title\": \"The Madonna of the Carnation\"}, \"model\": \"serve.art\", \"pk\": 63091}]",
This handler works and returns every painting I have for an artist.
def rest(request):
data = json.loads(request.body)
artistname = data['artiste']
response_data = serializers.serialize("json", Art.objects.filter(artist__contains=artistname))
return HttpResponse(json.dumps(response_data), content_type="application/json")
I just need to filter my query by title as well as by artist.
inYour problem is that you are serializing the data to json twice - once with serializers.serialize and then once more with json.dumps.
I don't know the specifics of your application, but can chain filters in django. So I would go with your second approach and just replace the line
response_data = serializers.serialize("json", Art.objects.filter(artist__contains=artistname))
with
response_data = serializers.serialize("json", Art.objects.filter(artist__contains=artistname).filter(title__in=paintings))
Check the queryset documentation.
The most efficient way to do this for a __contains search on painting title would be to use Q objects to or together all your possible painting names:
from operator import or_
def rest(request):
data = json.loads(request.body)
artistname = data['artiste']
with open('/static/top_paintings.json', 'r') as fb:
top_paintings_dict = json.load(fb)
title_filters = reduce(or_, (Q(title__contains=painting) for painting in top_paintings_dict[artist_name]))
paintings = Art.objects.filter(title_filters, artist__contains=artist_name)
That'll get you a queryset of paintings. I suspect your double serialization is not correct, but it seems you're happy with it in the single artist name case so I'll leave that up to you.
The reduce call here is a way to build up the result of |ing together multiple Q objects - operator.or_ is a functional handle for |, and then I'm using a generator expression to create a Q object for each painting name.

Categories

Resources