I have mp3 files stored in Amazon S3 and I have a MySQL database with a table called Songs. I want to run a Python script that updates my database by going to Amazon S3, retrieves details of the mp3 files (using ID3 for example) and then fills the Songs table in my database. I'm using Django. Is there any way that allows me to run this script by a simple click on an "update library" button for example through the Django admin panel? Also, is it possible to run it on a schedule?
P.S I'm new to both Django and Amazon S3
EDIT:
I wrote a small script that grabs meta tags from mp3 files in my local machine. Here is the code for it :
import eyeD3
import sys
import urllib
import os
class Track():
def __init__(self, audioFile):
self.title = audioFile.getTag().getTitle()
self.artist = audioFile.getTag().getArtist()
self.year = audioFile.getTag().getYear()
self.genre = audioFile.getTag().getGenre()
self.length = audioFile.getPlayTimeString()
self.album = audioFile.getTag().getAlbum()
def main():
for root, dirs, files in os.walk('.'):
for f in files:
if eyeD3.isMp3File(f):
audioFile = eyeD3.Mp3AudioFile(root+'/'+f)
t = Track(audioFile)
print t.artist," ",t.title, " ", t.length, " ", t.album, " ", t.genre
if __name__ == '__main__':
main()
I would like to find a way to run this script on Django even if ti's locally. I hope my point is clearer.
Thanks in advance !
You need to have a look at Boto and also django-storages for ideas on how to do what you'd like. django-storages makes it dead-simple to replace Django's FileStorage mechanism so you can upload imaes/files directly to your bucket(s) at S3.
Reading from S3 and updating your database objects is just the opposite workflow, but Boto makes it simple to get connected to the bucket(s) and read information.
Hope that helps you out.
Related
I'm working with Python hug API would like to create a GET API for the frontend. The frontend can download a created word document file e.g. via download button. However, after going through a documentation, I still cannot figure out a way to do it.
Here is my working script so far:
import os
import hug
from docx import Document
#hug.get("/download_submission_document")
def download_submission_document():
file_name = 'example.docx'
document = Document()
document.add_heading('Test header', level=2)
document.add_paragraph('Test paragraph')
document.save(file_name)
# TO DO: send a created file to frontend
I'm not sure if we can send the object right away or we have to save it first somewhere before sending the the frontend. (requirements: hug, python-docx)
I'm trying to use something like
#hug.get("/download_submission_document", output=hug.output_format.file)
but not sure how to return a file.
Alright, I found a solution which is easier than I thought. Just do the following:
#hug.get("/download_submission_document", output=hug.output_format.file)
def download_submission_document():
file_name = 'example.docx'
document = Document()
document.add_heading('Test header', level=2)
document.add_paragraph('Test paragraph')
document.save(file_name)
return file_name
Return file_name already download the docx
In my flask application, I am using a function to upload file to Amazon s3, using Boto.
Its working fine most of the cases, but some times its uploading files as zero byte file with no extension.
Why its failing sometimes,
I am validating user image file in form.
FileField('Your photo',validators=[FileAllowed(['jpg', 'png'], 'Images only!')])
My image upload function.
def upload_image_to_s3(image_from_form):
#upload pic to amazon
source_file_name_photo = secure_filename(image_from_form.filename)
source_extension = os.path.splitext(source_file_name_photo)[1]
destination_file_name_photo = uuid4().hex + source_extension
s3_file_name = destination_file_name_photo
# Connect to S3 and upload file.
conn = boto.connect_s3('ASJHjgjkhSDJJHKJKLSDH','GKLJHASDJGFAKSJDGJHASDKJKJHbbvhjcKJHSD')
b = conn.get_bucket('mybucket')
# Connect to S3 and upload file.
sml = b.new_key("/".join(["myfolder",destination_file_name_photo]))
sml.set_contents_from_string(image_from_form.read())
acl='public-read'
sml.set_acl(acl)
return s3_file_name
How large are your assets? If there is too large of an upload, you may have to multipart/chunk it otherwise it will timeout.
bucketObject.initiate_multipart_upload('/local/object/as/file.ext')
it means you will not be using set_contents_from_string but rather store and upload. You may have to use something to chuck the file, like FileChuckIO.
An example is here if this applies to you : http://www.bogotobogo.com/DevOps/AWS/aws_S3_uploading_large_file.php
Also, you may want to edit your post above and alter your AWS keys.
In my Django project I use Django-storageS to save media files in my Amazon S3.
I followed this tutorial (I use also Django-rest-framework). This works well for me: I can upload some images and I can see these on my S3 storage.
But, if I try to remove an instance of my model (that contains an ImageField) this not removes the corresponding file in S3. Is correct this? I need t remove also the resource in S3.
Deleting a record will not automatically delete the file in the S3 Bucket. In order to delete the S3 resource you need to call the following method on your file field:
model.filefield.delete(save=False) # delete file in S3 storage
You can perform this either in
The delete method of your model
A pre_delete signal
Here is an example of how you can achieve this in the delete model method:
def delete(self):
self.filefield.delete(save=False)
super().delete()
You can delete S3 files by offering its id (filename in the S3 storage) using following code:
import boto
from boto.s3.key import Key
from django.conf import settings
def s3_delete(id):
s3conn = boto.connect_s3(settings.AWS_ACCESS_KEY,
settings.AWS_SECRET_ACCESS_KEY)
bucket = s3conn.get_bucket(settings.S3_BUCKET)
k = Key(bucket)
k.key = str(id)
k.delete()
Make sure that you setup S3 variable correctly in settings.py including: AWS_ACCESS_KEY, AWS_SECRET_ACCESS_KEY, and S3_BUCKET.
This works for me in aws s3,hope it helps
import os
#receiver(models.signals.post_delete, sender=YourModelName)
def auto_delete_file_on_delete(sender, instance, **kwargs):
if instance.image:
instance.image.delete(save=False) ## use for aws s3
# if os.path.isfile(instance.image.path): ## use this in development
# os.remove(instance.image.path)
I ended up making a function to Django admin panel since in my case I don't remove files frequently.
If you want to delete files using API you may write your own destroy() in your serializer.
BUCKET_NAME = os.environ.get("AWS_STORAGE_BUCKET_NAME")
s3 = boto3.client('s3')
class UserFileAdmin(admin.ModelAdmin):
list_display = ('file')
actions = ['delete_completely']
def delete_completely(self, request, queryset):
for filemodel in queryset:
s3.delete_object(Bucket=BUCKET_NAME, Key=str(filemodel.file))
filemodel.delete()
delete_completely.short_description = 'Delete pointer and real file together'
I am learning Python/Django and my pet project is a photo sharing website. I would like to give users the ability to upload their photos using an email address like Posterous, Tumblr. Research has led me to believe I need to use the following:
-- cron job
-- python mail parser
-- cURL or libcurl
-- something that updates my database
How all these parts will work together is still where I need clarification. I know the cron will run a script that parses the email (sounds simple when reading), but how to get started with all these things is daunting. Any help in pointing me in the right direction, tutorials, or an answer would be greatly appreciated.
Read messages from maildir. It's not optimized but show how You can parse emails. Of course you should store information about files and users to database. Import models into this code and make right inserts.
import mailbox
import sys
import email
import os
import errno
import mimetypes
mdir = mailbox.Maildir(sys.argv [1], email.message_from_file)
for mdir_msg in mdir:
counter = 1
msg = email.message_from_string(str(mdir_msg))
for part in msg.walk():
# multipart/* are just containers
if part.get_content_maintype() == 'multipart':
continue
# Applications should really sanitize the given filename so that an
# email message can't be used to overwrite important files
filename = part.get_filename()
if not filename:
ext = mimetypes.guess_extension(part.get_content_type())
if not ext:
# Use a generic bag-of-bits extension
ext = '.bin'
filename = 'part-%03d%s' % (counter, ext)
counter += 1
fp = open(os.path.join('kupa', filename), 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
#photomodel imported from yourapp.models
photo = PhotoModel()
photo.name = os.path.join('kupa', filename)
photo.email = ....
photo.save()
Not sure what you need cURL for in that list - what's it supposed to be doing?
You don't really say where you're having trouble. It seems to me you can do all this in a Django management command, which can be triggered on a regular cron. The standard Python library contains everything you need to access the mailbox (smtplib) and parse the message to get the image (email and email.message). The script can then simply save the image file to the relevant place on disk, and create a matching entry in the database via the normal Django ORM.
I want to upload document, file to google docs using Google Apps Engine (python)
any code or link will be appreciated
See the documentation, but you might try something like:
ms = gdata.MediaSource(file_path='/path/to/your/test.doc', content_type=gdata.docs.service.SUPPORTED_FILETYPES['DOC'])
entry = gd_client.Upload(ms, 'MyDocTitle')
print 'Document now accessible online at:', entry.GetAlternateLink().href
Solution is with files Upload, You need to read data using below line in python:
function to read file size
def getSize(self,fileobject):
fileobject.seek(0,2) # move the cursor to the end of the file
size = fileobject.tell()
return size
f = self.request.POST.get('fname').file
media = gdata.data.MediaSource(file_handle=f.read(), content_type=gdata.docs.service.SUPPORTED_FILETYPES[ext], content_length=self.getSize(self.request.POST.get('fname').file))
And also need to modify the gdata python library of Google to achieve this:
client.py:
in
def upload_file
replace:
while not entry:
entry = self.upload_chunk(start_byte, self.file_handle.read(self.chunk_size))
start_byte += self.chunk_size
With:
while not entry:
entry = self.upload_chunk(start_byte, self.file_handle)
start_byte += self.chunk_size
And you can upload file directory to google doc