Data upload server with user management and resumable uploads - python

I’m looking to build a web-based data upload server for a citizen science project and am wondering if there are out-of-the box solutions available or if there are some useful Python packages, libraries available to make the job easier? 
I don’t really want to reinvent the wheel and it seems that something like this should already exist. Maybe I’m just looking in the wrong place. 
The brief is that our volunteers make audio recordings to monitor threatened species, then upload their data for archiving and automated processing. I’d like a server that has the following: 
Simple web-based user interface - many of our participants have limited confidence with computers;
No client-side software to install; 
User management: registration to approved email addresses only (or similar, maybe a manual admin approval process); 
Data files are 1 to 40MB in size but there are lots of them ~1000 files, and ~10 GB in total. If user loses network connection, uploads should be recoverable with the server capable of resuming an upload where it left off. That's quite important. 
live progress and status updates to the user. 
I have access to a web hosting server. Maybe a Django or Flask implementation already exists, or there's something similar I could adapt. I've looked at things like Dropbox shared directories but they don't quite fit.

Related

Designing a website architecture that gets data from a monitoring computer

I'm designing a website that will display data interactively for users. This data comes from various monitoring computers, that will then be loaded into a database on the same host as the website. The website will then display this data. What's the best (most efficient / most secure / most logical) way to design this? My architecture is Python (Flask) and MySQL, hosted on AWS. I can think of three possibilities:
HTML Post request. The monitoring computer has a script that loads the data onto an invisible webpage served by the host. The monitoring computer enters the data into some forms, along with a password to use for verification. After the host receives the data, it's loaded into the database by the host. My only concern here would be if someone would find this link and DDOSed it, seeing how a CAPTCHA is out of the question.
File transfer: The monitoring computer periodically transfers files to the website host. The website host then loads the new data into the website. This seems like the most straightforward method, but this feels insecure to me. Are my fears founded?
(Though I'm almost certain this one is out of the question): The monitoring computer uploads directly to the mysql database on the host. I can't imagine giving public access to the db is secure at all.

How do I work with large data with a web application?

I recently am wrapping up a personal project that involved using flask, python, and pythonanywhere. I learned a lot and now I have some new ideas for personal projects.
My next project involves changing video files and converting them into other file types for example JPGs. when I drafted up how my system could work I quikly realized that the current platform I am using for web application hosting, meaning pythonanywhere, will be too expensive and perhaps even too slow it since I will be working with large files.
I searched around and found AWS S3 for file storage but I am having trouble finding out how I can operate on that data to do my conversions in python. I definitely don't want to download from S3 operate, on the data in Python anywhere, and then reupload the converted files to a bucket. The project will be available for use on the internet so I am trying to make it as robust and scalable as possible.
I found it hard to even word this question on what to ask as I am not too sure if I am even asking the right questions. I guess I am looking for a way to manipulate large data files, preferably in python, without having to work with the data locally if that makes any sense.
I am open to learning new technologies if that is the case and am looking for some direction on how I might achieve this personal project.
Have you looked into AWS Elastic Transcoder?
Amazon Elastic Transcoder lets you convert media files that you have stored in Amazon Simple Storage Service (Amazon S3) into media files in the formats required by consumer playback devices. For example, you can convert large, high-quality digital media files into formats that users can play back on mobile devices, tablets, web browsers, and connected televisions.
Like all things AWS, there are SDKs (e.g. Python SDK) that allow you to programmatically access the service.

Setting up a home server to retrieve mp3 urls

I am developing an amazon Alexa skill to stream our music from the web.
I would like to have a database of mp3s that is accessible by my app but the mp3s should not be public for download but only for streaming. Thus spotify or soundcloud don't seem to me like good options.
I was thinking to set up a home server where to physically store the mp3s thus having easy access to their urls to pass to the Alexa's AudioPlayer, but I have little experience in setting up servers.
Do you have any alternative workaround suggestions for my issue?
In case the home server might seem a reasonable option to you, where can I start to understand how to do that?
(I code mainly in Python and I know very little of Django)

Implementing mBaaS in Python

I am a web backend developer. In the past, I've used a lot of Python and specifically django to create custom APIs to serve data, in JSON for instance, to web frontends.
Now, I am facing the task of developing a mobile backend that needs to provides services such as push notifications, geolocating etc. I am aware of the existing mBaaS providers which could definitely address a lot of the issues with the task at hand, however, the project requires a lot of custom backend code, async tasks, algorithms to perform calculations on the data that in response trigger additional behavior, as well as an extensive back office.
Looking at the features of the popular mBaaS provider, I feel like they are not able to meet all my needs, however it would be nice to use some of their features such as push notifications, instead of developing my own. Am I completely mistaken about mBaaS providers? Is this sort of hybrid approach even possible?
Thanks!
There are a ton of options out there. Personally, I'm still looking for the holy grail of mBaaS providers. I've tried Parse, DreamFactory, and most recently Azure Mobility Services.
All three are great getting started from PoC to v1, but the devil is always in the details. There are a few details to watch out for:
You sacrifice control and for simplicity. Stay in the lanes and things should work. The moment you want to do something else is when complexity creeps in.
You are at the mercy of their infrastructure. Yes -- even Amazon and Azure go down from time to time. Note -- Dreamfactory is a self-hosted solution.
You are locked into their platform. Any extra code customizations
you make with their hooks (ie - Parse's "CloudCode" and Azure's API
scripts) will most likely not port to another platform.
Given the learning curve and tradeoffs involved I think you should just play the strong hand you already have. Why not host an Django app on Heroku? Add on DjangoRestFramework and you basically can get a mBaas up and running in less than a day.
Heroku has plenty of third party providers for things like Push notifications, Authentication mechanisms, and even search engines (Elasticsearch).
All that is required is to drop the right "pip install" code into your controllers and you are off an running.

jasper reports server?

I would like to generate reports in pdf format with following scenario: people would enter information on a web site and after submitting, data would be transfered to jasper reports server and pdf would be created.
Python would be language of choice for my task.
Is this scenario plausible with current jasper reports software (open source or similar), could it be done, and what would be steps in the right direction ?
Is this scenario plausible with current jasper reports software (open source or similar),
Yes.
could it be done
Yes.
and what would be steps in the right direction ?
Write a web server in Python. Your web server will allow a user to enter information on a web site and after submitting, data would be transfered to jasper reports server and pdf would be created. Your web server would provide the PDF back to the user.
You need to pick a framework, install the components, write the unit tests, write the code, debug the code and transition the code to production.
It's hard (given the question) to determine what part of this you actually need help with.
Write the interface for the user with the language of your choice. Then, having the data from the user, make an API request to the jasperserver's API requesting the report.
Make sure to account for the time the report may need to be generated if you want to make it synchronous.
Otherwise, the API allows you to generate a report and poll for it's completion. When it's done, just send the file to the user.
If you use the second approach, don't point the client ajax polling mechanism to the jasperserver as you might not want it to be accessible from the internet directly. You should do that in the backend of your app.
More information about the REST web services for Jasper Server here: https://community.jaspersoft.com/documentation/jasperreports-server-web-services-guide/v550/rest-web-services-overview
Good luck! :)
Use jasper reports server to publish the report and use its rest interface to produce the output. See Render HTML to PDF in Django site that shows a practical implementation of a python rest client

Categories

Resources