I'm building a simple crud app in Django which is basically enhanced reporting for commerce related software. So I pull information on behalf of a user from another software's API, do a bunch of aggregation/calculations on the data, and then display a report to the user.
I'm having a hard time figuring out how to make this all happen quickly. While I'm certain there are optimizations that can be made in my Python code, I'd like to find some way to be able to make multiple calculations on reasonably large sets of objects without making the user wait like 10 seconds.
The problem is that the data can change in a given moment. If the user makes changes to their data in the other software, there is no way for me to know that without hitting the API again for that info. So I don't see a way that I can cache information or pre-fetch it without making a ridiculous number of requests to the API.
In a perfect world the API would have webhooks I could use to watch for data changes but that's not an option.
Is my best bet to just optimize the factors I control to the best of my ability and hope my users can live with it?
Related
I'm a beginner and now that I've done all the views and html, I'm in a connection phase with the backend, I would like to know what is most used, to do it correctly.
PS: this will have a large number of users
It depends on what you expect to do and how you want to operate your application. There is no absolute answer. With an API, your application may be more robust and evolutive but it takes time to implement.
However, if you are learning django, I cannot hurt to learn to make a proper API. Here a nice tutorial https://medium.com/swlh/build-your-first-rest-api-with-django-rest-framework-e394e39a482c
I've set up a fairly simple Django web app on Google Cloud to handle POSTed API calls that import JSON data into my database. I currently use the standard DRF models to handle serialization as well as validation. However, I've noticed some of the POSTs with larger JSONs timeout after 60 seconds on upload. I am guessing I can just increase the timeout settings on the Google Cloud instance, but was wondering from the perspective of Django what the best step would be. This is a learning project, as I'm new to Django, so I had several ideas on how to proceed. Was wondering if anyone had advice on this topic.
Use the Django FileUpload parser. However, I'm not sure this would avoid the timeout issue, and I would have to write my own custom file validation script.
Some sort of asynchronous processing where the file is uploaded, some ID is returned to the user which can used with a different endpoint to see when the upload has been processed and validated successfully. The downside of this is to my knowledge Django has no about-the-box solution for async support yet so it might be out-of-scope for a relative beginner.
Switch to using Flask. If I'm largely learning on my own should I even be using Django? I imagine Flask might be more efficient for handling large amounts of data but I don't know. So far my current solution has worked well, but if its best to switch I'd rather do that sooner than later.
I have a server with a database, the server will listen for http requests, and using JSONs for
data transferring.
Currently, what my server code(python) mainly do is read the JSON file, convert it to SQL and make some modifications to the database. And the function of the server, as I see, is only like a converter between JSON and SQL. Is this the right procedure that people usually do?
Then I come up with another idea, I can define a class for user information and every user's information in the database is an instance of that class. When I get the JSON file, first I put it into the instance, do some operation and then put it into the database. In my understanding, it adds a language layer between the http request and the database.
The question is, what do people usually do?
The answer is: people do usually that, what they need to do. The layer between database and client normally provides a higher level api, to make the request independent from the actual database. But how this higher level looks like depends on the application you have.
People usually make use of a Web framework, instead of implementing the basic machinery themselves as you are doing.
That is: Python i s a great language that easily allows one to translate "json into sql" with a small amount of code - and it is great for learning. If you are doing this for educational purposes, it is a nice project to continue fiddling with, and maybe you can have some nice ideas in this answer and in others.
But for "real world" usage, the apparent simple plan comes up with real world issues. Tens or even hundreds of them. How to proper separate html/css template from content from core logic, how to deal with many, many aspects of security, etc...
Them, the frameworks come into play: a web-framewrok project is a software project that had, over the years, and soemtimes hundreds of hours of work from several contributors, thought about, and addresses all of the issues a real web application can and will face.
So, it is ok if one want to to everything from scratch if he believes he can come up with a a framework taht has distinguished capabilities not available in any existing project. And it is ok to make small projects for learning purposes. It is not ok to try to come up with something from scratch for real server production, without having a deep knowledge of all the issues involved, and knowing well at least 3, 4 frameworks.
So, now, you've got the basic understanding of a way to get to a framework - it istime to learn some of the frameworks tehmselves. Try, for example, Bottle and Flask (microframeworks), and Django (a fully featured framework for web application development), maybe Tornado (an http server, but with enough of a web framework in it to be usable, and to be very instructive)- just reading the documentation on "how to get started" with these projects, to get to a "hello world" page will lead you to lots of concepts you probably had not thought about yet.
I have earlier worked on Java+Spring to create a web-app.
I have to build a new web-app now.
It will have one centralized db.
There will be two different type of instance of web-app.
Web-App 1:
a) It would have nothing to UI render, no html,js etc.
b) All it need to give is some set of rest API which will
b.1) create some new entries in DB
b.2) modify some entries in DB
b.3) retrieve some of DB records in JSON format.
some frontend code ( doesn't belong to this app) will periodically fetch
this details.
c) it will be used by max by 100,000 people but at a given point of time,
we can expect about 1000 users logged in and doing whats being said in b)
Web-App2 :
a) It will have some dashboards
b) 90% of DB operations would be read operations
c) 10% of DB operations would be write/modify
d) There will be about 1000s of user of this system and at any given point of time
hardly 50-1000 people will be accessing it.
I am thinking of following.
Have Web-App 1 created in python+Django and Web-App 2 created in RoR.
I am planning to use to Dynamo DB and memcache.
Why two different frameworks?
1) So that I get to learn both of them
2) There have been concern about scalability in RoR (and I also know people claim its not there), Web-app 1 may have scaling needs in future.
My questions is Do you see any problem with this combination?
for example active records would want you to use specific namings format for your data base tables? Are there any other concerns similar to this?
Anyone else who have used similar technology stack?
both frameworks are full stack framework and and provide MVC, templating, unit testing, security, db migration, caching, security, ORMs.
For my startup, we also needed to put out a full fleshed website along with an API. We are also using DynamoDB for storing most of the data and are only using MySQL for session info.
I opted to use Ruby on Rails for the Webapp and Sinatra for the API. If you're criteria is simply learning as many new things as possible, then it would make sense to opt for relatively different stacks (django/python and RoR). In our case, we went with sinatra because it's essentially a very lightweight wrapper around Rack and perfect for an API which essentially receives requests, calls one or more services or does some processing and hands out a formatted response. While I don't see any problem with using python/django instead of sinatra, in our case the benefit was having to spend less time working with a different language.
Also, scalability in rails is a bit of an iffy subject. In the end, it's about how you use it. We've had no issues scaling rails with unicorn and nginx. Our business logic is all in the API service and the rails server as well uses the API for most of the work. This means we don't use active record on rails and the website is just another consumer for our API which does all the heavy lifting whether the request comes from an app or the website. Using MySQL for the session store ensures we can route requests to any of the application servers without having to worry about always routing requests from the same client to the same server every time. This allows us to ramp up and down easily only considering the amount of traffic we're getting.
At the time we started working on this, there wasn't an ORM for dynamo db which looked and felt just like active record, so we ended up writing a few high level classes of our own to handle storage and retrieval of models on DynamoDb. Considering DynamoDB is not tailored for scans or joins, this didn't take a lot of effort since we were almost always doing lookups based on keys and ranges. This meant we didn't really need a replacement for active record since the real strength of active record is being able to intuitively do joins, etc. by convention.
DynamoDB does have it's limitations though and you might find yourself in situations where you will need to scan a large number of records. In our case, we also use CloudSearch to index some important info and use it as a fallback for cases when we need to do text based searches which need to scan all our data.
I'm trying to make a website that requires users to enter information about themselves. In order to check to see if this information is correct, it needs to enter the information on another website (that has an entire database of these types of users). It will then return the results found. How do I do such a thing? Where do I start? I tried googling but I couldn't even think of what this would be called?
Not really sure what you're looking to do as it doesn't make much sense. But you need to validate data provided by users on your site against data available in another database that isn't accessible to your app.
This means you need to send the data your users are providing to you to the other service that is providing the validation. Perhaps this other service provides an API to do this, perhaps it just provides a form you can post the data to (with python urllib2).
Without have a lot more information on what you're looking to do I can't even venture to guess whether either of these two things are feasible.