I'm currently building a selfhosted Vuejs webapp (with account logins).
The webapp needs to be a Python webscraper GUI where my user has control over the Python scraper.
So for example, the user fills in an endpoint, starts scraper, view results, trigger a new more depth scraper, etc.
I have the Python scripts for scraping.
And I have decided to go with VueJS + AWS cognito + Hasura for the frontend
I have difficulties understanding how to trigger the python scripts and dump the results in the database and show them to the frontend.
I do like the 3 factor approach:
The data from my scrapers can be many db entries, so I don't like to enter them in the database via mutations.
Do I have to make Flask endpoints to let Hasura trigger these webhooks?
I'm not familiar with serverless.
How do I make my Python scraper scripts serverless?
Or can I just use SQLalchemy to dump the scraper results into the database?
But how do I notify my frontend user that the data is ready?
There are a lot of questions in this one and the answer(s) will be somewhat opinionated so it might not be the greatest fit for StackOverflow.
That being said, after reading through your post I'd recommend that for your first attempt at this you use SQLalchemy to store the results of your scraper jobs directly into the Database. It sounds like you have the most familiarity with this approach.
With Hasura, you can simply have a subscription to the results of the job that you query in your front end so the UI will automatically update on the Vue side as soon as the results become available.
You'll have to decide how you want to kick off the scraping jobs, you have a few different options:
Expose an API endpoint from your Python app and let the UI trigger it
Use Hasura Actions
Build a GQL server in Python and attach it to Hasura using Remote Schemas
Allow your app put a record into the Database using a graphql mutation that includes information about the scrape job and then allow Hasura to trigger a webhook endpoint in your Python app using Hasura Event Triggers
Hasura doesn't care how the data gets into the database it provides a ton of functionality and value even if you're using a different Database access layer in another part of your stack.
Related
I'm building a web app using flask. I've made a completely separate python program and I want to run it in the backend of the web app. It is basically a program that takes in inputs and makes some calculations about scheduling and sends the user an email.
I don't know where or how to implement it into the web app.
normally one would implement such a manner with a message queue to pass it to a backend or with a database to persist these calculation jobs for another backend program.
If you want to stay with python flask is offering something for you called Celery
Tldr: Is it feasible to create a web tool/app that relies on server side Selenium automation.
Created a local script that automates form filling for car insurance quote websites and returns the cost to insure. Ie fill one form and it auto fills every other providers quote form and returns quotes.
But now I want to extend that functionality to others via some sort of webapp [flask/Django?] that handles a clients requests server side by fetching that information and returning it to the client based on their inputs.
What I’m struggling with is Selenium is limited to 5 web drivers (locally) I believe, and is resource intensive, so to me that means at most you can handle 5 website requests at once?
The short answer is YES.
The idea to solve the problem is as follows:
Create a web app which can fetch user's inputs and return something back to user. Just like the most common website.
Create a service in the web app. The service can handle what you wanted using Selenium, such as, filling form of car insurance and getting the cost to insure.
Create a webpage or API in the web app. The webpage/API calls the service mentioned above. When user use the webpage/API, Selenium will automate to do something.
So, it's done.
I need some direction as to how to achieve the following functionality using Django.
I want my application to enable multiple users to submit jobs to make calls to an API.
Each user job will require multiple API calls and will store the results in a db or a file.
Each user should be able to submit multiple jobs.
In case of some failure such as network blocked or API not returning results I want the application to pause for a while and then resume completing that job.
Basically want the application to pickup from where it was left off.
Any ideas as to how I could implement this solution or any technologies such as celery I should be looking at or even if you can suggest an opensource project where I can learn how to perform this would be a great help.
You can do this with rabbitmq and celery.
This post might be helpful.
https://medium.com/#ffreitasalves/executing-time-consuming-tasks-asynchronously-with-django-and-celery-8578eebab356
I have the bulk of my web application in React (front-end) and Node (server), and am trying to use Python for certain computations. My intent is to send data from my Node application to a Python web service in JSON format, do the calculations in my Python web service, and send the data back to my Node application.
Flask looks like a good option, but I do not intend to have any front-end usage for my Python web service. Would appreciate any thoughts on how to do this.
In terms of thoughts:
1) You can build a REST interface to your python code using Flask. Make REST calls from your nodejs.
2) You have to decide if your client will wait synchronously for the result. If it takes a relatively long time you can use a web hook as a callback for the result.
I'll briefly explain what I'm trying to achieve: We have a lot of servers behind ipvsadm VIPs (LVS load balancing) and we regularly move servers in/out of VIPs manually. To reduce risk (junior ops make mistakes...) I'd like to abstract it to a web interface.
I have a Python daemon which repeatedly runs "ipvsadm -l" to get a list of servers and statistics, then creates JSON from this output. What I'd now like to do is server this JSON, and have a web interface that can pass commands. For example, selecting a server in a web UI and pressing remove triggers an ipvsadm -d <server>... command. I'd also like the web UI to update every 10 seconds or so with the statistics from the list command.
My current Python daemon just outputs to a file. Should I somehow have this daemon also be a web server and serve its file and accept POST requests with command identifiers/arguments? Or a second daemon for the web UI? My only front end experience is with basic Bootstrap and jQuery usually backed by Laravel, so I'm not sure if there's a better way of doing this with sockets and some fancy JS modern-ism.
If there is a more appropriate place for this post, please move it if possible or let me know where to re-post.
You don't need fancy js application. To take the path of least resistance, I would create some extra application - if you like python, I recommend flask for this job. If you prefer php, then how about slim?
In your web application, if you want to make it fast and easy, you can even implement ajax mechanism fetching results based on interval to refresh servers' data every 10 seconds. You will fetch it from json served by independent, already existing deamon.
Running commands clicked on Web UI can be done by your web application.
Your web application is something extra and I find it nice to be separated from deamon which fetch data about servers and save it as json. Anytime you can turn off the page, but all statistics will be still fetching and available for console users in json format.