Django - Saving Facebook Information to database from social-registration

Django - Saving Facebook Information to database from social-registration - python

So I am fairly new to the whole django / python environment. I have successfully installed the social-registration app with my django application. Users are able to sign in with facebook and it creates a record for a user in the auth_user table and into my app_customuser table but it does not save the email, first name, last name etc.
What I was wondering is where in the structure of the application should I be looking to place the code that takes the facebook information and saves the data into the database.

social-registration seems to be a bit simplistic; it's documentation definitely is. It's possible you can subclass the backends or potentially create your own backends based on those to store the additional fields, but that feels clunky even saying it.
It's probably not what you want to hear, but django-socialauth is much more widely used and better documented. It actually provides signals that you can hook into to save additional user data out of the box: http://django-social-auth.readthedocs.org/en/latest/signals.html

Django Facebook sounds exactly like what you are looking for
https://github.com/tschellenbach/Django-facebook
Stores
username, email, gender, birthday, about me, website, and optionally likes and friends
Very mature app, runs thousands of registrations daily on our site.

Related

Handling repetitive content within django apps

I am currently building a tool in Django for managing the design information within an engineering department. The idea is to have a common catalogue of items accessible to all projects. However, the projects would be restricted based on user groups.
For each project, you can import items from the catalogue and change them within the project. There is a requirement that each project must be linked to a different database.
I am not entirely sure how to approach this problem. From what I read, the solution I came up with is to have multiple django apps. One represents the common catalogue of items (linked to its own database) and then an app for each project(which can write and read from its own database but it can additionally read also from the common items catalogue database). In this way, I can restrict what user can access what database/project. However, the problem with this solution is that it is not DRY. All projects look the same: same models, same forms, same templates. They are just linked to different database and I do not know how to do this in a smart way (without copy-pasting entire files cause I think managing this would be a pain).
I was thinking that this could be avoided by changing the database label when doing queries (employing the using attribute) depending on the group of the authenticated user. The problem with this is that an user can have access to multiple projects. So, I am again at a loss.

It looks for me that all you need is a single application that will manage its access properly.
If the requirement is to have separate DBs then I will not argue that, but ... there is always small chance that separate tables in 1 DB is what they will accept

Django apps don't segregate objects, they are a way of structuring your code base. The idea is that an app can be re-used in other projects. Having a separate app for your catalogue of items and your projects is a good idea, but having them together in one is not a problem if you have a small codebase.
If I have understood your post correctly, what you want is for the databases of different departments to be separate. This is essentially a multi-tenancy question which is a big topic in itself, there are a few options:
Code separation - all of your projects/departments exist in a single database and schema but are separate by code that filters departments depending on who the end user is (literally by using Django .filters()). This is easy to do but there is a risk that data could be leaked to the wrong user if you get your code wrong. I would recommend this one for your use-case.
Schema separation - you are still using a single database but each department has its own schema. You would need to use Postgresql for this but once a schema has been set, there is far less chance that data is going to be visible to the wrong user. There are some Django libraries such as django-tenants that can do a lot of the heavy lifting.
Database separation - each department has their own database. There is even less of a chance that data will be leaked but you have to manage multi-databases and it is more difficult to scale. You can manage this through django as there is support for multi-databases.
Application separation - each department not only has their own database but their own application instance. The separation is absolute but again you need to manage multiple applications on a host like Heroku, which is even less scalable.

Python Advice for Project: Taking input and writing to existing PDF?? WebApp?

I am somewhat of a beginner at Python and I am currently starting some brainstorming and planning for a project to simplify the tedious task of filling out a product order form for a friend's business. I am wanting to create a program with an interface that takes in input and writes to an already existing pdf form of the physical order form. I would also like to implement being able to then email that form to another coworker and having accessible information of previous orders from the current customer ordering.
I am most curious about how to write to an already existing pdf form and just fill in the blanks, potentially using a PDF Reader? Also, for product catalog data and customer order history, would using dictionaries suffice or would it be better to use some python database?
I know I could figure out how to individually do each of those tasks but I don't know how to tie it all together properly to distribute it, either making a WebApp or an executable file from the script and just use tkinter or another GUI library for the interface, or if there is a more obvious and convenient option?
If I were to do a WebApp, what would be my best option, I've seen options such as Anvil, Flask, Django, and I just don't know what would be best for what I'm trying to accomplish.
I know this is a longshot and vague but any advice or guidance in the right direction would be much appreciated!

I'd go with a web app here. Personally, I prefer Django, and I don't think that it would be too difficult to learn it well enough for your purposes!
Regarding saving product catalog data and customer order history: I would definitely recommend using a database for your backend, which Django helps you do! Database integration is almost ridiculously easy using Django.
Here's some resources for the specific things you asked about:
Django vs. Flask
Working with HTML forms (Django) (this will help with taking user input)
filling an existing PDF form in Python (this will help with using user input to fill out the pdfs)
How to send email attachments in Python? (this will help you email your coworker the filled out pdf)
If you need more help/guidance, definitely feel free to follow up!

Suggestions on how my end users can extract data using my SQL query without having to use SQL?

The problem is that I've ended up with the extra role of having to run these queries for different users everyday because no one in our factory knows how to use SQL and my IT wouldn't grant them database access anyway.
I've consulted some tech savvier friends who have suggested that i create a GUI (using python) that does the querying for me? That sounds like it would be ideal since all my end users would have to do is key into an input field someone on the interface and the query would run, versus them querying for the data directly and potentially messing up my SQL script.
I've read online as well of the possibility of doing something like mentioned before but in Excel instead. Right now I'm just really lost as to where to start. I probably would need to do alot of reading up anyway but it'd be at least nice to know what is a good direction to start. Any new suggestions would be appreciated.

This could be overkill for you, but Metabase is basically a GUI you can pretty easily set up on top of your SQL database.
It's target audience is "business analytics", so it enables people to create "questions" (aka. queries) using a GUI rather than writing SQL (although it allows raw SQL). More relevant to you, you probably just want to preconfigure some "questions" your users can log in and run in Metabase.
Without writing any code at all you could set up Metabase, write the query you want using either raw SQL or their GUI, and parameterize the query with some user_id or something the users can fill in. From there, you could give the users access to your Metabase (via the web), where they can go in and use the GUI to fill in whatever they need to and then run the query you've created for them.
Metabase also has a lot of customization, so you can make sure they only have permissions to see certain data, etc., and they don't need DB credentials. You, the Metabase admin, connect the database when you set up Metabase and give other users Metabase logins instead.

Checking username availability - Handling of AJAX requests (Google App Engine)

I want to add the 'check username available' functionality on my signup page using AJAX. I have few doubts about the way I should implement it.
With which event should I register my AJAX requests? We can send the
requests when user focus out of the 'username' input field (blur
event) or as he types (keyup event). Which provides better user
experience?
On the server side, a simple way of dealing with requests would be
to query my main 'Accounts' database. But this could lead to a lot
of request hitting my database(even more if we POST using the keyup
event). Should I maintain a separate model for registered usernames
only and use that to get better results?
Is it possible to use Memcache in this case? Initializing cache with
every username as key and updating it as we register users and use a
random key to check if cache is actually initialized or pass the
queries directly to db.

Answers -
Do the check on blur. If you do it on key up, you will be hammering your server with unnecessary queries, annoying the user who is not yet done typing, and likely lag the typing anyway.
If your Account entity is very large, you may want to create a separate AccountName entity, and create a matching such entity whenever you create a real Account (but this is probably an unnecessary optimization). When you create the Account (or AccountName), be sure to assign id=name when you create it. Then you can do an AccountName.get_by_id(name) to quickly see if the AccountName has already been assigned, and it will automatically pull it from memcache if it has been recently dealt with.
By default, GAE NDB will automatically populate memcache for you when you put or get entities. If you follow my advice in step 2, things will be very fast and you won't have to mess around with pre-populating memcache.
If you are concerned about 2 people simultaneously requesting the same user name, put your create method in a transaction:
#classmethod
#ndb.transactional()
def create_account(cls, name, other_params):
acct = Account.get_by_id(name)
if not acct:
acct = Account(id=name, other param assigns)
acct.put()

I would recommend the blur event of the username field, combined with some sort of inline error/warning display.
I would also suggest maintaining a memcache of registered usernames, to reduce DB hits and improve user experience - although probably not populate this with a warm-up, but instead only when requests are made. This is sometimes called a "Repository" pattern.
BUT, you can only populate the cache with USED usernames - you should not store the "available" usernames here (or if you do, use a much lower timeout).
You should always check directly against the DB/Datastore when actually performing the registration. And ideally in some sort of transactional method so that you don't have race conditions with multiple people registering.
BUT, all of this work is dependant on several things, including how busy your app is and what data storage tech you are using!

Importing a CSV file into a PostgreSQL DB using Python-Django

Note: Scroll down to the Background section for useful details. Assume the project uses Python-Django and South, in the following illustration.
What's the best way to import the following CSV
"john","doe","savings","personal"
"john","doe","savings","business"
"john","doe","checking","personal"
"john","doe","checking","business"
"jemma","donut","checking","personal"
Into a PostgreSQL database with the related tables Person, Account, and AccountType considering:
Admin users can change the database model and CSV import-representation in real-time via a custom UI
The saved CSV-to-Database table/field mappings are used when regular users import CSV files
So far two approaches have been considered
ETL-API Approach: Providing an ETL API a spreadsheet, my CSV-to-Database table/field mappings, and connection info to the target database. The API would then load the spreadsheet and populate the target database tables. Looking at pygrametl I don't think what i'm aiming for is possible. In fact, i'm not sure any ETL APIs do this.
Row-level Insert Approach: Parsing the CSV-to-Database table/field mappings, parsing the spreadsheet, and generating SQL inserts in "join-order".
I implemented the second approach but am struggling with algorithm defects and code complexity. Is there a python ETL API out there that does what I want? Or an approach that doesn't involve reinventing the wheel?
Background
The company I work at is looking to move hundreds of project-specific design spreadsheets hosted in sharepoint into databases. We're near completing a web application that meets the need by allowing an administrator to define/model a database for each project, store spreadsheets in it, and define the browse experience. At this stage of completion transitioning to a commercial tool isn't an option. Think of the web application as a django-admin alternative, though it isn't, with a DB modeling UI, CSV import/export functionality, customizable browse, and modularized code to address project-specific customizations.
The implemented CSV import interface is cumbersome and buggy so i'm trying to get feedback and find alternate approaches.

How about separating the problem into two separate problems?
Create a Person class which represents a person in the database. This could use Django's ORM, or extend it, or you could do it yourself.
Now you have two issues:
Create a Person instance from a row in the CSV.
Save a Person instance to the database.
Now, instead of just CSV-to-Database, you have CSV-to-Person and Person-to-Database. I think this is conceptually cleaner. When the admins change the schema, that changes the Person-to-Database side. When the admins change the CSV format, they're changing the CSV-to-Database side. Now you can deal with each separately.
Does that help any?

I write import sub-systems almost every month at work, and as I do that kind of tasks to much I wrote sometime ago django-data-importer. This importer works like a django form and has readers for CSV, XLS and XLSX files that give you lists of dicts.
With data_importer readers you can read file to lists of dicts, iter on it with a for and save lines do DB.
With importer you can do same, but with bonus of validate each field of line, log errors and actions, and save it at end.
Please, take a look at https://github.com/chronossc/django-data-importer. I'm pretty sure that it will solve your problem and will help you with process of any kind of csv file from now :)
To solve your problem I suggest use data-importer with celery tasks. You upload the file and fire import task via a simple interface. Celery task will send file to importer and you can validate lines, save it, log errors for it. With some effort you can even present progress of task for users that uploaded the sheet.

I ended up taking a few steps back to address this problem per Occam's razor using updatable SQL views. It meant a few sacrifices:
Removing: South.DB-dependent real-time schema administration API, dynamic model loading, and dynamic ORM syncing
Defining models.py and an initial south migration by hand.
This allows for a simple approach to importing flat datasets (CSV/Excel) into a normalized database:
Define unmanaged models in models.py for each spreadsheet
Map those to updatable SQL Views (INSERT/UPDATE-INSTEAD SQL RULEs) in the initial south migration that adhere to the spreadsheet field layout
Iterating through the CSV/Excel spreadsheet rows and performing an INSERT INTO <VIEW> (<COLUMNS>) VALUES (<CSV-ROW-FIELDS>);

Here is another approach that I found on github. Basically it detects the schema and allows overrides. Its whole goal is to just generate raw sql to be executed by psql and or whatever driver.
https://github.com/nmccready/csv2psql
% python setup.py install
% csv2psql --schema=public --key=student_id,class_id example/enrolled.csv > enrolled.sql
% psql -f enrolled.sql
There are also a bunch of options for doing alters (creating primary keys from many existing cols) and merging / dumps.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.