I am new to Python and I am trying to figure out to integrate Python into my workload. I have a set of tables within SSMS 2014 that I want to track on a daily basis. If there are any changes to the data elements within each table, I want to track that and log that there was a change to it. I want to have that run automatically every morning without me having to run the script manually. I figured that creating some sort of a script/function in Python would be best to run all the tables that I want against it. Are there any packages, templates, etc out there that work best for this kind of situation? Thanks in advance.
Related
I dont have much knowledge in dbs, but wanted to know if there is any technique by which when i update or insert a specific entry in a table, it should notify my python application to which i can then listen whats updated and then update that particular row, in the data stored in session or some temporary storage.
I need to send data filter and sort calls again n again, so i dont want to fetch whole data from sql, so i decided to keep it local, nd process it from there. But i was worried if in the mean time the db updates, and i could have been passing the same old data to filter requests.
Any suggestions?
rdbs only will be updated by your program's method or function sort of things.
you can just print console or log inside of yours.
if you want to track what updated modified deleted things,
you have to build a another program to able to track the logs for rdbs
thanks.
I have made several tables in a Postgres database in order to acquire data with time values and do automatic calculation in order to have directly compiled values. Everything is done using triggers that will update the right table in case of modification of values.
For example, if I update or insert a value measured # 2017-11-06 08:00, the trigger will detect this and do the update for daily calculations; another one will do the update for monthly calculations, and so...
Right now, everything is working well. Data acquisition is done in python/Qt to update the measured values using pure SQL instruction (INSERT/UPDATE/DELETE) and automatic calculation are working.
Everything is working well too when I use an interface like pgAdmin III to change values.
My problem comes with development in django to display and modify the data. Up to now, I did not have any problem as I just displayed data without trying to modify them. But now I don't understand what's going on...
If I insert a new value using model.save(), eveything is working: the hourly measure is written, the daily, monthly and yearly calculation are done.
But if I update an existing value, the triggers seem to not see the modification: the hourly measure is updated (so model.save() do the job), but the daily calculation trigger seems not to be launched as the corresponding table is not updated. As said previously, manually updating the same value with pgAdmin III works: the hourly value is updated, the daily calculation is done.
I do not understand why the update process of django seems to disable my triggers...
I have tried to use the old save algorithm (select_on_save = True), but without success.
The django account of the database is owning all the tables, triggers and functions. He has execute permission on all triggers and functions. And again, inserting an item with django is working using the same triggers and functions.
My solution for the moment is to use direct SQL instruction with python/Qt to do the job, but I feel a bit frustrating not to be able to use only django API...
Does anybody have some idea to debug or solve this issue?
Problem was resulting from an error of time zone management.
I am trying to build a website that displays stock information and I have a file called populate_stocks.py that populates the database with a given set of stocks. Since these stocks change almost every minute, I need to make sure I update the database with new information by running populate_stocks.py again.
I was wondering if there is any way to let my django application automatically call this file to update the stock information. I searched around and found another person using crontab which seems a bit complicated and was wondering if there is another solution.
Though I agree that the use of crontab to solve this problem is actually probably the easiest solution to this problem, there is a possible alternative over crontab which would be to make a management function that runs alongside your server.
Basically in its absolute simplest form you would make a basic loop in the form of a Django NoArgsCommand.
from django.core.management.base import NoArgsCommand
from populate_stocks import yourmysticalfunctionofupdating
class Command(NoArgsCommand):
help = "This runs the loop of glory, that does as it is told."
def handle_noargs(self, **options):
while True:
yourmysticalfunctionofupdating()
You would need to put this into a management -> commands folder and name the python file whatever you want the command should be (imagine it is updatify.py in this example).
You could then run the following command to run your watchdog.
./manage.py updatify
Though this may be overkill for your particular problem I have found it very helpful for trickier issues, and I hope it saves someone some time.
I have installed chart of accounts A for company1. This chart was used couple months for accounting. How can I convert into chart of accounts B and keep old data for accounts (debit, credit, etc.)? In other words, is possible migrate data from one chart of accounts to another? Solution could be programmatically or trough web client interface (not important). Virtual charts of accounts can't be used. Chart of accounts B must became main chart with old data.
Every advice will help me a lot. Thanks
I don't know of any way to install another chart of accounts after you've run the initial configuration wizard on a new database. However, if all you want to do is change the account numbers, names, and parents to match a different chart of accounts, then you should be able to do that with a bunch of database updates. Either manually edit each account if there aren't too many accounts, or write a SQL or Python script to update all the accounts. To do that, you'll need to map each old account to a new account code, name, and parent, then use that map to generate a script.
IMO its very difficult we are currently migrating some data and its proving to be difficult.
I would advice you to pick a date in the future and tell everyone to just use another db with the correct chart of accounts.
Your finance dept will be the one to suggest what date is perfect. How about when a period starts.
I needed to do similar. It is possible to massage the chart from one form to another but I found in the end that creating a New Database, bringing in modules, assigning the new Chart and then importing all critical elements was the best and safest path.
If you have a lot of transactions that will be more difficult to do the import on. If that is the case, then massage your chart from one form to another.
I am sure there will be some way to do an active Migration sometime in the future. You defintely don't want to live with a bad chart or with out your history if you can help it.
The fastest way to do so is using a ETL like Talend or Pentaho (provided there is a logic as to which account will map to which other during the process). If not you will have to do so by hand.
In case there is a logic, you would export it to a format you can transform and re import. Uninstall your account chart and install the new. Then import all the data that you formatted using those tools.
What would be the best way to import multi-million record csv files into django.
Currently using python csv module, it takes 2-4 days for it process 1 million record file. It does some checking if the record already exists, and few others.
Can this process be achieved to execute in few hours.
Can memcache be used somehow.
Update: There are django ManyToManyField fields that get processed as well. How will these used with direct load.
I'm not sure about your case, but we had similar scenario with Django where ~30 million records took more than one day to import.
Since our customer was totally unsatisfied (with the danger of losing the project), after several failed optimization attempts with Python, we took a radical strategy change and did the import(only) with Java and JDBC (+ some mysql tuning), and got the import time down to ~45 minutes (with Java it was very easy to optimize because of the very good IDE and profiler support).
I would suggest using the MySQL Python driver directly. Also, you might want to take some multi-threading options into consideration.
Depending upon the data format (you said CSV) and the database, you'll probably be better off loading the data directly into the database (either directly into the Django-managed tables, or into temp tables). As an example, Oracle and SQL Server provide custom tools for loading large amounts of data. In the case of MySQL, there are a lot of tricks that you can do. As an example, you can write a perl/python script to read the CSV file and create a SQL script with insert statements, and then feed the SQL script directly to MySQL.
As others have said, always drop your indexes and triggers before loading large amounts of data, and then add them back afterwards -- rebuilding indexes after every insert is a major processing hit.
If you're using transactions, either turn them off or batch your inserts to keep the transactions from being too large (the definition of too large varies, but if you're doing 1 million rows of data, breaking that into 1 thousand transactions is probably about right).
And most importantly, BACKUP UP YOUR DATABASE FIRST! The only thing worse than having to restore your database from a backup because of an import screwup is not having a current backup to restore from.
As mentioned you want to bypass the ORM and go directly to the database. Depending on what type of database you're using you'll probably find good options for loading the CSV data directly. With Oracle you can use External Tables for very high speed data loading, and for mysql you can use the LOAD command. I'm sure there's something similar for Postgres as well.
Loading several million records shouldn't take anywhere near 2-4 days; I routinely load a database with several million rows into mysql running on a very load end machine in minutes using mysqldump.
Like Craig said, you'd better fill the db directly first.
It implies creating django models that just fits the CSV cells (you can then create better models and scripts to move the data)
Then, db feedding : a tool of choice for doing this is Navicat, you can grab a functional 30 days demo on their site. It allows you to import CSV in MySQL, save the importation profile in XML...
Then I would launch the data control scripts from within Django, and when you're done, migrating your model with South to get what you want or , like I said earlier, create another set of models within your project and use scripts to convert/copy the data.