Data storage for standalone python application

Data storage for standalone python application - python

I want to make a python program (with a PyQt GUI, but I don't know whether that is relevant) that has to save some information that I want to store even when the program closes. Example for information I want to store:
The user can search for a file in a file dialog window. I want to start the file dialog window in the previously used directory, even if the program is closed in between file searches.
The user can enter their own categories to sort items, building up on some of my predefined categories. These new categories should be available the next time the program starts.
Now I'm wondering what the proper way to store such information is. Should I use pickle? A proper database (I know a tiny bit of sqlite3, but would have to read up on that)? A simple text file that I parse myself? One thing for data like in example 1., another for data like in example 2.?
Also, whatever way to store it I use, where would I put that file?
I'm asking in the context that I might want to later make my program available to others as a standalone application (using py2app, py2exe or PyInstaller).
Right now I'm just saving a pickle file in the directory that my .py file is in, like this answer reconmends, but the answer also specifically mentions:
for a personal project it might be enough.
(emphasis mine)
Is using pickle also the "proper, professional" way, if I want to make the program available to other people as a standalone application?

Choice depends on your approach to data you store, which is yours?:
user should be able to alter it without usage of my program
user should be prevented from altering it with program other than my program
If first you might consider deploying JSON open-standard file format, for which Python has ready library called json. In effect you get text (which you can save to file) which is human-readable and can be edited in text editor. Also there exist JSON file viewers and editors which made viewing/editing of JSON files easier.

I think SQLite3 is the better solution in this case as Moldovan commented.
There is a problem in pickle, sometimes pickling format can be change across python versions and there are greater advantages of using sqlite3.

Related

How to automate SAS enterprise guide reports with Python Script?

I tried with SASpy but it's not working. I am able to open the SAS .egp file but not able to run the multiple scripts within in sequence.
import os, sys, subprocess
def OpenProject(sas_exe, egp_path):
sasExe = sas_exe
sasEGpath = egp_path
subprocess.call([sasExe, sasEGpath])
sas_exe = path\path\
egp_path = path\path\path\
OpenProject(sas_exe, egp_path)

This depends a bit on exactly what the workflow is. A few side notes, then the full solution.
First: EGP is not really intended to store production processes, in my opinion. EGP should really be used for development, then production is done with .sas (text) files. EGP can directly store the nodes as .sas files; ask a new question about that if you want to know more, but it's pretty easy to figure out. Best practice is to have EGP save the code modules as .sas files, then run those - SASPy will easily do that for you.
Second: If you use SAS's built-in Git connectivity, then you can do this a bit more easily I suspect. Consider doing that if you already use Git for your other processes. Again, then you end up with a .sas file, and can directly run that via SASPy.
So: how can you do this in Python, with the assumption you do have to use the .egp itself, without too many different moving parts? The key here is the .egp format. EGP is a container file, which is actually a .zip format container that has in it, among other things, all of the SAS code you want to run, as text. Text in xml format, but still, text.
You can write a python program that opens the .egp as a .zip file, using the zipfile library, and then use xml.etree.ElementTree to parse the project.xml file inside that project. Exactly what you do from there depends on your particular details, and is well out of scope for a Stack Overflow answer, but if you do better visually you can simply rename the .egp to .zip and then open in unzip program of your choice, then browse project.xml in your text editor, and find the nodes and code related to those nodes.
You can then extract the .sas code as text, and submit it directly via SASPy, or extract it to a .sas file and then submit that however you prefer (SASPy or something else).
I do something similar to this for a project - I don't actually run code from it, I'm just parsing it to verify that the correct programs were synced from the EGP to production - but it would be trivial to actually submit the code from what I've written, which is about 50 lines of code total. I may write a SGF paper this year or next year on this topic, in which case I'll try and remember to submit it here - or you can head over to my github page and see if it's there (in the future!).

How to create a dynamic form with python using translated text as input?

I have an original text that I want to translate. I normally do it manually but I know I could save a lot of time translating automatically the most frequent words and expressions.
I will find out how to translate simple words, the problem is not here. I have read some books on python and I think using string manipulations can be done.
But I am lost about how to create the output file.
The output file will contain:
short empty forms ready to be filled wherever there is text that has not been translated
the translated words wherever they were in the original file
In the output file I will fill manually the empty forms, after pressing Tab the cursor should jump to the next exmpty form
I am lost here, I know how to do forms on html but the language I am used to is Python.
I would like to know what modules from Python I could use. I need some guidance on this.
Can you recommend me a book or a tool that explains how to do something similar to this?
This is what I want to do, assuming I have managed to create a simple database to translate colors from Spanish to English.
The first step contains the original file.
The second step contains the automatic translation.
In the third step I complete the manual translation.
After finishing everything is grouped into a normal txt file ready to be used.
I think it is quite clear. I don't expect people to tell me the code to do this, I just need to know what tools could be used to achieve my goal.
Thanks for editing.

To create an interface that works with a web browser, Flask for Python is a good method for creating webforms. There are tutorials available.
One method for storing data would be an SQLite file. That may be more than you need, so I'd recommend starting with a CSV file. Libraries exist in Python for both CSVs and SQLite.

Generating correct excel xls format

I created a little script in python to generate an excel compatible xml file (saved with xls extension). The file is generated from a part database so I can place an order with the extracted data.
On the website for ordering the parts, you can import the excel file so the order fills automatically. The problem here is that each time I want to make an order, I have to open excel and save the file with xls extension of type MS Excel 97-2003 to get the import working.
The excel document then looks exactly the same, but when opened with notepad, we cannot see the xml anymore, only binary dump.
Is there a way to automate this process, by running a bat file or maybe adding some line to my python script so it is converted in the proper format?
(I know that question has been asked before, but it never has been answered)

There are two basic approaches to this.
You asked about the first: Automating Excel to open and save the file. There are in fact two ways to do that. The second is to use Python tools that can create the file directly in Python without Excel's help. So:
1a: Automating Excel through its automation interface.
Excel is designed to be controlled by external apps, through COM automation. Python has a great COM-automation interface inside of pywin32. Unfortunately, the documentation on pywin32 is not that great, and all of the documentation on Excel's COM automation interface is written for JScript, VB, .NET, or raw COM in C. Fortunately, there are a number of questions on this site about using win32com to drive Excel, such as this one, so you can probably figure it out yourself. It would look something like this:
import win32com.client
excel = win32com.client.Dispatch('Excel.Application')
spreadsheet = excel.Workbooks.Open('C:/path/to/spreadsheet.xml')
spreadsheet.SaveAs('C:/path/to/spreadsheet.xls', fileformat=excel.xlExcel8)
That isn't tested in any way, because I don't have a Windows box with Excel handy. And I vaguely remember having problems getting access to the fileformat names from win32com and just punting and looking up the equivalent numbers (a quick google for "fileformat xlExcel8" shows that the numerical equivalent is 56, and confirms that's the right format for 97-2003 binary xls).
Of course if you don't need to do it in Python, MSDN is full of great examples in JScript, VBA, etc.
The documentation you need is all on MSDN (since the Office Developer Network for Excel was merged into MSDN, and then apparently became a 404 page). The top-level page for Excel is Welcome to the Excel 2013 developer reference (if you want a different version, click on "Office client development" in the navigation thingy above and pick a different version), and what you mostly care about is the Object model reference. You can also find the same documentation (often links to the exact same webpages) in Excel's built-in help. For example, that's where you find out that the Application object has a Workbooks property, which is a Workbooks object, which has Open and Add methods that return a Workbook object, which has a SaveAs method, which takes an optional FileFormat parameter of type XlFileFormat, which has a value xlExcel8 = 56.
As I implied earlier, you may not be able to access enumeration values like xlExcel8 for some reason which I no longer remember, but you can look the value up on MSDN (or just Google it) and put the number 56 instead.
The other documentation (both here and elsewhere within MSDN) is usually either stuff you can guess yourself, or stuff that isn't relevant from win32com. Unfortunately, the already-sparse win32com documentation expects you to have read that documentation—but fortunately, the examples are enough to muddle your way through almost everything but the object model.
1b: Automating Excel via its GUI.
Automating a GUI on Windows is a huge pain, but there are a number of tools that make it a whole lot easier, such as pywinauto. You may be able to just use swapy to write the pywinauto script for you.
If you don't need to do it in Python, separate scripting systems like AutoIt have an even larger user base and even more examples to make your life easier.
2: Doing it all in Python.
xlutils, part of python-excel, may be able to do what you want, without touching Excel at all.

Managing python program settings through browser

I am looking for a good way to manage the settings of a program I am writing. The program will be a service, so the user shouldn't have to interact with it at all. However, there will be a large number of settings to deal with (including a whole dirwalk), and I am looking for the best way to present that to the user.
Ideally, I would like something that would basically be a nice gui to an XML file I will build separately. I was thinking of doing it with Django, but I am trying to make the installation of this whole thing as simple as possible.
TL;DR: I dont know if it exists, but is there something that will cleanly present XML and allow for saving new settings to the XML document?

Windows Application Programming & wxPython

Developing a project of mine I realize I have a need for some level of persistence across sessions, for example when a user executes the application, changes some preferences and then closes the app. The next time the user executes the app, be it after a reboot or 15 minutes, I would like to be able to retain the preferences that had been changed.
My question relates to this persistence. Whether programming an application using the win32 API or the MFC Framework .. or using the newer tools for higher level languages such as wxPython or wxRuby, how does one maintain the type of persistence I refer to? Is it done as a temporary file written to the disk? Is it saved into some registry setting? Is there some other layer it is stored in that I am unaware of?

I would advice to do it in two steps.
First step is to save your prefs. as
string, for that you can
a)
Use any xml lib or output xml by
hand to output string and read
similarly from string
b) Just use pickle module to dump your prefs object as a string
c) Somehow generate a string from prefs which you can read back as prefs e.g. use yaml, config , JSON etc actually JSON is a good option when simplejson makes it so easy.
Once you have your methods to convert to and from string are ready, you just need to store it somewhere where it is persisted and you can read back next time, for that you can
a) Use wx.Config which save to registry in windows and to other places depending on platform so you don't have to worry where it saves, you can just read back values in platform independent way. But if you wish you can just use wx.Config for directly saving reading prefs.
b) Directly save prefs. string to a file in a folder assigned by OS to your app e.g. app data folder in windows.
Benefit of saving to a string and than using wx.Config to save it, is that you can easily change where data is saved in future e.g. in future if there is a need to upload prefs. you can just upload prefs. string.

There are different methods to do this that have evolved over the years.
These methods include (but not limited to):
Registry entries.
INI files.
XML Files
Simple binary/text files
Databases
Nowadays, most people do this kind of thing with XML files residing in the user specific AppData folders. It is your choice how you do it. For example, for simple things, databases can be overkill and for huge persisted objects, registry would not be appropriate. You have to see what you are doing and do it accordingly.
Here is a very good discussion on this topic

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.