How do i 'professionally' store small data in python? [duplicate]

How do i 'professionally' store small data in python? [duplicate] - python

I need to store basic data of customer's and cars that they bought and payment schedule of these cars. These data come from GUI, written in Python. I don't have enough experience to use a database system like sql, so I want to store my data in a file as plain text. And it doesn't have to be online.
To be able to search and filter them, first I convert my data (lists of lists) to the string then when I need the data re-convert to the regular Python list syntax. I know it is a very brute-force way, but is it safe to do like that or can you advice me to another way?

It is never safe to save your database in a text format (or using pickle or whatever). There is a risk that problems while saving the data may cause corruption. Not to mention risks with your data being stolen.
As your dataset grows there may be a performance hit.
have a look at sqlite (or sqlite3) which is small and easier to manage than mysql. Unless you have a very small dataset that will fit in a text file.
P/S: btw, using berkeley db in python is simple, and you don't have to learn all the DB things, just import bsddb

The answer to use pickle is good, but I personally prefer shelve. It allows you to keep variables in the same state they were in between launches and I find it easier to use than pickle directly. http://docs.python.org/library/shelve.html

I agree with the others that serious and important data would be more secure in some type of light database but can also feel sympathy for the wish to keep things simple and transparent.
So, instead of inventing your own text-based data-format I would suggest you use YAML
The format is human-readable for example:
List of things:
- Alice
- Bob
- Evan
You load the file like this:
>>> import yaml
>>> file = open('test.yaml', 'r')
>>> list = yaml.load(file)
And list will look like this:
{'List of things': ['Alice', 'Bob', 'Evan']}
Of course you can do the reverse too and save data into YAML, the docs will help you with that.
At least another alternative to consider :)

very simple and basic - (more # http://pastebin.com/A12w9SVd)
import json, os
db_name = 'udb.db'
def check_db(name = db_name):
if not os.path.isfile(name):
print 'no db\ncreating..'
udb = open(db_name,'w')
udb.close()
def read_db():
try:
udb = open(db_name, "r")
except:
check_db()
read_db()
try:
dicT = json.load(udb)
udb.close()
return dicT
except:
return {}
def update_db(newdata):
data = read_db()
wdb = dict(data.items() + newdata.items())
udb = open(db_name, 'w')
json.dump(wdb, udb)
udb.close()
using:
def adduser():
print 'add user:'
name = raw_input('name > ')
password = raw_input('password > ')
update_db({name:password})

You can use this lib to write an object into a file http://docs.python.org/library/pickle.html

Writing data in a file isn't a safe way for datastorage. Better use a simple database libary like sqlalchemy. It is a ORM for easy database usage...

You can also keep simple data in plain text file. Then you have not much support, however, to check consistency of data, double values etc.
Here is my simple 'card file' type data in text file code snippet using namedtuple so that you can access values not only by index in line but by they header name:
# text based data input with data accessible
# with named fields or indexing
from __future__ import print_function ## Python 3 style printing
from collections import namedtuple
import string
filein = open("sample.dat")
datadict = {}
headerline = filein.readline().lower() ## lowercase field names Python style
## first non-letter and non-number is taken to be the separator
separator = headerline.strip(string.lowercase + string.digits)[0]
print("Separator is '%s'" % separator)
headerline = [field.strip() for field in headerline.split(separator)]
Dataline = namedtuple('Dataline',headerline)
print ('Fields are:',Dataline._fields,'\n')
for data in filein:
data = [f.strip() for f in data.split(separator)]
d = Dataline(*data)
datadict[d.id] = d ## do hash of id values for fast lookup (key field)
## examples based on sample.dat file example
key = '123'
print('Email of record with key %s by field name is: %s' %
(key, datadict[key].email))
## by number
print('Address of record with key %s by field number is: %s' %
(key ,datadict[key][3]))
## print the dictionary in separate lines for clarity
for key,value in datadict.items():
print('%s: %s' % (key, value))
input('Ready') ## let the output be seen when run directly
""" Output:
Separator is ';'
Fields are: ('id', 'name', 'email', 'homeaddress')
Email of record with key 123 by field name is: gishi#mymail.com
Address of record with key 123 by field number is: 456 happy st.
345: Dataline(id='345', name='tony', email='tony.veijalainen#somewhere.com', homeaddress='Espoo Finland')
123: Dataline(id='123', name='gishi', email='gishi#mymail.com', homeaddress='456 happy st.')
Ready
"""

Related

How to merge few lines with filtering some text

I have a text file with the following format.
The first line includes "USERID"=12345678 and the other lines include the user groups for each application:
For example:
User with user T-number T12345 has WRITE access to the APP1 and APP2 and READ-ONLY access to APP1.
T-Number is just some other kind of ID.
00001, 00002 and so on are sequence numbers and can be ignored.
T12345;;USERID;00001;12345678;
T12345;APPLICATION;WRITE;00001;APP1
T12345;APPLICATION;WRITE;00002;APP2
T12345;APPLICATION;READ-ONLY;00001;APP1
I need to do some filtering and merge the line containing USERID with all the lines having user groups, matching t-number with userid (T12345 = 12345678)
So the output should look like this.
12345678;APPLICATION;WRITE;APP1
12345678;APPLICATION;WRITE;APP2
12345678;APPLICATION;READ-ONLY;APP1
Should I use csv python module to accomplish this?

I do not see any advantage in using the csv module for reading and parsing the input text file. The number of fields varies: 6 fields in the USERID line, with 2 of them empty, but 5 non-empty fields in the other lines. The fields look very simple, so there is no need for csv's handling of the separator character hidden away in quotes and the like. There is no header line as in a csv file, but rather many headers sprinkled in among the data lines.
A simple routine that reads each line, splits each on the semicolon character, and parses the line, and combines related lines would suffice.
The output file is another matter. The lines have the same format, with the same number of fields. So creating that output may be a good use for csv. However, the format is so simple that the file could also be created without csv.

I am not so sure if you should use the csv module here - it has mixed data, possibly more than just users and user group rights? In the case of a user declaration, you only need to retrieve its group and id, while for the application rights you need to extract the group, app name and right. The more differing data you have, the more issues you will encounter - with manual parsing of the data you are always able to just continue when you met certain criterias.
So far i must say you are better off with a manual, line-by-line parsing of the lines, structure it into something meaningful, then output the data. For instance
from StringIO import StringIO
from pprint import pprint
feed = """T12345;;USERID;00001;12345678;
T12345;;USERID;00001;2345678;
T12345;;USERID;00002;345678;
T12345;;USERID;00002;45678;
T12345;APPLICATION;WRITE;00001;APP1
T12345;APPLICATION;WRITE;00002;APP2
T12345;APPLICATION;READ-ONLY;00001;APP1
T12345;APPLICATION;WRITE;00002;APP1
T12345;APPLICATION;WRITE;00002;APP2"""
buf = StringIO(feed)
groups = {}
# Read all data into a dict of dicts
for line in buf:
values = line.strip().split(";")
if values[3] not in groups:
groups[values[3]] = {"users": [], "apps": {}}
if values[2] == "USERID":
groups[values[3]]['users'].append(values[4])
continue
if values[1] == "APPLICATION":
if values[4] not in groups[values[3]]["apps"]:
groups[values[3]]["apps"][values[4]] = []
groups[values[3]]["apps"][values[4]].append(values[2])
print("Structured data with group as root")
pprint(groups)
print("Output data")
for group_id, group in groups.iteritems():
# Order by user, app
for user in group["users"]:
for app_name, rights in group["apps"].iteritems():
for right in rights:
print(";".join([user, "APPLICATION", right, app_name]))
Online demo here

How do I import text file dictionary into python dictionary

I am creating a tournament scoring system application.
There are many sections to this app, for example, Participants, Team, Events and Award points.
Right now I am working on the team section. I was able to create something that will allow the user to create teams.
The code looks like this.
teamname = input("Team Name: ").title()
First_member = input("First Member Full Name: ").title()
if any(i.isdigit() for i in First_member):
print("Invalid Entry")
else:
There can only be 5 members in each team.
This is how the data is saved
combine = '\''+teamname+'\' : \''+First_member+'\', \''+Second_member+'\',
\''+Third_member+'\', \''+Forth_member+'\', \''+Fifth_member+'\'',
myfile = open("teams.txt", "a+")
myfile.writelines(combine)
myfile.writelines('\n')
myfile.close()
Now If I want to remove a team how do I do that?
Apologies if you feel like i am wasting your time but still thanks for stopping by.
If you want to see everything please check out this link
https://repl.it/#DaxitMahendra/pythoncode>

If text file creation is in your hand then you should just have proper YAML format of file. Your format is fairly similar to YAML.
Once you have YAML you can use PyYAML: https://pyyaml.org/wiki/PyYAMLDocumentation
This answer: YAML parsing and Python? has example of the format and how to parse as well.

you have to do a quick fix, please change your "combine" variable format to a valid "dict" format, then you can use the "ast" module, here a example
#proxy items
teamname,First_member,Second_member,Third_member,Forth_member,Fifth_member = [str(temp_name).zfill(3) for temp_name in range(6)]
#add "{" and "}" to start/end and add your values into a list "key":["value1", "value2",...]
combine = '{\''+teamname+'\' : [\''+First_member+'\', \''+Second_member+'\', \''+Third_member+'\', \''+Forth_member+'\', \''+Fifth_member+'\']}'
import ast
dict = ast.literal_eval(combine)
print dict.keys(), type(dict)
>>['000'] <type 'dict'>
for your case,
#change the "combine" variable
combine = '{\''+teamname+'\' : [\''+First_member+'\', \''+Second_member+'\', \''+Third_member+'\', \''+Forth_member+'\', \''+Fifth_member+'\']}'

Comparing two documents and writing output to a third [Python?]

I am seeking some advice whether it be in terms of a script (possibly python?) that I could use to do the following.
I basically have two documents, taken from a DB:
document one contains :
hash / related username.
example:
fb4aa888c283428482370 username1
fb4aa888c283328862370 username2
fb4aa888c283422482370 username3
fb4aa885djsjsfjsdf370 username4
fb4aa888c283466662370 username5
document two contains:
hash : plaintext
example:
fb4aa888c283428482370:plaintext
fb4aa888c283328862370:plaintext2
fb4aa888c283422482370:plaintext4
fb4aa885djsjsfjsdf370:plaintextetc
fb4aa888c283466662370:plaintextetc
can anyone think of an easy way for me to match up the hashes in document two with the relevant username from document one into a new document (say document three) and add the plain so it would look like the following...
Hash : Relevant Username : plaintext
This would save me a lot of time having to cross reference two files, find the relevant hash manually and the user it belongs to.
I've never actually used python before, so some examples would be great!
Thanks in advance

I don't have any code for you but a very basic way to do this would be to whip up a script that does the following:
Read the first doc into a dictionary with the hashes as keys.
Read the second doc into a dictionary with the hashes as keys.
Iterate through both dictionaries, by key, in the same loop, writing out the info you want into the third doc.

You didn't really specify how you wanted the output, but this should get you close enough to modify to your liking. There are guys out there good enough to shorten this into a fey lines of code - but I think the readability of keeping it long may be helpful to you just getting started.
Btw, I would probably avoid this altogether and to the join in SQL before creating the file -- but that wasn't really your question : )
usernames = dict()
plaintext = dict()
result = dict()
with open('username.txt') as un:
for line in un:
arry = line.split() #Turns the line into an array of two parts
hash, user = arry[0], arry[1]
usernames[hash] = user.rsplit()[0] # add to dictionary
with open('plaintext.txt') as un:
for line in un:
arry = line.split(':')
hash, txt = arry[0], arry[1]
plaintext[hash] = txt.rsplit()[0]
for key, val in usernames.items():
hash = key
txt = plaintext[hash]
result[val] = txt
with open("dict.txt", "w") as w:
for name, txt in result.items():
w.write('{0} = {1}\n'.format(name, txt))
print(usernames) #{'fb4aa888c283466662370': 'username5', 'fb4aa888c283422482370': 'username3' ...................
print(plaintext) #{'fb4aa888c283466662370': 'plaintextetc', 'fb4aa888c283422482370': 'plaintext4' ................
print(result) #{'username1': 'plaintext', 'username3': 'plaintext4', .....................

Python: Transform a string into a variable and its value

I am relatively new on Python.
The program I am writing reads line by line a XML file using a while loop. The data read is split so the information that I get is something like:
datas = ['Name="Date"', 'Tag="0x03442333"', 'Level="Acquisition"', 'Type="String"']
-Inside my program, I want to assign to some variables called exactly as the word before the = sign, the information after the = sign in the previous strings. And then I will introduce them as attributes for a class (this already works)
- What I have done until the moment is:
Name = ''
Tag = ''
Level = ''
Type = ''
for i in datas:
exec(i)
-It works fine that way. However, I do not want to use the exec function. Is there any other way of doing that?
Thank you

exec is generally the way to go about this. You could also add it to the globals() dictionary directly, but this would be slightly dangerous sometimes.
for pair in datas:
name, value = pair.split("=")
globals()[name] = eval(value)

You are right that you should avoid exec for security reasons, and you should probably keep the field values in a dict or similar structure. It's better to let a Python library do the whole parsing. For example, using ElementTree:
import xml.etree.ElementTree as ET
tree = ET.parse('myfile.xml')
root = tree.getroot()
and then iterating over root and its children, depending on how exactly your XML data looks like.

At most what you expect to do is discussed here. To convert string to variable name.
But what you should ideally do is to create a dictionary. Like this.
for i in datas:
(key,value)=i.split("=")
d[key] = eval(value)
NOTE: Still avoid using eval.

As Pelle Nilsson says, you should use a proper XML parser for this. However, if the data is simple and the format of your XML file is stable, you can do it by hand.
Do you have a particular reason to put this data into a class? A dictionary may be all you need:
datas = ['Name="Date"', 'Tag="0x03442333"', 'Level="Acquisition"', 'Type="String"']
datadict = {}
for s in datas:
key, val = s.split('=')
datadict[key] = val.strip('"')
print(datadict)
output
{'Tag': '0x03442333', 'Type': 'String', 'Name': 'Date', 'Level': 'Acquisition'}
You can pass such a dictionary to a class, if you want:
class MyClass(object):
def __init__(self, data):
self.__dict__ = data
def __repr__(self):
s = ', '.join('{0}={1!r}'.format(k,v) for k, v in self.__dict__.items())
return 'Myclass({0})'.format(s)
a = MyClass(datadict)
print(a)
print(a.Name, a.Tag)
output
Myclass(Tag='0x03442333', Type='String', Name='Date', Level='Acquisition')
Date 0x03442333
All of the code in this answer should work correctly on any recent version of Python 2 as well as on Python 3, although if you run it on Python 2 you should put
from __future__ import print_function
at the top of the script.

Python insert YAML in MongoDB

Folks,
Having a problem with inserting the following yaml document into MongoDB:
works:
---
URLs:
- "http://www.yahoo.com":
intensity: 5
port: 80
does not:
---
URLs:
- "foo":
intensity: 5
port: 80
The only difference is the url. Here is the python code:
stream = open(options.filename, 'r')
yamlData = yaml.load(stream)
jsonData = json.dumps(yamlData)
io = StringIO(jsonData)
me = json.load(io)
... calling classes, etc, then
self.appCollection.insert(me)
err:
bson.errors.InvalidDocument: key 'http://yahoo.com' must not contain '.'
So, what is the correct way to transform this YML file? :)
Thanks!

You cannot use "." in field names (i.e. keys). If you must, then replace occurences of "." with the unicode representation "\uff0E".
Hope this helps.

As the errors says, you have errors in your key. MongoDB uses dot for nested document keys, you cannot have a key that contains dot as part of the key.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do i 'professionally' store small data in python? [duplicate] - python

The answer to use pickle is good, but I personally prefer shelve. It allows you to keep variables in the same state they were in between launches and I find it easier to use than pickle directly. http://docs.python.org/library/shelve.html

You can use this lib to write an object into a file http://docs.python.org/library/pickle.html

Writing data in a file isn't a safe way for datastorage. Better use a simple database libary like sqlalchemy. It is a ORM for easy database usage...

Related

How to merge few lines with filtering some text

How do I import text file dictionary into python dictionary

Comparing two documents and writing output to a third [Python?]

Python: Transform a string into a variable and its value

Python insert YAML in MongoDB

Categories

Resources