Named Stuctured Data storage in Python

Named Stuctured Data storage in Python - python

I am writing an application that calculates various Pandas dataframes over various time periods. Each of these Dataframes have additional data that need to be stored with them.
I can quite easily define a structure using lists or dicts to carry the data, but it would be nice if it is nicely structured.
I have looked at (tried namedtuples). This is great as it simplifies the syntax when accessing the information a lot. Problems with tuples are of course that they are immutable.
Have gotten around this by either doing all the calcs ahead of time and living with not being able to change them (without jumping through few hoops) or by following code:
from collections import namedtuple
m = namedtuple("Month", 'df StartDate EndDate DaysInMonth
m.Month = 2
m.df = pandas.DataFrame()
etc....
this seems to work, but I am actually misusing the named tuple class. m in the above code is actually a "type" not an instance. Although it is working and I can now assign to it I am probably going to run into some problems later on.
type(m)
>>> type
Any suggestions on whether I could carry on with this structure or if i should rather create my own class for the data structure?

What you're doing setting m.Month to 2 is using something all classes can do because they walk and talk like dictionaries.
class Month():
pass
a = Month()
a.df = 2
This works without doing anything special. If you look inside a's _dict_ attribute
print(a.__dict__)
You'll see something like the following
{'__module__': '__main__', '__doc__': None, 'df': 2}
I would probably use the empty class instead of the namedtuple if you want to change the values at a later time. All the namedtuple machinery in the background get you nothing for your use case.

Related

How to use %s to name a list

I'm trying to create a system on Python that allows me to create a list called (user)total, 'user' being the name of the user before total. However this is subjective as any account with any username could be made within my program.
I have tried to use
%stotal = [''] %user
however this comes up with a syntax error. How would I manage to do this?

You can't do that kind of meta-programming in python! (not with the syntax you posted)
But instead you can create a dictionary of lists indexed by the user name:
total = {}
total['username1'] = [''] #list for this username total
total['username2'] = ['']
etc.

It is possible. Hopefully seeing how will help to illustrate why, as Hyperboreus says, it's not a good idea.
If you do dir() in your interactive Python environment, you'll get a list of names that are available in your current scope. There will always be one called __builtins__, which exposes all of the functions and constants in the builtins module. These functions and constants are defined to be exactly the same ones that are available right from the start of your Python session in the global namespace, which you can take a look at with the builtin function globals().
In accordance with the Python data model, every Python object has an element named __dict__ that's a dictionary object whose keys are member names. If obj is the name of some Python object in the current scope, obj.__dict__["keyname"] will access the same member that you could get to more simply through obj.keyname.
So putting this together, you can set key/value pairs in __builtins__.__dict__ directly:
>>> __builtins__.__dict__["testvarname"] = "testval"
>>> print testvarname
testval
Whew! Getting pretty abstract pretty quick here. This might be useful for defining behavior based on user input or something else that you might not know until runtime... but you can probably see how you're working through a lot of complexity to get there and sort of circumventing the normal rules that Python sets out to try to help you keep your programs organized and easy to understand. xndrme's answer is likely to be the more straightforward way to solve the bigger problem you're facing.

how to change variable from to 2 instances of same imported module

frmEnv = __import__(conf)
frmEnv.SCHEMA='abc'
toEnv = __import__(conf)
toEnv.SCHEMA='def'
print(frmEnv.SCHEMA, toEnv.SCHEMA)
Output:
('def', 'def')
I want both values to be different.
Is there a way to make a variable's value non-changable (constant or static)
I don't want frmEnv.SCHEMA value to change ever once a value is assigned to it.

what you're trying to do is totally wrong as other people explained. But if I try to read what you got in mind (though I'm rarely good at telepathy), maybe what you want is to have a copy of the first module?
you may then want to use the copy module, and change the copy of your object:
frmEnv = __import__(conf)
frmEnv.SCHEMA='abc'
toEnv = copy.deepcopy(frmEnv) # or copy.copy() depending on what are the members of frmEnv...
toEnv.SCHEMA='def'
print(frmEnv.SCHEMA, toEnv.SCHEMA)
Output:
('abc', 'def')
You may also want to load a module using its file name using the imp module, and give it two different names in the current environment so they are actually loaded two times. It should have the same effect as a copy, but would be a lot more dependent on the filesystem placement of files, thus being a lot less elegant (that's why I'm not giving an example). And it'd be a lot harder for the reader to understand why you're doing that.
HTH

A python module is only imported once; the second import just returns the already initialized module-- its code is not executed a second time. This means that your frmEnv and toEnv are two references to the same object.
If you explain what (concrete task) you're trying to accomplish, someone can tell you how to do it. This is not the way.

Referencing Python list elements without specifying indexes

How can I store values in a list without specifying index numbers?
For example
outcomeHornFive=5
someList = []
someList.append(outComeHornFive)
instead of doing this,
someList[0] # to reference horn five outcome
how can i do something like this? The reason is there are many items that I need to reference within the list and I just think it's really inconvenient to keep track of which index is what.
someList.hornFive

You can use another data structure if you'd like to reference things by attribute access (or otherwise via a name).
You can put them in a dict, or create a class, or do something else. It depends what kind of other interaction you want to have with that object.
(P.S., we call those lists, not arrays).

Instead of using a list you can use a dictionary.
See data types in the python documentation.
A dictionary allows you to lookup a value using a key:
my_dict["HornFive"] = 20

You cannot and you shouldn't. If you could do that, how would you refer to the list itself? And you will need to refer to the list itself.
The reason is there are many items that i need to reference within the list and I just think it's really inconvenient to keep track of which index is what.
You'll need to do something of that ilk anyway, no matter how you organize your data. If you had separate variables, you'd need to know which variable stores what. If you had your way with this, you'd still need to know that a bare someList refers to "horn five" and not to, say, "horn six".
One advantage of lists and dicts is that you can factor out this knowledge and write generic code. A dictionary, or even a custom class (if there is a finite number of semantically distinct attributes, and you'd never have to use it as a collection), may help with the readability by giving it an actual name instead of a numeric index.

referenced from http://parand.com/say/index.php/2008/10/13/access-python-dictionary-keys-as-properties/
Say you want to access the values if your dictionary via the dot notation instead of the dictionary syntax. That is, you have:
d = {'name':'Joe', 'mood':'grumpy'}
And you want to get at “name” and “mood” via
d.name
d.mood
instead of the usual
d['name']
d['mood']
Why would you want to do this? Maybe you’re fond of the Javascript Way. Or you find it more aesthetic. In my case I need to have the same piece of code deal with items that are either instances of Django models or plain dictionaries, so I need to provide a uniform way of getting at the attributes.
Turns out it’s pretty simple:
class DictObj(object):
def __init__(self, d):
self.d = d
def __getattr__(self, m):
return self.d.get(m, None)
d = DictObj(d)
d.name
# prints Joe
d.mood
# prints grumpy

Persistent references in Python

I would like my program to store datas for later uses. Until now, not any problem: there is much ways of doing this in Python.
Things get a little more complicated because I want to keep references between instances. If a list X is a list Y (they have the same ID, modify one is modify the other), it should be true the next time I load the datas (another session of the program which has stopped in the meantime).
I know a solution : the pickle module keeps tracks of references and will remember that my X and Y lists are exactly the same (not only their contents, but their references).
Still, the problem using pickle is that it works if you dump every data in a single file. Which is not really clever if you have a large amount of data.
Do you know another way to handle this problem?

The simplest thing to do is probably to wrap up all your state you wish to save in a dictionary (keyed by variable name, perhaps, or some other unique but predictable identifier), then pickle and unpickle that dictionary. The objects within the dictionary will share references between one another like you want:
>>> class X(object):
... # just some object to be pickled
... pass
...
>>> l1 = [X(), X(), X()]
>>> l2 = [l1[0], X(), l1[2]]
>>> state = {'l1': l1, 'l2': l2}
>>> saved = pickle.dumps(state)
>>> restored = pickle.loads(saved)
>>> restored['l1'][0] is restored['l2'][0]
True
>>> restored['l1'][1] is restored['l2'][1]
False

I would recommand using shelve over pickle. It has higher level functionnality, and is simpler to use.
http://docs.python.org/library/shelve.html
If you have performance issue because you manipulate very large amount of data, you may try other librairies like pyTables:
http://www.pytables.org/moin

ZODB is developed to save persistent python objects and all references. Just inherit your class from Persistent and have a fun. http://www.zodb.org/

GetAttr Function Problems (Python 3)

I have the following in a Python script:
setattr(stringRESULTS, "b", b)
Which gives me the following error:
AttributeError: 'str' object has no attribute 'b'
Can any-one telling me what the problem is here?

Don't do this. To quote the inestimable Greg Hewgill,
"If you ever find yourself using quoted names to refer to variables,
there's usually a better way to do whatever you're trying to do."
[Here you're one level up and using a string variable for the name, but it's the same underlying issue.] Or as S. Lott followed up with in the same thread:
"90% of the time, you should be using a dictionary. The other 10% of
the time, you need to stop what you're doing entirely."
If you're using the contents of stringRESULTS as a pointer to some object fred which you want to setattr, then these objects you want to target must already exist somewhere, and a dictionary is the natural data structure to store them. In fact, depending on your use case, you might be able to use dictionary key/value pairs instead of attributes in the first place.
IOW, my version of what (I'm guessing) you're trying to do would probably look like
d[stringRESULTS].b = b
or
d[stringRESULTS]["b"] = b
depending on whether I wanted/needed to work with an object instance or a dictionary would suffice.
(P.S. relatively few people subscribe to the python-3.x tag. You'll usually get more attention by adding the bare 'python' tag as well.)

Since str is a low-level primitive type, you can't really set any arbitrary attribute on it. You probably need either a dict or a subclass of str:
class StringResult(str):
pass
which should behave as you expect:
my_string_result = StringResult("spam_and_eggs")
my_string_result.b = b
EDIT:
If you're trying to do what DSM suggests, ie. modify a property on a variable that has the same name as the value of the stringRESULTS variable then this should do the trick:
locals()[stringRESULTS].b = b
Please note that this is an extremely dangerous operation and can wreak all kinds of havoc on your app if you aren't careful.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Named Stuctured Data storage in Python - python

Related

How to use %s to name a list

how to change variable from to 2 instances of same imported module

Referencing Python list elements without specifying indexes

Persistent references in Python

GetAttr Function Problems (Python 3)

Categories

Resources