I have a number of chemicals with corresponding data held within a database, how do I go about returning a specific chemical, and its data, via its formula, eg o2.
class SourceNotDefinedException(Exception):
def __init__(self, message):
super(SourceNotDefinedException, self).__init__(message)
class tvorechoObject(object):
"""The class stores a pair of objects, "tv" objects, and "echo" objects. They are accessed
simply by doing .tv, or .echo. If it does not exist, it will fall back to the other variable.
If neither are present, it returns None."""
def __init__(self, echo=None, tv=None):
self.tv = tv
self.echo = echo
def __repr__(self):
return str({"echo": self.echo, "tv": self.tv}) # Returns the respective strings
def __getattribute__(self, item):
"""Altered __getattribute__() function to return the alternative of .echo / .tv if the requested
attribute is None."""
if item in ["echo", "tv"]:
if object.__getattribute__(self,"echo") is None: # Echo data not present
return object.__getattribute__(self,"tv") # Select TV data
elif object.__getattribute__(self,"tv") is None: # TV data not present
return object.__getattribute__(self,"echo") # Select Echo data
else:
return object.__getattribute__(self,item) # Return all data
else:
return object.__getattribute__(self,item) # Return all data
class Chemical(object):
def __init__(self, inputLine, sourceType=None):
self.chemicalName = TVorEchoObject()
self.mass = TVorEchoObject()
self.charge = TVorEchoObject()
self.readIn(inputLine, sourceType=sourceType)
def readIn(self, inputLine, sourceType=None):
if sourceType.lower() == "echo": # Parsed chemical line for Echo format
chemicalName = inputLine.split(":")[0].strip()
mass = inputLine.split(":")[1].split(";")[0].strip()
charge = inputLine.split(";")[1].split("]")[0].strip()
# Store the objects
self.chemicalName.echo = chemicalName
self.mass.echo = mass
self.charge.echo = charge
elif sourceType.lower() == "tv": # Parsed chemical line for TV format
chemicalName = inputLine.split(":")[0].strip()
charge = inputLine.split(":")[1].split(";")[0].strip()
mass = inputLine.split(";")[1].split("&")[0].strip()
# Store the objects
self.chemicalName.tv = chemicalName
self.charge.tv = charge
self.mass.tv = molecularWeight
else:
raise SourceNotDefinedException(sourceType + " is not a valid `sourceType`") # Otherwise print
def toDict(self, priority="echo"):
"""Returns a dictionary of all the variables, in the form {"mass":<>, "charge":<>, ...}.
Design used is to be passed into the Echo and TV style line format statements."""
if priority in ["echo", "tv"]:
# Creating the dictionary by a large, to avoid repeated text
return dict([(attributeName, self.__getattribute__(attributeName).__getattribute__(priority))
for attributeName in ["chemicalName", "mass", "charge"]])
else:
raise SourceNotDefinedException("{0} source type not recognised.".format(priority)) # Otherwise print
from ParseClasses import Chemical
allChemical = []
chemicalFiles = ("/home/temp.txt")
for fileName in chemicalFiles:
with open(fileName) as sourceFile:
for line in sourceFile:
allChemical.append(Chemical(line, sourceType=sourceType))
for chemical in allChemical:
print chemical.chemicalName #Prints all chemicals and their data in list format
for chemical in allChemical(["o2"]):
print chemical.chemicalName
outputs the following error which I have tried to remedy with no luck;
TypeError: 'list' object is not callable
The issue is the two lines
for chemical in allChemical(["o2"]):
print chemical.chemicalName
allChemical is a list, and you can't just do a_list(). It looks like you're trying to find either ['o2'] or just 'o2' in a list. To do that, you can get the index of the item and then get that index from the list.
allChemical[allChemical.index("o2")]
Try this function:
def chemByString(chemName,chemicals,priority="echo"):
for chemical in chemicals:
chemDict = chemical.toDict(priority)
if chemDict["chemicalName"] == chemName
return chemical
return None
This function is using the toDict() method found in the Chemical class. The code you pasted from the Chemical class explains that this method returns a dictionary from the chemical object:
def toDict(self, priority="echo"):
"""Returns a dictionary of all the variables, in the form {"mass":<>, "charge":<>, ...}.
Design used is to be passed into the Echo and TV style line format statements."""
if priority in ["echo", "tv"]:
# Creating the dictionary by a large, to avoid repeated text
return dict([(attributeName, self.__getattribute__(attributeName).__getattribute__(priority))
for attributeName in ["chemicalName", "mass", "charge"]])
else:
raise SourceNotDefinedException("{0} source type not recognised.".format(priority)) # Otherwise print
This dictionary looks like this:
"chemicalName" : <the chemical name>
"mass" : <the mass>
"charge" : <the charge>
What the function I created above does is iterate through all of the chemicals in the list, finds the first one with a name equal to "o2", and returns that chemical. Here's how to use it:
chemByString("o2",allChemicals).chemicalName
If the above does not work, may want to try using the alternative priority ("tv"), though I'm unsure if this will have any effect:
chemByString("o2",allChemicals,"tv").chemicalName
If the chemical isn't found, the function returns None:
chemByString("myPretendChemical",allChemicals).chemicalName
EDIT: See my new answer. Leaving this one here since it might still be helpful info.
In python, a list object is a structure holding other objects with an index for each object it contains. Like this:
Index Object
0 "hello"
1 "world"
2 "spam"
If you want to get to one of those objects, you have to know its index:
objList[0] #returns "hello" string object
If you don't know the index, you can find it using the index method:
objList.index("hello") #returns 0
Then you can get the object out of the list using the found index:
objList[objList.index("hello")]
However this is kind of silly, since you can just do:
"hello"
Which in this case will produce the same result.
Your allChemical object is a list. It looks like the line chemicalFiles = ("/home/temp.txt") is filling your list with some type of object. In order to answer your question, you have to provide more information about the objects which the list contains. I assume that information is in the ParseClasses module you are using.
If you can provide more information about the Chemical object you are importing, that may go a long way to helping solve your problem.
IF the objects contained in your list are subclassed from str, this MAY work:
allChemical[allChemical.index("o2")].chemicalName
"02" is a str object, so index is going to look for a str object (or an object subclassed from str) in your list to find its index. However, if the object isn't a string, it will not find it.
As a learning exercise, try this:
class Chemical(str):
'''A class which is a subclass of string but has additional attributes such as chemicalName'''
def __init__(self,chemicalName):
self.chemicalName = chemicalName
someChemicals = [Chemical('o2'),Chemical('n2'),Chemical('h2')]
for chemical in someChemicals: print(chemical.chemicalName)
#prints all the chemical names
print(someChemicals[0].chemicalName)
#prints "o2"; notice you have to know the index ahead of time
print(someChemicals[someChemicals.index("o2")].chemicalName)
#prints "o2" again; this time index found it for you, but
#you already knew the object ahead of time anyway, sot it's a little silly
This works because index is able to find what you are looking for. If it isn't a string it can't find it, and if you don't know what index 'o2' is at, if you want to get to a specific chemical in your list of chemicals you're going to have to learn more about those objects.
Related
I am reading data from nested json with this code:
data = json.loads(json_file.json)
for nodesUni in data["data"]["queryUnits"]['nodes']:
try:
tm = (nodesUni['sql']['busData'][0]['engine']['engType'])
except:
tm = ''
try:
to = (nodesUni['sql']['carData'][0]['engineData']['producer']['engName'])
except:
to = ''
json_output_for_one_GU_owner = {
"EngineType": tm,
"EngineName": to,
}
I am having an issue with None type error (eg. this one doesn't exists at all nodesUni['sql']['busData'][0]['engine']['engType'] cause there are no data, so I am using try/except. But my code is more complex and having a try/except for every value is crazy. Is there any other option how to deal with this?
Error: "TypeError: 'NoneType' object is not subscriptable"
This is non-trivial as your requirement is to traverse the dictionaries without errors, and get an empty string value in the end, all that in a very simple expression like cascading the [] operators.
First method
My approach is to add a hook when loading the json file, so it creates default dictionaries in an infinite way
import collections,json
def superdefaultdict():
return collections.defaultdict(superdefaultdict)
def hook(s):
c = superdefaultdict()
c.update(s)
return(c)
data = json.loads('{"foo":"bar"}',object_hook=hook)
print(data["x"][0]["zzz"]) # doesn't exist
print(data["foo"]) # exists
prints:
defaultdict(<function superdefaultdict at 0x000001ECEFA47160>, {})
bar
when accessing some combination of keys that don't exist (at any level), superdefaultdict recursively creates a defaultdict of itself (this is a nice pattern, you can read more about it in Is there a standard class for an infinitely nested defaultdict?), allowing any number of non-existing key levels.
Now the only drawback is that it returns a defaultdict(<function superdefaultdict at 0x000001ECEFA47160>, {}) which is ugly. So
print(data["x"][0]["zzz"] or "")
prints empty string if the dictionary is empty. That should suffice for your purpose.
Use like that in your context:
def superdefaultdict():
return collections.defaultdict(superdefaultdict)
def hook(s):
c = superdefaultdict()
c.update(s)
return(c)
data = json.loads(json_file.json,object_hook=hook)
for nodesUni in data["data"]["queryUnits"]['nodes']:
tm = nodesUni['sql']['busData'][0]['engine']['engType'] or ""
to = nodesUni['sql']['carData'][0]['engineData']['producer']['engName'] or ""
Drawbacks:
It creates a lot of empty dictionaries in your data object. Shouldn't be a problem (except if you're very low in memory) as the object isn't dumped to a file afterwards (where the non-existent values would appear)
If a value already exists, trying to access it as a dictionary crashes the program
Also if some value is 0 or an empty list, the or operator will pick "". This can be workarounded with another wrapper that tests if the object is an empty superdefaultdict instead. Less elegant but doable.
Second method
Convert the access of your successive dictionaries as a string (for instance just double quote your expression like "['sql']['busData'][0]['engine']['engType']", parse it, and loop on the keys to get the data. If there's an exception, stop and return an empty string.
import json,re,operator
def get(key,data):
key_parts = [x.strip("'") if x.startswith("'") else int(x) for x in re.findall(r"\[([^\]]*)\]",key)]
try:
for k in key_parts:
data = data[k]
return data
except (KeyError,IndexError,TypeError):
return ""
testing with some simple data:
data = json.loads('{"foo":"bar","hello":{"a":12}}')
print(get("['sql']['busData'][0]['engine']['engType']",data))
print(get("['hello']['a']",data))
print(get("['hello']['a']['e']",data))
we get, empty string (some keys are missing), 12 (the path is valid), empty string (we tried to traverse a non-dict existing value).
The syntax could be simplified (ex: "sql"."busData".O."engine"."engType") but would still have to retain a way to differentiate keys (strings) from indices (integers)
The second approach is probably the most flexible one.
I want to:
Take a list of lists
Make a frequency table in a dictionary
Do things with the resulting dictionary
The class works, the code works, the frequency table is correct.
I want to get a class that returns a dictionary, but I actually get a class that returns a class type.
I can see that it has the right content in there, but I just can't get it out.
Can someone show me how to turn the output of the class to a dictionary type?
I am working with HN post data. Columns, a few thousand rows.
freq_pph = {}
freq_cph = {}
freq_uph = {}
# Creates a binned frequency table:
# - key is bin_minutes (size of bin in minutes).
# - value is freq_value which sums/counts the number of things in that column.
class BinFreq:
def __init__(self, dataset, bin_minutes, freq_value, dict_name):
self.dataset = dataset
self.bin_minutes = bin_minutes
self.freq_value = freq_value
self.dict_name = dict_name
def make_table(self):
# Sets bin size
# Counts how of posts in that timedelta
if (self.bin_minutes == 60) and (self.freq_value == "None"):
for post in self.dataset:
hour_dt = post[-1]
hour_str = hour_dt.strftime("%H")
if hour_str in self.dict_name:
self.dict_name[hour_str] += 1
else:
self.dict_name[hour_str] = 1
# Sets bins size
# Sums the values of a given index/column
if (self.bin_minutes == 60) and (self.freq_value != "None"):
for post in self.dataset:
hour_dt = post[-1]
hour_str = hour_dt.strftime("%H")
if hour_str in self.dict_name:
self.dict_name[hour_str] += int(row[self.freq_value])
else:
self.dict_name[hour_str] = int(row[self.freq_value])
Instantiate:
pph = BinFreq(ask_posts, 60, "None", freq_pph)
pph.make_table()
How can pph be turned into a real dictionary?
If you want the make_table function to return a dictionary, then you have to add a return statement at the end of it, for example: return self.dict_name.
If you then want to use it outside of the class, you have to assign it to a variable, so in the second snipped do: my_dict = pph.make_table().
Classes can't return things – functions in classes could. However, the function in your class doesn't; it just modifies self.dict_name (which is a misnomer; it's really just a reference to a dict, not a name (which one might imagine is a string)), which the caller then reads (or should, anyway).
In addition, there seems to be a bug; the second if block (which is never reached anyway) refers to row, an undefined name.
Anyway, your class doesn't need to be a class at all, and is easiest implemented with the built-in collections.Counter() class:
from collections import Counter
def bin_by_hour(dataset, value_key=None):
counter = Counter()
for post in dataset:
hour = post[-1].hour # assuming it's a `datetime` object
if value_key: # count using `post[value_key]`
counter[hour] += post[value_key]
else: # just count
counter[hour] += 1
return dict(counter.items()) # make the Counter a regular dict
freq_pph = bin_by_hour(ask_posts)
freq_cph = bin_by_hour(ask_posts, value_key="num_comments") # or whatever
I am new to Python and currently searching for some internship or a job. I am currently working on a program in Python which reads a file that contains data in this shape:
Id;name;surname;age;gender;friends;
Id and age are the positive integers,
gender can be "male" or "female",
and friends is an array of numbers, separated by comma, which represent the Id's of persons who are friends with the current person. If Person1 is a friend to a Person2, it must work vice versa.
As you can see in the above example, attributes of a "Person" are separated by semicolon, and the trick is that not every person has every attribute, and of course, they differ by the number of friends. So, the first part of the task is to make a program which reads a file and creates a structure which represents a list of persons with the attributs mentioned above. I have to make a search for those persons by Id.
The second part is to make a function with two arguments (Id1, Id2) which returns True if a person with Id2 is a friend to a person with Id1. Otherwise, it returns false.
I have some ideas on my mind, but I am not sure how to realize this, since I don't know enough about Python yet. I guess the best structure for this would be a dictionary, but I am not sure how to load a file into it, since the attributes of all persons are different. I would be greatful for any help you can offer me.
Here is my attempt to write the code:
people = open(r"data.txt")
class People:
id = None
name = ''
surname = ''
age = None
gender = ['male', 'female']
friends = []
#def people(self):
# person = {'id': None,
# 'name': '',
# 'surname': '',
# 'age': None,
# 'gender': ['male', 'female'],
# 'friends': []
# }
# return person
def community(self):
comm = [People()]
return comm
def is_friend(id1, id2):
if (id1 in People.friends) & (id2 in People.friends):
return True
people.close()
Your question is too broad imho, but I'll give you a few hints:
the simplest datastructure for O(n) key access is indeed a dict. Note that a dict needs immutable values as keys (but that's fine since your Ids are integers), but can take anything as values. but that only works for (relatively) small datasets since it's all in memory. If you need bigger datasets and/or persistance, you want a database (key:value, relational, document, the choice is up to you).
Python has classes and computed attributes
In Python, the absence of a value is the None object
there's a csv files parser in the standard lib.
Now you just have to read the doc and start coding.
[edit] wrt/ your code snippet
class People:
id = None
name = ''
surname = ''
age = None
gender = ['male', 'female']
friends = []
Python is not Java or PHP. What you defined above are class attributes (shared by all instances of the class), you want instance attributes (defined in the __init() method). You should really read the FineManual.
Also if you're using Python 2.7.x, you want your classes to inherit from object (historical reasons).
So your Person class should look something like this:
class Person(object):
def __init__(self, id, name, surname, age, gender, friends=None):
self.id = id
self.name = name
self.surname = surname
self.age = age
self.gender = gender
self.friends = friends or []
And then to create a Person instance:
person = Person(42, "John Cleese", "Archie Leach", 77, "male", [11, 1337)])
def is_friend(id1, id2):
if (id1 in People.friends) & (id2 in People.friends):
return True
A few points points here:
First: you either want to rename this function are_friends or make it a method of the Person class and then only pass a (single) Person instance (not an 'id') as argument.
Second: in Python, & is the bitwise operator. The logical "and" operator is spelled, well, and.
Third: an expression has a truth value by itself, so your if statement is redundant. Whenever you see something like:
def func():
if <some expression>:
return True
else:
return False
you can just rewrite it as :
def func():
return <some expression>
Or if you want to ensure func returns a proper boolean (True or False):
def func():
return bool(<some expression>)
I'll stop here because I don't intend to teach you how to program. You obviously need to do at least the full official Python tutorial, and possibly some complete beginner tutorial too.
I'm loading data about phone calls into a list of namedtuples called 'records'. Each phone call has information on the length of the call in the variable 'call_duration'. However, some have the variable set to None. I would like to replace None with zero in all of the records, but the following code doesn't seem to work:
for r in records:
if r.call_duration is None:
r = r._replace(call_duration=0)
How can replace the value in the list? I guess the problem is that the new 'r' isn't stored in the list. What would be the best way to capture in the change in the list?
You can replace the old record by using its index in the records list. You can get that index using enumerate():
for i, rec in enumerate(records):
if rec.call_duration is None:
records[i] = rec._replace(call_duration=0)
I suggest you create your own class, it will benefit you in the future as far as object management goes. When you want to create methods later on for a record, you'll be able to easily do so in a class:
class Record:
def __init__(self, number = None, length = None):
self.number = number
self.length = length
def replace(self, **kwargs):
self.__dict__.update(kwargs)
Now you can easily manage your records and replace object attributes as you deem necessary.
for r in records:
if r.length is None:
r.replace(length = 0)
Is there some easy way to access an object in a list, without using an index or iterating through the list?
In brief:
I'm reading in lines from a text file, splitting up the lines, and creating objects from the info. I do not know what information will be in the text file. So for example:
roomsfile.txt
0\bedroom\A bedroom with king size bed.\A door to the east.
1\kitchen\A modern kitchen with steel and chrome.\A door to the west.
2\familyRoom\A huge family room with a tv and couch.\A door to the south.
Some Python Code:
class Rooms:
def __init__(self, roomNum, roomName, roomDesc, roomExits):
self.roomNum = roomNum
self.roomName = roomName
self.roomDesc = roomDesc
self.roomExits = roomExits
def getRoomNum(self):
return self.roomNum
def getRoomName(self):
return self.roomName
def getRoomDesc(self):
return self.roomDesc
def getRoomExits(self):
return self.roomExits
def roomSetup():
roomsfile = "roomsfile.txt"
infile = open(roomsfile, 'r')
rooms = []
for line in infile:
rooms.append(makeRooms(line))
infile.close()
return rooms
def makeRooms(infoStr):
roomNum, roomName, roomDesc, roomExits = infoStr.split("\")
return Rooms(roomNum, roomName, roomDesc, roomExits)
When I want to know what exits the bedroom has, I have to iterate through the list with something like the below (where "noun" is passed along by the user as "bedroom"):
def printRoomExits(rooms, noun):
numRooms = len(rooms)
for n in range(numRooms):
checkRoom = rooms[n].getRoomName()
if checkRoom == noun:
print(rooms[n].getRoomExits())
else:
pass
This works, but it feels like I am missing some easier approach...especially since I have a piece of the puzzle (ie, "bedroom" in this case)...and especially since the rooms list could have thousands of objects in it.
I could create an assignment:
bedroom = makeRooms(0, bedroom, etc, etc)
and then do:
bedroom.getRoomExits()
but again, I won't know what info will be in the text file, and don't know what assignments to make. This StackOverFlow answer argues against "dynamically created variables", and argues in favor of using a dictionary. I tried this approach, but I could not find a way to access the methods (and thus the info) of the named objects I added to the dictionary.
So in sum: am I missing something dumb?
Thanks in advance! And sorry for the book-length post - I wanted to give enough details.
chris
At least one dictionary is the right answer here. The way you want to set it up is at least to index by name:
def roomSetup():
roomsfile = "roomsfile.txt"
infile = open(roomsfile, 'r')
rooms = {}
for line in infile:
newroom = makeRooms(line)
rooms[newroom.roomName] = newroom
infile.close()
return rooms
Then, given a name, you can access the Rooms instance directly:
exits = rooms['bedroom'].roomExits
There is a reason I'm not using your getRoomName and getRoomExits methods - getter and setter methods are unnecessary in Python. You can just track your instance data directly, and if you later need to change the implementation refactor them into properties. It gives you all the flexibility of getters and setters without needing the boilerplate code up front.
Depending on what information is present in your definitions file and what your needs are, you can get fancier - for instance, I would probably want to have my exits information stored in a dictionary mapping a canonical name for each exit (probably starting with 'east', 'west', 'north' and 'south', and expanding to things like 'up', 'down' and 'dennis' as necessary) to a tuple of a longer description and the related Rooms instance.
I would also name the class Room rather than Rooms, but that's a style issue rather than important behavior.
You can use in to check for membership (literally, if something is in a container). This works for lists, strings, and other iterables.
>>> li = ['a','b','c']
>>> 'a' in li
True
>>> 'x' in li
False
After you've read your rooms, you can create a dictionary:
rooms = roomSetup()
exits_of_each_room = {}
for room in rooms:
exits_of_each_room[room.getRoomName()] = room.getRoomExits()
Then you your function is simply:
def printRoomExits(exits_of_each_room, noun):
print exits_of_each_room[noun]