Populate dictionaries from text file - python

I have a text file with the details of a set of restaurants given one after the other. The details are name, rating, price and type of cuisines of a particular restaurant. The contents of text file is as given below.
George Porgie
87%
$$$
Canadian, Pub Food
Queen St. Cafe
82%
$
Malaysian, Thai
Dumpling R Us
71%
$
Chinese
Mexican Grill
85%
$$
Mexican
Deep Fried Everything
52%
$
Pub Food
I want to create a set of dictionaries as given below:
Restaurant name to rating:
# dict of {str : int}
name_to_rating = {'George Porgie' : 87,
'Queen St. Cafe' : 82,
'Dumpling R Us' : 71,
'Mexican Grill' : 85,
'Deep Fried Everything' : 52}
Price to list of restaurant names:
# dict of {str : list of str }
price_to_names = {'$' : ['Queen St. Cafe', 'Dumpling R Us', 'Deep Fried Everything'],
'$$' : ['Mexican Grill'],
'$$$' : ['George Porgie'],
'$$$$' : [ ]}
Cuisine to list of restaurant name:
#dic of {str : list of str }
cuisine_to_names = {'Canadian' : ['George Porgie'],
'Pub Food' : ['George Porgie', 'Deep Fried Everything'],
'Malaysian' : ['Queen St. Cafe'],
'Thai' : ['Queen St. Cafe'],
'Chinese' : ['Dumpling R Us'],
'Mexican' : ['Mexican Grill']}
What is the best way in Python to populate the above dictionaries ?

Initialise some containers:
name_to_rating = {}
price_to_names = collections.defaultdict(list)
cuisine_to_names = collections.defaultdict(list)
Read your file into a temporary string:
with open('/path/to/your/file.txt') as f:
spam = f.read().strip()
Assuming the structure is consistent (i.e. chunks of 4 lines separated by double newlines), iterate through the chunks and populate your containers:
restraunts = [chunk.split('\n') for chunk in spam.split('\n\n')]
for name, rating, price, cuisines in restraunts:
name_to_rating[name] = rating
# etc ..

for the main reading loop, you can use enumerate and modulo to know what is the data on a line:
for lineNb, line in enumerate(data.splitlines()):
print lineNb, lineNb%4, line
for the price_to_names and cuisine_to_names dictionnaries, you could use a defaultdict:
from collections import defaultdict
price_to_names = defaultdict(list)

Related

Is there a way to find an element in a list with a unique criteria Python?

For some reason .index and .find is not working in my program.
Basically, I want the program to find the index number of the list with the set criteria.
For example this list:
['Synonyms: Pocket Monsters, Indigo League, Adventures on the Orange Islands, The Johto Journeys, Johto League Champions, Master Quest', 'Japanese: ポケットモンスター', 'Type: TV', 'Episodes: 276', 'Status: Finished Airing', 'Aired: Apr 1, 1997 to Nov 14, 2002', 'Premiered: Spring 1997', 'Broadcast: Thursdays at 19:00 (JST)', 'Producers: TV Tokyo, TV Tokyo Music, Studio Jack', 'Licensors: VIZ Media, 4Kids Entertainment', 'Studios: OLM', 'Source: Game', 'Genres: Action, Adventure, Comedy, Kids, Fantasy', 'Duration: 24 min. per ep.', 'Rating: PG - Children', 'Score: 7.341 (scored by 291,570 users)', 'Ranked: #21572', 'Popularity: #287', 'Members: 504,076', 'Favorites: 4,076', '']
I would like to find the index number of the position of "Genres". I tried doing it .index("Genres:") but that didn't find the index number and returned an error.
I need this to find the index number because, for other pages on this website, the "genre" is in a different position
This is what I tried and just returned an error
GIndex = Information.index("Genres")
print (Information[GIndex])
Genre = (Information[GIndex])
You could also use a list comprehension:
word = 'Genres'
results = [i for i, l in enumerate(lst) if word in l]
for index to work it would need to be an exact match. This will check if 'Genres' is contained within any string in the list and print its index
word = 'Genres'
for i, item in enumerate(lst):
if word in item:
print(i, item)
you can easily turn this into a function to return i which is the index
ou can use enumerate()
l=['Synonyms: Pocket Monsters, Indigo League, Adventures on the Orange Islands, The Johto Journeys, Johto League Champions, Master Quest', 'Japanese: ポケットモンスター', 'Type: TV', 'Episodes: 276', 'Status: Finished Airing', 'Aired: Apr 1, 1997 to Nov 14, 2002', 'Premiered: Spring 1997', 'Broadcast: Thursdays at 19:00 (JST)', 'Producers: TV Tokyo, TV Tokyo Music, Studio Jack', 'Licensors: VIZ Media, 4Kids Entertainment', 'Studios: OLM', 'Source: Game', 'Genres: Action, Adventure, Comedy, Kids, Fantasy', 'Duration: 24 min. per ep.', 'Rating: PG - Children', 'Score: 7.341 (scored by 291,570 users)', 'Ranked: #21572', 'Popularity: #287', 'Members: 504,076', 'Favorites: 4,076', '']
for i, s in enumerate(l):
if "Genres" in s:
print(i)
>>>12
In your list there is not element whose value is "Genres". "Genres" and "Genres: Action, Adventure, Comedy, Kids, Fantasy" are not equal. If you want to find the element which starts with "Genres", you can write a very simple for loop like
for index, item in enumerate(information_list):
if item.startswith('Genres'):
print(index, item)
break
A better way would be like #9769953 suggested is to use a dictionary. The efficiency of lookup in dict is pretty much constant time and your data will be organized neatly with keys corresponding to property name and value corresponding to property value.
A dictionary would look like this
information_dict = {
"Synonyms": "Pocket Monsters, Indigo League, Adventures on the Orange Islands, The Johto Journeys, Johto League Champions, Master Quest",
"Japanese": "ポケットモンスター",
"Type": "TV",
"Episodes": "276",
"Status": "Finished Airing",
"Aired": "Apr 1, 1997 to Nov 14, 2002",
"Premiered": "Spring 1997",
"Broadcast": "Thursdays at 19:00 (JST)",
"Producers": "TV Tokyo, TV Tokyo Music, Studio Jack",
"Licensors": "VIZ Media, 4Kids Entertainment",
"Studios": "OLM",
"Source": "Game",
"Genres": "Action, Adventure, Comedy, Kids, Fantasy",
"Duration": "24 min. per ep.",
"Rating": "PG - Children",
"Score": "7.341 (scored by 291,570 users)",
"Ranked": "#21572",
"Popularity": "#287",
"Members": "504,076",
"Favorites": "4,076"
}
And by doing information_dict["Genres"] you can get the value "Action, Adventure, Comedy, Kids, Fantasy".

'DataFrame' object is not callable PYTHON

I have a code that should write information to excel using selenium. I have 1 list with some information. I need to write all this to excel, and i have solution. But, when i tried to use it i got 'DataFrame' object is not callable. How can i solve it?
All this code into iteration:
for schools in List: #in the List i have data from excel file with Name of schools
data = pd.DataFrame()
data({
"School Name":School_list_result[0::17],
"Principal":School_list_result[1::17],
"Principal's E-mail":School_list_result[2::17],
"Type":School_list_result[8::17],
"Grade Span": School_list_result[3::17],
"Address":School_list_result[4::17],
"Phone":School_list_result[14::17],
"Website":School_list_result[13::17],
"Associations/Communities":School_list_result[5::17],
"GreatSchools Summary Rating":School_list_result[6::17],
"U.S.News Rankings":School_list_result[12::17],
"Total # Students":School_list_result[15::17],
"Full-Time Teachers":School_list_result[16::17],
"Student/Teacher Ratio":School_list_result[17::17],
"Charter":School_list_result[9::17],
"Enrollment by Race/Ethnicity": School_list_result[7::17],
"Enrollment by Gender":School_list_result[10::17],
"Enrollment by Grade":School_list_result[11::17],
})
data.to_excel("D:\Schools.xlsx")
In School_list_result i have this data:
'Cape Elizabeth High School',
'Mr. Jeffrey Shedd',
'No data.',
'9-12',
'345 Ocean House Road, Cape Elizabeth, ME 04107',
'Cape Elizabeth Public Schools',
'8/10',
'White\n91%\nAsian\n3%\nTwo or more races\n3%\nHispanic\n3%\nBlack\n1%',
'Regular school',
'No',
' Male Female\n Students 281 252',
' 9 10 11 12\n Students 139 135 117 142',
'#5,667 in National Rankings',
'https://cehs.cape.k12.me.us/',
'Tel: (207)799-3309',
'516 students',
'47 teachers',
'11:1',
Please follow the syntax about how to create a dataframe
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html
So your code should be modified as:
for schools in List: #in the List i have data from excel file with Name of schools
data = pd.DataFrame(data={
"School Name": School_list_result[0::17],
"Principal": School_list_result[1::17],
"Principal's E-mail": School_list_result[2::17],
"Type": School_list_result[8::17],
"Grade Span": School_list_result[3::17],
"Address": School_list_result[4::17],
"Phone": School_list_result[14::17],
"Website": School_list_result[13::17],
"Associations/Communities": School_list_result[5::17],
"GreatSchools Summary Rating": School_list_result[6::17],
"U.S.News Rankings": School_list_result[12::17],
"Total # Students": School_list_result[15::17],
"Full-Time Teachers": School_list_result[16::17],
"Student/Teacher Ratio": School_list_result[17::17],
"Charter": School_list_result[9::17],
"Enrollment by Race/Ethnicity": School_list_result[7::17],
"Enrollment by Gender": School_list_result[10::17],
"Enrollment by Grade": School_list_result[11::17],
})
Do you want to add in an existing xlsx file?
First, create the dictionary and then call the DataFrame method, like this:
r = {"column1":["data"], "column2":["data"]}
data = pd.DataFrame(r)

Can not use .json information

Just a heads up I'm completely new to the coding scene and I'm having some issues using a json file
I've got the json to open using
json_queue = json.load(open('customer.json'))
but I just cant find the right code that allows me to make use of the info on the json. I think its because the json is an array not an object (probably completely wrong) My json currently looks like this
[
["James", "VW"],
["Katherine", "BMW"],
["Deborah", "renault"],
["Marguerite", "ford"],
["Kenneth", "VW"],
["Ronald", "Mercedes"],
["Donald", "BMW"],
["Al", "vauxhall"],
["Max", "porsche"],
["Carlos", "BMW"],
["Barry", "ford"],
["Donald", "renault"]
]
What I'm trying to do is take the persons name and the car type they are looking for and compare it too another json file that has the stock of cars in a shop but I'm currently stuck as to how I get python to actually use the information in that json.
I think I might of over explained my problem. My issue is that I am just starting a project using .json files and I can get python to open the file, but then I am unsure of how to get python to read that "James" wants a "VW" and then to go check the stock json to check if it is in stock. The stock json looks like this.
{
"VW": 4,
"BMW": 2,
"renault": 0,
"ford": 1,
"mercedes": 2,
"vauxhall": 1,
"porsche": 0,
}
What you have after the json.load() call is a plain python list of lists:
whishlist = [
["James", "VW"],
["Katherine", "BMW"],
["Deborah", "renault"],
["Marguerite", "ford"],
["Kenneth", "VW"],
["Ronald", "Mercedes"],
["Donald", "BMW"],
["Al", "vauxhall"],
["Max", "porsche"],
["Carlos", "BMW"],
["Barry", "ford"],
["Donald", "renault"]
]
where each sublist is a , pair. You can iterate over this list:
for name, car in whishlist:
print("name : {} - car : {}".format(name, car))
Now with your "other json file", what you have is a dict:
stock = {
"VW": 4,
"BMW": 2,
"renault": 0,
"ford": 1,
"mercedes": 2,
"vauxhall": 1,
"porsche": 0,
}
so all you have to do is to iterate over the whishlist list, check whether the car is in stock and print (or do anything else) the result:
for name, car in whishlist:
in_stock = stock.get(car, 0)
print("for {} : car {} in stock : {}".format(name, car, in_stock))
for James : car VW in stock : 4
for Katherine : car BMW in stock : 2
for Deborah : car renault in stock : 0
for Marguerite : car ford in stock : 1
for Kenneth : car VW in stock : 4
for Ronald : car Mercedes in stock : 0
for Donald : car BMW in stock : 2
for Al : car vauxhall in stock : 1
for Max : car porsche in stock : 0
for Carlos : car BMW in stock : 2
for Barry : car ford in stock : 1
for Donald : car renault in stock : 0

Printing just the dict value

I'm trying to figure out why my loop is printing everything in the dict and not just the values
films = {
"2005": ["Munich", "Steven Spielberg"],
"2006": [["The Prestige", "Christopher Nolan"], ["The Departed", "Martin Scorsese"]]
}
for year in movies:
print (year)
for x in films[year]:
print (films[year])
I would like it to print like this
2005
Munich, Steven Spielberg
2006
The prestige, Christopher Nolan
the Departed, Martin Scorsese
But instead its printing like this with brackets and apostrophes
2005
['Munich', 'Steven Spielberg']
You're not using x (I suggest a better name).
This code works in both Python 2.7 and Python 3.x:
films = {
"2005": ["Munich", "Steven Spielberg"],
"2006": [["The Prestige", "Christopher Nolan"], ["The Departed", "Martin Scorsese"]]
}
for year in sorted(films.keys()):
print(year)
if isinstance(films[year][0], list):
films_list = films[year]
else:
films_list = [films[year]]
for film in films_list:
print(", ".join(film))
print("")
Instead of
print (films[year])
use
print (", ".join(films[x]))
This construction joins members of the list films[x] (yes - [x], not [year]) using the string
", " (comma and a space) as separators between individual members.

How do I work with a nested dictionary's name?

I'm writing a program using dictionaries nested within a list. I want to print the name of each dictionary when looping through the list, but don't know how to do that without calling the entire contents of the dictionary. Here is my code:
sam = {
'food' : 'tortas',
'country' : 'mexico',
'song' : 'Dream On',
}
dave = {
'food' : 'spaghetti',
'country' : 'USA',
'song' : 'Sweet Home Alabama',
}
people = [sam, dave]
for person in people:
for key, value in sorted(person.items()):
print( #person's name +
"'s favorite " + key + " is " + value + ".")
Here is the output:
's favorite country is mexico.
's favorite food is tortas.
's favorite song is Dream On.
's favorite country is USA.
's favorite food is spaghetti.
's favorite song is Sweet Home Alabama.
Everything works, I just need the names of my dictionaries to print. What's the solution?
The (more) correct way of doing this is to construct a dict of dicts instead, such as:
people = {'sam': {'food' : 'tortas',
'country' : 'mexico',
'song' : 'Dream On',
},
'dave': {'food' : 'spaghetti',
'country' : 'USA',
'song' : 'Sweet Home Alabama',
}
}
Then you can simply do the following:
for name, person in people.items():
for key, value in sorted(person.items()):
print(name + "'s favorite " + key + " is " + value + ".")
This will print the following:
dave's favorite country is USA.
dave's favorite food is spaghetti.
dave's favorite song is Sweet Home Alabama.
sam's favorite country is mexico.
sam's favorite food is tortas.
sam's favorite song is Dream On.
As a side note, it is more readable to use string formatting in your print statement:
print("{0}'s favorite {1} is {2}".format(name, key, value))
what you are basically trying to do is printing the name of a variable. Of course, this is not reccomended. If you really want to do this, you should take a look at this post:
How can you print a variable name in python?
What i would do, is to store the name of the dictionary inside of the lists. You could do this by changing 'people = [sam, dave]' to 'people = [["sam", sam], ["dave", dave]]'. This way, person[0] is the name of the person, and person[1] contains the information.
The simplest way is to store the name as a string that maps to the matching variable identifier:
people = {'sam':sam, 'dave':dave}
for name, person in people.items():
for key, value in sorted(person.items()):
print(name + "'s favorite " + key + " is " + value + ".")
If you really don't like the idea of typing each name twice, you could 'inline' the dictionaries:
people = {
'sam':{
'food' : 'tortas',
'country' : 'mexico',
'song' : 'Dream On',
},
'dave':{
'food' : 'spaghetti',
'country' : 'USA',
'song' : 'Sweet Home Alabama',
}
}
Finally, if you can rely on those variables being in the global namespace and are more concerned with just making it work than purity of practice, you can find them this way:
people = ['sam', 'dave']
for name in people:
person = globals()[name]
for key, value in sorted(person.items()):
print(name + "'s favorite " + key + " is " + value + ".")
Values in a list aren't really variables any more. They aren't referred to by a name in some namespace, but by an integer indicating their offsets from the front of the list (0, 1, ...).
If you want to associate each dict of data with some name, you have to do it explicitly. There are two general options, depending on what's responsible for tracking the name: the collection of people, or each person in the collection.
The first and easiest is the collections.OrderedDict --- unlike the normal dict, it will preserve the order of the people in your list.
from collections import OrderedDict
sam = {
'food': 'tortas',
'country': 'Mexico',
'song': 'Dream On',
}
dave = {
'food': 'spaghetti',
'country': 'USA',
'song': 'Sweet Home Alabama',
}
# The OrderedDict stores each person's name.
people = OrderedDict([('Sam', sam), ('Dave', dave)])
for name, data in people.items():
# Name is a key in the OrderedDict.
print('Name: ' + name)
for key, value in sorted(data.items()):
print(' {0}: {1}'.format(key.title(), value))
Alternatively, you can store each person's name in his or her own dict... assuming you're allowed to change the contents of those dictionaries. (Also, you wouldn't want to add anything to the data dictionary that would require you to change / update the data more than you already do. Since most people change their favorite food or song much more often than they change their name, this is probably safe.)
sam = {
# Each dict has a new key: 'name'.
'name': 'Sam',
'food': 'tortas',
'country': 'Mexico',
'song': 'Dream On',
}
dave = {
'name': 'Dave',
'food': 'spaghetti',
'country': 'USA',
'song': 'Sweet Home Alabama',
}
people = [sam, dave]
for data in people:
# Name is a value in the dict.
print('Name: ' + data['name'])
for key, value in sorted(data.items()):
# Have to avoid printing the name again.
if 'name' != key:
print(' {0}: {1}'.format(key.title(), value))
Note that how you print the data depends on whether you store the name in the collection (OrderedDict variant), or in each person's dict (list variant).
Thanks for the great input. This program is for a practice example in "Python Crash Course" by Eric Matthes, so the inefficient "dictionaries inside list" format is intentional. That said, I got a lot out of your comments, and altered my code to get the desired output:
sam = {
#Added a 'name' key-value pair.
'name' : 'sam',
'food' : 'tortas',
'country' : 'mexico',
'song' : 'Dream On',
}
dave = {
'name' : 'dave',
'food' : 'spaghetti',
'country' : 'USA',
'song' : 'Sweet Home Alabama',
}
people = [sam, dave]
for person in people:
for key, value in sorted(person.items()):
#Added if statement to prevent printing the name.
if key != 'name':
print(person['name'].title() + "'s favorite " + key + " is " + value + ".")
#Added a blank line at the end of each for loop.
print('\n')
Here is the output:
Sam's favorite country is mexico.
Sam's favorite food is tortas.
Sam's favorite song is Dream On.
Dave's favorite country is USA.
Dave's favorite food is spaghetti.
Dave's favorite song is Sweet Home Alabama.
Thanks again, all who provided insightful answers.

Categories

Resources