hello guys so i have this notepad containing country and capital list.
then i want to make input of the country name to reveal the capital so this is where i get confused.
country.txt
malaysia, vietnam, myanmar, china, sri lanka, japan, brazil, usa, australia, thailand, russia, uk
kuala lumpur, hanoi, yangon, beijing, colombo, tokyo, rio, washington, canberra, bangkok, moscow, london
thats the notepad file for the country and capital
f = open(r'C:\Users\User\Desktop\country.txt')
count = 0
for line in f:
line = line.rstrip('\n')
rec = line.split(',')
count = count + 1
ctry = input('\nEnter country name: ')
ctry = ctry.lower()
for i in country:
if ctry == country[i]:
print ('country:', ctry)
print ('capital:', capital[i])
break
else:
print ('country not in the list')
here is where i don't know what to do to make it work.
i want the output to be like
Enter country name: vietnam
Country: vietnam
Capital: hanoi
and when there's no country on the list
Enter country name: france
Country not in the list
First of all, here are some reasons why your code isn't working and some suggestions too:
You didn't define the variables country or capital, so you were comparing the input with nothing, this will raise an error because the variable wasn't defined.
If you open a file, you should close it after manipulate it, as suggestion it's better to use with.
Please, see this about how for loops works, because you were trying to index country[i], knowing that i is an element not an integer index.
Nonethless, you could try this:
with open(r'C:/Users/User/Desktop/country.txt') as f:
lines=f.read().splitlines()
countries=lines[0].strip().split(', ')
cities=lines[1].strip().split(', ')
count = 0
print(countries)
print(cities)
ctry = input('\nEnter country name: ')
ctry = ctry.lower()
for i in countries:
if i == ctry:
print ('country:', ctry)
print ('capital:', cities[countries.index(i)])
break
else:
print ('country not in the list')
Output:
countries
>>>['malaysia', 'vietnam', 'myanmar', 'china', 'sri lanka', 'japan', 'brazil', 'usa', 'australia', 'thailand', 'russia', 'uk']
cities
>>>['kuala lumpur', 'hanoi', 'yangon', 'beijing', 'colombo', 'tokyo', 'rio', 'washington', 'canberra', 'bangkok', 'moscow', 'london']
>>>'\nEnter country name: ' uk
>>>country: uk
>>>capital: london
I am trying to do the following:
Read through a specific portion of a text file (there is a known starting point and ending point)
While reading through these lines, check to see if a word matches a word that I have included in a list
If a match is detected, then add that specific word to a new list
I have been able to read through the text and grab other data from it that I need, but I've been unable to do the above mentioned thus far.
I have tried to implement the following example: Python - Search Text File For Any String In a List
But I have failed to make it read correctly.
I have also tried to adapt the following: https://www.geeksforgeeks.org/python-finding-strings-with-given-substring-in-list/
But I was equally unsuccessful.
Here is some of my code:
import re
from itertools import islice
import os
# list of all countries
oneCountries = "Afghanistan, Albania, Algeria, Andorra, Angola, Antigua & Deps, Argentina, Armenia, Australia, Austria, Azerbaijan, Bahamas, Bahrain, Bangladesh, Barbados, Belarus, Belgium, Belize, Benin, Bhutan, Bolivia, Bosnia Herzegovina, Botswana, Brazil, Brunei, Bulgaria, Burkina, Burma, Burundi, Cambodia, Cameroon, Canada, Cape Verde, Central African Rep, Chad, Chile, China, Republic of China, Colombia, Comoros, Democratic Republic of the Congo, Republic of the Congo, Costa Rica,, Croatia, Cuba, Cyprus, Czech Republic, Danzig, Denmark, Djibouti, Dominica, Dominican Republic, East Timor, Ecuador, Egypt, El Salvador, Equatorial Guinea, Eritrea, Estonia, Ethiopia, Fiji, Finland, France, Gabon, Gaza Strip, The Gambia, Georgia, Germany, Ghana, Greece, Grenada, Guatemala, Guinea, Guinea-Bissau, Guyana, Haiti, Holy Roman Empire, Honduras, Hungary, Iceland, India, Indonesia, Iran, Iraq, Republic of Ireland, Israel, Italy, Ivory Coast, Jamaica, Japan, Jonathanland, Jordan, Kazakhstan, Kenya, Kiribati, North Korea, South Korea, Kosovo, Kuwait, Kyrgyzstan, Laos, Latvia, Lebanon, Lesotho, Liberia, Libya, Liechtenstein, Lithuania, Luxembourg, Macedonia, Madagascar, Malawi, Malaysia, Maldives, Mali, Malta, Marshall Islands, Mauritania, Mauritius, Mexico, Micronesia, Moldova, Monaco, Mongolia, Montenegro, Morocco, Mount Athos, Mozambique, Namibia, Nauru, Nepal, Newfoundland, Netherlands, New Zealand, Nicaragua, Niger, Nigeria, Norway, Oman, Ottoman Empire, Pakistan, Palau, Panama, Papua New Guinea, Paraguay, Peru, Philippines, Poland, Portugal, Prussia, Qatar, Romania, Rome, Russian Federation, Rwanda, St Kitts & Nevis, St Lucia, Saint Vincent & the Grenadines, Samoa, San Marino, Sao Tome & Principe, Saudi Arabia, Senegal, Serbia, Seychelles, Sierra Leone, Singapore, Slovakia, Slovenia, Solomon Islands, Somalia, South Africa, Spain, Sri Lanka, Sudan, Suriname, Swaziland, Sweden, Switzerland, Syria, Tajikistan, Tanzania, Thailand, Togo, Tonga, Trinidad & Tobago, Tunisia, Turkey, Turkmenistan, Tuvalu, Uganda, Ukraine, United Arab Emirates, United Kingdom, United States, Uruguay, Uzbekistan, Vanuatu, Vatican City, Venezuela, Vietnam, Yemen, Zambia, Zimbabwe"
countries = oneCountries.split(",")
path = "C:/Users/me/Desktop/read.txt"
thefile = open(path, errors='ignore')
countryParsing = False
for line in thefile:
line = line.strip()
# if line.startswith("Submitting Author:"):
# if re.match(r"Submitting Author:", line):
# print("blahblah1")
# countryParsing = True
# if countryParsing == True:
# print("blahblah2")
#
# res = [x for x in line if re.search(countries, x)]
# print("blah blah3: " + str(res))
# elif re.match(r"Running Head:", line):
# countryParsing = False
# if countryParsing == True:
# res = [x for x in line if re.search(countries, x)]
# print("blah blah4: " + str(res))
# for x in countries:
# if x in thefile:
# print("a country is: " + x)
# if any(s in line for s in countries):
# listOfAuthorCountries = listOfAuthorCountries + s + ", "
# if re.match(f"Submitting Author:, line"):
The #commented out lines are versions of the code that I've tried and failed to make work properly.
As requested, this is an example of the text file that I'm trying to grab the data from. I've modified it to remove sensitive information, but in this particular case, the "new list" should be appended with a certain number of "France" entries:
txt above....
Submitting Author:
asdf, asdf (proxy)
France
asdfasdf
blah blah
asdfasdf
asdf, Provence-Alpes-Côte d'Azu 13354
France
blah blah
France
asdf
Running Head:
...more text below
Based on the three points you stated on what you want to accomplish and what I understand from your code (which may not be what you intended), I propose:
# list of all countries
countries = "Afghanistan, Albania, Algeria, Andorra, Angola, Antigua & Deps, Argentina, Armenia, Australia, Austria, Azerbaijan, Bahamas, Bahrain, Bangladesh, Barbados, Belarus, Belgium, Belize, Benin, Bhutan, Bolivia, Bosnia Herzegovina, Botswana, Brazil, Brunei, Bulgaria, Burkina, Burma, Burundi, Cambodia, Cameroon, Canada, Cape Verde, Central African Rep, Chad, Chile, China, Republic of China, Colombia, Comoros, Democratic Republic of the Congo, Republic of the Congo, Costa Rica, Croatia, Cuba, Cyprus, Czech Republic, Danzig, Denmark, Djibouti, Dominica, Dominican Republic, East Timor, Ecuador, Egypt, El Salvador, Equatorial Guinea, Eritrea, Estonia, Ethiopia, Fiji, Finland, France, Gabon, Gaza Strip, The Gambia, Georgia, Germany, Ghana, Greece, Grenada, Guatemala, Guinea, Guinea-Bissau, Guyana, Haiti, Holy Roman Empire, Honduras, Hungary, Iceland, India, Indonesia, Iran, Iraq, Republic of Ireland, Israel, Italy, Ivory Coast, Jamaica, Japan, Jonathanland, Jordan, Kazakhstan, Kenya, Kiribati, North Korea, South Korea, Kosovo, Kuwait, Kyrgyzstan, Laos, Latvia, Lebanon, Lesotho, Liberia, Libya, Liechtenstein, Lithuania, Luxembourg, Macedonia, Madagascar, Malawi, Malaysia, Maldives, Mali, Malta, Marshall Islands, Mauritania, Mauritius, Mexico, Micronesia, Moldova, Monaco, Mongolia, Montenegro, Morocco, Mount Athos, Mozambique, Namibia, Nauru, Nepal, Newfoundland, Netherlands, New Zealand, Nicaragua, Niger, Nigeria, Norway, Oman, Ottoman Empire, Pakistan, Palau, Panama, Papua New Guinea, Paraguay, Peru, Philippines, Poland, Portugal, Prussia, Qatar, Romania, Rome, Russian Federation, Rwanda, St Kitts & Nevis, St Lucia, Saint Vincent & the Grenadines, Samoa, San Marino, Sao Tome & Principe, Saudi Arabia, Senegal, Serbia, Seychelles, Sierra Leone, Singapore, Slovakia, Slovenia, Solomon Islands, Somalia, South Africa, Spain, Sri Lanka, Sudan, Suriname, Swaziland, Sweden, Switzerland, Syria, Tajikistan, Tanzania, Thailand, Togo, Tonga, Trinidad & Tobago, Tunisia, Turkey, Turkmenistan, Tuvalu, Uganda, Ukraine, United Arab Emirates, United Kingdom, United States, Uruguay, Uzbekistan, Vanuatu, Vatican City, Venezuela, Vietnam, Yemen, Zambia, Zimbabwe"
countries = countries.split(",")
countries = [c.strip() for c in countries]
filename = "read.txt"
filehandle = open(filename, errors='ignore')
my_other_list = []
toParse = False
for line in filehandle:
line = line.strip()
if line.startswith("Submitting Author:"):
toParse = True
continue
elif line.startswith("Running Head:"):
toParse = False
continue
elif toParse:
for c in countries:
if c in line:
my_other_list.append(c)
EDIT SUMMARY
Adapted code to work on the text sample provided.
Fixed the list of countries (originally there were two commas after Costa Rica).
I think your main problem is that, in oneCountries, the country-names are separated by comma+space, but you're only splitting on comma, so for instance the second entry in countries is " Albania", with a space in front. You need to change:
oneCountries.split(",")
to:
oneCountries.split(", ")
After that, it looks like there's enough useful stuff in your commented-out code to achieve what you want.
I am currently trying to get countries from rows of data frame. Here is the code that i currently have:
l = [
['[Aydemir, Deniz\', \' Gunduz, Gokhan\', \' Asik, Nejla] Bartin
Univ, Fac Forestry, Dept Forest Ind Engn, TR-74100 Bartin,
Turkey\', \' [Wang, Alice] Lulea Univ Technol, Wood Technol,
Skelleftea, Sweden',1990],
['[Fang, Qun\', \' Cui, Hui-Wang] Zhejiang A&F Univ, Sch Engn, Linan
311300, Peoples R China\', \' [Du, Guan-Ben] Southwest Forestry
Univ, Kunming 650224, Yunnan, Peoples R China',2005],
['[Blumentritt, Melanie\', \' Gardner, Douglas J.\', \' Shaler
Stephen M.] Univ Maine, Sch Resources, Orono, ME USA\', \' [Cole,
Barbara J. W.] Univ Maine, Dept Chem, Orono, ME 04469 USA',2012],
['[Kyvelou, Pinelopi; Gardner, Leroy; Nethercot, David A.] Univ
London Imperial Coll Sci Technol & Med, London SW7 2AZ,
England',1998]]
dataf = pd.DataFrame(l, columns = ['Authors', 'Year'])
This is the data frame. And here is the code:
df = (dataf['Authors']
.replace(r"\bUSA\b", "United States", regex=True)
.apply(lambda x: geotext.GeoText(x).countries))
The problem was that GeoText didn't recognize "USA", but now I also saw that I need to change "England", "Scotland", "Wales" and "Northern Ireland" to "United Kingdom".
How can I extend .replace to achieve this?
You can use the translate method of the Series.str module and pass a dictionary of replacements.
dataf.Authors.str.translate({
'USA': 'United States',
"England": "United Kingdom",
"Scotland": "United Kingdom",
"Wales": "United Kingdom",
"Northern Ireland": "United Kingdom"
})
This worked for me. Here is the code:
replace_list = ['England', 'Scotland', 'Wales', 'Northern Ireland']
for check in replace_list:
dataf['Authors'] = dataf['Authors'].str.replace(check, 'United Kingdom', regex=True)
initially i had to create a function that receives the person's attributes and returns a structure that looks like that:
Team:
Name: Real Madrid
President:
Name: Florentino Perez
Age: 70
Country: Spain
Office: 001
Coach:
Name: Carlo Ancelotti
Age: 55
Country: Italy
Office: 006
Coach License: 456789545678
Players:
- Name: Cristiano Ronaldo
Age: 30
Country: Portugal
Number: 7
Position: Forward
Golden Balls: 1
- Name: Chicharito
Age: 28
Country: Mexico
Number: 14
Position: Forward
- Name: James Rodriguez
Age: 22
Country: Colombia
Number: 10
Position: Midfielder
- Name: Lucas Modric
Age: 28
Country: Croatia
Number: 19
Position: Midfielder
This structure also contains info about other clubs . I managed to do this with the following function:
def create_person(name, age, country, **kwargs):
info={"Name": name, "Age": age, "Country": country}
for k,v in kwargs.iteritems():
info[k]=v
return info
I used this function to create a list of nested dictionaries and display the right structure for each team. Example:
teams = [
{
"Club Name": "Real Madrid",
"Club President": create_person("Florentino Perez", 70, "Spain", Office="001"),
"Club's Coach": create_person("Carlo Angelotii", 60, "Italy", Office="006", CoachLicense="456789545678"),
"Players": {
"Real_Player1": create_person("Cristiani Ronaldo", 30, "Portugal", Number="7", Position="Forward", GoldenBalls="1"),
"Real_Player2": create_person("Chicharito", 28, "Mexic", Number="14", Position="Forward"),
"Real_Player3": create_person("James Rodriguez", 22, "Columbia", Number="10", Position="Midfilder"),
"Real_Player4": create_person("Lucas Modric", 28, "Croatia", Number="19", Position="Midfilder")
}
},
{
"Club Name": "Barcelona",
"Club President": create_person("Josep Maria Bartolomeu", 60, "Spain", Office="B123"),
"Club's Coach": create_person("Luis Enrique Martinez", 43, "Spain", Office="B405", CoachLicense="22282321231"),
"Players": {
"Barcelona_Player1": create_person("Lionel Messi", 28, "Argentina", Number="10", Position="Forward", GoldenBalls="3"),
"Barcelona_Player2": create_person("Xavi Hernandez", 34, "Spain", Number="6", Position="Midfilder"),
"Barcelona_Player3": create_person("Dani Alvez", 28, "Brasil", Number="22", Position="Defender"),
"Barcelona_Player4": create_person("Gerard Pique", 29, "Spain", Number="22", Position="Defender")
}
}
]
Everything fine so far.
The part where I got stuck is this: Create a function print_president that receives the team name prints the following output:
Team: Real Madrid
President: Florentino Perez
Age: 70
Country: Spain
Office: 001
I could use a variable to display this but i need a function and I don't know how to work around this. Please help!
When you're trying to solve a problem (or ask a question) first simplify as much as you can. Your print_president() function takes a team name and then prints various pieces of information about the team. Each team is a dictionary with various attributes. So a simplified version of the problem might look like this:
teams = [
{
'name': 'Real Madrid',
'pres': 'Florentino',
},
{
'name': 'Barcelona',
'pres': 'Josep',
},
]
def print_president(team_name):
for t in teams:
# Now, you finish the rest. What should we check here?
...
print_president('Barcelona')
I can't think of a way to do this with just a team name, as you will have to know which dict to look at. I think something like this:
def print_president(team):
print 'Team: {team} President: {president} Age: {age} Country: {country} Office: {office}'.format(
team=team['Club Name'],
president=team['Club President']['Name'],
age=team['Club President']['Age'],
country=team['Club President']['Country'],
office=team['Club President']['Office']
)
If you are thinking of looking through all the teams in the list, then pass in two arguments: teams_list and team_name:
def print_president(teams_list,team_name):
for team in teams_list:
if team_name in team.values():
print 'Team: {team} President: {president} Age: {age} Country: {country} Office: {office}'.format(
team=team['Club Name'],
president=team['Club President']['Name'],
age=team['Club President']['Age'],
country=team['Club President']['Country'],
office=team['Club President']['Office']
)