Making acronym for every airport in Python list? - python

How do I create a new column, and write acronyms for each respective airport record using Python on a csv file?
I have a csv file of airports and I want the names of the airports to be in acronym form so that I can display them on a map more compactly, with the airport symbol showing what it is.
An example would be this sample list:
['Bradley Sky Ranch', 'Fire Island Airport', 'Palmer Municipal Airport']
into this: ['B.S.R', 'F.I.A.', 'P.M.A.']
Next, how would you put the '.' period punctuation between each acronym letter?
I think it would be + "." + or something with ".".join?
Lastly, a benefit would be if there is a way to get rid of the word 'Airport' so that every acronym doesn't end with 'A'?
For example, something like .strip 'Airport'... but it's not the main goal.
The numbered list below shows examples of code I have, but I have no coherent solution. So please take only what makes sense, and if it doesn't I would like to learn more effective syntax!
[The original airport data is from the ESRI Living Atlas.] I have a new field/column called 'NameAbbrev' which I want to write the acronyms into, but I did this in ArcPro which contains essentially a black-box interface for calculating new fields.
Sidenote: Why am I posting to SO and not GeoNet if this is map related? Please note that my goal is to use python and am not asking about ArcPy. I think the underlying principle is python-based for operating on a csv file (whereas ArcPy would be operating on a featureclass and you would have to use ESRI-designated functions). And SO reaches a wider audience of python experts.
1) So far, I have come across how to turn a string into an acronym, which works great on a single string, not a list:
Creating acronyms in Python
acronym = "".join(word[0] for word in test.upper().split())
2) and attempted to split the items in a list, or how to do readlines on a csv file based on an example (not mine): Attribute Error: 'list' object has no attribute 'split'
def getQuakeData():
filename = input("Please enter the quake file: ")
# Use with to make sure the file gets closed
with open(filename, "r") as readfile:
# no need for readlines; the file is already an iterable of lines
# also, using generator expressions means no extra copies
types = (line.split(",") for line in readfile)
# iterate tuples, instead of two separate iterables, so no need for zip
xys = ((type[1], type[2]) for type in types)
for x, y in xys:
print(x,y)
getQuakeData()
3) Also, I have been able to use pandas to print out just the column of airport names into a list:
import pandas
colnames = ['OBJECTID', 'POLYGON_ID', 'POLYGON_NM', 'NM_LANGCD', 'FEAT_TYPE', 'DETAIL_CTY', 'FEAT_COD', 'NAME_FIX', 'ORIG_FID', 'NameAbbrev']
data = pandas.read_csv(r'C:\Users\...\AZ_Airports_table.csv', names=colnames)
names = data.NAME_FIX.tolist()
print(names)
#Here is a sample of the list of airport names/print result.
#If you want a sample to demo guidance you could use these names:
#['NAME_FIX', 'Bradley Sky Ranch', 'Fire Island Airport', 'Palmer Municipal Airport', 'Kodiak Airport', 'Nome Airport', 'Kenai Municipal Airport', 'Iliamna Airport', 'Sitka Airport', 'Wrangell Airport', 'Sand Point Airport', 'Unalaska Airport', 'Adak Airport', 'Homer Airport', 'Cold Bay Airport']
4) I've also been able to use search cursor and writerow in the past, but I don't know how exactly to apply these methods. (unrelated example):
with open(outCsv, 'wb') as ouputCsv:
writer = csv.writer(outputCsv)
writer.writerow(fields) # writes header containing list of fields
rows = arcpy.da.SearchCursor(fc, field_names=fields)
for row in rows:
writer.writerow(row) # writes fc contents to output csv
del rows
5) So, I have pieces, but I don't know how to put them all together or if they even fit together. This is my Frankenstein monster of a solution, but it is wrong because it is trying to look at each column!
def getAcronym():
filename = r'C:\Users\...\AZ_Airports_table.csv'
# Use with to make sure the file gets closed
with open(filename, "r") as readfile:
# no need for readlines; the file is already an iterable of lines
# also, using generator expressions means no extra copies
airport = (line.split(",") for line in readfile)
# iterate tuples, instead of two separate iterables, so no need for zip
abbreviation = "".join(word[0] for word in airport.upper().split())
# could also try filter(str.isupper, line)
print(abbreviation)
getAcronym()
Is there a simpler way to combine these ideas and get the acronym column I want? Or is there an alternative way?

This can be done quite simply using a list comprehension, str.join, and filter:
>>> data = ['Bradley Sky Ranch', 'Fire Island Airport', 'Palmer Municipal Airport']
>>> ['.'.join(filter(str.isupper, name)) for name in data]
['B.S.R', 'F.I.A', 'P.M.A']

Shortest answer
You can iterate over each string in the list, by using a for loop, then you can add each result to a new list. It could be turned into a function if you desire.
airports = ['Bradley Sky Ranch', 'Fire Island Airport', 'Palmer Municipal Airport']
air_acronyms = []
for airport in airports:
words = airport.split()
letters = [word[0] for word in words]
air_acronyms.append(".".join(letters))
print(air_acronyms)
output
['B.S.R', 'F.I.A', 'P.M.A']

I don't know and actually not properly understood what you want but as far as I understand you want to generate the acronym of your list of strings with the first character of every word. So what about my below solution with couple of loops? You can use list comprehension or filter or other cool functions of python to achieve what you want further. Let me know if I miss anything.
input = ['Bradley Sky Ranch', 'Fire Island Airport', 'Palmer Municipal Airport']
output = []
for i in input:
j = i.split(' ')
res = ''
for k in j:
res+= k[0] + '.'
output.append(res)
print(output)
Output:
['B.S.R.', 'F.I.A.', 'P.M.A.']

Related

Python: How to convert CSV file to lists of dictionaries without importing reader or external libraries

I need to convert a CSV file into a list of dictionaries without importing CSV or other external libraries for a project I am doing for class.
Attempt
I am able to get the keys using header line but when I try to extract the values it goes row by row instead of column by column and starts in the wrong place. However when I append it to the list it goes back to starting at the right place. However I am unsure of how to connect the keys to the correct column in the list.
CSV file
This is the CSV file I am using, I am only using the descriptions portion up to the first comma.
I tried using a for 6 loop in order to cycle through each key but it seems to go row by row and I don't know how to change it.
If anybody could steer me in the right direction it would be very appreciated.
CSV sample - sample is not saving correctly but it has the three headers on top and then the three matching information below and so on.
(Code,Name,State)\n
(ACAD,Acadia National Park,ME)\n
(ARCH,Arches National Park,UT)\n
(BADL, Badlands National Park,SD)\n
read your question. I am posting code from what I understood from your question. You should learn to post the code in question. It is a mandatory skill. Always open a file using the "with" block. I made a demo CSV file with two rows of records. The following code fetched all the rows as a list of dictionaries.
def readParksFile(fileName="national_parks.csv"):
with open(fileName) as infile:
column_names = infile.readline()
keys = column_names.split(",")
number_of_columns = len(keys)
list_of_dictionaries = []
data = infile.readlines()
list_of_rows = []
for row in data:
list_of_rows.append(row.split(","))
infile.close()
for item in list_of_rows:
row_as_a_dictionary = {}
for i in range(number_of_columns):
row_as_a_dictionary[keys[i]] = item[i]
list_of_dictionaries.append(row_as_a_dictionary)
for i in range(len(list_of_dictionaries)):
print(list_of_dictionaries[i])
Output:
{'Code': 'cell1', 'Name': 'cell2', 'State': 'cell3', 'Acres': 'cell4', 'Latitude': 'cell5', 'Longitude': 'cell6', 'Date': 'cell7', 'Description\n': 'cell8\n'}
{'Code': 'cell11', 'Name': 'cell12', 'State': 'cell13', 'Acres': 'cell14', 'Latitude': 'cell15', 'Longitude': 'cell16', 'Date': 'cell17', 'Description\n': 'cell18'}
I would create a class with a constructor that has the keys from the first row of the CSV as properties. Then create an empty list to store your dictionaries. Then open the file (that is a built-in library so I assume you can use it) and read it line by line. Store the line as a string and use the split method with a comma as the delimiter and store that list in a variable. Call the constructor of your class for each line to construct your dictionary using the indexes of the list from the split method. Before reading the next line, append the dictionary to your list. This is probably not the easiest way to do it but it doesn't use any external libraries (although as others have mentioned, there is a built-in CSV module).
Code:
#Class with constructor
class Park:
def __init__(self, code, name, state):
self.code = code
self.name = name
self.state = state
#Empty array for storing the dictionaries
parks = []
#Open file
parks_csv = open("parks.csv")
#Skip first line
lines = parks_csv.readlines()[1:]
#Read the rest of the lines
for line in lines:
parkProperties = line.split(",")
newPark = Park(parkProperties[0], parkProperties[1], parkProperties[2])
parks.append(newPark)
#Print park dictionaries
#It would be easier to parse this using the JSON library
#But since you said you can't use any libraries
for park in parks:
print(f'{{code: {park.code}, name: {park.name}, state: {park.state}}}')
#Don't forget to close the file
parks_csv.close()
Output:
{code: ACAD, name: Acadia National Park, state: ME}
{code: ARCH, name: Arches National Park, state: UT}
{code: BADL, name: Badlands National Park, state: SD}

How do I read from a file consists of city names and coordinates/Populations and create functions to get the coordinates and population?

I'm using Python, and I have a file which has city names and information such as names, coordinates of the city and population of the city:
Youngstown, OH[4110,8065]115436
Yankton, SD[4288,9739]12011
966
Yakima, WA[4660,12051]49826
1513 2410
Worcester, MA[4227,7180]161799
2964 1520 604
Wisconsin Dells, WI[4363,8977]2521
1149 1817 481 595
How can I create a function to take the city name and return a list containing the latitude and longitude of the given city?
fin = open ("miles.dat","r")
def getCoordinates
cities = []
for line in fin:
cities.append(line.rstrip())
for word in line:
print line.split()
That's what I tried now; how could I get the coordinates of the city by calling the names of the city and how can I return the word of each line but not letters?
Any help will be much appreciated, thanks all.
I am feeling generous since you responded to my comment and made an effort to provide more info....
Your code example isn't even runnable right now, but from a purely pseudocode standpoint, you have at least the basic concept of the first part right. Normally I would want to parse out the information using a regex, but I think giving you an answer with a regex is beyond what you already know and won't really help you learn anything at this stage. So I will try and keep this example within the realm of the tools with which you seem to already be familiar.
def getCoordinates(filename):
'''
Pass in a filename.
Return a parsed dictionary in the form of:
{
city: [lat, lon]
}
'''
fin = open(filename,"r")
cities = {}
for line in fin:
# this is going to split on the comma, and
# only once, so you get the city, and the rest
# of the line
city, extra = line.split(',', 1)
# we could do a regex, but again, I dont think
# you know what a regex is and you seem to already
# understand split. so lets just stick with that
# this splits on the '[' and we take the right side
part = extra.split('[')[1]
# now take the remaining string and split off the left
# of the ']'
part = part.split(']')[0]
# we end up with something like: '4660, 12051'
# so split that string on the comma into a list
latLon = part.split(',')
# associate the city, with the latlon in the dictionary
cities[city] = latLong
return cities
Even though I have provided a full code solution for you, I am hoping that it will be more of a learning experience with the added comments. Eventually you should learn to do this using the re module and a regex pattern.

Python: Parsing Multiple .txt Files into a Single .csv File?

I'm not very experienced with complicated large-scale parsing in Python, do you guys have any tips or guides on how to easily parse multiple text files with different formats, and combining them into a single .csv file and ultimately entering them into a database?
An example of the text files is as follows:
general.txt (Name -- Department (DEPT) Room # [Age]
John Doe -- Management (MANG) 205 [Age: 40]
Equipment: Laptop, Desktop, Printer, Stapler
Experience: Python, Java, HTML
Description: Hardworking, awesome
Mary Smith -- Public Relations (PR) 605 [Age: 24]
Equipment: Mac, PC
Experience: Social Skills
Description: fun to be around
Scott Lee -- Programmer (PG) 403 [Age: 25]
Equipment: Personal Computer
Experience: HTML, CSS, JS
Description: super-hacker
Susan Kim -- Programmer (PG) 504 [Age: 21]
Equipment: Desktop
Experience: Social Skills
Descriptions: fun to be around
Bob Simon -- Programmer (PG) 101 [Age: 29]
Equipment: Pure Brain Power
Experience: C++, C, Java
Description: never comes out of his room
cars.txt (a list of people who own cars by their department/room #)
Programmer: PG 403, PG 101
Management: MANG 205
house.txt
Programmer: PG 504
The final csv should preferably tabulate to something like:
Name | Division | Division Abbrevation | Equipment | Room | Age | Car? | House? |
Scott Lee Programming PG PC 403 25 YES NO
Mary Smith Public Rel. PR Mac, PC 605 24 NO NO
The ultimate goal is to have a database, where searching "PR" would return every row where a person's Department is "PR," etc. There's maybe 30 text files total, each representing one or more columns in a database. Some columns are short paragraphs, which include commas. Around 10,000 rows total. I know Python has built in csv, but I'm not sure where to start, and how to end with just 1 csv. Any help?
It looks like you're looking for someone who will solve a whole problem for you. Here I am :)
General idea is to parse general info to dict (using regular expressions), then append additional fields to it and finally write to CSV. Here's Python 3.x solution (I think Python 2.7+ should suffice):
import csv
import re
def read_general(fname):
# Read general info to dict with 'PR 123'-like keys
# Gerexp that will split row into ready-to-use dict
re_name = re.compile(r'''
(?P<Name>.+)
\ --\ # Separator + space
(?P<Division>.+)
\ # Space
\(
(?P<Division_Abbreviation>.*)
\)
\ # Space
(?P<Id>\d+)
\ # Space
\[Age:\ # Space at the end
(?P<Age>\d+)
\]
''', re.X)
general = {}
with open(fname, 'rt') as f:
for line in f:
line = line.strip()
m = re_name.match(line)
if m:
# Name line, start new man
man = m.groupdict()
key = '%s %s' % (m.group('Division_Abbreviation'), m.group('Id'))
general[key] = man
elif line:
# Non empty lines
# Add values to dict
key, value = line.split(': ', 1)
man[key] = value
return general
def add_bool_criteria(fname, field, general):
# Append a field with YES/NO value
with open(fname, 'rt') as f:
yes_keys = set()
# Phase one, gather all keys
for line in f:
line = line.strip()
_, keys = line.split(': ', 1)
yes_keys.update(keys.split(', '))
# Fill data
for key, man in general.items(): # iteritems() will be faster in Python 2.x
man[field] = 'YES' if key in yes_keys else 'NO'
def save_csv(fname, general):
with open(fname, 'wt') as f:
# Gather field names
all_fields = set()
for value in general.values():
all_fields.update(value.keys())
# Write to csv
w = csv.DictWriter(f, all_fields)
w.writeheader()
w.writerows(general.values())
def main():
general = read_general('general.txt')
add_bool_criteria('cars.txt', 'Car?', general)
add_bool_criteria('house.txt', 'House?', general)
from pprint import pprint
pprint(general)
save_csv('result.csv', general)
if __name__ == '__main__':
main()
I wish you lot of $$$ for this ;)
Side note
CSV is a history, you could use JSON for storage and further use, because it's simpler to use, more flexible and human readable.
You just have a function which parses one file, and returns a list of dictionaries containing {'name': 'Bob Simon', 'age': 29, ...} etc. Then call this on each of your files, extending a master list. Then write this master list of dicts as a CSV file.
More elaborately:
First you need to parse the input files, you'd have a function which takes a file, and returns a list of "things".
def parse_txt(fname):
f = open(fname)
people = []
# Here, parse f. Maybe using a while loop, and calling
# f.readline() until there is an empty line Construct a
# dictionary from each person's block, and append it to allpeople
return people
This returns something like:
people = [
{'name': 'Bob Simon', 'age': 29},
{'name': 'Susan Kim', 'age': 21},
]
Then, loop over each of your input files (maybe by using os.listdir, or optparse to get a list of args):
allpeople = []
for curfile in args:
people = parse_txt(fname = curfile)
allpeople.extend(people)
So allpeople is a long list of all the people from all files.
Finally you can write this to a CSV file using the csv module (this bit usually involves another function to reorganise the data into a format more compatible with the writer module)
I'll do it backwards, I'll start by loading all those house.txt and cars.txt each one into a dict, that could look like:
cars = {'MANG': [205], 'PG': [403, 101]}
Since you said to have like 30 of them, you could easily use a nested dict without making things too complicated:
data = {'house': {'PG': 504}, 'cars': {...}}
Once the data dict will be complete, load general.txt and while building the dict for each employee (or whatever they are) do a dict look-up see if they have a house or not, or a car, etc..
For example for John Doe you'll have to check:
if data['house']['PG'].get(205):
# ...
and update his dict accordingly. Obviously you don't have to hard code all the possible look-ups, just build a couple of lists of the ['house', 'cars', ...] or something like that and iterate over it.
At the end you should have a big list of dict with all the info merged, so just write each one of them to a csv file.
Best possible advise: Don't do that.
Your cars and house relations are, ummmm, interesting. Owning a house or a car is an attribute of a person or other entity (company, partnership, joint tenancy, tenancy in common, etc, etc). It is NOT an attribute of a ("division", room) combination. The first fact in your cars file is "A programmer in room 403 owns a car". What happens in the not unlikely event that there 2 or more programmers in the same room?
The equipment shouldn't be in a list.
Don't record age, record date or year of birth.
You need multiple tables in a database, not 1 CSV file. You need to study a book on elementary database design.

Finding a small list of strings in a large list of strings (Python)

Hi I'm new to Python, so this may come across as a simple problem but I've been searching through Google many times and I can't seem to find a way to overcome it.
Basically I have a list of strings, taken from a CSV file. And I have another list of strings in a text file. My job is to see if the words from my text file are in the CSV file.
Let's say this is what the CSV file looks like (it's made up):
name,author,genre,year
Private Series,Kate Brian,Romance,2003
Mockingbird,George Orwell,Romance,1956
Goosebumps,Mary Door,Horror,1990
Geisha,Mary Door,Romance,2003
And let's say the text file looks like this:
Romance
2003
What I'm trying to do is, create a function which returns the names of a book which have the words "Romance" and "2003" in them. So in this case, it should return "Private Series" and "Geisha" but not "Mockingbird". But my problem is, it doesn't seem to return them. However when I change my input to "Romance" it returns all three books with Romance in them. I assume it's because "Romance 2003" aren't together because if I change my input to "Mary Door" both "Goosebumps" and "Geisha" show up. So how can I overcome this?
Also, how do I make my function case insensitive?
Any help would be much appreciated :)
import csv
def read_input(filename):
f = open(filename)
return csv.DictReader(f, delimiter = ',')
def search_filter(src, term):
term = term.lower()
for s in src:
if term in map(str.lower, s.values()):
yield s
def query(src, terms):
terms = terms.split()
for t in terms:
src = search_filter(src, t)
return src
def print_query(q):
for row in q:
print row
I tried to split the logic into small, re-usable functions.
First, we have read_input which takes a filename and returns the lines of a CSV file as an iterable of dicts.
The search_filter filters a stream of results with the given term. Both the search term and the row values are changed to lowercase for the comparison to achieve case-independent matching.
The query function takes a query string, splits it into search terms and then makes a chain of filters based on the terms and returns the final, filtered iterable.
>>> src = read_input("input.csv")
>>> q = query(src, "Romance 2003")
>>> print_query(q)
{'genre': 'Romance', 'year': '2003', 'name': 'Private Series', 'author': 'Kate Brian'}
{'genre': 'Romance', 'year': '2003', 'name': 'Geisha', 'author': 'Mary Door'}
Note that the above solution only returns full matches. If you want to e.g. return the above matcher with the search query "Roman 2003", then you can use this alternative version of search_filter:
def search_filter(src, term):
term = term.lower()
for s in src:
if any(term in v.lower() for v in s.values()):
yield s

Searching CSV Files (Python)

I've made this CSV file up to play with.. From what I've been told before, I'm pretty sure this CSV file is valid and can be used in this example.
Basically I have this CSV file 'book_list.csv':
name,author,year
Lord of the Rings: The Fellowship of the Ring,J. R. R. Tolkien,1954
Nineteen Eighty-Four,George Orwell,1984
Lord of the Rings: The Return of the King,J. R. R. Tolkien,1954
Animal Farm,George Orwell,1945
Lord of the Rings: The Two Towers, J. R. R. Tolkien, 1954
And I also have this text file 'search_query.txt', whereby I put in keywords or search terms I want to search for in the CSV file:
Lord
Rings
Animal
I've currently come up with some code (with the help of stuff I've read) that allows me to count the number of matching entries. I then have the program write a separate CSV file 'results.csv' which just returns either 'Matching' or ' '.
The program then takes this 'results.csv' file and counts how many 'Matching' results I have and it prints the count.
import csv
import collections
f1 = file('book_list.csv', 'r')
f2 = file('search_query.txt', 'r')
f3 = file('results.csv', 'w')
c1 = csv.reader(f1)
c2 = csv.reader(f2)
c3 = csv.writer(f3)
input = [row for row in c2]
for booklist_row in c1:
row = 1
found = False
for input_row in input:
results_row = []
if input_row[0] in booklist_row[0]:
results_row.append('Matching')
found = True
break
row = row + 1
if not found:
results_row.append('')
c3.writerow(results_row)
f1.close()
f2.close()
f3.close()
d = collections.defaultdict(int)
with open("results.csv", "rb") as info:
reader = csv.reader(info)
for row in reader:
for matches in row:
matches = matches.strip()
if matches:
d[matches] += 1
results = [(matches, count) for matches, count in d.iteritems() if count >= 1]
results.sort(key=lambda x: x[1], reverse=True)
for matches, count in results:
print 'There are', count, 'matching results'+'.'
In this case, my output returns:
There are 4 matching results.
I'm sure there is a better way of doing this and avoiding writing a completely separate CSV file.. but this was easier for me to get my head around.
My question is, this code that I've put together only returns how many matching results there are.. how do I modify it in order to return the ACTUAL results as well?
i.e. I want my output to return:
There are 4 matching results.
Lord of the Rings: The Fellowship of the Ring
Lord of the Rings: The Return of the King
Animal Farm
Lord of the Rings: The Two Towers
As I said, I'm sure there's a much easier way to do what I already have.. so some insight would be helpful. :)
Cheers!
EDIT: I just realized that if my keywords were in lower case, it won't work.. is there a way to avoid case-sensitivity?
Throw away the query file and get your search terms from sys.argv[1:] instead.
Throw away your output file and use sys.stdout instead.
Append matched booklist titles to a result_list. The result_row that you currently have has a rather misleading name. The count that you want is len(result_list). Print that. Then print the contents of result_list.
Convert your query words to lowercase once (before you start reading the input file). As you read each book_list row, convert its title to lowercase. Do your your matching with the lowercase query words and the lowercase title.
Overall plan:
Read in the entire book list csv into a dictionary of {title: info}.
Read in the questions csv. For each keyword, filter the dictionary:
[key for key, value in books.items() if "Lord" in key]
say. Do what you will with the results.
If you want, put the results in another csv.
If you want to deal with casing issues, try turning all the titles to lowercase ("FOO".lower()) when you store them in the dictionary.

Categories

Resources