Loop and store coordinates - python

I have a copy of a dataframe that looks like this:
heatmap_df = test['coords'].copy()
heatmap_df
0 [(Manhattanville, Manhattan, Manhattan Communi...
1 [(Mainz, Rheinland-Pfalz, 55116, Deutschland, ...
2 [(Ithaca, Ithaca Town, Tompkins County, New Yo...
3 [(Starr Hill, Charlottesville, Virginia, 22903...
4 [(Neuchâtel, District de Neuchâtel, Neuchâtel,...
5 [(Newark, Licking County, Ohio, 43055, United ...
6 [(Mae, Cass County, Minnesota, United States o...
7 [(Columbus, Franklin County, Ohio, 43210, Unit...
8 [(Canaanville, Athens County, Ohio, 45701, Uni...
9 [(Arizona, United States of America, (34.39534...
10 [(Enschede, Overijssel, Nederland, (52.2233632...
11 [(Gent, Oost-Vlaanderen, Vlaanderen, België - ...
12 [(Reno, Washoe County, Nevada, 89557, United S...
13 [(Grenoble, Isère, Auvergne-Rhône-Alpes, Franc...
14 [(Columbus, Franklin County, Ohio, 43210, Unit...
Each row has this format with some coordinates:
heatmap_df[2]
[Location(Ithaca, Ithaca Town, Tompkins County, New York, 14853, United States of America, (42.44770298533052, -76.48085858627931, 0.0)),
Location(Chapel Hill, Orange County, North Carolina, 27515, United States of America, (35.916920469999994, -79.05664845999999, 0.0))]
I want to pull the latitude and longitudes from each row and store them as separate columns in the dataframe heatmap_df. I have this so far, but I suck at writing loops. My loop is not working recursively, it only prints out the last coordinates.
x = np.arange(start=0, stop=3, step=1)
for i in x:
point_i = (heatmap_df[i][0].latitude, heatmap_df[i][0].longitude)
i = i+1
point_i
(42.44770298533052, -76.48085858627931)
I am trying to make a heat map with all the coordinates using Folium. Can someone help please? Thank you

Python doesn't know what you are trying to do it's assuming you want to store the tuple value of (heatmap_df[i][0].latitude, heatmap_df[i][0].longitude) in the variable point_i for every iteration. So what happens is it is overwritten every time. You want to declare a list outside then loop the append a lists of the Lat and Long to it creating a List of List which can easily be a DF. Also, your loop in the example isn't recursive, Check this out for recursion
Try this:
x = np.arange(start=0, stop=3, step=1)
points = []
for i in x:
points.append([heatmap_df[i][0].latitude, heatmap_df[i][0].longitude])
i = i+1
print(points)

Related

select random pairs from remaining unique values in a list

Updated: Not sure I explained it well first time.
I have a scheduling problem, or more accurately, a "first come first served" problem. A list of available assets are assigned a set of spaces, available in pairs (think cars:parking spots, diners:tables, teams:games). I need a rough simulation (random) that chooses the first two to arrive from available pairs, then chooses the next two from remaining available pairs, and so on, until all spaces are filled.
Started using teams:games to cut my teeth. The first pair is easy enough. How do I then whittle it down to fill the next two spots from among the remaining available entities? Tried a bunch of different things, but coming up short. Help appreciated.
import itertools
import numpy as np
import pandas as pd
a = ['Georgia','Oregon','Florida','Texas'], ['Georgia','Oregon','Florida','Texas']
b = [(x,y) for x,y in itertools.product(*a) if x != y]
c = pd.DataFrame(b)
c.columns = ['home', 'away']
print(c)
d = c.sample(n = 2, replace = False)
print(d)
The first results is all possible combinations. But, once the first slots are filled, there can be no repeats. in example below, once Oregon and Georgia are slated in, the only remaining options to choose from are Forlida:Texas or Texas:Florida. Obviously just the sample function alone produces duplicates frequently. I will need this to scale up to dozens, then hundreds of entities:slots. Many thanks in advance!
home away
0 Georgia Oregon
1 Georgia Florida
2 Georgia Texas
3 Oregon Georgia
4 Oregon Florida
5 Oregon Texas
6 Florida Georgia
7 Florida Oregon
8 Florida Texas
9 Texas Georgia
10 Texas Oregon
11 Texas Florida
home away
3 Oregon Georgia
5 Oregon Texas
Not exactly sure what you are trying to do. But if you want to randomly pair your unique entities you can simply randomly order them and then place them in a 2-columns dataframe. I wrote this with all the US states minus one (Wyomi):
states = ['Alaska','Alabama','Arkansas','Arizona','California',
'Colorado','Connecticut','District of Columbia','Delaware',
'Florida','Georgia','Hawaii','Iowa','Idaho','Illinois',
'Indiana','Kansas','Kentucky','Louisiana','Massachusetts',
'Maryland','Maine','Michigan','Minnesota','Missouri',
'Mississippi','Montana','North Carolina','North Dakota',
'Nebraska','New Hampshire','New Jersey','New Mexico',
'Nevada','New York','Ohio','Oklahoma','Oregon',
'Pennsylvania','Rhode Island','South Carolina',
'South Dakota','Tennessee','Texas','Utah','Virginia',
'Vermont','Washington','Wisconsin','West Virginia']
a=states.copy()
random.shuffle(states)
c = pd.DataFrame({'home':a[::2],'away':a[1::2]})
print(c)
#Output
home away
0 West Virginia Minnesota
1 New Hampshire Louisiana
2 Nevada Florida
3 Alabama Indiana
4 Delaware North Dakota
5 Georgia Rhode Island
6 Oregon Pennsylvania
7 New York South Dakota
8 Maryland Kansas
9 Ohio Hawaii
10 Colorado Wisconsin
11 Iowa Idaho
12 Illinois Missouri
13 Arizona Mississippi
14 Connecticut Montana
15 District of Columbia Vermont
16 Tennessee Kentucky
17 Alaska Washington
18 California Michigan
19 Arkansas New Jersey
20 Massachusetts Utah
21 Oklahoma New Mexico
22 Virginia South Carolina
23 North Carolina Maine
24 Texas Nebraska
Not sure if this is exactly what you were asking for though.
If you need to schedule all the fixtures of the season, you can check this answer --> League fixture generator in python

Getting a word from a set in a dataframe?

I have a dataframe column 'address' with values like this in each row:
3466B, Jerome Avenue, The Bronx, Bronx County, New York, 10467, United States, (40.881836199999995, -73.88176324294639)
Jackson Heights 74th Street - Roosevelt Avenue (7), 75th Street, Queens, Queens County, New York, 11372, United States, (40.74691655, -73.8914737373454)
I need only to keep the value Bronx / Queens / Manhattan / Staten Island from each row.
Is there any way to do this?
Thanks in advance.
One option is this, assuming the values are always in the same place. Using .split(', ')[2]
"3466B, Jerome Avenue, The Bronx, Bronx County, New York, 10467, United States, (40.881836199999995, -73.88176324294639)".split(', ')[2]
If the source file is a CSV (Comma-separated values), I would have a look at pandas and pandas.read_csv('filename.csv') and leverage all the nice features that are in pandas.
If the values are not at the same position and you need only a is in set of values or not:
import pandas as pd
df = pd.DataFrame(["The Bronx", "Queens", "Man"])
df.isin(["Queens", "The Bronx"])
You could add a column, let's call it 'district' and then populate it like this.
import pandas as pd
df = pd.DataFrame({'address':["3466B, Jerome Avenue, The Bronx, Bronx County, New York, 10467, United States, (40.881836199999995, -73.88176324294639)",
"Jackson Heights 74th Street - Roosevelt Avenue (7), 75th Street, Queens, Queens County, New York, 11372, United States, (40.74691655, -73.8914737373454)"]})
districts = ['Bronx','Queens','Manhattan', 'Staten Island']
df['district'] = ''
for district in districts:
df.loc[df['address'].str.contains(district) , 'district'] = district
print(df)

Merge is not working on two dataframes of multi level index

First DataFrame : housing, This data Frame contains MultiIndex (State, RegionName) and some relevant values in other 3 columns.
State RegionName 2008q3 2009q2 Ratio
New York New York 499766.666667 465833.333333 1.072844
California Los Angeles 469500.000000 413900.000000 1.134332
Illinois Chicago 232000.000000 219700.000000 1.055985
Pennsylvania Philadelphia 116933.333333 116166.666667 1.006600
Arizona Phoenix 193766.666667 168233.333333 1.151773
Second DataFrame : list_of_university_towns, Contains the names of States and Some regions and has default numeric index
State RegionName
1 Alabama Auburn
2 Alabama Florence
3 Alabama Jacksonville
4 Arizona Phoenix
5 Illinois Chicago
Now the inner join of the two dataframes :
uniHousingData = pd.merge(list_of_university_towns,housing,how="inner",on=["State","RegionName"])
This gives no values in the resultant uniHousingData dataframe, while it should have the bottom two values (index#4 and 5 from list_of_university_towns)
What am I doing wrong?
I found the issue. There was space at the end of the string in the RegionName column of the second dataframe. used Strip() method to remove the space and it worked like a charm.

Pandas merging/filtering dataframes

I have two data frames,
the first is a list of the cities in europe that belong to the EU and
which country they're in:
cities_in_eu
country city
0 sweden stockholm
1 germany berlin
2 germany frankfurt
3 spain barcelona
4 spain madrid
5 france paris
...
assume the data goes on like this for many observations, with potentially
many observations of cities for each country.
the next data frame is all cities in europe, not exclusive
to belonging in the EU.
This data frame has information on the cities population:
cities_in_europe
country city population(100million)
sweden stockholm 2
germany berlin 8
germany frankfurt 5
spain barcelona 6
spain madrid 3
france paris 8
switzerland bern 1
russia moscow 6
...
(the numbers here are made up)
basically i want to test the difference in population between
EU cities and non-EU cities by filtering the data to only see
cities in/not in the EU.
Using only the data frame list of cities_in_eu, how would i
achieve this?
You could try this:
First, you will create a list of EU cities based on the cities_in_eu
EUcities = list(set(cities_in_eu.city))
Then you will create a table which contains all the population information of EU cities:
#create a list of booleans
filter = []
for city in cities_in_europe.city:
filter.append(True if city in EUcities else False)
filtered = pd.Series(filter)
#this one will remain only cities in EU
df_eu = cities_in_europe[filtered]
nonEU_filter = [not i for i in filter]
nonEU_filtered = pd.Series(nonEU_filter)
df_non_eu = cities_in_europe[nonEU_filtered]
There you go, now you have 2 df of EU cities with population and non-EU cities with population. Then you can do other stuff on these two

Robot FrameWork Collections - List comparison Issue

I am trying to compare two identical lists in Robot Framework . The code I am using is :
List Test
Lists Should Be Equal #{List_Of_States_USA} #{List_Of_States_USA-Temp}
and the lists are identical with the following values :
#{List_Of_States_USA} Alabama Alaska American Samoa Arizona Arkansas California Colorado
... Connecticut Delaware District of Columbia Florida Georgia Guam Hawaii
... Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana
... Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri
... Montana National Nebraska Nevada New Hampshire New Jersey New Mexico
... New York North Carolina North Dakota Northern Mariana Islands Ohio Oklahoma Oregon
... Pennsylvania Puerto Rico Rhode Island South Carolina South Dakota Tennessee Texas
... Utah Vermont Virgin Islands Virginia Washington West Virginia Wisconsin
... Wyoming
This test fails with the following error:
FAIL Keyword 'Collections.Lists Should Be Equal' expected 2 to 5 arguments, got 114.
I have searched SO and other sites for a solution, but could not figure out why this happened. Thanks in advance for support
You need to use a $ not #. When you use #, robot expands the lists into multiple arguments.
From the robot framework user's guide:
When a variable is used as a scalar like ${EXAMPLE}, its value will be used as-is. If a variable value is a list or list-like, it is also possible to use as a list variable like #{EXAMPLE}. In this case individual list items are passed in as arguments separately.
Consider the case of #{foo} being a list with the values "one", "two" and "three". In such as case the following two are identical:
some keyword #{foo}
some keyword one two three
You need to change your statement to this:
Lists Should Be Equal ${List_Of_States_USA} ${List_Of_States_USA-Temp}
So, As suggested by Bryan-Oakley above, I modified the test as follows:
${L1} Create List #{List_Of_States_USA}
${L2} Create List #{List_Of_States_USA-Temp}
Lists Should Be Equal ${L1} ${L2}
Now the test passed. Thanks Again # Brian

Categories

Resources