Pyspark AnalysisException Py4JJavaError on transformation withColumn() - python

Working with Pyspark using the withColumn() command in order to do some basic transformation on the dataframe, namely, to update the value of a column. Looking for some debug assistance while I also strudy the problem.
Pyspark is issuing an AnalysisException & Py4JJavaError on the usage of the pyspark.withColumn command.
_c49='EVENT_NARRATIVE' is the withColumn('EVENT_NARRATIVE')... reference data elements inside the spark df (dataframe).
from pyspark.sql.functions import *
from pyspark.sql.types import *
df = df.withColumn('EVENT_NARRATIVE', lower(col('EVENT_NARRATIVE')))
Py4JJavaError: An error occurred while calling o100.withColumn.
: org.apache.spark.sql.AnalysisException: cannot resolve '`EVENT_NARRATIVE`' given input columns: [_c3, _c17, _c40, _c21, _c48, _c12, _c39, _c18, _c31, _c10, _c45, _c26, _c5, _c43, _c24, _c33, _c9, _c14, _c1, _c16, _c47, _c20, _c46, _c32, _c22, _c7, _c2, _c42, _c37, _c36, _c30, _c8, _c38, _c23, _c25, _c13, _c29, _c41, _c19, _c44, _c11, _c28, _c6, _c50, _c49, _c0, _c15, _c4, _c34, _c27, _c35];;
'Project [_c0#604, _c1#605, _c2#606, _c3#607, _c4#608, _c5#609, _c6#610, _c7#611, _c8#612, _c9#613, _c10#614, _c11#615, _c12#616, _c13#617, _c14#618, _c15#619, _c16#620, _c17#621, _c18#622, _c19#623, _c20#624, _c21#625, _c22#626, _c23#627, ... 28 more fields]
+- Relation[_c0#604,_c1#605,_c2#606,_c3#607,_c4#608,_c5#609,_c6#610,_c7#611,_c8#612,_c9#613,_c10#614,_c11#615,_c12#616,_c13#617,_c14#618,_c15#619,_c16#620,_c17#621,_c18#622,_c19#623,_c20#624,_c21#625,_c22#626,_c23#627,... 27 more fields] csv
1 row of sample data from df.head():
[Row(_c0='BEGIN_YEARMONTH', _c1='BEGIN_DAY', _c2='BEGIN_TIME', _c3='END_YEARMONTH', _c4='END_DAY', _c5='END_TIME', _c6='EPISODE_ID', _c7='EVENT_ID', _c8='STATE', _c9='STATE_FIPS', _c10='YEAR', _c11='MONTH_NAME', _c12='EVENT_TYPE', _c13='CZ_TYPE', _c14='CZ_FIPS', _c15='CZ_NAME', _c16='WFO', _c17='BEGIN_DATE_TIME', _c18='CZ_TIMEZONE', _c19='END_DATE_TIME', _c20='INJURIES_DIRECT', _c21='INJURIES_INDIRECT', _c22='DEATHS_DIRECT', _c23='DEATHS_INDIRECT', _c24='DAMAGE_PROPERTY', _c25='DAMAGE_CROPS', _c26='SOURCE', _c27='MAGNITUDE', _c28='MAGNITUDE_TYPE', _c29='FLOOD_CAUSE', _c30='CATEGORY', _c31='TOR_F_SCALE', _c32='TOR_LENGTH', _c33='TOR_WIDTH', _c34='TOR_OTHER_WFO', _c35='TOR_OTHER_CZ_STATE', _c36='TOR_OTHER_CZ_FIPS', _c37='TOR_OTHER_CZ_NAME', _c38='BEGIN_RANGE', _c39='BEGIN_AZIMUTH', _c40='BEGIN_LOCATION', _c41='END_RANGE', _c42='END_AZIMUTH', _c43='END_LOCATION', _c44='BEGIN_LAT', _c45='BEGIN_LON', _c46='END_LAT', _c47='END_LON', _c48='EPISODE_NARRATIVE', _c49='EVENT_NARRATIVE', _c50='DATA_SOURCE'),
Row(_c0='201210', _c1='29', _c2='1600', _c3='201210', _c4='29', _c5='1922', _c6='68680', _c7='416744', _c8='NEW HAMPSHIRE', _c9='33', _c10='2012', _c11='October', _c12='High Wind', _c13='Z', _c14='12', _c15='EASTERN HILLSBOROUGH', _c16='BOX', _c17='29-OCT-12 16:00:00', _c18='EST-5', _c19='29-OCT-12 19:22:00', _c20='0', _c21='0', _c22='0', _c23='0', _c24='109.60K', _c25='0.00K', _c26='ASOS', _c27='55.00', _c28='MG', _c29=None, _c30=None, _c31=None, _c32=None, _c33=None, _c34=None, _c35=None, _c36=None, _c37=None, _c38=None, _c39=None, _c40=None, _c41=None, _c42=None, _c43=None, _c44=None, _c45=None, _c46=None, _c47=None, _c48='Sandy, a hybrid storm with both tropical and extra-tropical characteristics, brought high winds and coastal flooding to southern New England. Easterly winds gusted to 50 to 60 mph for interior southern New England; 55 to 65 mph along the eastern Massachusetts coast and along the I-95 corridor in southeast Massachusetts and Rhode Island; and 70 to 80 mph along the southeast Massachusetts and Rhode Island coasts. A few higher higher gusts occurred along the Rhode Island coast. A severe thunderstorm embedded in an outer band associated with Sandy produced wind gusts to 90 mph and concentrated damage in Wareham early Tuesday evening, |a day after the center of Sandy had moved into New Jersey. In general, moderate coastal flooding occurred along the Massachusetts coastline, and major coastal flooding impacted the Rhode Island coastline. The storm surge was generally 2.5 to 4.5 feet along the east coast of Massachusetts, but peaked late Monday afternoon in between high tide cycles. Seas built to between 20 and 25 feet Monday afternoon and evening just off the Massachusetts east coast. Along the south coast, the storm surge was 4 to 6 feet and seas from 30 to a little over 35 feet were observed in the outer coastal waters. The very large waves on top of the storm surge caused destructive coastal flooding along stretches of the Rhode Island exposed south coast. ||Sandy grew into a hurricane over the southwest Caribbean and then headed north across Jamaica, Cuba, and the Bahamas. As Sandy headed north of the Bahamas, the storm interacted with a vigorous weather system moving west to east across the United States and began to take on a hybrid structure. Strong high pressure over southeast Canada helped with the expansion of the strong winds well north of the center of Sandy. In essence, Sandy retained the structure of a hurricane near its center (until shortly before landfall) while taking on more of an extra-tropical cyclone configuration well away from the center. Sandy���s track was unusual. The storm headed northeast and then north across the western Atlantic and then sharply turned to the west to make landfall near Atlantic City, NJ during Monday evening. Sandy subsequently weakened and moved west across southern Pennsylvania on Tuesday before turning north and heading across western New York state into Quebec during Tuesday night and Wednesday.', _c49='The Automated Surface Observing System at Manchester-Boston Regional Airport (KMHT) recorded sustained wind speeds of 38 mph and gusts to 63 mph. In Manchester, a tree was downed on Harrison Street. In Hudson, a tree was downed on Lawrence Road, bringing down wires that sparked a fire that damaged a house. In Merrimack, a tree was downed, taking down wires and closing Amherst Road from Meetinghouse Road to Riverside Drive. In Nashua, a tree was downed onto a house on Broad Street, near the Hollils line. No structural damage was found. Numerous trees were downed, blocking roads.', _c50='CSV')

The column names are in the form of _c followed by a number, because presumbaly you did not specify header=True while reading the input file. You can do
df = spark.read.csv('filepath', header=True)
So that the column names will be BEGIN_YEARMONTH, BEGIN_DAY, ... etc, instead of _c0, _c1, ..., and then your withColumn code should work.
You can also consider adding inferSchema=True to ensure that the data types are suitable.
Of course, you can also stick with your current code, and do
df2 = df.withColumn('_c49', lower(col('_c49')))
But that's not a good long-term solution. Column names should be sensible, and you also don't want the header to be one of the rows in your dataframe.

Related

Selecting random row from dataframe in python based on certain conditions

Hi this is probably a very basic fix but I am just completely stuck and don't know enough about Python to figure out how to go about this myself. I made a dictionary of restaurants in my city and created a data frame of them. The whole program is just supposed to pick a random restaurant out of the dataframe. However, I want it to be able to select random restaurants based on certain things. For instance, "Cuisine" is a category and I want it to be able to select a random restaurant(row) based on cuisine being Mexican. I hope that makes sense because I am very lost.
my code is also below but there is not much to it
import pandas as pd
# Define a dictionary containing employee data
data = {'Restaurant':['August Henrys Burger Bar', 'Bridges & Bourbon', 'The Capital Grille', 'Chinatown Inn', 'Chipotle','Condado Tacos','Crafted North','Cristos Mediterranean Grille','Five Guys','Forbes Tavern','Freshii','Genoa Pizza & Bar','Giovannis Pizza & Pasta','Hello Bistro','Joe and Pie Cafe & Pizzeria','Las Velas','Mandarin Gourmet','McCormick and Schmick','Moes Southwest Grill','Nickys Thai Kitchen','Noodles & Company','The Original Oyster House','Pizza Parma','Primanti Bros','Siam Thai Restaurant','The Simple Greek','SlyFox Taphouse','SoFresh','Villa Reale Pizzeria & Restaurant','The Warren','The Yard'],
'Cuisine':['American', 'American', 'American','Asian', 'Mexican','Mexican','American','Mediterranean','American','American','American','Italian','Italian','American','Italian','Mexican','Asian','American','Mexican','Asian','American','American','Italian','American','Asian','Mediterranean','American','American','Italian','American','American'],
'Address':['946 Penn Avenue 412-765-3270', '930 Penn Avenue 412-586-4287', '301 Fifth Avenue 412-338-9100', '522 Third Avenue 412-261-1291', '211 Forbes Avenue 412-224-5586',',971 Liberty Avenue 412-281-9111','Marriott City Center 412-471-4000','130 6th Street 412-261-6442','Three PPH PLace 412-227-0206','310 Forbes Avenue 412-281-1999','501 Grant Street 412-430-0318','111 Market Street 412-281-6100','123 6th Street 412-281-7060','292 Forbes Avenue 412-434-0100','955 Liberty Avenue 412-738-0603','21 Market Square 412-251-0031','305 Wood Street 412-261-6151','301 Fifth Avenue 412-201-6992','210 Forbes Avenue 412-224-4422','903 Penn Avenue 412-471-8424','476 McMasters Way 412-562-2191','20 Market Square 412-566-7925','963 Liberty Avenue 412-577-7300','2 Market Square 412-261-1599','410 First Avenue 412-281-1122','4313 Market Street 412-261-4976','300 Liberty Avenue 412-586-7474','Five PPG Place Suite 100 412-586-7240','628 Smithfield Street 412-391-3963','245 7th Street 412-201-5888','100 Fifth Avenue 412-291-8182'],
'Operation':['Local', 'Local', 'Franchise', 'Local', 'Franchise','Franchise','Franchise','Local','Franchise','Local','Franchise','Local','Local','Franchise','Franchise','Local','Local','Franchise','Franchise','Local','Franchise','Franchise','Franchise','Franchise','Local','Local','Franchise','Local','Local','Local','Franchise']}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
You can first filter by Cusisine and then use sample to pick a random row:
df.loc[df.Cuisine=='Mexican'].sample(1)
Restaurant Cuisine Address Operation
18 Moes Southwest Grill Mexican 210 Forbes Avenue 412-224-4422 Franchise

Pandas : Create a new dataframe from 2 different dataframes using fuzzy matching [duplicate]

I have two data frames with each having a different number of rows. Below is a couple rows from each data set
df1 =
Company City State ZIP
FREDDIE LEES AMERICAN GOURMET SAUCE St. Louis MO 63101
CITYARCHRIVER 2015 FOUNDATION St. Louis MO 63102
GLAXOSMITHKLINE CONSUMER HEALTHCARE St. Louis MO 63102
LACKEY SHEET METAL St. Louis MO 63102
and
df2 =
FDA Company FDA City FDA State FDA ZIP
LACKEY SHEET METAL St. Louis MO 63102
PRIMUS STERILIZER COMPANY LLC Great Bend KS 67530
HELGET GAS PRODUCTS INC Omaha NE 68127
ORTHOQUEST LLC La Vista NE 68128
I joined them side by side using combined_data = pandas.concat([df1, df2], axis = 1). My next goal is to compare each string under df1['Company'] to each string under in df2['FDA Company'] using several different matching commands from the fuzzy wuzzy module and return the value of the best match and its name. I want to store that in a new column. For example if I did the fuzz.ratio and fuzz.token_sort_ratio on LACKY SHEET METAL in df1['Company'] to df2['FDA Company'] it would return that the best match was LACKY SHEET METAL with a score of 100 and this would then be saved under a new column in combined data. It results would look like
combined_data =
Company City State ZIP FDA Company FDA City FDA State FDA ZIP fuzzy.token_sort_ratio match fuzzy.ratio match
FREDDIE LEES AMERICAN GOURMET SAUCE St. Louis MO 63101 LACKEY SHEET METAL St. Louis MO 63102 LACKEY SHEET METAL 100 LACKEY SHEET METAL 100
CITYARCHRIVER 2015 FOUNDATION St. Louis MO 63102 PRIMUS STERILIZER COMPANY LLC Great Bend KS 67530
GLAXOSMITHKLINE CONSUMER HEALTHCARE St. Louis MO 63102 HELGET GAS PRODUCTS INC Omaha NE 68127
LACKEY SHEET METAL St. Louis MO 63102 ORTHOQUEST LLC La Vista NE 68128
I tried doing
combined_data['name_ratio'] = combined_data.apply(lambda x: fuzz.ratio(x['Company'], x['FDA Company']), axis = 1)
But got an error because the lengths of the columns are different.
I am stumped. How I can accomplish this?
I couldn't tell what you were doing. This is how I would do it.
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
Create a series of tuples to compare:
compare = pd.MultiIndex.from_product([df1['Company'],
df2['FDA Company']]).to_series()
Create a special function to calculate fuzzy metrics and return a series.
def metrics(tup):
return pd.Series([fuzz.ratio(*tup),
fuzz.token_sort_ratio(*tup)],
['ratio', 'token'])
Apply metrics to the compare series
compare.apply(metrics)
There are bunch of ways to do this next part:
Get closest matches to each row of df1
compare.apply(metrics).unstack().idxmax().unstack(0)
Get closest matches to each row of df2
compare.apply(metrics).unstack(0).idxmax().unstack(0)

How to convert text article with keywords to pandas data frame

I have similar text files to below, about 5,000 times and I want to extract the text article to one df column and the keywords in a list to another df column. I need this to have more training data.
In below sample, the article I want to extract is everything from 'Addis Abeba' to 'private bank' and the keywords are all keywords after 'SUBJECT' without percentages in brackets.
Sample of the dataset:
Addis Fortune
February 2011
Declaration? AU Action Needed in Favour of Democracy [opinion]
LENGTH: 692 words
Addis Abeba has been hosting delegates and heads of state for the AU Summit. It
is encouraging to see leaders of Africa discussing issues of continental
importance that accelerate the process of integration and thereby put Africa in
a better bargaining position in its relations with the outside world.
Indeed, "United We Stand, Divided We Fall."
It is time that the AU took a bold step to ensure that the leaders of the
continent win the hearts and minds of their citizens. It should ensure the
existence of democratic governments, which, at a minimum, guarantee popular
participation based on an acceptance of political equality among all citizens,
respect for civil liberties, and meaningful checks and balances on the power of
the executive.
This is also indispensable to the realisation of the age-old dream of the
formation of the United States of Africa. Donor countries and organisations also
have moral obligations to extend much needed support in this aspect.
Dawit Haile is a loan officer at a private bank.
SUBJECT: HEADS OF STATE & GOVERNMENT (90%); ELECTIONS (90%); INTERNATIONAL
ASSISTANCE (89%); INTERNATIONAL RELATIONS (73%); GROSS DOMESTIC PRODUCT (70%);
ECONOMIC NEWS (70%); EMBEZZLEMENT (68%); ELECTION FRAUD (68%) Ethiopia;
International Organizations and Africa
GEOGRAPHIC: AFRICA (96%); EGYPT (93%); UNITED STATES (93%); CHINA (92%);
ETHIOPIA (79%); TUNISIA (79%); ISRAEL (79%) Africa
LOAD-DATE: February 8, 2011
LANGUAGE: ENGLISH
PUBLICATION-TYPE: Newspaper
Copyright 2011 AllAfrica Global Media.
All Rights Reserved
2 of 1352 DOCUMENTS
Addis Fortune
February 2011
Gebrekidan Beyene's Prosecutors Repeat Request for 25 Years
BYLINE: Eden Sahle
LENGTH: 815 words
During the appeals hearing last week of Gebrekidan Beyene, a.k.a. Morocco,
general manager and a shareholder of a private limited company by the same name,
prosecutors of the Ethiopian Revenues and Customs Authority (ERCA) requested
almost the same sentence they originally had, in August 2010: a maximum jail
term and confiscation of properties.
However, the lower court's decision to mitigate the sentence was correct and the
Appeals Bench should release Gebrekidan, either as a free man or on parole, the
defence argued. His good behaviour in prison and the investment he had made in
his country should be counted as mitigating circumstances, the lawyer claimed,
also counting the defendant's poor health in mitigation. The case was adjourned
for a verdict until May 2, 2011.
An alleged similar offence involving money laundering and loan sharking against
Ayalew Tesema, board chairman and major shareholder of Ayat Real Estate, is
underway at the Federal High Court.
SUBJECT: LITIGATION (91%); JUSTICE DEPARTMENTS (90%); BANKING & FINANCE (90%);
EXCISE & CUSTOMS (90%); LIMITED LIABILITY COMPANIES (90%); SENTENCING (90%);
APPEALS (89%); LAW COURTS & TRIBUNALS (89%); JAIL SENTENCING (89%); LAWYERS
(89%); VERDICTS (89%); SUPREME COURTS (89%); FINES & PENALTIES (89%);
SETTLEMENTS & DECISIONS (78%); CRIMINAL CONVICTIONS (78%); DECISIONS & RULINGS
(78%); PRISONS (77%); SUITS & CLAIMS (77%); VALUE ADDED TAX (77%); JUDGES (73%);
INCOME TAX (72%); MONEY LAUNDERING (69%); COUNTERFEITING (68%); INTEREST RATES
(55%); ECONOMIC NEWS (55%) Ethiopia; Legal and Judicial Affairs
GEOGRAPHIC: MOROCCO (90%)
LOAD-DATE: March 1, 2011
LANGUAGE: ENGLISH
PUBLICATION-TYPE: Newspaper
My expected result would be:
df
content keywords
1 'string article 1' [HEADS OF STATE & GOVERNMENT, ELECTIONS, ...]
2 'string article 2' [LITIGATION, JUSTICE DEPARTMENTS, ...]

How to use python beautifulsoup to get image description from html?

I did not find this answer in other location, so seek your's help:
I had a python code try to access http://news.yahoo.com/rss/entertainment
To get the title and descriptions. but some is in image alt format:
This is my code:
for child in body_tag.contents[0].channel.children:
if (child.__class__ != NavigableString):
if child.title != None :
print "------title----------"
print(child.title.contents[0].encode('ascii','ignore'))
print "-----description-class------------"
mchild=child.find_next("description").contents[0]
print mchild.__class__
print "-------description---------"
print mchild.find_next("img")
print(mchild.encode('ascii','ignore'))
print "-------end---------"
This is part of the output:
------title----------
University of Connecticut revokes Cosby's honorary degree
-----description-class------------
class 'bs4.element.NavigableString'
-------description---------
None
To display it, I use () replace "<" and ">"
(p) (a href="http://news.yahoo.com/university-connecticut-revokes-cosbys-honorary-degree-155552959.html")
(img src="http://l.yimg.com/bt/api/res/1.2/cjgCZP4YBj7M6SmdpoGj.Q--/YXBwaWQ9eW5ld3NfbGVnbztmaT1maWxsO2g9ODY7cT03NTt3PTEzMA--/http://media.zenfs.com/en_us/News/ap_webfeeds/7b35f971ec59428491aef6308db4567e.jpg" width="130" height="86" alt="FILE - In this May 24, 2016 file photo, Bill Cosby departs the Montgomery County Courthouse after a preliminary hearing, in Norristown, Pa. A 72-year-old New Hampshire woman who says Bill Cosby raped her in 1965 has withdrawn her civil defamation lawsuit against the comedian after a federal judge had allowed the case to move forward. (AP Photo/Matt Rourke, File)" align="left" title="FILE - In this May 24, 2016 file photo, Bill Cosby departs the Montgomery County Courthouse after a preliminary hearing, in Norristown, Pa. A 72-year-old New Hampshire woman who says Bill Cosby raped her in 1965 has withdrawn her civil defamation lawsuit against the comedian after a federal judge had allowed the case to move forward. (AP Photo/Matt Rourke, File)" border="0" /((/a)STORRS, Conn. (AP) The University of Connecticut on Wednesday revoked an honorary degree awarded to Bill Cosby, saying he engaged in conduct "incongruent" with the university's values.(/p((br) clear="all"/)
-------end---------
-------end---------
How could I get the tile inside the img tag:
title="FILE - In this May 24, 2016 file photo,
I tried to find_next("img") and others but I couldn't get them.
So you want all the text from the description and the title from any img tags, you can find all the decription tags then turn the description.text in a BeautifulSoup object then look for the img in that try to pull either the title or alt attribute, to find the matching title find the previous title to the description tag:
for desc in soup.find_all("description"):
d = BeautifulSoup(desc.text,"lxml")
img = d.find("img")
print("Title = {}".format(desc.find_previous("title").text))
img_text = img.get("title") or img.get("alt","") if img else ""
print("Decscription = {}\n" .format(d.find(text=True) + img_text))
Which gives you:
Title = Entertainment News Headlines — Yahoo! News
Decscription = Get the latest entertainment news headlines from Yahoo! News. Find breaking entertainment news, including analysis and opinion on top entertainment stories.
Title = Spotify's Top 10 most streamed tracks
Decscription = The following list represents the most streamed tracks on Spotify, based on the number of people who shared it divided by the number who listened to it, from Monday, Oct. 20 to Sunday Oct. 26 via Facebook, Tumblr, Twitter and Spotify.FILE - In this Sept. 7, 2012 file photo, musician Robin Thicke performs during Macy's Passport presents Glamorama 2012 at The Orpheum Theatre in Los Angeles. Thicke's "Blurred Lines (feat. T.I. & Pharrell)" was the top streamed tracks on Spotify from Monday, June 10, to Sunday, June 16, 2013. (Photo by Matt Sayles/Invision/AP, File)
Title = Who will win at the Tony Awards? AP predicts
Decscription = NEW YORK (AP) — The great comedian W.C. Fields is credited with the line, "Never work with children or animals." He would have had trouble on Broadway this season.This theater image released by The O+M Company shows the cast during a performance of the musical "Kinky Boots." The Cyndi Lauper-scored "Kinky Boots," based on the 2005 British movie about a real-life shoe factory that struggles until it finds new life in fetish footwear, is nominated for 13 Tony Award nominations. The awards will be broadcast on CBS from Radio City Music Hall on June 9. (AP Photo/The O+M Company, Matthew Murphy)
Title = The top iPhone and iPad apps on App Store
Decscription = App Store Official Charts for the week ending November 3, 2014:
Title = Fairey: 'Vindicated' by dismissal of Detroit tagging case
Decscription = DETROIT (AP) — Graffiti artist Shepard Fairey says he feels "relieved and vindicated" now that a malicious destruction of property case in Detroit has been dismissed.
Title = FBI seeks Rockwell painting on 40th anniversary of its theft
Decscription = CHERRY HILL, N.J. (AP) — Federal authorities are seeking the public's help in recovering a 1919 Norman Rockwell painting on the 40th anniversary of its theft from a New Jersey home.
Title = APNewsBreak: Union has deal with 4th Atlantic City casino
Decscription = ATLANTIC CITY, N.J. (AP) — Atlantic City's main casino workers union reached agreement Thursday with four of the five casinos it had been targeting for a strike this weekend.Union members cheer as they discuss preparations for a strike against as many as five of the city's eight casinos in Atlantic City, N.J. on Wednesday June 29, 2016. Local 54 of the Unite-HERE union says it will go on strike Friday if it can't reach new contracts with three casinos owned by Caesars Entertainment (Bally's, Caesars and Harrah's) and two casinos owned by billionaire investor Carl Icahn (the Tropicana and the Trump Taj Mahal). About 6,500 of the union's nearly 10,000 workers are at the five hotels. (AP Photo/Wayne Parry)
Title = The Latest: APNewsBreak: Union has deal with 4th casino
Decscription = The Latest on contract negotiations with casinos (all times local): 4:35 p.m. Atlantic City's main casino workers union has reached agreement with the fourth of five casinos it had been targeting for a ...Union members cheer as they discuss preparations for a strike against as many as five of the city's eight casinos in Atlantic City, N.J. on Wednesday June 29, 2016. Local 54 of the Unite-HERE union says it will go on strike Friday if it can't reach new contracts with three casinos owned by Caesars Entertainment (Bally's, Caesars and Harrah's) and two casinos owned by billionaire investor Carl Icahn (the Tropicana and the Trump Taj Mahal). About 6,500 of the union's nearly 10,000 workers are at the five hotels. (AP Photo/Wayne Parry)
Title = CBS reporter traveling to 59 parks in a year
Decscription = NEW YORK (AP) — Conor Knighton didn't take the easy route when he proposed a "CBS Sunday Morning" story on the National Park Service's centennial.
Title = Here come the virtual reality Olympics ... for Samsung users
Decscription = NEW YORK (AP) — Athletes in Rio will compete to be the fastest sprinter and highest jumper at the Olympics this August. But there's another test underway as well: How well can virtual reality capture sporting events?This photo provided by NBC and HD Studio shows NBC's daytime and late night set for the Rio Olympics located on Copacabana Beach in Rio. NBC says it will provide 85 hours of virtual reality programming during the Rio Olympics in August, but only to users of Samsung Galaxy smartphones and the Samsung Gear VR headset. (HD Studio/Courtesy of NBC via AP)
Title = Oscars timetable for 2017 revealed
Decscription = Movie buffs, mark your calendars: your 2017 Oscars party will be on Sunday, February 26. The Academy of Motion Picture Arts and Sciences announced the timetable for the 89th Oscars on Thursday, one day after it announced that it had invited a record number of artists to join the body, the majority of them women and people of color.A view of the Oscars logo at the 88th Annual Academy Awards nominee luncheon on February 8, 2016 in Beverly Hills, California
Title = Queen marks deadly Somme centenary at Westminster Abbey
Decscription = LONDON (AP) — Queen Elizabeth II attended a service at Westminster Abbey on Thursday, the eve of the centenary of the Battle of the Somme, one of the deadliest chapters of World War I.
Title = Rob Wasserman, accomplished bass player, dead at 64
Decscription = NEW YORK (AP) — Rob Wasserman, a highly respected bass player and composer who performed and recorded with Lou Reed, Neil Young, Brian Wilson and many other musicians, has died. He was 64.
Title = Documents filed by some Prince claimants to become public
Decscription = CHASKA, Minn. (AP) — A Minnesota judge overseeing the legal proceedings about Prince's estate will allow documents filed by some claimants to become public.
Title = The Latest: Oprah Winfrey to appear at Essence Festival
Decscription = NEW ORLEANS (AP) — The Latest on the annual Essence Festival held over the July 4th holiday in New Orleans (all times local):FILE - In this Jan. 20, 2009, file photo, Mariah Carey performs at the Neighborhood Inaugural Ball in Washington. Music is at the heart of the annual Essence Festival in New Orleans, and this year is no different. Fans will get to hear from first-timers Mariah Carey, Puff Daddy and Jeremih as well as from festival veterans Charlie Wilson, Maxwell, New Edition, Tyrese and Lalah Hathaway - all of whom are scheduled to perform inside the Superdome Friday, July 1, 2016, through Sunday. (AP Photo/Alex Brandon, File)
Title = Brad Paisley: West Virginia floods shocking, heartbreaking
Decscription = CHARLESTON, W.Va. (AP) — Brad Paisley said he's shocked and heartbroken by the destruction from deadly flooding in his home state of West Virginia.Principal Mike Kelley walks through a hallway that is filled with slick mud at Herbert Hoover High School in Clendenin, W.Va., Monday, June 27, 2016. The first floor hallways and rooms of the school are caked in 3-5 inches mud, which was left by over six feet of flood water that swamped the building late last week. (Sam Owens/Charleston Gazette-Mail via AP)
Title = Chechen leader Kadyrov seeks apprentice on reality TV show
Decscription = MOSCOW (AP) — Another powerful, controversial man is taking to reality TV to find an assistant — not Donald Trump but the leader of Chechnya.FILE - In this Wednesday March 23, 2016 file photo, Chechen regional leader Ramzan Kadyrov addresses a rally marking the 13th anniversary of the adoption of the Constitution of Russian region of Chechnya, in the regional capital of Grozny, Russia. Russian state television on Thursday is to broadcast the opening episode of "Live - The Team," in which participants compete to become an assistant to leader of Chechnya Ramzan Kadyrov. (AP Photo/Musa Sadulayev, File)
Title = With an eye to Tuscany, Debi Mazar plots culinary future
Decscription = NEW YORK (AP) — Debi Mazar and her brood spend at least a month in Tuscany each year, but if the "Younger" actress had her way, the region would be a far more permanent fixture in her life.FILE - In this Wednesday, Jan. 6, 2016 file photo, Debi Mazar speaks during the "Younger" panel at the TV Land 2016 Winter TCA in Pasadena, Calif. After the success of her award-winning cooking show "Extra Virgin," Mazar's creative juices are still flowing, as the actress talks about the possibility of another show and more of her culinary dreams. (Photo by Richard Shotwell/Invision/AP)
Title = Wisecracking De Niro touts Catskills with NY governor
Decscription = BETHEL, N.Y. (AP) — Robert De Niro is conjuring the legacy — and the stand-up jokes — of comedians like Rodney Dangerfield, Henny Youngman and Milton Berle while praising the natural beauty of New York's Catskills region.
Title = Music Review: Sara Watkins branches out
Decscription = Sara Watkins, "Young in All the Wrong Ways" (New West Records)FILE - In this July 29, 2012 file photo, Sara Watkins performs at the Newport Folk Festival in Newport, R.I. Watkins describes her latest venture as “a breakup album with myself,” but it seems like there might have been someone else involved. The songs on her new album, “Young in All the Wrong Ways,” have bite to them. There is anger here, a jarring departure from Watkins’ previous work. A couple of the songs push into hard-edged rock, her voice straining against a jagged electric guitar. (AP Photo/Joe Giblin)
Title = Disney Animation's 'Wreck-It Ralph 2' set for March 2018
Decscription = LOS ANGELES (AP) — "Wreck-It Ralph" is headed back to the arcade, and theaters, in a sequel planned for release on March 9, 2018. Co-directors Rich Moore and Phil Johnston announced the sequel to the 2012 animated film Thursday morning on Facebook Live.FILE - In this Oct. 29, 2012 file photo, Director Rich Moore arrives at the world premiere of "Wreck-It Ralph" at El Capitan Theatre in Los Angeles. “Wreck-It Ralph” is headed back to the arcade, and theaters, in a sequel planned for release on March 9, 2018. Co-directors Rich Moore and Phil Johnston announced the sequel to the 2012 animated film Thursday, June 30, 2016 on Facebook Live. (Photo by Jordan Strauss/Invision/AP)
Title = Scarlett Johansson ranked Hollywood's top-grossing actress
Decscription = Scarlett Johansson has taken the crown as Hollywood's highest-grossing actress ever.FILE - In this April 21, 2015, file photo, Scarlett Johansson poses for photographers upon arrival at the premiere for the film 'The Avengers Age of Ultron' in London. Box Office Mojo has crowned Johansson as Hollywood's highest grossing actress on a list updated June 29, 2016.(Photo by Joel Ryan/Invision/AP, File)
Title = HLN's Nancy Grace leaving her legal show
Decscription = NEW YORK (AP) — Tough-talking former prosecutor Nancy Grace is leaving her prime-time show on the HLN network in October.FILE - In this Friday, Oct. 21, 2014, file photo, television host Nancy Grace arrives at the 7th annual GLSEN Respect Awards in Beverly Hills, Calif. Grace is leaving her prime-time show on the HLN network in October 2016. The CNN sister station said Grace told her staff on Thursday, June 30, 2016 that her show would be ending after 12 years. An HLN spokeswoman said the network had no immediate announcement on what program would go in its place. (AP Photo/Matt Sayles, File)
Title = Moviegoers to Hollywood: It better be good
Decscription = NEW YORK (AP) — As Hollywood girds for a low-key Fourth of July box office weekend and watches its summer season dip 15 percent below last year's, an even more worrisome trend has taken shape: Moviegoers are growing pickier.FILE - This image released by Warner Bros. Entertainment shows Alexander Skarsgard from "The Legend of Tarzan." For films that aren’t “the movie to see,” moviegoers are increasingly staying home. With word-of-mouth traveling at the speed of Twitter, quality has become a more vital currency. (Jonathan Olley/Warner Bros. Entertainment via AP, File)
Title = 8 rescued after Oklahoma City roller coaster gets stuck
Decscription = OKLAHOMA CITY (AP) — No one was injured when a roller coaster at an Oklahoma City amusement park stalled out and stranded eight people, including seven children.
Title = Smallest national park? Kosciuszko, forgotten son of liberty
Decscription = PHILADELPHIA (AP) — If the hip-hop Broadway smash "Hamilton" can reignite interest in the first U.S. treasury secretary, what will it take to drum up interest in another forgotten hero from America's fight for independence?FILE - In this April 1, 2013 file photo a statue of Poland's General Thaddeus Kosciuszko is enveloped in the early morning fog in Lafayette Park across from the White House in Washington. Kosciuszko was a military engineer from Poland, Kosciuszko came to Philadelphia in August 1776 to offer his services in the fight against the British. (AP Photo/Jacquelyn Martin, File)
Title = Pregnant Alanis Morissette posts nude underwater photo
Decscription = Alanis Morissette has posted a nude photo of herself sporting a large baby bump while floating underwater.FILE - In this Nov. 22, 2015, file photo, Souleye, left, and Alanis Morissette arrive at the American Music Awards in Los Angeles. Morissette posted a nude photo of herself sporting a large baby bump while floating underwater on Instagram on June 28, 2016. (Photo by Jordan Strauss/Invision/AP, File)
Title = New Orleans ready to 'party with a purpose' at Essence Fest
Decscription = NEW ORLEANS (AP) — Music has always been at the heart of the annual Essence Festival, now in its 22nd year, and this year will be no different.FILE - In this Jan. 20, 2009, file photo, Mariah Carey performs at the Neighborhood Inaugural Ball in Washington. Music is at the heart of the annual Essence Festival in New Orleans, and this year is no different. Fans will get to hear from first-timers Mariah Carey, Puff Daddy and Jeremih as well as from festival veterans Charlie Wilson, Maxwell, New Edition, Tyrese and Lalah Hathaway - all of whom are scheduled to perform inside the Superdome Friday, July 1, 2016, through Sunday. (AP Photo/Alex Brandon, File)
Title = Alvin Toffler, author of 'Future Shock,' dead at 87
Decscription = NEW YORK (AP) — Alvin Toffler, a guru of the post-industrial age whose million-selling "Future Shock" and other books anticipated the disruptions and transformations brought about by the rise of digital technology, has died. He was 87.
Title = Theater shows R-rated comedy trailer with "Finding Dory"
Decscription = CONCORD, Calif. (AP) — The owner of a California movie theater is apologizing after a trailer for an R-rated upcoming Seth Rogen comedy was shown ahead of a screening of Disney's "Finding Dory."FILE - This undated file image released by Disney shows the character Dory, voiced by Ellen DeGeneres, in a scene from "Finding Dory." In its second week, “Finding Dory” easily remained on top with an estimated $73.2 million, according to studio estimates Sunday, June 26, 2016. (Pixar/Disney via AP, File)
Title = Christie's to sell contents of Reagans' LA home
Decscription = NEW YORK (AP) — A two-day auction of the contents of Ronald and Nancy Reagan's ranch-style home in California will include everything from personal mementos from heads of state and friends to objects the couple took with them to the White House.This undated photo provided by Christie's shows a needlepoint cushion given to Ronald Reagan for his 70th birthday in 1981. The pillow, which will be sold by Christie's New York during a two-day auction of the contents of Ronald and Nancy Reagan's ranch-style home in California, has a pre-sale estimate of $1,000-1,500. Christie’s announced Thursday, June 30, 2016, highlights of the Sept. 21-22 sale in New York City. (Christie's via AP) MANDATORY CREDIT
Title = Asian actors too busy to fret over Hollywood 'white-washing'
Decscription = TOKYO (AP) — The film world of Asia, known for producing Akira Kurosawa, Satyajit Ray, Brillante Mendoza and other greats, is too busy making movies of its own to fret much about the debate slamming Hollywood — the casting of white people in roles written for Asians.FILE - In this Sept. 5, 2007, file photo, Japanese actress Kaori Momoi poses during the photo call for the movie "Sukiyaki Western Django" at the 64th Venice Film Festival, in Venice, Italy. The film world of Asia is too busy making movies of its own to fret much about the debate slamming Hollywood - the casting of white people in roles written for Asians. Momoi, who appeared in “Memoirs of a Geisha,” as well as Russian filmmaker Aleksandr Sokurov’s “The Sun,” suggested acting was ultimately about individual talent, not skin color or nationality. (AP Photo/Andrew Medichini, File)
Title = Film academy invites 683 new members to join
Decscription = LOS ANGELES (AP) — Six months after announcing intentions to double the number of female and minority members in its ranks by 2020, the Academy of Motion Picture Arts and Sciences has invited 683 new members to join the organization.FILE - In this March 2, 2014 file photo, an Oscar statue is displayed at the Oscars at the Dolby Theatre in Los Angeles. Six months after announcing intentions to double the number of female and minority members in its ranks, the Academy of Motion Picture Arts and Sciences has invited 683 new members to join the organization. The academy says its invitees are 46 percent female, 41 percent minority and represent 59 countries.(Photo by Matt Sayles/Invision/AP, File)
Title = Miss Teen USA pageant replaces swimsuits with athletic wear
Decscription = LAS VEGAS (AP) — The Miss Teen USA pageant is dropping the swimsuit portion of its competition.
Title = YouTube personality charged with making false police report
Decscription = LOS ANGELES (AP) — A gay YouTube personality who said he was assaulted outside a West Hollywood club has been charged with filing a false police report and faking his injuries.This Wednesday, June 29, 2016, photo released by Los Angeles County Sheriff's Department shows Calum McSwiggan. The London-native gay YouTube personality who said he was assaulted outside a West Hollywood club has been charged with filing a false police report and faking his injuries. (Los Angeles County Sheriff's Department via AP) MANDATORY CREDIT
Title = Jesus Christ film coming to virtual reality
Decscription = LOS ANGELES (AP) — The story of Jesus Christ is coming to virtual reality for the first time.This undated photo provided by Autumn VR Inc. and VRWERX, LLC, shows a production still from "Jesus VR - The Story of Christ." The story of Jesus Christ is coming to virtual reality for the first time. Autumn Productions and VRWerx announced plans Wednesday, June 29, 2016, to release the live-action film on all major VR platforms this Christmas. (Autumn VR Inc. and VRWERX, LLC via AP)
Title = The Latest: Celebrities record tribute to nightclub victims
Decscription = ORLANDO, Fla. (AP) — The Latest on the mass shooting at a gay Orlando nightclub that left 49 people dead (all times local):
Title = The Latest: Golfer Bubba Watson plans to help flood victims
Decscription = CHARLESTON, W.Va. (AP) — The Latest on flooding that has devastated parts of West Virginia (all times local):
Title = Twitter dominated by tongue-in-cheek #HeterosexualPrideDay
Decscription = What appears to be a tongue-in-cheek social media movement to mark June 29 as a day to celebrate heterosexual pride has become one of the day's top online trends.
Title = Miss Teen USA axes 'outdated' bikini competition
Decscription = One of America's top beauty pageants has axed its swimsuit competition, ditching bikinis for sportswear to fend off years of complaints that parading in a bikini is sexist and demeaning. The Miss Universe Organization, which operates the pageant, said from now on contestants would be judged on athletic wear, in addition to the evening wear and personality competitions. "Miss Teen USA's transition to athletic wear reads as less exploitative and more focused on the importance of physical fitness for its younger participants," it said.Miss Teen USA 2016 Katherine Haik (R) congratulates Miss District of Columbia USA 2016 Deshauna Barber during the 2016 Miss USA pageant at T-Mobile Arena on June 5, 2016 in Las Vegas, Nevada
Title = Kayne West, Adidas expand partnership for Yeezy line
Decscription = LOS ANGELES (AP) — Rapper Kanye West and Adidas are expanding their partnership that began almost two years ago with retail hubs for his Yeezy products and additional sportswear designs.FILE - In this Aug. 30, 2015, file photo, Kanye West accepts the video vanguard award at the MTV Video Music Awards at the Microsoft Theater in Los Angeles. West and Adidas are expanding their partnership that began almost two years ago with retail hubs for his Yeezy products and additional sportswear designs. The sportswear company announced the collaboration on Wednesday, June 29, 2016, and described it as the most significant partnership between a non-athlete and an athletic brand. (Photo by Matt Sayles/Invision/AP, File)
You cannot find every title first and then the following description as not all titles are related to a description but all descriptions are related to a title.

Pandas:reshape table and plotting different times series in one plot

Im rocky to pandas or python for that case and I'm working with a 311 dataset. The output Im trying to get is a plot with 5 time series, one for each NYC borough. Where each point in the plot represents the total number of complaints for each "created date" in that period of time. My data is as follow:
Agency Name Complaint Type \ Borough
Created Date
2013-08-30 23:58:55 New York City Police Department Noise - Vehicle BROOKLYN
2013-08-30 23:58:28 New York City Police Department Noise - Vehicle QUEENS
2013-08-30 23:57:46 New York City Police Department Noise - Street/Sidewalk MANHATTAN
2013-08-30 23:55:07 New York City Police Department Noise - Street/Sidewalk QUEENS
2013-08-30 23:55:06 New York City Police Department Noise - Commercial MANHATTAN
X= created date, Y= Total No.of complaints.
My code so far (overlooking at some stackoverflow queries and libraries):
df=pd.read_csv(sys.argv[1], parse_dates=True)
df.set_index("Created Date", inplace=True)
df2=df[["Borough","Complaint Type"]]
df3=df2.groupby("Complaint Type").count()
df3.plot()
plt.show()
!http://imgur.com/D9jrYLf
I did some changes but still it doesn't work:
df=pd.read_csv(sys.argv[1], parse_dates=True)
df.set_index("Created Date", inplace=True)
df2=df[["Borough","Complaint Type"]]
df3=df[df2.groupby("Complaint Type")].count()
df3.plot()
I really appreciated any help. :)

Categories

Resources