How to decode this Ajax response in python? - python
How do I decode the following response from this url in python? https://www.scorespro.com/livescore/ajax0.php
1599071734^^~~##Wed 02 Sep 21:35 GMT +03^^~~##2361498-1##194837##0##2020-09-02 17:00:00##76##1##18032##17842##Club Friendly##0##Real Sociedad##Villarreal##CLB##un##FG##0-2##0-2##2 HF##Friendly Games##1599066000######2##99######real-sociedad-vs-villarreal/02-09-2020##friendly-games##club-friendly##2##LEAGUE##2020##round-1##0####0##1599066240##############0##0 2325164-1##196097##0##2020-09-02 17:00:00##71##1##105187##104946##Canadian Premier League - Premier League##0##Valour FC (5)##HFX Wanderers FC (6)##PL##ca##CAN##0-2##0-2##2 HF##Canada##1599066000######2##99######valour-fc-vs-hfx-wanderers-fc/30-05-2020##canada##premier-league##1##LEAGUE##2020##round-1##1##canadian-premier-league##0##1599066540##############0##0 2338959-1##197065##0##2020-09-02 17:00:00##81##4##39942##41961##Regionalliga Nordost##0##Germania Halberstadt (18)##Optik Rathenow (20)##N/E##de##GER##0-2##0-0##2 HF##Germany##1599066000######2##99######germania-halberstadt-vs-optik-rathenow/02-09-2020##germany##regionalliga-nordost##5##LEAGUE##2020-2021##round-4##1####0##1599065940##############0##0 2338955-1##197065##0##2020-09-02 17:00:00##81##4##44124##56097##Regionalliga Nordost##0##Viktoria Berlin (2)##VSG Altglienicke (1)##N/E##de##GER##2-1##1-1##2 HF##Germany##1599066000######2##99######viktoria-berlin-vs-vsg-altglienicke/02-09-2020##germany##regionalliga-nordost##5##LEAGUE##2020-2021##round-4##1####0##1599065940##############0##0 2338958-1##197065##0##2020-09-02 17:00:00##78##4##13847##3034##Regionalliga Nordost##0##SV Babelsberg 03 (12)##Chemnitzer FC (15)##N/E##de##GER##2-2##1-1##2 HF##Germany##1599066000######2##99######sv-babelsberg-03-vs-chemnitzer-fc/02-09-2020##germany##regionalliga-nordost##5##LEAGUE##2020-2021##round-4##1####0##1599066120##############0##0 2338954-1##197065##0##2020-09-02 17:00:00##79##4##21173##37508##Regionalliga Nordost##0##Hertha Berlin II (5)##Berliner AK 07 (16)##N/E##de##GER##2-5##1-2##2 HF##Germany##1599066000######2##99######hertha-berlin-ii-vs-berliner-ak-07/02-09-2020##germany##regionalliga-nordost##5##LEAGUE##2020-2021##round-4##1####0##1599066060##############0##0 2361307-1##197664##0##2020-09-02 17:00:00##81##1##24981##21152##Landspokal##0##Slagelse##Dalum##CUP##dk##DEN##1-1##0-0##2 HF##Denmark##1599066000######2##99######slagelse-vs-dalum/01-09-2020##denmark##fa-cup##6##PHASE##2020-2021##round-1##0####0##1599065940##2.20##2.87##3.75########0##0 2338953-1##197065##0##2020-09-02 17:00:00##80##4##41959##2993##Regionalliga Nordost##0##VfB Auerbach (7)##Energie Cottbus (19)##N/E##de##GER##2-4##1-2##2 HF##Germany##1599066000######2##99######vfb-auerbach-vs-energie-cottbus/02-09-2020##germany##regionalliga-nordost##5##LEAGUE##2020-2021##round-4##1####0##1599066000##############0##0 2307988-1##195163##0##2020-09-02 17:15:00##62##10##34050##49891##Serie A - First Stage##0##CD Olmedo (16)##Delfin SC (10)##SA1##ec##ECU##2-0##2-0##2 HF##Ecuador##1599066900######2##99######cd-olmedo-vs-delfin-sc/10-05-2020##ecuador##first-stage##1##LEAGUE##2020##round-10##1##serie-a##0##1599067080##2.45##2.55##3.25########1##1 2338956-1##197065##0##2020-09-02 17:30:00##HT##4##41960##50882##Regionalliga Nordost##0##Lokomotive Leipzig (13)##FSV 63 Luckenwalde (8)##N/E##de##GER##1-0##1-0##H/T##Germany##1599067800######2##99######lokomotive-leipzig-vs-fsv-63-luckenwalde/02-09-2020##germany##regionalliga-nordost##5##LEAGUE##2020-2021##round-4##1####0##0##############0##0 2367153-1##194837##0##2020-09-02 17:30:00##HT##1##18022##4189##Club Friendly##0##Real Betis##Almeria##CLB##un##FG##1-0##1-0##H/T##Friendly Games##1599067800######2##99######real-betis-vs-almeria/02-09-2020##friendly-games##club-friendly##2##LEAGUE##2020##round-1##0####0##0##1.53##6.00##3.50########0##0 2313051-1##195400##0##2020-09-02 17:30:00##48##15##43773##103469##1. Deild##0##Magni (12)##Afturelding (8)##D2##is##ISL##2-0##2-0##2 HF##Iceland##1599067800######2##99######magni-vs-afturelding/29-07-2020##iceland##1-deild##2##LEAGUE##2020##round-15##1####0##1599067920##############0##0 2366633-1##194837##0##2020-09-02 17:30:00##47##1##18052##28704##Club Friendly##0##Levante##Cartagena##CLB##un##FG##1-1##1-1##2 HF##Friendly Games##1599067800######2##99######levante-vs-cartagena/02-09-2020##friendly-games##club-friendly##2##LEAGUE##2020##round-1##0####0##1599067980##1.40##5.25##4.20########0##1 2313052-1##195400##0##2020-09-02 17:30:00##47##15##52987##30414##1. Deild##0##Vestri (7)##Thor Akureyri (5)##D2##is##ISL##1-0##1-0##2 HF##Iceland##1599067800######2##99######vestri-vs-thor-akureyri/29-07-2020##iceland##1-deild##2##LEAGUE##2020##round-15##1####0##1599067980##############0##0 2313056-1##195400##0##2020-09-02 17:30:00##47##15##26363##32547##1. Deild##0##IBV Vestmannaeyjar (3)##Leiknir R. (4)##D2##is##ISL##0-2##0-2##2 HF##Iceland##1599067800######2##99######ibv-vestmannaeyjar-vs-leiknir-r/01-08-2020##iceland##1-deild##2##LEAGUE##2020##round-15##1####0##1599067980##############0##0 2363441-1##194837##0##2020-09-02 18:00:00##36##1##21281##3220##Club Friendly##0##Benfica##SC Braga##CLB##un##FG##0-0##-##1 HF##Friendly Games##1599069600######2##99######benfica-vs-sc-braga/02-09-2020##friendly-games##club-friendly##2##LEAGUE##2020##round-1##0####0##1599069540##1.55##5.50##3.50########0##0 2289461-1##193678##0##2020-09-02 18:30:00##4##24##40009##40019##Premier League##0##Smouha SC (6)##El Entag El Harby (14)##PL##eg##EGY##0-0##-##1 HF##Egypt##1599071400######2##99######smouha-sc-vs-el-entag-el-harby/02-03-2020##egypt##premier-league##1##LEAGUE##2019-2020##round-24##1####0##1599071460##2.15##4.00##2.70########0##0 2211667-1##190057##0##2020-09-02 18:30:00##1##1##23376##19092##U21 Championship - Qualifying Group Stage##0##San Marino U21 (6)##Czech Republic U21 (1)##QR##eu##UEF##0-0##-##1 HF##Europe (UEFA)##1599071400######2##8######san-marino-u21-vs-czech-republic-u21/02-09-2020##uefa##qualifying-group-stage##8##PHASE##2021-hungary-slovenia##round-1##1##u21-championship##1##1599071640##29.00##1.01##21.00########0##0 ^^##a##1599071731##1599071658^^~~##5333322-1##197760##0-1##Set 2##2##40089##32214##Pavic/Soares B.##Granollers-P M./Zeballos H. (5)##US OPEN##us##ATP##3-6|2-2|-|-|-##ATP Doubles##US Open##1599068700####H##atp-doubles##us-open##2020######195167######0####13##R32##3########################Set2##-2##z##1##- 5333324-1##197760##0-1##Set 2##2##41811##51180##Bambridge L./McLachlan B.##Eubanks C./Mcdonald M. (wc)##US OPEN##us##ATP##3-6|2-5|-|-|-##ATP Doubles##US Open##1599067500####A##atp-doubles##us-open##2020######195167######0####16##R32##3########################Set2##-2##z##0##- 5333325-1##197760##1-1##Set 3##2##27573##39167##Chardy J./Martin F.##Harrison C./Harrison R. (wc)##US OPEN##us##ATP##7-5|65-77|0-0|-|-##ATP Doubles##US Open##1599066000####H##atp-doubles##us-open##2020######195167######0####25##R32##3########################Set3##-2##z##1##30-30 5333330-1##197760##1-0##Set 2##2##143786##21596##Gille S./Vliegen J.##Kubot L./Melo M. (2)##US OPEN##us##ATP##6-2|4-3|-|-|-##ATP Doubles##US Open##1599067500####H##atp-doubles##us-open##2020######195167######0####15##R32##3########################Set2##-2##z##1##30-15 5334089-1##197805##1-1##Set 3##2##145891##145017##Carlos Alcaraz (se)##Juan Pablo Ficovich####it##CHM##4-6|6-3|5-4|-|-##Challenger Men Singles##Cordenons (Italy)##1599063000####H##challenger-men-singles##cordenons##2020##es##ar##195223##ESP##ARG######28##R32##5########################Set3##10##z##1##- 5334294-1##197761##0-1##Set 2##2##46027##43093##Gerasimov E.##Thompson J.##US OPEN##us##ATP##1-6|3-5|-|-|-##ATP Singles##US Open##1599067500####H##atp-singles##us-open##2020##by##au##195166##BLR##AUS##0####15##R64##1######gerasimov-e##thompson-j################Set2##-2##z##1##40-30 5334313-1##197761##0-0##Set 1##2##58519##52671##Davidovich Fokina A.##Hurkacz H. (24)##US OPEN##us##ATP##0-0|-|-|-|-##ATP Singles##US Open##1599071700####A##atp-singles##us-open##2020##es##pl##195166##ESP##POL##0####0##R64##1######davidovich-fokina-a##hurkacz-h################Set1##-2##z##1##40-15 5334316-1##197761##0-0##Set 1##2##21374##41813##Djokovic N. (1)##Edmund K.##US OPEN##us##ATP##1-2|-|-|-|-##ATP Singles##US Open##1599070500####H##atp-singles##us-open##2020##rs##gb-eng##195166##SRB##ENG##0####3##R64##1######novak-djokovic##edmund-k################Set1##-2##z##1##- 5334317-1##197761##0-1##Set 2##2##143584##44241##Nakashima B. (wc)##Zverev A. (5)##US OPEN##us##ATP##5-7|4-3|-|-|-##ATP Singles##US Open##1599066600####A##atp-singles##us-open##2020##us##de##195166##USA##GER##0####19##R64##1######nakashima-b##alexander-zverev################Set2##-2##z##1##15-40 5334319-1##197761##0-0##Set 1##2##55929##38842##Harris Ll.##Goffin D. (7)##US OPEN##us##ATP##4-3|-|-|-|-##ATP Singles##US Open##1599069300####A##atp-singles##us-open##2020##za##be##195166##RSA##BEL##0####7##R64##1######harris-ll##david-goffin################Set1##-2##z##1##30-15 5334322-1##197761##0-0##Set 1##2##32375##38073##Mannarino A. (32)##Sock J. (pr)##US OPEN##us##ATP##1-2|-|-|-|-##ATP Singles##US Open##1599070800####H##atp-singles##us-open##2020##fr##us##195166##FRA##USA##0####3##R64##1######adrian-mannarino##jack-sock################Set1##-2##z##1##30-15 5334325-1##197761##2-0##Set 3##2##31424##42596##Kukushkin M.##Garin C. (13)##US OPEN##us##ATP##6-2|6-1|2-5|-|-##ATP Singles##US Open##1599065100####H##atp-singles##us-open##2020##kz##cl##195166##KAZ##CHI##0####22##R64##1######mikhail-kukushkin##garin-c################Set3##-2##z##1##40-15 5334328-1##197761##0-0##Set 1##2##51475##43105##Mmoh M. (wc)##Struff J-L. (28)##US OPEN##us##ATP##2-5|-|-|-|-##ATP Singles##US Open##1599070200####A##atp-singles##us-open##2020##us##de##195166##USA##GER##0####7##R64##1######mmoh-m##jan-lennard-struff################Set1##-2##z##1##- 5334329-1##197763##0-1##Set 2##2##4337##41349##Flipkens K.##Pegula J. (wc)##US##us##WTA##61-77|0-0|-|-|-##WTA Singles##US Open##1599068100####H##wta-singles##us-open##2020##be##us##195168##BEL##USA##0####13##R64##2######kirsten-flipkens##pegula-j################Set2##10##z##1##15-0 ^^##a##1599071731##0^^~~##5320202-1##197316##68-62##Q4##2##47955##47954##TBV Start Lublin##Polski Cukier Torun##PLK-RS##pl##POL##21-16|15-15|18-21|14-10| - |36-31##Poland##Energa Basket Liga##1599066000######poland##energa-basket-liga##2020-2021######197315######1####1.21##4.25##########4Qrt##1##z##0 ^^##a##1599071661##0^^~~##5333286-3##197776##1-0##Set 2##2##8410##8414##Spor Toto (1)##Ziraat Bankasi (2)##GS##tr##TUR##25-18|8-5|-|-|-##Turkey##Turkish Cup - Group Stage##1599069600######turkey##national-cup##2020-2021######197447######1####56##############2S##3##z##0 ^^##a##1599071674##1599071126^^~~##5302817-3##196725##28-21##2H##2##140460##41158##Molde W##Larvik W##RS##no##NOR##13-9##Norway##REMA 1000-ligaen - Women##1599066900######norway##postenligaen-women##2020-2021######196717######1####49##1.12##7.50##12.00########2H##2##z##0 5303101-3##196762##21-13##2H##2##8559##3172##Sonderjyske##Skjern##RS##dk##DEN##16-10##Denmark##Handbold Liagen##1599067800######denmark##handball-league##2020-2021######196756######1####34##3.40##1.55##8.50########2H##1##z##0 5303102-3##196762##1-0##1H##2##3517##3516##Skanderborg##Arhus GF##RS##dk##DEN##1-0##Denmark##Handbold Liagen##1599071400######denmark##handball-league##2020-2021######196756######1##1H##1##1.35##4.50##9.50########1H##1##z##0 5304740-3##196776##25-16##2H##2##3587##6865##Kadetten Schaffhausen##Amicitia Zurich##RS##ch##SUI##10-9##Switzerland##NLA##1599066000######switzerland##nla##2020-2021######196774######1####41##1.11##8.00##12.00########2H##1##z##0 5304741-3##196776##12-11##2H##2##10782##10780##HC Kriens##Wacker Thun##RS##ch##SUI##11-10##Switzerland##NLA##1599067800######switzerland##nla##2020-2021######196774######1##H##23##1.50##3.20##8.00########2H##1##z##0 5304742-3##196776##22-18##2H##2##10786##10783##Pfadi Winterthur##Bern Muri##RS##ch##SUI##19-13##Switzerland##NLA##1599067800######switzerland##nla##2020-2021######196774######1##A##40##1.25##5.00##10.00########2H##1##z##0 5304743-3##196776##15-15##2H##2##10777##10784##St. Otmar St. Gallen##Suhr Aarau##RS##ch##SUI##14-15##Switzerland##NLA##1599067800######switzerland##nla##2020-2021######196774######1####30##2.00##2.15##7.50########2H##1##z##0 5304744-3##196776##7-7##1H##2##12340##10778##Endingen##1879 Basel##RS##ch##SUI##7-7##Switzerland##NLA##1599069600######switzerland##nla##2020-2021######196774######1####14##1.67##2.60##7.50########1H##1##z##0 5312581-3##197132##11-17##HT##2##41099##3282##Oroshazi##Pick Szeged##RS##hu##HUN##11-17##Hungary##Liga 1##1599068700######hungary##liga-1##2020-2021######197130######1####28##67.00##1.00##50.00########H/T##1##z##0 5334268-3##197814##2-3##1H##2##9591##9590##Fivers WAT Margareten##Alpla Hard##CUP##at##AUT##2-3##Austria##Super Cup - Cup##1599070800######austria##super-cup##2020-2021######197234######0##H##5##1.67##2.60##7.50########1H##5##z##0 ^^##a##1599071731##1599071728^^~~##5321546-1##197388##2-2##P3##2##22308##22300##HC CSKA Moscow##AK Bars Kazan##KHL-RS##ru##RUS##1-0|0-1|1-1| - | - ##Russia##KHL##1599064200######russia##khl##2020-2021######197387######1####hc-cska-moscow-vs-ak-bars-kazan/02-09-2020##1.67##2.25##########3Per##1##z##1 ^^##a##1599071289##0^^~~##^^##a##1597635989##0^^~~##^^##a##1599057728##0^^~~##^^##a##0##0^^~~##^^##a##1599042879##1590074504^^~~##^^##a##1599045851##0^^~~##^^##a##1599071265##0^^~~##45
Related
How to export a lot of routes in a shpfile from OSMNX
I have a trip data including lat,lng. I want to simulate the shortest paths of the trip,and export the paths to shpfile.Then I'll do the Linedensity Analysis to discover changes in the trips. I don't know how to export the paths as a shpfile in once. my sample data is below.you can save as station.csv { ride_id rideable_type started_at ended_at start_station_name start_station_id end_station_name end_station_id start_lat start_lng end_lat end_lng member_casual A847FADBBC638E45 docked_bike 2020/4/26 17:45 2020/4/26 18:12 Eckhart Park 86 Lincoln Ave & Diversey Pkwy 152 41.8964 -87.661 41.9322 -87.6586 member 5405B80E996FF60D docked_bike 2020/4/17 17:08 2020/4/17 17:17 Drake Ave & Fullerton Ave 503 Kosciuszko Park 499 41.9244 -87.7154 41.9306 -87.7238 member 5DD24A79A4E006F4 docked_bike 2020/4/1 17:54 2020/4/1 18:08 McClurg Ct & Erie St 142 Indiana Ave & Roosevelt Rd 255 41.8945 -87.6179 41.8679 -87.623 member 2A59BBDF5CDBA725 docked_bike 2020/4/7 12:50 2020/4/7 13:02 California Ave & Division St 216 Wood St & Augusta Blvd 657 41.903 -87.6975 41.8992 -87.6722 member 27AD306C119C6158 docked_bike 2020/4/18 10:22 2020/4/18 11:15 Rush St & Hubbard St 125 Sheridan Rd & Lawrence Ave 323 41.8902 -87.6262 41.9695 -87.6547 casual 356216E875132F61 docked_bike 2020/4/30 17:55 2020/4/30 18:01 Mies van der Rohe Way & Chicago Ave 173 Streeter Dr & Grand Ave 35 41.8969 -87.6217 41.8923 -87.612 member A2759CB06A81F2BC docked_bike 2020/4/2 14:47 2020/4/2 14:52 Streeter Dr & Grand Ave 35 Fairbanks St & Superior St 635 41.8923 -87.612 41.8957 -87.6201 member FC8BC2E2D54F35ED docked_bike 2020/4/7 12:22 2020/4/7 13:38 Ogden Ave & Roosevelt Rd 434 Western Ave & Congress Pkwy 382 41.8665 -87.6847 41.8747 -87.6864 casual 9EC5648678DE06E6 docked_bike 2020/4/15 10:30 2020/4/15 10:35 LaSalle Dr & Huron St 627 Larrabee St & Division St 359 41.8949 -87.6323 41.9035 -87.6434 casual } This is my code(only get the picture of the path): ` import networkx as nx import osmnx as ox import geopandas as gpd import pandas as pd from shapely.geometry import LineString, MultiLineString ox.config(log_console=True, use_cache=True) # get a graph for some city startlat = [] startlng = [] endlat = [] endlng = [] data = pd.read_csv("station.csv") startlat = data['start_lat'] startlng = data['start_lng'] endlat = data['end_lat'] endlng = data['end_lng'] G2 = ox.graph_from_place('Chicago, Illinois', network_type='drive') route_list = [] # get nodes and edges nodes, edges = ox.graph_to_gdfs(G2, nodes=True, edges=True) for i in range(len(startlng)): origin = (startlat[i], startlng[i]) destination = (endlat[i], endlng[i]) origin_node = ox.get_nearest_node(G2, origin) destination_node = ox.get_nearest_node(G2, destination) # exception handling, skipping points without path try: route = nx.shortest_path(G2, origin_node, destination_node, ) route_nodes = nodes.loc[route] # Create a geometry for the shortest path route_line = MultiLineString(list(route_nodes.geometry.values)) # Create a GeoDataFrame route_geom = gpd.GeoDataFrame([[route_line]], geometry='geometry', crs=edges.crs, columns=['geometry']) except: pass route_list.append(route) fig, ax = ox.plot_graph_routes(G2, route_list, node_size=0) `
TypeError: unorderable types: int() < str()
There is an error occurs when I was applying the 5W1H extractor(which is an opensource library in Git) on my JSON news dataset. The error occurs at evaluate_location file when it tried to run raw_locations.sort(key=lambda x: x[1], reverse=True) Then the console gave the error says TypeError: unorderable types: int() < str() My question is: Does this means something wrong with my dataset format? But if so shouldn't it consider all the news data as a simple long string when the extractor work on this corpus? I'm eagerly looking for a solution to this problem. This is one of the json news data: { "title": "Football: Van Dijk, Ronaldo and Messi shortlisted for FIFA award", "body": "ROME: Liverpool centre-back Virgil van Dijk is on the shortlist to add FIFA's best player award to his UEFA Men's Player of the Year honour.The Dutch international denied Cristiano Ronaldo and Lionel Messi for the European title last week and the same trio are in the running for the FIFA accolade to be announced in Milan on September 23. Van Dijk starred in Liverpool's triumphant Champions League campaign.England full-back Lucy Bronze won UEFA's women's award and is on FIFA's shortlist with the United States' World Cup-winning duo Megan Rapinoe and Alex Morgan.Manchester City boss Pep Guardiola is up against Liverpool's Jurgen Klopp and Mauricio Pochettino of Tottenham for best men's coach.Phil Neville, who led England's women to a World Cup semi-final, is up for the women's coach award with the USA's Jill Ellis and Sarina Wiegman who guided European champions the Netherlands to the World Cup final. FIFA Best shortlistsMen's player:Cristiano Ronaldo (Juventus/Portugal), Lionel Messi (Barcelona/Argentina), Virgil van Dijk player:Lucy Bronze (Lyon/England), Alex Morgan (Orlando Pride/USA), Megan Rapinoe (Reign FC/USA)Men's coach:Pep Guardiola (Manchester City), Jurgen Klopp (Liverpool), Mauricio Pochettino (Tottenham)Women's coach:Jill Ellis (USA), Phil Neville (England), Sarina Wiegman (Netherlands)Women's goalkeeper:Christiane Endler (Paris St-Germain/Chile), Hedvig Lindahl (Wolfsburg/Sweden), Sari van Veenendaal (Atletico Madrid/Netherlands)Men's goalkeeper:Alisson (Liverpool/Brazil), Ederson (Manchester City/Brazil), Marc-Andre ter Stegen (Barcelona/Germany)Puskas award (for best goal):Lionel Messi (Barcelona v Real Betis), Juan Quintero (River Plate v Racing Club), Daniel Zsori (Debrecen v Ferencvaros)", "published_at": "2019-09-02", } Code: json_file = open("./Labeled.json","r",encoding="utf-8") data = json.load(json_file) if __name__ == '__main__': # logger setup log = logging.getLogger('GiveMe5W') log.setLevel(logging.DEBUG) sh = logging.StreamHandler() sh.setLevel(logging.DEBUG) log.addHandler(sh) # giveme5w setup - with defaults extractor = MasterExtractor() Document() for i in range(0,1000): body = data[i]["body"] #print(body) #for line in body: #print(line[0:line.find('\n')]) #head = re.sub("[^A-Z\d]", "", "") head = re.search("^[^\n]*", body).group(0) head = str(head) title = data[i]["title"] title = str(title) body = data[i]["body"] body = str(body) published_at = data[i]["published_at"] published_at = str(published_at) doc1 = Document(title,head,body,published_at) doc = extractor.parse(doc1) Instead of return the extracted time&location result, it gave me this error: Traceback (most recent call last): File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner self.run() File "/usr/local/lib/python3.5/dist-packages/Giveme5W1H/extractor/extractor.py", line 20, in run extractor.process(document) File "/usr/local/lib/python3.5/dist-packages/Giveme5W1H/extractor/extractors/abs_extractor.py", line 41, in process self._evaluate_candidates(document) File "/usr/local/lib/python3.5/dist-packages/Giveme5W1H/extractor/extractors/environment_extractor.py", line 75, in _evaluate_candidates locations = self._evaluate_locations(document) File "/usr/local/lib/python3.5/dist-packages/Giveme5W1H/extractor/extractors/environment_extractor.py", line 224, in _evaluate_locations raw_locations.sort(key=lambda x: x[1], reverse=True) TypeError: unorderable types: int() < str()
The row_locations is build in the same file in line 219: raw_locations.append([parts, location.raw['place_id'], location.point, bb, area, 0, 0, candidate, 0]) Thus, the sort function tries to sort the locations by their place_id. Please check your dataset if it does include strings and numbers for the place_id. If so you need to convert all entries to one type.
Calculating average of data set, with text mixed [closed]
Closed. This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 4 years ago. Improve this question I am required to write a Python program that reads a file and calculates the average GDP for each country over the 10-year period. Basically, my desired result is: Australia: 1248467214849.1 Azerbaijan: 55506365440.0 Bangladesh: 139036345780.9 Brazil: 2057882976008.9 Brunei Darussalam: 14817756697.0 Burkina Faso: 10081729086.1 Cabo Verde: 1719693752.3 Cambodia: 13779735437.1 Chile: 229246627569.0 China: 7784747168448.6 Czech Republic: 207328405561.6 Dominica: 499171357.0 Egypt, Arab Rep.: 247614743339.3 France: 2702817149305.2 Germany: 3582562859622.3 Greece: 270091322197.4 Guam: 5115700000.0 India: 1726508317353.4 Iran, Islamic Rep.: 454617559842.3 Iraq: 169480789377.9 Japan: 5217301203153.5 Jordan: 29469864942.1 Kazakhstan: 168198946242.6 Kenya: 48807995178.8 Korea, Rep.: 1205755199135.1 Latvia: 28908355369.8 Lebanon: 40455121214.3 Lithuania: 42763449721.2 Madagascar: 9486935333.5 Malaysia: 274833978374.2 Mali: 11894695436.7 Mongolia: 9207583282.1 Mozambique: 12838623643.4 Myanmar: 50703575766.4 Nicaragua: 10212597587.4 Nigeria: 375494148527.7 Paraguay: 23250819867.6 Philippines: 231981575952.4 Qatar: 149455747118.1 Singapore: 257026873704.2 Spain: 1404296966483.9 Sweden: 519174481541.8 Tanzania: 36731725995.3 Tunisia: 44118349316.0 Turkmenistan: 29383204467.2 United Kingdom: 2736682446205.8 United States: 16108231800000.0 Vietnam: 144579453846.2 Zambia: 21393965950.9 Zimbabwe: 11907947332.3 and the provided text file is as: 853764622753 1055334825425 927168311000 1142876772659 1390557034408 1538194473087 1567178619062 1459597906913 1345383143356 1204616439828 Australia 33050343783 48852482960 44291490421 52902703376 65951627200 69684317719 74164435946 75244166773 53074370486 37847715736 Azerbaijan 79611888213 91631278239 102477791472 115279077465 128637938711 133355749482 149990451022 172885454931 195078665828 221415162446 Bangladesh 1397084381901 1695824517396 1667019605882 2208871646203 2616201578192 2465188674415 2472806919902 2455993200170 1803652649614 1796186586414 Brazil 12247694247 14393099069 10732366286 13707370737 18525319978 19048495519 18093829923 17098342541 12930394938 11400653732 Brunei Darussalam 6771277871 8369637065 8369175126 8979966766 10724063458 11166063467 11947176342 12377391463 10419303761 11693235542 Burkina Faso 1513934037 1789333749 1711817182 1664310770 1864824081 1751888562 1850951315 1858121723 1574288668 1617467436 Cabo Verde 8639235842 10351914093 10401851851 11242275199 12829541141 14038383450 15449630419 16777820333 18049954289 20016747754 Cambodia 173605968179 179638496279 172389498445 218537551220 252251992029 267122320057 278384332694 260990299051 242517905162 247027912574 Chile 3552182311653 4598206091384 5109953609257 6100620488868 7572553836875 8560547314679 9607224481533 10482372109962 11064666282626 11199145157649 China 189227050760 235718586901 206179982164 207477857919 227948349666 207376427021 209402444996 207818330724 186829940546 195305084919 Czech Republic 421375852 458190185 489074333 493824407 501025303 485997988 501979277 523666347 535095846 581484032 Dominica 130478960092 162818181818 188982374701 218888324505 236001858960 279372758362 288586231502 305529656458 332698041031 332791045964 Egypt, Arab Rep. 2663112510266 2923465651091 2693827452070 2646837111795 2862680142625 2681416108537 2808511203185 2849305322685 2433562015516 2465453975282 France 3439953462907 3752365607148 3418005001389 3417094562649 3757698281118 3543983909148 3752513503278 3890606893347 3375611100742 3477796274497 Germany 318497936901 354460802549 330000252153 299361576558 287797822093 245670666639 239862011450 237029579261 195541761243 192690813127 Greece 4375000000 4621000000 4781000000 4895000000 4928000000 5199000000 5337000000 5531000000 5697000000 5793000000 Guam 1201111768409 1186952757636 1323940295875 1656617073124 1823049927771 1827637859136 1856722121395 2035393459979 2089865410868 2263792499341 India 349881601459 406070949554 414059094949 487069570464 583500357530 598853401276 467414852231 434474616832 385874474399 418976679729 Iran, Islamic Rep. 88840050497 131613661510 111660855043 138516722650 185749664444 218000986223 234648370497 234648370497 179640210726 171489001692 Iraq 4515264514431 5037908465114 5231382674594 5700098114744 6157459594824 6203213121334 5155717056271 4848733415524 4383076298082 4940158776617 Japan 17110587447 21972004086 23820230000 26425379437 28840263380 30937277606 33593843662 35826925775 37517410282 38654727746 Jordan 104849886826 133441612247 115308661143 148047348241 192626507972 207998568866 236634552078 221415572820 184388432149 137278320084 Kazakhstan 31958195182 35895153328 37021512049 39999659234 41953433591 50412754822 55097343448 61445345999 63767539357 70529014778 Kenya 1122679154632 1002219052968 901934953365 1094499338703 1202463682634 1222807284485 1305604981272 1411333926201 1382764027114 1411245589977 Korea, Rep. 30901399261 35596016664 26169854045 23757368290 28223552825 28119996053 30314363219 31419072948 27009231911 27572698482 Latvia 24577114428 29227350570 35477118070 38419626628 40075674163 43868565282 46014226808 47833413749 49459296463 49598825982 Lebanon 39738180077 47850551149 37440673478 37120517694 43476878139 42847900766 46473646002 48545251796 41402022148 42738875963 Lithuania 7342923489 9413002921 8550363975 8729936136 9892702358 9919780071 10601690872 10673516673 9744243420 10001193420 Madagascar 193547824063 230813597938 202257586268 255016609233 297951960784 314443149443 323277158907 338061963396 296434003329 296535930381 Malaysia 8145694632 9750822511 10181021770 10678749467 12978107561 12442747897 13246412031 14388360064 13100058100 14034980334 Mali 4234999823 5623216449 4583850368 7189481824 10409797649 12292770631 12582122604 12226514722 11749620620 11183458131 Mongolia 9366742309 11494837053 10911698208 10154238250 13131168012 14534278446 16018848991 16961127046 14798439527 11014858592 Mozambique 20182477481 31862554102 36906181381 49540813342 59977326086 59937797559 60269734045 65446402659 59687373958 63225097051 Myanmar 7423377429 8496965842 8298695145 8758622329 9774316692 10532001130 10982972256 11880438824 12747741540 13230844687 Nicaragua 166451213396 208064753766 169481317540 369062464570 411743801712 460953836444 514966287207 568498937588 481066152889 404652720165 Nigeria 13794910634 18504130753 15929902138 20030528043 25099681461 24595319574 28965906502 30881166852 27282581336 27424071383 Paraguay 149359920006 174195135053 168334599538 199590775190 224143083707 250092093548 271836123724 284584522899 292774099014 304905406845 Philippines 79712087912 115270054945 97798351648 125122306346 167775274725 186833516484 198727747253 206224725275 164641483516 152451923077 Qatar 179981288567 192225881688 192408387762 236421782178 275599459374 289162118909 302510668904 308142766948 296840704102 296975678610 Singapore 1479341637011 1635015380108 1499099749931 1431616749640 1488067258325 1336018949806 1361854206549 1376910811041 1197789902774 1237255019654 Spain 487816328342 513965650650 429657033108 488377689565 563109663291 543880647757 578742001488 573817719109 497918109302 514459972806 Sweden 21501741757 27368386358 28573777052 31407908612 33878631649 39087748240 44333456245 48197218327 45628320606 47340071107 Tanzania 38908069299 44856586316 43454935940 44050929160 45810626509 45044112939 46251061734 47587913059 43156708809 42062549395 Tunisia 12664165103 19271523179 20214385965 22583157895 29233333333 35164210526 39197543860 43524210526 35799628571 36179885714 Turkmenistan 3074359743898 2890564338235 2382825985356 2441173394730 2619700404733 2662085168499 2739818680930 3022827781881 2885570309161 2647898654635 United Kingdom 14477635000000 14718582000000 14418739000000 14964372000000 15517926000000 16155255000000 16691517000000 17393103000000 18120714000000 18624475000000 United States 77414425532 99130304099 106014659770 115931749697 135539438560 155820001920 171222025117 186204652922 193241108710 205276172135 Vietnam 14056957976 17910858638 15328342304 20265556274 23460098340 25503370699 28045460442 27150630607 21154394546 21063989683 Zambia 5291950100 4415702800 8621573608 10141859710 12098450749 14242490252 15451768659 15891049236 16304667807 16619960402 Zimbabwe So what I have thought of so far is: to use an aggregation loop that checks whether the current line is a GDP value or the name of a country: when it reaches the name of a country it should calculate the average and print out the result, then it should reset the per-country aggregation variables and continue looping to aggregate the next country's GDP values. And so to handle the mixed nature of the input file, I would either use the str.isnumeric() method or keep a counter to check when 10 GDP values have been read (since the next line would then be the name of the corresponding country). for value in open("10year-gdp.txt"):
Something like this in Python 3 may work: import statistics with open('10year-gdp.txt') as f: items = [] for line in f.readlines(): line = line.strip() if line.isdigit(): items.append(float(line)) else: print('{0}: {1}'.format(line, statistics.mean(items))) items = []
You can try this one too: with open("10year-gdp.txt", "r") as infile: content = infile.readlines() content = [content[i:i+11] for i in range(0,len(content),11)] results = [": ".join([c[10],str(sum(map(float,c[0:10]))/10)]).replace("\n","") for c in content] for result in results: print(result) Output: Australia: 1248467214849.1 Azerbaijan: 55506365440.0 Bangladesh: 139036345780.9 Brazil: 2057882976008.9 Brunei Darussalam: 14817756697.0 Burkina Faso: 10081729086.1 Cabo Verde: 1719693752.3 Cambodia: 13779735437.1 Chile: 229246627569.0 China: 7784747168448.6 Czech Republic: 207328405561.6 Dominica: 499171357.0 Egypt, Arab Rep.: 247614743339.3 France: 2702817149305.2 Germany: 3582562859622.3 Greece: 270091322197.4 Guam: 5115700000.0 India: 1726508317353.4 Iran, Islamic Rep.: 454617559842.3 Iraq: 169480789377.9 Japan: 5217301203153.5 Jordan: 29469864942.1 Kazakhstan: 168198946242.6 Kenya: 48807995178.8 Korea, Rep.: 1205755199135.1 Latvia: 28908355369.8 Lebanon: 40455121214.3 Lithuania: 42763449721.2 Madagascar: 9486935333.5 Malaysia: 274833978374.2 Mali: 11894695436.7 Mongolia: 9207583282.1 Mozambique: 12838623643.4 Myanmar: 50703575766.4 Nicaragua: 10212597587.4 Nigeria: 375494148527.7 Paraguay: 23250819867.6 Philippines: 231981575952.4 Qatar: 149455747118.1 Singapore: 257026873704.2 Spain: 1404296966483.9 Sweden: 519174481541.8 Tanzania: 36731725995.3 Tunisia: 44118349316.0 Turkmenistan: 29383204467.2 United Kingdom: 2736682446205.8 United States: 16108231800000.0 Vietnam: 144579453846.2 Zambia: 21393965950.9 Zimbabwe: 11907947332.3
#!/usr/bin/env python from statistics import mean GDPGroup = [] GDPDictionary = {} with open("10year-gdp.txt") as FileObject: lines = FileObject.readlines() for line in lines: line = line.strip() if not line.isdigit(): GDPDictionary[line] = GDPGroup GDPGroup = [] else: GDPGroup.append(float(line)) for key in GDPDictionary: array = GDPDictionary[key] array2 = [] GDPDictionary[key] = mean(array) print(GDPDictionary) Prints out: {'Guam': 5115700000.0, 'Lithuania': 42763449721.2, 'Azerbaijan': 55506365440.0, 'Bangladesh': 139036345780.9, 'Egypt, Arab Rep.': 247614743339.3, 'Burkina Faso': 10081729086.1, 'Chile': 229246627569.0, 'Mongolia': 9207583282.1, 'Nicaragua': 10212597587.4, 'Brazil': 2057882976008.9, 'Kenya': 48807995178.8, 'Dominica': 499171357.0, 'Japan': 5217301203153.5, 'India': 1726508317353.4, 'Cabo Verde': 1719693752.3, 'United States': 16108231800000.0, 'Greece': 270091322197.4, 'Myanmar': 50703575766.4, 'Madagascar': 9486935333.5, 'Tunisia': 44118349316.0, 'Mozambique': 12838623643.4, 'Cambodia': 13779735437.1, 'Iraq': 169480789377.9, 'Korea, Rep.': 1205755199135.1, 'Kazakhstan': 168198946242.6, 'Turkmenistan': 29383204467.2, 'Germany': 3582562859622.3, 'Iran, Islamic Rep.': 454617559842.3, 'France': 2702817149305.2, 'Paraguay': 23250819867.6, 'United Kingdom': 2736682446205.8, 'Malaysia': 274833978374.2, 'Philippines': 231981575952.4, 'Qatar': 149455747118.1, 'Lebanon': 40455121214.3, 'Jordan': 29469864942.1, 'Mali': 11894695436.7, 'Zambia': 21393965950.9, 'Australia': 1248467214849.1, 'Singapore': 257026873704.2, 'Zimbabwe': 11907947332.3, 'Sweden': 519174481541.8, 'Nigeria': 375494148527.7, 'China': 7784747168448.6, 'Tanzania': 36731725995.3, 'Czech Republic': 207328405561.6, 'Vietnam': 144579453846.2, 'Latvia': 28908355369.8, 'Spain': 1404296966483.9, 'Brunei Darussalam': 14817756697.0}
BeautifulSoup, extract a table (from poorly designed site) and turn it into a CSV
I'm trying to extract this table in whole - any tips? I've tried the following code 8 different ways, with no avail. Thank you! data = [] table = soup.find_all("tbody") rows = table.find_all("tr") for row in rows: cols = row.find_all("td") cols = [ele.text.strip() for ele in cols] data.append([ele for ele in cols if ele])
Code: import requests from bs4 import BeautifulSoup html = requests.get('http://www.boxofficemojo.com/alltime/adjusted.htm').text soup = BeautifulSoup(html, 'html.parser') table = soup.find('table', cellspacing='1') f = open('data.csv','w') for row in table.find_all('tr'): print(''.join(row.findAll(text=True)).replace('\n', '|')) f.write(''.join(row.findAll(text=True)).replace('\n', '|') + '\n') f.close() Output: 1|Gone with the Wind|MGM|$1,854,769,700|$198,676,459|1939^| 2|Star Wars|Fox|$1,635,137,900|$460,998,007|1977^| 3|The Sound of Music|Fox|$1,307,373,200|$158,671,368|1965| 4|E.T.: The Extra-Terrestrial|Uni.|$1,302,222,800|$435,110,554|1982^| 5|Titanic|Par.|$1,244,347,300|$659,363,944|1997^| 6|The Ten Commandments|Par.|$1,202,580,000|$65,500,000|1956| 7|Jaws|Uni.|$1,175,763,500|$260,000,000|1975| 8|Doctor Zhivago|MGM|$1,139,563,500|$111,721,910|1965| 9|The Exorcist|WB|$1,015,300,400|$232,906,145|1973^| 10|Snow White and the Seven Dwarfs|Dis.|$1,000,620,000|$184,925,486|1937^| 11|Star Wars: The Force Awakens|BV|$992,496,600|$936,662,225|2015| 12|101 Dalmatians|Dis.|$917,240,400|$144,880,014|1961^| 13|The Empire Strikes Back|Fox|$901,298,200|$290,475,067|1980^| 14|Ben-Hur|MGM|$899,640,000|$74,000,000|1959| 15|Avatar|Fox|$893,301,900|$760,507,625|2009^| 16|Return of the Jedi|Fox|$863,465,400|$309,306,177|1983^| 17|Jurassic Park|Uni.|$843,843,500|$402,453,882|1993^| 18|Star Wars: Episode I - The Phantom Menace|Fox|$829,064,800|$474,544,677|1999^| 19|The Lion King|BV|$818,364,200|$422,783,777|1994^| 20|The Sting|Uni.|$818,331,400|$156,000,000|1973| 21|Raiders of the Lost Ark|Par.|$812,675,900|$248,159,971|1981^| 22|The Graduate|AVCO|$785,595,300|$104,945,305|1967^| 23|Fantasia|Dis.|$762,339,100|$76,408,097|1941^| 24|Jurassic World|Uni.|$725,671,700|$652,270,625|2015| 25|The Godfather|Par.|$724,509,200|$134,966,411|1972^| 26|Forrest Gump|Par.|$721,682,300|$330,252,182|1994^| 27|Mary Poppins|Dis.|$717,709,100|$102,272,727|1964^| 28|Grease|Par.|$706,577,200|$188,755,690|1978^| 29|Marvel's The Avengers|BV|$705,769,500|$623,357,910|2012| 30|Thunderball|UA|$686,664,000|$63,595,658|1965| 31|The Dark Knight|WB|$683,575,000|$534,858,444|2008^| 32|The Jungle Book|Dis.|$676,381,600|$141,843,612|1967^| 33|Sleeping Beauty|Dis.|$667,166,200|$51,600,000|1959^| 34|Ghostbusters|Col.|$653,374,800|$242,212,467|1984^| 35|Shrek 2|DW|$652,247,500|$441,226,247|2004| 36|Butch Cassidy and the Sundance Kid|Fox|$647,721,100|$102,308,889|1969| 37|Love Story|Par.|$642,583,000|$106,397,186|1970| 38|Spider-Man|Sony|$637,870,000|$403,706,375|2002| 39|Independence Day|Fox|$635,888,300|$306,169,268|1996^| 40|Home Alone|Fox|$621,799,900|$285,761,243|1990| 41|Pinocchio|Dis.|$618,762,600|$84,254,167|1940^| 42|Cleopatra (1963)|Fox|$616,744,200|$57,777,778|1963| 43|Beverly Hills Cop|Par.|$616,437,200|$234,760,478|1984| 44|Star Wars: The Last Jedi|BV|$615,738,300|$615,738,279|2017| 45|Goldfinger|UA|$608,634,000|$51,081,062|1964| 46|Airport|Uni.|$606,901,600|$100,489,151|1970| 47|American Graffiti|Uni.|$603,257,100|$115,000,000|1973| 48|The Robe|Fox|$600,872,700|$36,000,000|1953| 49|Pirates of the Caribbean: Dead Man's Chest|BV|$593,288,400|$423,315,812|2006| 50|Around the World in 80 Days|UA|$593,169,200|$42,000,000|1956| 51|Bambi|RKO|$584,880,300|$102,247,150|1942^| 52|Blazing Saddles|WB|$580,539,700|$119,601,481|1974^| 53|Batman|WB|$577,923,400|$251,188,924|1989| 54|The Bells of St. Mary's|RKO|$576,000,000|$21,333,333|1945| 55|The Lord of the Rings: The Return of the King|NL|$565,852,400|$377,845,905|2003^| 56|Finding Nemo|BV|$565,364,200|$380,843,261|2003^| 57|The Towering Inferno|Fox|$563,428,600|$116,000,000|1974| 58|Rogue One: A Star Wars Story|BV|$554,854,100|$532,177,324|2016| 59|Cinderella (1950)|Dis.|$553,567,100|$93,141,149|1950^| 60|Spider-Man 2|Sony|$552,257,300|$373,585,825|2004| 61|My Fair Lady|WB|$550,800,000|$72,000,000|1964| 62|The Greatest Show on Earth|Par.|$550,800,000|$36,000,000|1952| 63|National Lampoon's Animal House|Uni.|$549,792,700|$141,600,000|1978^| 64|The Passion of the Christ|NM|$548,090,400|$370,782,930|2004^| 65|Star Wars: Episode III - Revenge of the Sith|Fox|$544,599,700|$380,270,577|2005^| 66|Back to the Future|Uni.|$542,085,000|$210,609,762|1985| 67|The Lord of the Rings: The Two Towers|NL|$529,918,100|$342,551,365|2002^| 68|The Dark Knight Rises|WB|$528,601,000|$448,139,099|2012| 69|The Sixth Sense|BV|$528,576,400|$293,506,292|1999| 70|Superman|WB|$526,547,600|$134,218,018|1978| 71|Tootsie|Col.|$522,378,200|$177,200,000|1982| 72|Smokey and the Bandit|Uni.|$521,726,300|$126,737,428|1977| 73|Beauty and the Beast (2017)|BV|$521,407,600|$504,014,165|2017| 74|Finding Dory|BV|$515,531,300|$486,295,561|2016| 75|West Side Story|MGM|$513,807,200|$43,656,822|1961| 76|Close Encounters of the Third Kind|Col.|$513,370,800|$135,189,114|1977^| 77|Harry Potter and the Sorcerer's Stone|WB|$513,281,200|$317,575,550|2001| 78|Lady and the Tramp|Dis.|$511,646,200|$93,602,326|1955^| 79|Lawrence of Arabia|Col.|$508,421,000|$44,824,144|1962^| 80|The Rocky Horror Picture Show|Fox|$505,537,300|$112,892,319|1975| 81|Rocky|UA|$505,267,000|$117,235,147|1976| 82|The Best Years of Our Lives|RKO|$504,900,000|$23,650,000|1946| 83|The Poseidon Adventure|Fox|$504,000,000|$84,563,118|1972| 84|The Lord of the Rings: The Fellowship of the Ring|NL|$503,057,400|$315,544,750|2001^| 85|Twister|WB|$502,037,000|$241,721,524|1996| 86|Men in Black|Sony|$501,381,100|$250,690,539|1997| 87|The Bridge on the River Kwai|Col.|$499,392,000|$27,200,000|1957| 88|Transformers: Revenge of the Fallen|P/DW|$494,810,500|$402,111,870|2009| 89|It's a Mad, Mad, Mad, Mad World|MGM|$494,576,300|$46,332,858|1963| 90|Swiss Family Robinson|Dis.|$493,957,400|$40,356,000|1960| 91|One Flew Over the Cuckoo's Nest|UA|$492,831,600|$108,981,275|1975| 92|M.A.S.H.|Fox|$492,821,000|$81,600,000|1970| 93|Indiana Jones and the Temple of Doom|Par.|$491,431,300|$179,870,271|1984| 94|Avengers: Age of Ultron|BV|$491,377,100|$459,005,868|2015| 95|Star Wars: Episode II - Attack of the Clones|Fox|$490,840,600|$310,676,740|2002^| 96|Toy Story 3|BV|$489,656,000|$415,004,880|2010| 97|Mrs. Doubtfire|Fox|$483,642,600|$219,195,243|1993| 98|Aladdin|BV|$481,420,700|$217,350,219|1992| 99|Ghost|Par.|$472,450,700|$217,631,306|1990| 100|The Hunger Games: Catching Fire|LGF|$469,232,400|$424,668,047|2013| 101|Duel in the Sun|Selz.|$468,367,300|$20,408,163|1946| 102|The Hunger Games|LGF|$466,924,700|$408,010,692|2012| 103|Pirates of the Caribbean: The Curse of the Black Pearl|BV|$464,956,900|$305,413,918|2003| 104|House of Wax|WB|$463,883,000|$23,750,000|1953| 105|Rear Window|Par.|$462,256,500|$36,764,313|1954^| 106|The Lost World: Jurassic Park|Uni.|$458,173,400|$229,086,679|1997| 107|Indiana Jones and the Last Crusade|Par.|$453,643,400|$197,171,806|1989| 108|Monsters, Inc.|BV|$453,061,600|$289,916,256|2001^| 109|Frozen|BV|$450,196,500|$400,738,009|2013| 110|Spider-Man 3|Sony|$449,033,200|$336,530,303|2007| 111|Iron Man 3|BV|$448,060,700|$409,013,994|2013| 112|Terminator 2: Judgment Day|TriS|$447,732,400|$205,881,154|1991^| 113|Sergeant York|WB|$441,770,900|$16,361,885|1941| 114|How the Grinch Stole Christmas|Uni.|$441,620,600|$260,044,825|2000| 115|Top Gun|Par.|$440,917,900|$179,800,601|1986^| 116|Harry Potter and the Deathly Hallows Part 2|WB|$440,547,300|$381,011,219|2011| 117|Toy Story 2|BV|$439,139,300|$245,852,179|1999^| 118|Shrek|DW|$434,128,000|$267,665,011|2001| 119|Shrek the Third|P/DW|$430,606,000|$322,719,944|2007| 120|Despicable Me 2|Uni.|$430,487,800|$368,061,265|2013| 121|Captain America: Civil War|BV|$429,213,000|$408,084,349|2016| 122|The Matrix Reloaded|WB|$428,668,600|$281,576,461|2003| 123|Transformers|P/DW|$425,970,900|$319,246,193|2007| 124|Crocodile Dundee|Par.|$424,138,600|$174,803,506|1986| 125|Wonder Woman|WB|$423,340,500|$412,563,408|2017| 126|The Four Horsemen of the Apocalypse|MPC|$421,530,600|$9,183,673|1921| 127|Saving Private Ryan|DW|$419,958,100|$216,540,909|1998| 128|Young Frankenstein|Fox|$419,041,900|$86,273,333|1974| 129|Peter Pan|Dis.|$418,824,000|$87,404,651|1953^| 130|Gremlins|WB|$417,526,300|$153,083,102|1984^| 131|Beauty and the Beast|BV|$416,438,900|$218,967,620|1991^| 132|The Chronicles of Narnia: The Lion, the Witch and the Wardrobe|BV|$414,717,600|$291,710,957|2005| 133|Harry Potter and the Goblet of Fire|WB|$414,709,000|$290,013,036|2005| 134|Pirates of the Caribbean: At World's End|BV|$412,860,400|$309,420,425|2007| 135|Harry Potter and the Chamber of Secrets|WB|$412,327,800|$261,988,482|2002| 136|The Fugitive|WB|$407,567,300|$183,875,760|1993| 137|The Caine Mutiny|Col.|$407,479,600|$21,750,000|1954| 138|Iron Man|Par.|$407,095,000|$318,412,101|2008| 139|Transformers: Dark of the Moon|P/DW|$406,315,000|$352,390,543|2011| 140|Meet the Fockers|Uni.|$405,508,300|$279,261,160|2004| 141|Indiana Jones and the Kingdom of the Crystal Skull|Par.|$405,430,100|$317,101,119|2008| 142|Toy Story|BV|$402,711,200|$191,796,233|1995^| 143|Dances with Wolves|Orion|$401,159,500|$184,208,848|1990| 144|An Officer and a Gentleman|Par.|$400,769,900|$129,795,554|1982| 145|Guardians of the Galaxy Vol. 2|BV|$399,848,900|$389,813,101|2017| 146|2001: A Space Odyssey|MGM|$397,829,200|$56,954,992|1968^| 147|Rain Man|MGM|$397,417,800|$172,825,435|1988| 148|The Secret Life of Pets|Uni.|$397,253,600|$368,384,330|2016| 149|Guess Who's Coming to Dinner|Col.|$397,099,200|$56,666,667|1967| 150|Inside Out|BV|$396,452,900|$356,461,711|2015| 151|American Sniper|WB|$395,474,400|$350,126,372|2014| 152|Kramer Vs. Kramer|Col.|$394,925,800|$106,260,000|1979| 153|Armageddon|BV|$394,560,300|$201,578,182|1998| 154|Psycho|Uni.|$391,680,100|$32,000,000|1960| 155|Rocky III|UA|$390,271,700|$125,049,125|1982^| 156|Harry Potter and the Order of the Phoenix|WB|$389,622,600|$292,004,738|2007| 157|Rambo: First Blood Part II|TriS|$388,961,600|$150,415,432|1985| 158|Batman Forever|WB|$388,369,100|$184,031,112|1995| 159|Deadpool|Fox|$388,249,600|$363,070,709|2016| 160|Pretty Woman|BV|$387,179,600|$178,406,268|1990| 161|Earthquake|Uni.|$386,952,300|$79,666,653|1974| 162|Alice in Wonderland (2010)|BV|$385,896,200|$334,191,110|2010| 163|The Incredibles|BV|$385,835,000|$261,441,092|2004| 164|Cast Away|Fox|$384,588,700|$233,632,142|2000| 165|Home Alone 2: Lost in New York|Fox|$384,179,200|$173,585,516|1992| 166|The Jungle Book (2016)|BV|$382,904,500|$364,001,123|2016| 167|Three Men and a Baby|BV|$382,840,700|$167,780,960|1987| 168|My Big Fat Greek Wedding|IFC|$380,230,800|$241,438,208|2002| 169|Guardians of the Galaxy|BV|$378,010,100|$333,176,600|2014| 170|Furious 7|Uni.|$376,598,400|$353,007,020|2015| 171|Mission: Impossible|Par.|$375,885,400|$180,981,856|1996| 172|The Hunger Games: Mockingjay - Part 1|LGF|$373,872,900|$337,135,885|2014| 173|Minions|Uni.|$373,756,800|$336,045,770|2015| 174|Saturday Night Fever|Par.|$372,751,500|$94,213,184|1977| 175|On Golden Pond|Uni.|$372,564,100|$119,285,432|1981| 176|Austin Powers: The Spy Who Shagged Me|NL|$372,332,300|$206,040,086|1999| 177|Harry Potter and the Half-Blood Prince|WB|$371,524,900|$301,959,197|2009| 178|Bruce Almighty|Uni.|$369,680,400|$242,829,261|2003| 179|Harry Potter and the Prisoner of Azkaban|WB|$368,886,800|$249,541,069|2004| 180|Funny Girl|Col.|$367,562,200|$52,223,306|1968^| 181|Mission: Impossible II|Par.|$366,876,200|$215,409,889|2000| 182|Rush Hour 2|NL|$366,817,700|$226,164,286|2001| 183|Apollo 13|Uni.|$365,894,000|$173,837,933|1995^| 184|Patton|Fox|$365,718,000|$61,749,765|1970| 185|Fatal Attraction|Par.|$364,269,300|$156,645,693|1987| 186|Zootopia|BV|$363,584,000|$341,268,248|2016| 187|Liar Liar|Uni.|$362,821,200|$181,410,615|1997| 188|Robin Hood: Prince of Thieves|WB|$360,863,200|$165,493,908|1991| 189|Beverly Hills Cop II|Par.|$360,778,800|$153,665,036|1987| 190|Iron Man 2|Par.|$360,772,100|$312,433,331|2010| 191|Up|BV|$360,533,300|$293,004,164|2009| 192|Batman Returns|WB|$360,191,600|$162,831,698|1992| 193|Signs|BV|$360,164,800|$227,966,634|2002| 194|Jumanji: Welcome to the Jungle|Sony|$358,036,900|$358,036,871|2017| 195|The Twilight Saga: Eclipse|Sum.|$357,823,200|$300,531,751|2010| 196|Superman II|WB|$357,246,300|$108,185,706|1981| 197|The Twilight Saga: New Moon|Sum.|$357,194,500|$296,623,634|2009| 198|What's Up, Doc?|WB|$356,400,000|$66,000,000|1972| 199|9 to 5|Fox|$352,493,200|$103,290,500|1980| 200|Batman v Superman: Dawn of Justice|WB|$351,232,600|$330,360,194|2016| 201|The Firm|Par.|$351,120,300|$158,348,367|1993| 202|Suicide Squad|WB|$350,483,800|$325,100,054|2016| 203|Who Framed Roger Rabbit|BV|$349,448,400|$156,452,370|1988| 204|Inception|WB|$348,133,400|$292,576,195|2010| 205|Skyfall|Sony|$347,389,600|$304,360,277|2012| 206|The Hobbit: An Unexpected Journey|WB (NL)|$347,313,400|$303,003,568|2012| 207|Porky's|Fox|$346,289,600|$111,289,673|1982^| 208|Air Force One|Sony|$345,835,200|$172,956,409|1997| 209|Stir Crazy|Col.|$345,700,400|$101,300,000|1980| 210|A Star Is Born (1976)|WB|$344,788,700|$80,000,000|1976| 211|There's Something About Mary|Fox|$344,053,800|$176,484,651|1998| 212|Spider-Man: Homecoming|Sony|$343,499,000|$334,201,140|2017| 213|Cars|BV|$342,088,800|$244,082,982|2006| 214|The Hangover|WB|$341,182,900|$277,322,503|2009| 215|Lethal Weapon 2|WB|$340,501,700|$147,253,986|1989| 216|Night at the Museum|Fox|$340,041,900|$250,863,268|2006| 217|Harry Potter and the Deathly Hallows Part 1|WB|$339,560,700|$295,983,305|2010| 218|I Am Legend|WB|$337,126,200|$256,393,010|2007| 219|Austin Powers in Goldmember|NL|$337,033,800|$213,307,889|2002| 220|War of the Worlds|Par.|$335,521,600|$234,280,354|2005| 221|It|WB (NL)|$335,148,900|$327,481,748|2017| 222|Every Which Way But Loose|WB|$334,232,400|$85,196,485|1978| 223|The Twilight Saga: Breaking Dawn Part 2|LG/S|$333,495,700|$292,324,737|2012| 224|The Love Bug|Dis.|$331,410,900|$51,264,000|1969| 225|The Twilight Saga: Breaking Dawn Part 1|Sum.|$329,680,800|$281,287,133|2011| 226|You Only Live Twice|UA|$329,598,600|$43,084,787|1967| 227|X-Men: The Last Stand|Fox|$328,465,300|$234,362,462|2006| 228|The Mummy Returns|Uni.|$327,657,500|$202,019,785|2001| 229|X2: X-Men United|Fox|$327,236,800|$214,949,694|2003| 230|Platoon|Orion|$325,302,500|$138,530,565|1986| 231|Rocky IV|UA|$324,855,400|$127,873,716|1985| 232|Pearl Harbor|BV|$322,017,800|$198,542,554|2001| 233|True Lies|Fox|$321,261,400|$146,282,411|1994| 234|Heaven Can Wait (1978)|Par.|$320,281,100|$81,640,278|1978| 235|Lethal Weapon 3|WB|$320,153,100|$144,731,527|1992| 236|Look Who's Talking|TriS|$319,854,500|$140,088,813|1989| 237|Gladiator|DW|$319,592,900|$187,705,427|2000| 238|Man of Steel|WB|$318,830,300|$291,045,518|2013| 239|Jaws 2|Uni.|$318,717,900|$81,766,007|1978^| 240|Star Trek|Par.|$317,150,800|$257,730,019|2009| 241|The Santa Clause|BV|$316,776,400|$144,833,357|1994| 242|The Amityville Horror|AIP|$316,113,900|$86,432,000|1979| 243|Thor: Ragnarok|BV|$314,143,200|$314,143,225|2017| 244|The Waterboy|BV|$314,053,600|$161,491,646|1998| 245|A Bug's Life|BV|$313,363,900|$162,798,565|1998| 246|A Few Good Men|Col.|$313,069,200|$141,340,178|1992| 247|The Odd Couple|Par.|$312,030,500|$44,527,234|1968| 248|Rocky II|UA|$311,542,700|$85,182,160|1979| 249|Jerry Maguire|Sony|$311,468,800|$153,952,592|1996| 250|The Perfect Storm|WB|$311,027,300|$182,618,434|2000| 251|King Kong|Uni.|$310,014,100|$218,080,025|2005| 252|The Matrix|WB|$309,879,100|$171,479,930|1999| 253|The Amazing Spider-Man|Sony|$309,163,500|$262,030,663|2012| 254|Tarzan|BV|$309,122,000|$171,091,819|1999| 255|Sister Act|BV|$308,813,300|$139,605,150|1992| 256|Hooper|WB|$306,000,000|$78,000,000|1978| 257|The Blind Side|WB|$305,701,600|$255,959,475|2009| 258|The Da Vinci Code|Sony|$304,882,700|$217,536,138|2006| 259|Monsters University|BV|$304,779,900|$268,492,764|2013| 260|All the President's Men|WB|$304,276,100|$70,600,000|1976| 261|What Women Want|Par.|$303,763,400|$182,811,707|2000| 262|The Bourne Ultimatum|Uni.|$303,515,200|$227,471,070|2007| 263|Gravity|WB|$302,369,300|$274,092,705|2013| 264|Honey, I Shrunk the Kids|BV|$302,279,100|$130,724,172|1989| 265|Terms of Endearment|Par.|$301,824,600|$108,423,489|1983| 266|Men in Black II|Sony|$300,868,300|$190,418,803|2002| 267|Star Trek: The Motion Picture|Par.|$300,849,700|$82,258,456|1979| 268|Wedding Crashers|NL|$299,683,200|$209,255,921|2005| 269|Despicable Me|Uni.|$299,217,100|$251,513,985|2010| 270|Pocahontas|BV|$298,782,100|$141,579,773|1995| 271|Arthur|WB|$298,725,900|$95,461,682|1981| 272|The Hunger Games: Mockingjay - Part 2|LGF|$297,446,700|$281,723,902|2015| 273|The LEGO Movie|WB|$296,654,200|$257,760,692|2014| 274|Batman Begins|WB|$295,860,600|$206,852,432|2005^| 275|Apocalypse Now|MGM|$295,789,400|$83,471,511|1979^| 276|Charlie and the Chocolate Factory|WB|$295,677,800|$206,459,076|2005| 277|Big Daddy|Sony|$295,422,100|$163,479,795|1999| 278|Ocean's Eleven|WB|$294,446,200|$183,417,150|2001| 279|Jurassic Park III|Uni.|$293,844,100|$181,171,875|2001| 280|Teenage Mutant Ninja Turtles|NL|$293,555,800|$135,265,915|1990| 281|Planet of the Apes (2001)|Fox|$291,948,200|$180,011,740|2001| 282|Alien|Fox|$291,755,600|$80,931,801|1979^| 283|Hancock|Sony|$291,441,100|$227,946,274|2008| 284|As Good as It Gets|Sony|$290,776,100|$148,478,011|1997| 285|The Hangover Part II|WB|$289,972,400|$254,464,305|2011| 286|Midnight Cowboy|UA|$289,525,900|$44,785,053|1969| 287|The Hobbit: The Desolation of Smaug|WB (NL)|$289,308,500|$258,366,855|2013| 288|The French Connection|Fox|$287,640,000|$51,700,000|1971| 289|The Flintstones|Uni.|$286,669,000|$130,531,208|1994| 290|Captain America: The Winter Soldier|BV|$286,373,800|$259,766,572|2014| 291|Coming to America|Par.|$286,238,000|$128,152,301|1988| 292|National Treasure: Book of Secrets|BV|$286,164,000|$219,964,115|2007| 293|WALL-E|BV|$286,150,300|$223,808,164|2008| 294|The Hobbit: The Battle of the Five Armies|WB (NL)|$285,304,300|$255,119,788|2014| 295|The Silence of the Lambs|Orion|$285,087,900|$130,742,922|1991| 296|The Karate Kid Part II|Col.|$284,812,500|$115,103,979|1986| 297|Airplane!|Par.|$284,796,800|$83,453,539|1980| 298|Alvin and the Chipmunks|Fox|$284,128,700|$217,326,974|2007| 299|Meet the Parents|Uni.|$282,676,300|$166,244,045|2000| 300|Ransom|BV|$282,366,800|$136,492,681|1996|
BeautifulSoup - how to arrange data and write to txt?
New to Python, have a simple problem. I am pulling some data from Yahoo Fantasy Baseball to text file, but my code didn't work properly: from bs4 import BeautifulSoup import urllib2 teams = ("http://baseball.fantasysports.yahoo.com/b1/2282/players?status=A&pos=B&cut_type=33&stat1=S_S_2015&myteam=0&sort=AR&sdir=1") page = urllib2.urlopen(teams) soup = BeautifulSoup(page, "html.parser") players = soup.findAll('div', {'class':'ysf-player-name Nowrap Grid-u Relative Lh-xs Ta-start'}) playersLines = [span.get_text('\t',strip=True) for span in players] with open('output.txt', 'w') as f: for line in playersLines: line = playersLines[0] output = line.encode('utf-8') f.write(output) In output file is only one player for 25 times. Any ideas to get result like this? Pedro Álvarez Pit - 1B,3B Kevin Pillar Tor - OF Melky Cabrera CWS - OF etc
Try removing: line = playersLines[0] Also, append a newline character to the end of your output to get them to write to separate lines in the output.txt file: from bs4 import BeautifulSoup import urllib2 teams = ("http://baseball.fantasysports.yahoo.com/b1/2282/players?status=A&pos=B&cut_type=33&stat1=S_S_2015&myteam=0&sort=AR&sdir=1") page = urllib2.urlopen(teams) soup = BeautifulSoup(page, "html.parser") players = soup.findAll('div', {'class':'ysf-player-name Nowrap Grid-u Relative Lh-xs Ta-start'}) playersLines = [span.get_text('\t',strip=True) for span in players] with open('output.txt', 'w') as f: for line in playersLines: output = line.encode('utf-8') f.write(output+'\n') Results: Pedro Álvarez Pit - 1B,3B Kevin Pillar Tor - OF Melky Cabrera CWS - OF Ryan Howard Phi - 1B Michael A. Taylor Was - OF Joe Mauer Min - 1B Maikel Franco Phi - 3B Joc Pederson LAD - OF Yangervis Solarte SD - 1B,2B,3B César Hernández Phi - 2B,3B,SS Eddie Rosario Min - 2B,OF Austin Jackson Sea - OF Danny Espinosa Was - 1B,2B,3B,SS Danny Valencia Oak - 1B,3B,OF Freddy Galvis Phi - 3B,SS Jimmy Paredes Bal - 2B,3B Colby Rasmus Hou - OF Luis Valbuena Hou - 1B,2B,3B Chris Young NYY - OF Kevin Kiermaier TB - OF Steven Souza TB - OF Jace Peterson Atl - 2B,3B Juan Lagares NYM - OF A.J. Pierzynski Atl - C Khris Davis Mil - OF