import urllib
import json
import re
import csv
from bs4 import BeautifulSoup
game_code = open("/Users//Desktop/PYTHON/gc.txt").read()
game_code = game_code.split("\r")
for gc in game_code:
htmltext =urllib.urlopen("http://cluster.leaguestat.com/feed/index.php?feed=gc&key=f109cf290fcf50d4&client_code=ohl&game_id="+gc+"&lang_code=en&fmt=json&tab=pxpverbose")
soup= BeautifulSoup(htmltext, "html.parser")
j= json.loads(soup.text)
summary = ['GC'],['Pxpverbose']
for event in summary:
print gc, ["event"]
I can not seem to access the lib to print the proper headers and row. I ultimately want to export specific rows to csv. I downloaded python 2 days ago, so i am very new. I needed this one data set for a project. Any advice or direction would be greatly appreciated.
Here are a few game codes if anyone wanted to take a look. Thanks
21127,20788,20922,20752,21094,21196,21295,21159,21128,20854,21057
Here are a few thoughts:
I'd like to point out the excellent requests as an alternative to urllib for all your HTTP needs in Python (you may need to pip install requests).
requests comes with a built-in json decoder (you don't need BeautifulSoup).
In fact, you have already imported a great module (csv) to print headers and rows of data. You can also use this module to write the data to a file.
Your data is returned as a dictionary (dict) in Python, a data structure indexed by keys. You can access the values (I think this is what you mean by "specific rows") in your data with these keys.
One of many possible ways to accomplish what you want:
import requests
import csv
game_code = open("/Users//Desktop/PYTHON/gc.txt").read()
game_code = game_code.split("\r")
for gc in game_code:
r = requests.get("http://cluster.leaguestat.com/feed/index.php?feed=gc&key=f109cf290fcf50d4&client_code=ohl&game_id="+gc+"&lang_code=en&fmt=json&tab=pxpverbose")
data = r.json()
with open("my_data.csv", "a") as csvfile:
wr = csv.writer(csvfile,delimiter=',')
for summary in data["GC"]["Pxpverbose"]:
wr.writerow([gc,summary["event"]])
# add keys to write additional values;
# e.g. summary["some-key"]. Example:
# wr.writerow([gc,summary["event"],summary["id"]])
You don't need beautiful soup for this; the data can be read directly from the URL into JSON format.
import urllib, json
response = urllib.urlopen("http://cluster.leaguestat.com/feed/index.php?feed=gc&key=f109cf290fcf50d4&client_code=ohl&game_id=" + gc +"&lang_code=en&fmt=json&tab=pxpverbose")
data = json.loads(response.read())
At this point, data is the parsed JSON of your web page.
Excel can read csv files, so easiest route would be exporting the data you want into a CSV file using this library.
This should be enough to get you started. Modify fieldnames to include specific event details in the columns of the csv file.
import csv
with open('my_games.csv', 'w') as csvfile:
fieldnames = ['event', 'id']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames,
extrasaction='ignore')
writer.writeheader()
for event in data['GC']['Pxpverbose']:
writer.writerow(event)
Related
I think this is simple but I am not finding an answer that works. The data importing seems to work but separating the "/" numbers doesnt code is below. thanks for the help.
import urllib.request
opener = urllib.request.FancyURLopener({})
url = "http://jse.amstat.org/v22n1/kopcso/BeefDemand.txt"
f = opener.open(url)
content = f.read()
# below are the 3 different ways I tried to separate the data
content.encode('string-escape').split("\\x")
content.split('\r')
content.split('\\')
I highly recommend Pandas for reading and analysing this kind of file. It supports reading directly from a url and also gives meaningful analysis ability.
import pandas
url = "http://jse.amstat.org/v22n1/kopcso/BeefDemand.txt"
df = pandas.read_table(url, sep="\t+", engine='python', index_col="Year")
Note that you have multiple repeated tabs as separators in that file, which is handled by the sep="\t+". The repeats also means you have to use the python engine.
Now that the file is read into a dataframe, we can do easy plotting for instance:
df[['ChickPrice', 'BeefPrice']].plot()
Simply use a csv.reader or csv.DictReader to parse the contents. Make sure to set the delimiter to tabs, in this case:
import requests
import csv
import re
url = "http://jse.amstat.org/v22n1/kopcso/BeefDemand.txt"
response = requests.get(url)
response.raise_for_status()
text = re.sub("\t{1,}", "\t", response.text)
reader = csv.DictReader(text.splitlines(), delimiter="\t")
for row in reader:
print(row)
I like csv.DictReader better in this case, because it consumes the header line for you and each "row" is a dictionary. Your specific text file sometimes seperates fields with repeated tabs to make it look prettier, so you'll have to take that into account in some way. In my snippet, I used a regular expression to replace all tab-clusters with a single tab.
I'm trying to automate an email sending service which sends a person's bus station to his mail.
In order to do so I need to pull some data from a Hebrew website.
I need to append to a list from the same website but from a different tab/page (I managed to pull the first page), and then to write the data to a CSV file.
I managed to pull the first page to a data frame list:
import requests
import pandas as pd
import csv
url = 'http://yit.maya-tour.co.il/yit-pass/Drop_Report.aspx?
client_code=2660&coordinator_code=2669'
html = requests.get(url).text
df_list = pd.read_html(html)
print(df_list)
myFile = open('my data.csv', 'wb')
wr = csv.writer(myFile, quoting=csv.QUOTE_ALL)
wr.writerow(df_list)
i can't find a way to write to file, as I get the message:
"TypeError: a bytes-like object is required, not 'str'"
I expect to get the full list from multiple pages and then to write to CSV.
I think you should use 'w' instead of 'wb' as an option on open(). The 'b' makes the command expect a binary object. Like this:
myFile = open('my data.csv', 'w')
See this link for more info
I request advice on a Pythonic matter I am confused about. I have a csv file which holds a field value I need to search by on a website. Next, I want to tell Python if you find any matching values corresponding to the field value in the csv file, save the records into a new csv file. Please provide your assistance on how and what modules I can use to accomplish this task. Any assistance would be greatly appreciated.
import requests
import bs4
import csv
r = requests.get('https://etrakit.friscotexas.gov/Search/permit.aspx')
with open ('C:/Users/Desktop/Programming/Addresses.csv') as f:
for row in csv.reader(f):
print row[1]
I'm trying to do some basic analsys on ether historical prices for a school project. My problem is quite a simple one I think. I made a function that download the data from the URL, but the format is wrong. I got a dataframe thats size is (0,~14k). So I download the data, but I'm not sure how should I format it into a form that I can use.
I see 2 possibilities, I format the dataframe after download, which I will try to do. Or I download it in the correct format first, which would be the better and more elegant solution.
My problem that I don't know how to do the 2. and I may not succeed on the 1. thats why I make this post.
def get_stock_price_csv_from_poloniex():
import requests
from pandas import DataFrame
from io import StringIO
url = 'https://poloniex.com/public?command=returnChartData¤cyPair=USDT_ETH&start=1435699200&end=9999999999&period=14400'
csv = requests.get(url)
if csv.ok:
return DataFrame.from_csv(StringIO(csv.text), sep=',')
else:
return None
The source data is not CSV, it's json. Luckily pandas provides facilities for working with it as well.
import requests
from pandas.io.json import json_normalize
url = 'https://poloniex.com/public?command=returnChartData¤cyPair=USDT_ETH&start=1435699200&end=9999999999&period=14400'
resp = requests.get(url)
data_frame = json_normalize(resp.json())
I'm trying to read data from Google Fusion Tables API into Python using the csv library. It seems like querying the API returns CSV data, but when I try and use it with csv.reader, it seems to mangle the data and split it up on every character rather than just on the commas and newlines. Am I missing a step? Here's a sample I made to illustrate, using a public table:
#!/usr/bin/python
import csv
import urllib2, urllib
request_url = 'https://www.google.com/fusiontables/api/query'
query = 'SELECT * FROM 1140242 LIMIT 10'
url = "%s?%s" % (request_url, urllib.urlencode({'sql': query}))
serv_req = urllib2.Request(url=url)
serv_resp = urllib2.urlopen(serv_req)
reader = csv.reader(serv_resp.read())
for row in reader:
print row #prints out each character of each cell and the column headings
Ultimately I'd be using the csv.DictReader class, but the base reader shows the issue as well
csv.reader() takes in a file-like object.
Change
reader = csv.reader(serv_resp.read())
to
reader = csv.reader(serv_resp)
Alternatively, you could do:
reader = csv.DictReader(serv_resp)
It's not the CSV module that's causing the problem. Take a look at the output from serv_resp.read(). Try using serv_resp.readlines() instead.