How to download csv ata from website using Python - python

I'm trying to automatically download data from the following website; however I just get the html and no data:
http://tcplus.com/GTN/OperationalCapacity#filter.GasDay=02/02/19&filter.CycleType=1&page=1&sort=LocationName&sort_direction=ascending
import csv
import urllib2
downloaded_data = urllib2.urlopen('http://tcplus.com/GTN/OperationalCapacity#filter.GasDay=02/02/19&filter.CycleType=1&page=1&sort=LocationName&sort_direction=ascending')
csv_data = csv.reader(downloaded_data)
for row in csv_data:
print row

The code below will only fetch data from provided url, but if you tweak parameters you can get other reports as well.
import requests
parameters = {'serviceTypeName': 'Ganesha.InfoPost.Service.OperationalCapacity.OperationalCapacityService, Ganesha.InfoPost.Service',
'filterTypeName': 'Ganesha.InfoPost.ViewModels.GasDayAndCycleTypeFilterViewModel, Ganesha.InfoPost',
'templateType': 6,
'exportType': 1,
'filter.GasDay': '02/02/19',
'filter.CycleType': 1}
response = requests.post('http://tcplus.com/GTN/Export/Generate', data=parameters)
with open('result.csv', 'w') as f:
f.write(response.text)

Related

How to convert url data to csv using python

i am trying to download the data from the following url and tying to save it as csv data but the output i am getting is a text file. can anyone pls help what i am doing wrong here ? also, is it possible to add multiple url in the same script and download multiple csv files.
import csv
import pandas as pd
import requests
from datetime import datetime
CSV_URL = ('https://dsv-ops-toolkit.ihsmvals.com/ftp?config=fenics-bgc&file=IRSDATA_20211129_1700_Intra.csv&directory=%2FIRS%2FIntraday%2FDaily')
with requests.Session() as s:
download = s.get(CSV_URL)
decoded_content = download.content.decode('utf-8')
cr = csv.reader(decoded_content.splitlines(), delimiter=',')
date =datetime.now().strftime('%y%m%d')
my_list = list(cr)
df=pd.DataFrame(my_list)
df.to_csv(f'RFR_{date}')
You can create a list of your necessary URLs like:
urls = ['http://url1.com','http://url2.com','http://url3.com']
Iterate through the list for each url and your requests will be as it is:
for each_url in urls:
with requests.Session() as s:
# your_code_here
Hope you'll find this helpful.

Using Python to download api data

Hi I have created a piece of code that downloads data from a api end point and also loads in the apikeys.
I am trying to achieve downloading the api data into csv files into their own folder based on the input.csv I have tried to achieve this by adding the following section at the end. The problem is that it does not download the file its to be receiving from the api end point.
Please assist?
with open('filepath/newfile.csv', 'w+') as f:
f.write(r.text)
import csv
import sys
import requests
def query_api(business_id, api_key):
headers = {
"Authorization": api_key
}
r = requests.get('https://api.link.com', headers=headers)
print(r.text)
# get filename from command line arguments
if len(sys.argv) < 2:
print ("input.csv")
sys.exit(1)
csv_filename = sys.argv[1]
with open(csv_filename) as csv_file:
csv_reader = csv.DictReader(csv_file, delimiter=',')
for row in csv_reader:
business_id = row['BusinessId']
api_key = row['ApiKey']
query_api(business_id, api_key)
with open('filepath/newfile.csv', 'w+') as f:
f.write(r.text)

Python-3 Trying to iterate through a csv and get http response codes

I am attempting to read a csv file that contains a long list of urls. I need to iterate through the list and get the urls that throw a 301, 302, or 404 response. In trying to test the script I am getting an exited with code 0 so I know it is error free but it is not doing what I need it to. I am new to python and working with files, my experience has been ui automation primarily. Any suggestions would be gladly appreciated. Below is the code.
import csv
import requests
import responses
from urllib.request import urlopen
from bs4 import BeautifulSoup
f = open('redirect.csv', 'r')
contents = []
with open('redirect.csv', 'r') as csvf: # Open file in read mode
urls = csv.reader(csvf)
for url in urls:
contents.append(url) # Add each url to list contents
def run():
resp = urllib.request.urlopen(url)
print(self.url, resp.getcode())
run()
print(run)
Given you have a CSV similar to the following (the heading is URL)
URL
https://duckduckgo.com
https://bing.com
You can do something like this using the requests library.
import csv
import requests
with open('urls.csv', newline='') as csvfile:
errors = []
reader = csv.DictReader(csvfile)
# Iterate through each line of the csv file
for row in reader:
try:
r = requests.get(row['URL'])
if r.status_code in [301, 302, 404]:
# print(f"{r.status_code}: {row['url']}")
errors.append([row['url'], r.status_code])
except:
pass
Uncomment the print statement if you want to see the results in the terminal. The code at the moment appends a list of URL and status code to an errors list. You can print or continue processing this if you prefer.

CSV results overwrites the values

I was somewhat able to write and execute the program using BeautifulSoup. My concept is to capture the details from html source by parsing multiple urls via csv file and save the output as csv.
Programming is executing well, but the csv overwrites the values in 1st row itself.
input File has three urls to parse
I want the output to be stored in 3 different rows.
Below is my code
import csv
import requests
import pandas
from bs4 import BeautifulSoup
with open("input.csv", "r") as f:
reader = csv.reader(f)
for row in reader:
url = row[0]
print (url)
r=requests.get(url)
c=r.content
soup=BeautifulSoup(c, "html.parser")
all=soup.find_all("div", {"class":"biz-country-us"})
for br in soup.find_all("br"):
br.replace_with("\n")
l=[]
for item in all:
d={}
name=item.find("h1",{"class":"biz-page-title embossed-text-white shortenough"})
d["name"]=name.text.replace(" ","").replace("\n","")
claim=item.find("div", {"class":"u-nowrap claim-status_teaser js-claim-status-hover"})
d["claim"]=claim.text.replace(" ","").replace("\n","")
reviews=item.find("span", {"class":"review-count rating-qualifier"})
d["reviews"]=reviews.text.replace(" ","").replace("\n","")
l.append(d)
df=pandas.DataFrame(l)
df.to_csv("output.csv")
Please kindly let me know if Im not clear on explaining anything.
Open the output file in append mode as suggested in this post with the modification that you add header the first time:
from os.path import isfile
if not isfile("output.csv", "w"):
df.to_csv("output.csv", header=True)
else:
with open("output.csv", "a") as f:
df.to_csv(f, header=False)

Python reading CSV file using URL giving error

I am trying to read a CSV file using the requests library but I am having issues.
import requests
import csv
url = 'https://storage.googleapis.com/sentiment-analysis-dataset/training_data.csv'
r = requests.get(url)
text = r.iter_lines()
reader = csv.reader(text, delimiter=',')
I then tried
for row in reader:
print(row)
but it gave me this error:
Error: iterator should return strings, not bytes (did you open the file in text mode?)
How should I fix this?
What you probably want is:
text = r.iter_lines(decode_unicode=True)
This will return a strings-iterator instead of a bytes-iterator. (See here for documentation.)

Categories

Resources