Newyork BBL to Latitude/Longitude API - python

I want to convert NewYork BBL numbers to Latitude Longitude values. The BBL values are present as CSV file. Is there a Free API to convert them using python?

These has one month free.. If you go an sign up for a free account, this code works (I tried it out).
import pandas as pd
import requests
TOKEN = 'YOUR TOKEN'
def get_coord(bbl):
url = f'https://locatenyc.io/arcgis/rest/services/locateNYC/v1/GeocodeServer/findAddressCandidates?singleLine={bbl}&token={TOKEN}'
resp = requests.get(url)
data = resp.json()
attrs = data['candidates'][0]['attributes']
return attrs['longitudeInternalLabel'], attrs['latitudeInternalLabel']
df['coords'] = df['bbl'].apply(get_coord)

Related

How do I get the one-year percent change with Bureau of Labor Statistics API public data?

I hope that I can clearly explain what I'm trying to do here. I'm able to retrieve the series date and values through the Bureau of Labor Statistics(BLS) API data. https://www.bls.gov/
I want to now get the 1-year percent change for each value. I'm grateful for any help on this. XXXXX is where I put my registration ID, so I can access more than three years of data.
Here's the developer's page: https://www.bls.gov/developers/api_signature_v2.htm#parameters
base_url = 'https://api.bls.gov/publicAPI/v2/timeseries/data/'
series = {'id': 'CUSR0000SA0',
'name': 'Consumer Price Index - All Urban Consumers'}
data_url = '{}{}/?registrationkey=XXXXXXXXXXXXXXXXXX&startyear=2010&endyear=2022'.format(base_url, series['id'])
import requests
r = requests.get(data_url).json()
print('Status: ' + r['status'])
r = r['Results']['series'][0]['data']
print(r[0])
import pandas as pd
##M13 is annual year, which I'm skipping since I only want months 1-12
dates = ['{}{}'.format(i['period'], i['year']) for i in r if i['period'] < 'M13']
index = pd.to_datetime(dates)
data = {series['id']: [float(i['value']) for i in r if i['period'] < 'M13']}
df = pd.DataFrame(index=index, data=data).iloc[::-1]
I'm not sure how to write up retrieving calculations in my code.
I tried "calculations: {[i['calculations'][0] for i in r if i['period'] < 'M13']}" but that is not right.

Alpaca API simple 5 min historical bar request returning null

I must be making very simple mistake which I haven't been able to figure out for hours. I am referencing Alpaca API Doc and trying to follow. The code is shown below trying to get the 5 min historical bar on the TSLA as the test purposes which I failed. See below code.
import config, requests, json
# QUOTE_URL = 'https://data.alpaca.markets/v2/stocks/TSLA/quotes'
# LATESTBAR_URL = 'https://data.alpaca.markets/v2/stocks/tsla/bars/latest'
BARS_URL = 'https://data.alpaca.markets/v2/stocks/tsla/bars'
timeframe = '?timeframe=5Min'
BARS_URL = BARS_URL+timeframe
# r = requests.get(QUOTE_URL, headers=config.HEADERS)
# r2 = requests.get(LATESTBAR_URL, headers=config.HEADERS)
r3 = requests.get(BARS_URL, headers=config.HEADERS)
print(json.dumps(r3.json(), indent=4))
The result I get is shown below:
{
"bars": null,
"symbol": "TSLA",
"next_page_token": null
}
Here is my solution which helped me to get the OHLC data about any symbol from Alpaca API:
# imports
import alpaca_trade_api as tradeapi
from alpaca.data import TimeFrame
# create Alpaca API object with the given credentials
api = tradeapi.REST(key_id="YOUR_API_KEY",
secret_key="YOUR_SECRET_KEY",
base_url='https://paper-api.alpaca.markets')
# Call the API to get OHLC TSLA data adn store it in a dataframe
data = api.get_bars(
symbol='TSLA',
timeframe=TimeFrame.Minute
).df
# Get some historical data for TSLA
historical_data = api.get_bars(
symbol='TSLA', #any symbol is acceptable if it can be found in Alpaca API
timeframe=TimeFrame.Minute,
start="2018-01-01T00:00:00-00:00",
end="2018-02-01T00:00:00-00:00"
).df
print(data)
print(historical_data)
The data will be stored in a Dataframe, of course, if you want to get the results in a raw format just leave the ".df"
I hope it will help!

Extracting chosen information from URL results into a dataframe

I would like to create a dataframe by pulling only certain information from this website.
https://www.stockrover.com/build/production/Research/tail.js?1644930560
I would like to pull all the entries like this one. ["0005.HK","HSBC HOLDINGS","",""]
Another problem is, suppose I only want only the first 20,000 lines which is the stock information and there is other information after line 20,000 that I don't want included in the dataframe.
To summarize, could someone show me how to pull out just the information I'm trying to extract and create a dataframe with those results if this is possible.
A sample of the website results
function getStocksLibraryArray(){return[["0005.HK","HSBC HOLDINGS","",""],["0006.HK","Power Assets Holdings Ltd","",""],["000660.KS","SK hynix","",""],["004370.KS","Nongshim","",""],["005930.KS","Samsung Electroni","",""],["0123.HK","YUEXIU PROPERTY","",""],["0336.HK","HUABAO INTL","",""],["0408.HK","YIP'S CHEMICAL","",""],["0522.HK","ASM PACIFIC","",""],["0688.HK","CHINA OVERSEAS","",""],["0700.HK","TENCENT","",""],["0762.HK","CHINA UNICOM","",""],["0808.HK","PROSPERITY REIT","",""],["0813.HK","SHIMAO PROPERTY",
Code to pull all lines including ones not wanted
import requests
import pandas as pd
import requests
url = "https://www.stockrover.com/build/production/Research/tail.js?1644930560"
payload={}
headers = {}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)
Use regex to extract the details followed by literal_eval to convert string to python object
import re
from ast import literal_eval
import pandas as pd
import requests
url = "https://www.stockrover.com/build/production/Research/tail.js?1644930560"
response = requests.request("GET", url, headers={}, data={})
regex_ = re.compile(r"getStocksLibraryArray\(\)\{return(.+?)}", re.DOTALL)
print(pd.DataFrame(literal_eval(regex_.search(response.text).group(1))))
0 1 2 3
0 0005.HK HSBC HOLDINGS
1 0006.HK Power Assets Holdings Ltd
2 000660.KS SK hynix
3 004370.KS Nongshim
4 005930.KS Samsung Electroni
... ... ... ... ..
21426 ZZHGF ZhongAn Online P&C _INSUP
21427 ZZHGY ZhongAn Online P&C _INSUP
21428 ZZLL ZZLL Information Tech _INTEC
21429 ZZZ.TO Sleep Country Canada _SPECR
21430 ZZZOF Zinc One Resources _OTHEI

Trouble Looping through JSON elements pulled using API

I am trying to pull search results data from an API on a website and put it into a pandas dataframe. I've been able to successfully pull the info from the API into a JSON format.
The next step I'm stuck on is how to loop through the search results on a particular page and then again for each page of results.
Here is what I've tried so far:
#Step 1: Connect to an API
import requests
import json
response_API = requests.get('https://www.federalregister.gov/api/v1/documents.json?conditions%5Bpublication_date%5D%5Bgte%5D=09%2F01%2F2021&conditions%5Bterm%5D=economy&order=relevant&page=1')
#200
#Step 2: Get the data from API
data = response_API.text
#Step 3: Parse the data into JSON format
parse_json = json.loads(data)
#Step 4: Extract data
title = parse_json['results'][0]['title']
pub_date = parse_json['results'][0]['publication_date']
agency = parse_json['results'][0]['agencies'][0]['name']
Here is where I've tried to put this all into a loop:
import numpy as np
import pandas as pd
df=[]
for page in np.arange(0,7):
url = 'https://www.federalregister.gov/api/v1/documents.json?conditions%5Bpublication_date%5D%5Bgte%5D=09%2F01%2F2021&conditions%5Bterm%5D=economy&order=relevant&page={page}'.format(page=page)
response_API = requests.get(url)
print(response_API.status_code)
data = response_API.text
parse_json = json.loads(data)
for i in parse_json:
title = parse_json['results'][i]['title']
pub_date = parse_json['results'][i]['publication_date']
agency = parse_json['results'][i]['agencies'][0]['name']
df.append([title,pub_date,agency])
cols = ["Title", "Date","Agency"]
df = pd.DataFrame(df,columns=cols)
I feel like I'm close to the correct answer, but I'm not sure how to move forward from here. I need to iterate through the results where I placed the i's when parsing through the json data, but I get an error that reads, "Type Error: list indices must be integers or slices, not str". I understand I can't put the i's in those spots, but how else am I supposed to iterate through the results?
Any help would be appreciated!
Thank you!
I think you are very close!
import numpy as np
import pandas as pd
import requests
BASE_URL = "'https://www.federalregister.gov/api/v1/documents.json?conditions%5Bpublication_date%5D%5Bgte%5D=09%2F01%2F2021&conditions%5Bterm%5D=economy&order=relevant&page={page}"
results = []
for page in range(0, 7):
response = requests.get(BASE_URL.format(page=page))
if response.ok:
resp_json = response.json()
for res in resp_json["results"]:
results.append(
[
res["title"],
res["publication_date"],
[agency["name"] for agency in res["agencies"]]
]
)
df = pd.DataFrame(results, columns=["Title", "Date", "Agencies"])
In this block of code, I used the requests library's built-in .json() method, which can automatically convert a response's text to a JSON dict (if it's in the proper format).
The if response.ok is a little less-verbose way provided by requests to check if the status code is < 400, and can prevent errors that might occur when attempting to parse the response if there was a problem with the HTTP call.
Finally, I'm not sure what data you need exactly for your DataFrame, but each object in the
"results" list from the pages pulled from that website has "agencies" as a list of agencies... wasn't sure if you wanted to drop all that data, so I kept the names as a list.
*Edit:
In case the response objects don't contain the proper keys, we can use the .get() method of Python dictionaries.
# ...snip
for res in resp_json["results"]:
results.append(
[
res.get("title"), # This will return `None` as a default, instead of causing a KeyError
res.get("publication_date"),
[
# Here, get the 'raw_name' or None, in case 'name' key doesn't exist
agency.get("name", agency.get("raw_name"))
for agency in res.get("agencies", [])
]
]
)
Slightly different approach: rather than iterating through the response, read into a dataframe then save what you need. The saves the first agency name in the list.
df_list=[]
for page in np.arange(0,7):
url = 'https://www.federalregister.gov/api/v1/documents.json?conditions%5Bpublication_date%5D%5Bgte%5D=09%2F01%2F2021&conditions%5Bterm%5D=economy&order=relevant&page={page}'.format(page=page)
response_API = requests.get(url)
# print(response_API.status_code)
data = response_API.text
parse_json = json.loads(data)
df = pd.json_normalize(parse_json['results'])
df['Agency'] = df['agencies'][0][0]['raw_name']
df_list.append(df[['title', 'publication_date', 'Agency']])
df_final = pd.concat(df_list)
df_final
title publication_date Agency
0 Determination of the Promotion of Economy and ... 2021-09-28 OFFICE OF MANAGEMENT AND BUDGET
1 Corporate Average Fuel Economy Standards for M... 2021-09-03 OFFICE OF MANAGEMENT AND BUDGET
2 Public Hearing for Corporate Average Fuel Econ... 2021-09-14 OFFICE OF MANAGEMENT AND BUDGET
3 Investigation of Urea Ammonium Nitrate Solutio... 2021-09-08 OFFICE OF MANAGEMENT AND BUDGET
4 Call for Nominations To Serve on the National ... 2021-09-08 OFFICE OF MANAGEMENT AND BUDGET
.. ... ... ...
15 Energy Conservation Program: Test Procedure fo... 2021-09-14 DEPARTMENT OF COMMERCE
16 Self-Regulatory Organizations; The Nasdaq Stoc... 2021-09-09 DEPARTMENT OF COMMERCE
17 Regulations To Improve Administration and Enfo... 2021-09-20 DEPARTMENT OF COMMERCE
18 Towing Vessel Firefighting Training 2021-09-01 DEPARTMENT OF COMMERCE
19 Patient Protection and Affordable Care Act; Up... 2021-09-27 DEPARTMENT OF COMMERCE
[140 rows x 3 columns]

Need a 'for loop' to get dividend data for a stock portfolio, from their respective api urls

I am trying to automate parsing of dividend data for a stock portfolio, and getting the stock wise dividend values into a single dataframe table.
The data for each stock in a portfolio is stored in a separate api url
The portfolio ids (for stocks - ITC, Britannia, Sanofi) are [500875, 500825, 500674].
I would first like to run a 'for loop' to generate/concatenate each specific url (which goes like this - https://api.bseindia.com/BseIndiaAPI/api/CorporateAction/w?scripcode=500674), the last 6 digit numbers of urls being their respective company ids
Then I would like to use that url to get each of the respective dividend table's first line into a single dataframe. The code I used to get the individual dividend data, and the final dataframe that I need is represented in image attached
Basically I would like to run a 'for loop' to get the first line of 'Table2' for each stock id and store it in a single data frame as a final result.
PS - The code which I used to get individual dividend data is highlighted below:
url = 'https://api.bseindia.com/BseIndiaAPI/api/CorporateAction/w?scripcode=500674'
jsondata = requests.get(url, headers= {'User-Agent': 'Mozilla/5.0'}).json()
df = pd.DataFrame(jsondata['Table2'])
If you need for-loop then you should use it and show code with for-loop and problem which it gives you.
You could use single for-loop for all works.
You can use string formatting to create url with code and read data from server. Next you can get first row (even without creating DataFrame) and append to list with all rows. And after loop you can convert this list to DataFrame
import requests
import pandas as pd
# --- before loop ---
headers = {'User-Agent': 'Mozilla/5.0'}
all_rows = []
# --- loop ---
for code in [500875, 500825, 500674]:
# use `f-string` of string `.format()` to create url
#url = f'https://api.bseindia.com/BseIndiaAPI/api/CorporateAction/w?scripcode={code}'
url = 'https://api.bseindia.com/BseIndiaAPI/api/CorporateAction/w?scripcode={}'.format(code)
r = requests.get(url, headers=headers)
#print(r.text) # to check error message
#print(r.status_code)
data = r.json()
first_row = data['Table2'][0] # no need to use DataFrame
#df = pd.DataFrame(data['Table2'])
#first_row = df.iloc[0]
#print(first_row)
all_rows.append(first_row)
# --- after loop ---
df_result = pd.DataFrame(all_rows)
print(df_result)
Result:
scrip_code sLongName ... Details PAYMENT_DATE
0 500875 ITC LTD. ... 10.1500 2020-09-08T00:00:00
1 500825 BRITANNIA INDUSTRIES LTD. ... 83.0000 2020-09-16T00:00:00
2 500674 Sanofi India Ltd ... 106.0000 2020-08-06T00:00:00
[3 rows x 9 columns]

Categories

Resources