Python API to access Stock Market information - python

I would like to know if there is a place from where I can download metadata of a given stock. I was studying sometime back about REST API and I though I could maybe use something like this:
stock_code = "GME"
base_url = "https://somestockmarkekpage.com/api/stock?code={}"
resp = requests.get(base_url.format(stock_code))
print(resp.json()['short_ratio'])
The problem is I dont know any base_url from where I can download this data, dont even know if it exist for free. However any other API or service you could provide is very welcome

There is a free API provided by Yahoo that contains up to date data related with several tickets. You can see the API details here. One example to extract metadata from a ticket would be:
import yfinance as yf
stock_obj = yf.Ticker("GME")
# Here are some fixs on the JSON it returns
validated = str(stock_obj.info).replace("'","\"").replace("None", "\"NULL\"").replace("False", "\"FALSE\"").replace("True", "\"TRUE\"")
# Parsing the JSON here
meta_obj = json.loads(validated)
# Some of the short fields
print("sharesShort: "+str( meta_obj['sharesShort']))
print("shortRatio: "+str( meta_obj['shortRatio']))
print("shortPercentOfFloat: "+str( meta_obj['shortPercentOfFloat']))
The output for the ticket you are interested in would be:
sharesShort: 61782730
shortRatio: 2.81
shortPercentOfFloat: 2.2642

You can use the free Yahoo Finance API and their most popular Python library yfinance.
Link: https://pypi.org/project/yfinance/
Sample Code:
import yfinance as yf
GME_data = yf.Ticker("GME")
# get stock info
GME_data.info
Other than that you can also use many other API. You can search in RapidAPI and search "Stock".

Related

Web Scraping AccuWeather site

I have recently started learning Web scraping using Scrapy in python and am facing issues with scraping data from AccuWeather.org site (https://www.accuweather.com/en/gb/london/ec4a-2/may-weather/328328?year=2020).
Basically I am capturing dates and its weather temperature for my reporting purpose.
When inspected the site I found too many div tags so getting confused to write the code. Hence thought I would seek experts help on this.
Here is my code for your reference.
import scrapy
class QuoteSpider(scrapy.Spider):
name = 'quotes'
start_urls = ['https://www.accuweather.com/en/gb/london/ec4a-2/may-weather/328328?year=2020']
def parse(self, response):
All_div_tags = response.css('div.content-module')[0]
#Grid_tag = All_div_tags.css('div.monthly-grid')
Date_tag = All_div_tags.css('div.date::text').extract()
yield {
'Date' : Date_tag}
I wrote this in PyCharm and am getting error as "code is not handled or not allowed".
please could someone help me with this?
I've tried to read some websites that gave me the same error. It happens because some websites don't allow web scraping on them. To get data from these websites, you would probably need to use their API if they have one.
Fortunately, AccuWeather has made it easy to use their API (unlike other APIs):
You first need to create an account at their developers' website: https://developer.accuweather.com/
Now, create a new app by going to My Apps > Add a new app.
You will probably see some information about your app (if you don't, press its name and it will probably show up). The only information you will need is your API Key, which is essential for APIs.
AccuWeather has pretty good documentation about their API here, yet I will show you how to use the most useful ones. You will need to have the location key of the city you want to get the weather from, that is shown in the URL of its weather page, for example, London's URL is www.accuweather.com/en/gb/london/ec4a-2/weather-forecast/328328, so its location key is 328328.
When you have the location key of the city/cities you want to get the weather from, open a file, and type:
import requests
import json
If you want the daily weather (as shown here), type:
response = requests.get(url="http://dataservice.accuweather.com/forecasts/v1/daily/1day/LOCATIONKEY?apikey=APIKEY")
print(response.status_code)
Replacing APIKEY with your API key, and LOCATIONKEY with the city's location key. It should now display 200 when you run it (meaning the request was successful)
Now, load it as a JSON file:
response_json = json.loads(response.content)
And you can now get some information from it, such as the day's "definition":
print(response_json["Headline"]["Text"])
The minimum temperature:
min_temperature = response_json["DailyForecasts"][0]["Temperature"]["Minimum"]["Value"]
print(f"Minimum Temperature: {min_temperature}")
The maximum temperature
max_temperature = response_json["DailyForecasts"][0]["Temperature"]["Maximum"]["Value"]
print(f"Maximum Temperature: {max_temperature}")
The minimum temperature and maximum temperature with the unit:
min_temperature = str(response_json["DailyForecasts"][0]["Temperature"]["Minimum"]["Value"]) + response_json["DailyForecasts"][0]["Temperature"]["Minimum"]["Unit"]
print(f"Minimum Temperature: {min_temperature}")
max_temperature = str(response_json["DailyForecasts"][0]["Temperature"]["Maximum"]["Value"]) + response_json["DailyForecasts"][0]["Temperature"]["Maximum"]["Unit"]
print(f"Maximum Temperature: {max_temperature}")
And more.
If you have any questions, let me know. I hope I could help you!

Obtain data from a API by setting the path variable in the API Link

I am trying to retrieve data from a API.
Given is the link of API used:
https://documenter.getpostman.com/view/8854915/SzS7R6uu?version=latest
I wish to obtain data from: "https://corona.lmao.ninja/countries/:country" where "country" is a variable, to get the data for any respective country.
Here is my approach to the problem:
country = "Zimbabwe"
all_cases = requests.get('https://corona.lmao.ninja/countries/<str:country>')
all_cases_json = all_cases.json()
print(all_cases_json)
And it doesn't work thereby giving an output:
{'message': "Country not found or doesn't have any cases"}
How do I retrieve the desired data?
To get the data for a specific country the API URL is;
https://corona.lmao.ninja/v2/countries/Zimbabwe
use iso2 or iso3 while importing all the countries. iso2 is like the US for America or PK for Pakistan,
When you try to pull up the information regarding a specific country. the API will give you a response by its corresponding Country using iso2 and iso3 Format so in this case, it might be a zim
Like Tebogo said above, the API URL is https://corona.lmao.ninja/v2/countries/{country_name}.
With that information, you should use an f string to access the URL with the requests module.
# country = "Zimbabwe"
cases_raw = requests.get(f'https://corona.lmao.ninja/v2/countries/{country}')
cases_json = cases_raw.json()
print(f'Cases JSON: {cases_json}')

Beautifulsoup returns empty for all table tags

I'm trying to access the table details to ultimately put into a dataframe and save as a csv with a limited number of rows(the dataset is massive) from the following site: https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2/data
I'm just starting out webscraping and was practicing on this dataset. I can effectively pull tags like div but when I try soup.findAll('tr') or td, it returns an empty set.
The table appears to be embedded in a different code(see link above) so that's maybe my issue, but still unsure how to access the detail rows and headers, etc..., Selenium maybe?
Thanks in advance!
By the looks of it, the website already allows you to export the data:
As it would seem, the original link is:
https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2/data
The .csv download link is:
https://data.cityofchicago.org/api/views/ijzp-q8t2/rows.csv?accessType=DOWNLOAD
The .json link is:
https://data.cityofchicago.org/resource/ijzp-q8t2.json
Therefore you could simply extract the ID of the data, in this case ijzp-q8t2, and replace it on the download links above. Here is the official documentation of their API.
import pandas as pd
from sodapy import Socrata
# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
client = Socrata("data.cityofchicago.org", None)
# Example authenticated client (needed for non-public datasets):
# client = Socrata(data.cityofchicago.org,
# MyAppToken,
# userame="user#example.com",
# password="AFakePassword")
# First 2000 results, returned as JSON from API / converted to Python list of
# dictionaries by sodapy.
results = client.get("ijzp-q8t2", limit=2000)
# Convert to pandas DataFrame
results_df = pd.DataFrame.from_records(results)

How can I get the first results of a Google Search that is not an ad using python?

I am trying to get the financial statements of a bunch of Australian companies as pdfs. I have all the companies stored in a pandas dataframe, their company names are in a column called 'Companies' This is my code so far to search for the urls:
import webbrowser
tabUrl = "http://google.com/?#q="
append = "+financial+report+2017"
file_type = 'filetype%3Apdf+'
for company in data["Company"]:
googleSearch = tabUrl + file_type + company.replace(" ", "+") + append
print(googleSearch)
Every search returns (unsurprisingly) a number of ads as the first result. How do I open the first result that is not an ad?
Thanks!
Right now you are sending request to the google webpage url and the results displayed would contain the ads that you see on google if you go to https://www.google.com
A better way to do this would be to use google Custom Search API to send your requests and get the results. You can get the documentation here: https://developers.google.com/custom-search/json-api/v1/using_rest
From their documentation, you see that you can make REST requests to their service end point once you generate your API KEY and Custom search engine ID
GET https://www.googleapis.com/customsearch/v1?key=INSERT_YOUR_API_KEY&cx=017576662512468239146:omuauf_lfve&q=lectures

Discogs API => How to retrieve genre?

I've crawled a tracklist of 36.000 songs, which have been played on the Danish national radio station P3. I want to do some statistics on how frequently each of the genres have been played within this period, so I figured the discogs API might help labeling each track with genre. However, the documentation for the API doesent seem to include an example for querying the genre of a particular song.
I have a CSV-file with with 3 columns: Artist, Title & Test(Test where i want the API to label each song with the genre).
Here's a sample of the script i've built so far:
import json
import pandas as pd
import requests
import discogs_client
d = discogs_client.Client('ExampleApplication/0.1')
d.set_consumer_key('key-here', 'secret-here')
input = pd.read_csv('Desktop/TEST.csv', encoding='utf-8',error_bad_lines=False)
df = input[['Artist', 'Title', 'Test']]
df.columns = ['Artist', 'Title','Test']
for i in range(0, len(list(df.Artist))):
x = df.Artist[i]
g = d.artist(x)
df.Test[i] = str(g)
df.to_csv('Desktop/TEST2.csv', encoding='utf-8', index=False)
This script has been working with a dummy file with 3 records in it so far, for mapping the artist of a given ID#. But as soon as the file gets larger(ex. 2000), it returns a HTTPerror when it cannot find the artist.
I have some questions regarding this approach:
1) Would you recommend using the search query function in the API for retrieving a variable as 'Genre'. Or do you think it is possible to retrieve Genre with a 'd.' function from the API?
2) Will I need to aquire an API-key? I have succesfully mapped the 3 records without an API-key so far. Looks like the key is free though.
Here's the guide I have been following:
https://github.com/discogs/discogs_client
And here's the documentation for the API:
https://www.discogs.com/developers/#page:home,header:home-quickstart
Maybe you need to re-read the discogs_client examples, i am not an expert myself, but a newbie trying to use this API.
AFAIK, g = d.artist(x) fails because x must be a integer not a string.
So you must first do a search, then get the artist id, then d.artist(artist_id)
Sorry for no providing an example, i am python newbie right now ;)
Also have you checked acoustid for
It's a probably a rate limit.
Read the status code of your response, you should find an 429 Too Many Requests
Unfortunately, if that's the case, the only solution is to add a sleep in your code to make one request per second.
Checkout the api doc:
http://www.discogs.com/developers/#page:home,header:home-rate-limiting
I found this guide:
https://github.com/neutralino1/discogs_client.
Access the api with your key and try something like:
d = discogs_client.Client('something.py', user_token=auth_token)
release = d.release(774004)
genre = release.genres
If you found a better solution please share.

Categories

Resources