Python get all stock Tickers - python

This question have been asked to death but none of the answers provide an actual workable solution. I had found one previously in get-all-tickers:
pip install get-all-tickers
Recently, for whatever reason, the package get-all-tickers has stopped working:
from get_all_tickers import get_tickers as gt
list_of_tickers = gt.get_tickers()
gives me the error:
ParserError: Error tokenizing data. C error: Expected 1 fields in line 23, saw 46
As this is the only package I found that actually gave a complete ticker list (a good check is "NKLA" which is missing from 100% of all other "solutions" I've found on stackoverflow or elsewhere), I now either need a new way to get up-to-date ticker lists, or a fix to this...
Any ideas?

Another solution would be to load this data as CSV.
Get the CSV from:
https://plextock.com/us-symbols?utm_source=so

See this answer first: https://quant.stackexchange.com/a/1862/38968
NASDAQ makes this information available via FTP and they update it
every night. Log into ftp.nasdaqtrader.com anonymously. Look in the
directory SymbolDirectory. You'll notice two files: nasdaqlisted.txt
and otherlisted.txt. These two files will give you the entire list of
tradeable symbols, where they are listed, their name/description, and
an indicator as to whether they are an ETF.
Given this list, which you can pull each night, you can then query
Yahoo to obtain the necessary data to calculate your statistics.
Also: New York's Stock Exchange provides a search-function:
https://www.nyse.com/listings_directory/stock
.. and this page seems to have a lot as well - it has Nikola/NKLA at least ;)
https://www.advfn.com/nasdaq/nasdaq.asp?companies=N

This pip package was recently broken, Someone has already raised an issue on the projects github (https://github.com/shilewenuw/get_all_tickers/issues/12).
It was caused by an update to the NASDAQ API recently.

Not perfect, but Kaggle has some:
https://www.kaggle.com/datasets/jacksoncrow/stock-market-dataset?resource=download

You could use the free alphavantage API https://www.alphavantage.co/documentation/
example:
import requests
key = '2DHC1EFVR3EOQ33Z' # free key from https://www.alphavantage.co/support/#api-key -- no registration required
result = requests.get('https://www.alphavantage.co/query?function=GLOBAL_QUOTE&symbol=NKLA&apikey='+ key).json()
print(f'The price for NKLA right now is ${result["Global Quote"]["05. price"]}.')

Related

How to detect failed downloads using yfinance

I am using the API yfinance: https://github.com/ranaroussi/yfinance
With the simple code below:
data = yf.download("A AA AAA Z LOL KE QP")
I got the following output:
[*********************100%***********************] 7 of 7 completed
2 Failed downloads:
- LOL: 1d data not available for startTime=-2208988800 and endTime=1621954979. Only 100 years worth of day granularity data are allowed to be fetched per request.
- QP: 1d data not available for startTime=-2208988800 and endTime=1621954979. Only 100 years worth of day granularity data are allowed to be fetched per request.
I would like to know how can I detect in my code that "LOL" and "QP" failed?
This is the code where the 'error' is thrown in the yfinance package. This is not an actual error so you might want to override the function download which is quite big.
if shared._ERRORS:
print('\n%.f Failed download%s:' % (
len(shared._ERRORS), 's' if len(shared._ERRORS) > 1 else ''))
# print(shared._ERRORS)
print("\n".join(['- %s: %s' %
v for v in list(shared._ERRORS.items())]))
Edit
If found a way to get the failed download:
simply import the shared.py file and get the ERRORS dict.
This dict stores the last errors of the download method. It is reset before a download so it is accessible right after it.
Simply use the following code:
import yfinance.shared as shared
data = yf.download("A AA AAA Z LOL KE QP")
print(list(shared._ERRORS.keys()))
After playing a little bit more with the data output I found a non-elegant way of checking for failed values, for example for the element "LOL":
all(pd.isna(v) for v in dict(data.Close["LOL"]).values())
It check if all values are nan for the closing price.
This method is working, but not optimal I think, there might be a better and simpler way of doing it. Lets hope someone find it :)

bibtex to html with pybtex, python 3

I want to take a file of one or more bibtex entries and output it as an html-formatted string. The specific style is not so important, but let's just say APA. Basically, I want the functionality of bibtex2html but with a Python API since I'm working in Django. A few people have asked similar questions here and here. I also found someone who provided a possible solution here.
The first issue I'm having is pretty basic, which is that I can't even get the above solutions to run. I keep getting errors similar to ModuleNotFoundError: No module named 'pybtex.database'; 'pybtex' is not a package. I definitely have pybtex installed and can make basic API calls in the shell no problem, but whenever I try to import pybtex.database.whatever or pybtex.plugin I keep getting ModuleNotFound errors. Is it maybe a python 2 vs python 3 thing? I'm using the latter.
The second issue is that I'm having trouble understanding the pybtex python API documentation. Specifically, from what I can tell it looks like the format_from_string and format_from_file calls are designed specifically for what I want to do, but I can't seem to get the syntax correct. Specifically, when I do
pybtex.format_from_file('foo.bib',style='html')
I get pybtex.plugin.PluginNotFound: plugin pybtex.style.formatting.html not found. I think I'm just not understanding how the call is supposed to work, and I can't find any examples of how to do it properly.
Here's a function I wrote for a similar use case--incorporating bibliographies into a website generated by Pelican.
from pybtex.plugin import find_plugin
from pybtex.database import parse_string
APA = find_plugin('pybtex.style.formatting', 'apa')()
HTML = find_plugin('pybtex.backends', 'html')()
def bib2html(bibliography, exclude_fields=None):
exclude_fields = exclude_fields or []
if exclude_fields:
bibliography = parse_string(bibliography.to_string('bibtex'), 'bibtex')
for entry in bibliography.entries.values():
for ef in exclude_fields:
if ef in entry.fields.__dict__['_dict']:
del entry.fields.__dict__['_dict'][ef]
formattedBib = APA.format_bibliography(bibliography)
return "<br>".join(entry.text.render(HTML) for entry in formattedBib)
Make sure you've installed the following:
pybtex==0.22.2
pybtex-apa-style==1.3

how to use pyknackhq python library for getting whole objects/tables from my knack builder

I am trying to connect knack online database with my python data handling scripts in order to renew objects/tables directly into my knack app builder. I discovered pyknackhq Python API for KnackHQ can fetch objects and return json objects for the object's records. So far so good.
However, following the documentation (http://www.wbh-doc.com.s3.amazonaws.com/pyknackhq/quick%20start.html) I have tried to fetch all rows (records in knack) for my object-table (having in total 344 records).
My code was:
i =0
for rec in undec_obj.find():
print(rec)
i=i+1
print(i)
>> 25
All first 25 records were returned indeed, however the rest until the 344-th were never returned. The documentation of pyknackhq library is relatively small so I couldn't find a way around my problem there. Is there a solution to get all my records/rows? (I have also changed the specification in knack to have all my records appear in the same page - page 1).
The ultimate goal is to take all records and make them a pandas dataframe.
thank you!
I haven't worked with that library, but I've written another python Knack API wrapper that should help:
https://github.com/cityofaustin/knackpy
The docs should get you where you want to go. Here's an example:
>>> from knackpy import Knack
# download data from knack object
# will fetch records in chunks of 1000 until all records have been downloaded
# optionally pass a rows_per_page and/or page_limit parameter to limit record count
>>> kn = Knack(
obj='object_3',
app_id='someappid',
api_key='topsecretapikey',
page_limit=10, # not needed; this is the default
rows_per_page=1000 # not needed; this is the default
)
>>> for row in kn.data:
print(row)
{'store_id': 30424, 'inspection_date': 1479448800000, 'id': '58598262bcb3437b51194040'},...
Hope that helps. Open a GitHub issue if you have any questions using the package.

Feedparser returns only first entry of ATOM feed

I updated my (already) working code from python2.7 to python3.5 and the following problem suddenly appears.
By parsing the given ATOM feed with many entries (correct syntax), feedparser 5.2.1. returns only the first entry of the feed and of course the "meta" data of the feed.
My (unmodified) code:
feed_data = feedparser.parse("www.myfeed.com/myfeeds.atom")
for entry in feed_data.entries:
print(entry)
output
{'uid':'99999','author':'XY', ...more content of the first entry...}
{}
The next (second) entry is empty... and the other entries are not even listed... The lenght of feed_data.entries is 2 (it should be 78).
UPDATE
Now (today) I get 3 entries as output, because one new entry was appended at the beginning of the entry-list, so I guess it is an "encoding" problem with the specific 3rd entry in the current feed.
Any ideas how to fix the problem?
Okay guys,
Python3.5 is not supported yet. But the support for this python version is prepared in the develop branch of the github project (see here).
It works with this development version of feedparser, so I'll try this and might wait (nothing happend sind one year) until the official release of this "feature".

Discogs API => How to retrieve genre?

I've crawled a tracklist of 36.000 songs, which have been played on the Danish national radio station P3. I want to do some statistics on how frequently each of the genres have been played within this period, so I figured the discogs API might help labeling each track with genre. However, the documentation for the API doesent seem to include an example for querying the genre of a particular song.
I have a CSV-file with with 3 columns: Artist, Title & Test(Test where i want the API to label each song with the genre).
Here's a sample of the script i've built so far:
import json
import pandas as pd
import requests
import discogs_client
d = discogs_client.Client('ExampleApplication/0.1')
d.set_consumer_key('key-here', 'secret-here')
input = pd.read_csv('Desktop/TEST.csv', encoding='utf-8',error_bad_lines=False)
df = input[['Artist', 'Title', 'Test']]
df.columns = ['Artist', 'Title','Test']
for i in range(0, len(list(df.Artist))):
x = df.Artist[i]
g = d.artist(x)
df.Test[i] = str(g)
df.to_csv('Desktop/TEST2.csv', encoding='utf-8', index=False)
This script has been working with a dummy file with 3 records in it so far, for mapping the artist of a given ID#. But as soon as the file gets larger(ex. 2000), it returns a HTTPerror when it cannot find the artist.
I have some questions regarding this approach:
1) Would you recommend using the search query function in the API for retrieving a variable as 'Genre'. Or do you think it is possible to retrieve Genre with a 'd.' function from the API?
2) Will I need to aquire an API-key? I have succesfully mapped the 3 records without an API-key so far. Looks like the key is free though.
Here's the guide I have been following:
https://github.com/discogs/discogs_client
And here's the documentation for the API:
https://www.discogs.com/developers/#page:home,header:home-quickstart
Maybe you need to re-read the discogs_client examples, i am not an expert myself, but a newbie trying to use this API.
AFAIK, g = d.artist(x) fails because x must be a integer not a string.
So you must first do a search, then get the artist id, then d.artist(artist_id)
Sorry for no providing an example, i am python newbie right now ;)
Also have you checked acoustid for
It's a probably a rate limit.
Read the status code of your response, you should find an 429 Too Many Requests
Unfortunately, if that's the case, the only solution is to add a sleep in your code to make one request per second.
Checkout the api doc:
http://www.discogs.com/developers/#page:home,header:home-rate-limiting
I found this guide:
https://github.com/neutralino1/discogs_client.
Access the api with your key and try something like:
d = discogs_client.Client('something.py', user_token=auth_token)
release = d.release(774004)
genre = release.genres
If you found a better solution please share.

Categories

Resources