Looping a Bloomberg function over a list of tickers - python

I would like to loop a Bloomberg IntraDayBar request over a dynamic list of 22 tickers and then combined the result into one dataframe:
This code generates the following list of tickers:
bquery = blp.BlpQuery().start()
dates = pd.bdate_range(end='today', periods=31)
time = datetime.datetime.now()
bcom_info = bquery.bds("BCOM Index", "INDX_MEMBERS")
bcom_info['ticker'] = bcom_info['Member Ticker and Exchange Code'].astype(str) + ' Comdty'
I would like to create a dataframe that returns the volume for each ticker, contained in the 'TRADE' event_type. Effectively looping the below code over each of the tickers in bcom_info.
bquery.bdib(bcom_info['ticker'], event_type='TRADE', interval=60, start_datetime=dates[0], end_datetime=time)
I tried this but couldn't get it to work:
def bloom_func(x, func):
bloomberg = bquery
return bquery.bdib(x, func, event_type='TRADE', interval=60, start_datetime=dates[0], end_datetime=time)
for d in bcom_info['ticker']:
x[d] = bloom_func(d)
It generates the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-80-fcbf4acd6840> in <module>
2
3 for d in tickers:
----> 4 x[d] = bloom_func(d)
TypeError: bloom_func() missing 1 required positional argument: 'func'

Related

sequence item 0: expected str instance, tuple found(2)

I analyzed the data in the precedent and tried to use topic modeling. Here is a
syntax I am using:
According to the error, I think it means that the string should go in when
joining, but the tuple was found. I don't know how to fix this part.
class FacebookAccessException(Exception): pass
def get_profile(request, token=None):
...
response = json.loads(urllib_response)
if 'error' in response:
raise FacebookAccessException(response['error']['message'])
access_token = response['access_token'][-1]
return access_token
#Join the review
word_list = ",".join([",".join(i) for i in sexualhomicide['tokens']])
word_list = word_list.split(",")
This is Error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
C:\Users\Public\Documents\ESTsoft\CreatorTemp\ipykernel_13792\3474859476.py in <module>
1 #Join the review
----> 2 word_list = ",".join([",".join(i) for i in sexualhomicide['tokens']])
3 word_list = word_list.split(",")
C:\Users\Public\Documents\ESTsoft\CreatorTemp\ipykernel_13792\3474859476.py in <listcomp>(.0)
1 #Join the review
----> 2 word_list = ",".join([",".join(i) for i in sexualhomicide['tokens']])
3 word_list = word_list.split(",")
TypeError: sequence item 0: expected str instance, tuple found
This is print of 'sexual homicide'
print(sexualhomicide['cleaned_text'])
print("="*30)
print(twitter.pos(sexualhomicide['cleaned_text'][0],Counter('word')))
I can't upload the results of this syntax. Error occurs because it is classified as spam during the upload process.

Trying to form DataFrame from API, but the function is getting Name error

import requests # get connection
import pandas as pd
import json
def get_info(data):
data=[]
source=[]
published_date=[]
adx_keywords=[]
byline=[]
title=[]
abstract=[]
des_facet=[]
per_facet=[]
media=[]
Api_Key=''
url='https://api.nytimes.com/svc/mostpopular/v2/viewed/7.json?api-key=' # key redacted
response=requests.get(url).json()
for i in response['results']:
source.append(i['source'])
published_date.append(i['published_date'])
adx_keywords.append(i['adx_keywords'])
byline.append(i['byline'])
title.append(i['title'])
abstract.append(i['abstract'])
des_facet.append(i['des_facet'])
per_facet.append(i['per_facet'])
media.append(i['media'])
data=data.append({'source':source,'published_date':published_date,'adx_keywords':adx_keywords,byline':byline, 'title':title,'abstract':abstract,'des_facet':des_facet,
'per_facet':per_facet,'media':media})
df=df.append(d)
return df
df NameError
Traceback (most recent call last)
<ipython-input-292-00cf07b74dcd> in <module>()
----> 1 df
NameError: name 'df' is not defined
your hyphens are in the the wrong place
before:
data=data.append({'source':source,'published_date':published_date,'adx_keywords':adx_keywords,byline':byline, 'title':title,'abstract':abstract,'des_facet':des_facet,
'per_facet':per_facet,'media':media})
after:
data=data.append({'source':source,'published_date':published_date,'adx_keywords':adx_keywords,'byline':byline, 'title':title, 'abstract':abstract,'des_facet':des_facet,
'per_facet':per_facet,'media':media})

TypeError: list indices must be integers or slices, not str error, multiple fails after trying debug in a different cell

I have two dataframe.
As follows:
And I have the following function:
def get_user_movies(user_id):
movie_id = user_movie_df[user_movie_df['UserID'] == user_id]['MovieID'].tolist()
movie_title = []
for i in range(len(movie_id)):
a = movie_title[movie_title['MovieID'] == movie_id[i]]['Title'].values[0]
movie_title.append(a)
if movie_id == [] and movie_title == []:
raise Exception
return movie_id,movie_title
get_user_movies(30878)
And I have the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-55-9c58c22528ff> in <module>
8 raise Exception
9 return movie_id,movie_title
---> 10 get_user_movies(30878)
<ipython-input-55-9c58c22528ff> in get_user_movies(user_id)
3 movie_title = []
4 for i in range(len(movie_id)):
----> 5 a = movie_title[movie_title['MovieID'] == movie_id[i]]['Title'].values[0]
6 movie_title.append(a)
7 if movie_id == [] and movie_title == []:
TypeError: list indices must be integers or slices, not str
I debug couple of times, the line that has error no problem running when I try to run with single movie_id or some random movie_id together in another loop.. I just don't understand why this error keeps poping up..
Please take a look! Thanks!
def get_user_movies(user_id):
movie_id = user_movie_df[user_movie_df['UserID'] == user_id]['MovieID'].tolist()
movie_title = []
for i in range(len(movie_id)):
a = movie_title[movie_title['MovieID'] == movie_id[i]]['Title'].values[0]
movie_title.append(a)
if movie_id == [] and movie_title == []:
raise Exception
return movie_id,movie_title
get_user_movies(30878)
movie_title list and movie_title dataframe name repeated..

Scraping Google News with pygooglenews

I am trying to do scraping from Google News with pygooglenews.
I am trying to scrape more than 100 articles at a time (as google sets limit at 100) by changing the target dates using for loop. The below is what I have so far but I keep getting error message
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-84-4ada7169ebe7> in <module>
----> 1 df = pd.DataFrame(get_news('Banana'))
2 writer = pd.ExcelWriter('My Result.xlsx', engine='xlsxwriter')
3 df.to_excel(writer, sheet_name='Results', index=False)
4 writer.save()
<ipython-input-79-c5266f97934d> in get_titles(search)
9
10 for date in date_list[:-1]:
---> 11 search = gn.search(search, from_=date, to_=date_list[date_list.index(date)])
12 newsitem = search['entries']
13
~\AppData\Roaming\Python\Python37\site-packages\pygooglenews\__init__.py in search(self, query, helper, when, from_, to_, proxies, scraping_bee)
140 if from_ and not when:
141 from_ = self.__from_to_helper(validate=from_)
--> 142 query += ' after:' + from_
143
144 if to_ and not when:
TypeError: unsupported operand type(s) for +=: 'dict' and 'str'
import pandas as pd
from pygooglenews import GoogleNews
import datetime
gn = GoogleNews()
def get_news(search):
stories = []
start_date = datetime.date(2021,3,1)
end_date = datetime.date(2021,3,5)
delta = datetime.timedelta(days=1)
date_list = pd.date_range(start_date, end_date).tolist()
for date in date_list[:-1]:
search = gn.search(search, from_=date.strftime('%Y-%m-%d'), to_=(date+delta).strftime('%Y-%m-%d'))
newsitem = search['entries']
for item in newsitem:
story = {
'title':item.title,
'link':item.link,
'published':item.published
}
stories.append(story)
return stories
df = pd.DataFrame(get_news('Banana'))
Thank you in advance.
It looks like you are correctly passing in a string into get_news() which is then passed on as the first argument (search) into gn.search().
However, you're reassigning search to the result of gn.search() in the line:
search = gn.search(search, from_=date.strftime('%Y-%m-%d'), to_=(date+delta).strftime('%Y-%m-%d'))
# ^^^^^^
# gets overwritten with the result of gn.search()
In the next iteration this reassigned search is passed into gn.search() which it doesn't like.
If you look at the code in pygooglenews, it looks like gn.search() is returning a dict which would explain the error.
To fix this, simply use a different variable, e.g.:
result = gn.search(search, from_=date.strftime('%Y-%m-%d'), to_=(date+delta).strftime('%Y-%m-%d'))
newsitem = result['entries']
I know that pygooglenews has a limit of 100 articles, so you must to make a loop in which it will scrape every day separately.

tuple index out of range for regexp_replace - pyspark-sql

SELECT url,
regexp_replace(title, '(http|ftp|file|https)://[-a-z0-9+&##/\%?=~_-|!:,.;/]*|\<.*?\>|(=+)\s*(.*?)\s*(=+)|&\w+;', '') AS text_body
FROM df_table_doc
0 https://demo.com New Arch {Onboarding}..Lets (Onboard) it..
1 https://example.com New Arch (Onboarding)
Adding the pattern \{.*?\} to replace anything within {} is failing with :
IndexError: tuple index out of range
IndexError Traceback (most recent call last)
<ipython-input-1-20460659c049> in <module>
----> 1 get_ipython().run_cell_magic('spark_sql', '--limit 200', "select url, regexp_replace(title, '(http|ftp|file|https)://[-a-z0-9+&##/\\%?=~_-|!:,.;/]*|\\<.*?\\>|\\{.*?\\}|(=+)\\s*(.*?)\\s*(=+)|&\\w+;', '') as text_body\n from df_table_doc\n")

Categories

Resources