IMDbPY handling None object - python

I'm trying to pull data about cinematographers from IMDbPY and i'm encountering a null object. I'm not sure how to deal with that None object in the code. Could someone help me out please?
here's where I have reached.
from imdb import IMDb, IMDbError
ia = IMDb()
itemdop = ''
doplist = []
items = ["0050083", "6273736", "2582802"]
def imdblistdop(myList=[], *args):
for x in myList:
movie = ia.get_movie(x)
cinematographer = movie.get('cinematographers')[0]
cinematographer2 = movie.get('cinematographers')
print(cinematographer)
print(doplist)
try:
itemdop = cinematographer['name']
doplist.append(itemdop)
except KeyError as ke:
print('Nope!')
imdblistdop(items)
The code is not working at all and all i get is this:
Boris Kaufman
[]
TypeError Traceback (most recent call last)
in ()
21
22
---> 23 imdblistdop(items)
24
25
in imdblistdop(myList, *args)
10 for x in myList:
11 movie = ia.get_movie(x)
---> 12 cinematographer = movie.get('cinematographers')[0]
13 cinematographer2 = movie.get('cinematographers')
14 print(cinematographer)
TypeError: 'NoneType' object is not subscriptable

cinematographer is a list. It means that you can point to an an entry in the list using its index. Example: cinematographer[2]. You can not use a string to point to an entry in the list.

Related

sequence item 0: expected str instance, tuple found(2)

I analyzed the data in the precedent and tried to use topic modeling. Here is a
syntax I am using:
According to the error, I think it means that the string should go in when
joining, but the tuple was found. I don't know how to fix this part.
class FacebookAccessException(Exception): pass
def get_profile(request, token=None):
...
response = json.loads(urllib_response)
if 'error' in response:
raise FacebookAccessException(response['error']['message'])
access_token = response['access_token'][-1]
return access_token
#Join the review
word_list = ",".join([",".join(i) for i in sexualhomicide['tokens']])
word_list = word_list.split(",")
This is Error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
C:\Users\Public\Documents\ESTsoft\CreatorTemp\ipykernel_13792\3474859476.py in <module>
1 #Join the review
----> 2 word_list = ",".join([",".join(i) for i in sexualhomicide['tokens']])
3 word_list = word_list.split(",")
C:\Users\Public\Documents\ESTsoft\CreatorTemp\ipykernel_13792\3474859476.py in <listcomp>(.0)
1 #Join the review
----> 2 word_list = ",".join([",".join(i) for i in sexualhomicide['tokens']])
3 word_list = word_list.split(",")
TypeError: sequence item 0: expected str instance, tuple found
This is print of 'sexual homicide'
print(sexualhomicide['cleaned_text'])
print("="*30)
print(twitter.pos(sexualhomicide['cleaned_text'][0],Counter('word')))
I can't upload the results of this syntax. Error occurs because it is classified as spam during the upload process.

Getting KeyError: 'viewCount' for using Youtube API in Python

I'm trying to get the view count for a list of videos from a channel. I've written a function and when I try to run it with just 'video_id', 'title' & 'published date' I get the output. However, when I want the view count or anything from statistics part of API, then it is giving a Key Error.
Here's the code:
def get_video_details(youtube, video_ids):
all_video_stats = []
for i in range(0, len(video_ids), 50):
request = youtube.videos().list(
part='snippet,statistics',
id = ','.join(video_ids[i:i+50]))
response = request.execute()
for video in response['items']:
video_stats = dict(
Video_id = video['id'],
Title = video['snippet']['title'],
Published_date = video['snippet']['publishedAt'],
Views = video['statistics']['viewCount'])
all_video_stats.append(video_stats)
return all_video_stats
get_video_details(youtube, video_ids)
And this is the error message:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_18748/3337790216.py in <module>
----> 1 get_video_details(youtube, video_ids)
~\AppData\Local\Temp/ipykernel_18748/1715852978.py in get_video_details(youtube, video_ids)
14 Title = video['snippet']['title'],
15 Published_date = video['snippet']['publishedAt'],
---> 16 Views = video['statistics']['viewCount'])
17
18 all_video_stats.append(video_stats)
KeyError: 'viewCount'
I was referencing this Youtube video to write my code.
Thanks in advance.
I got it.
I had to use .get() to avoid the KeyErrors. It will return None for KeyErrors.
Replaced this code to get the solution.
Views = video['statistics'].get('viewCount')

TypeError: list indices must be integers or slices, not str error, multiple fails after trying debug in a different cell

I have two dataframe.
As follows:
And I have the following function:
def get_user_movies(user_id):
movie_id = user_movie_df[user_movie_df['UserID'] == user_id]['MovieID'].tolist()
movie_title = []
for i in range(len(movie_id)):
a = movie_title[movie_title['MovieID'] == movie_id[i]]['Title'].values[0]
movie_title.append(a)
if movie_id == [] and movie_title == []:
raise Exception
return movie_id,movie_title
get_user_movies(30878)
And I have the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-55-9c58c22528ff> in <module>
8 raise Exception
9 return movie_id,movie_title
---> 10 get_user_movies(30878)
<ipython-input-55-9c58c22528ff> in get_user_movies(user_id)
3 movie_title = []
4 for i in range(len(movie_id)):
----> 5 a = movie_title[movie_title['MovieID'] == movie_id[i]]['Title'].values[0]
6 movie_title.append(a)
7 if movie_id == [] and movie_title == []:
TypeError: list indices must be integers or slices, not str
I debug couple of times, the line that has error no problem running when I try to run with single movie_id or some random movie_id together in another loop.. I just don't understand why this error keeps poping up..
Please take a look! Thanks!
def get_user_movies(user_id):
movie_id = user_movie_df[user_movie_df['UserID'] == user_id]['MovieID'].tolist()
movie_title = []
for i in range(len(movie_id)):
a = movie_title[movie_title['MovieID'] == movie_id[i]]['Title'].values[0]
movie_title.append(a)
if movie_id == [] and movie_title == []:
raise Exception
return movie_id,movie_title
get_user_movies(30878)
movie_title list and movie_title dataframe name repeated..

Scraping Google News with pygooglenews

I am trying to do scraping from Google News with pygooglenews.
I am trying to scrape more than 100 articles at a time (as google sets limit at 100) by changing the target dates using for loop. The below is what I have so far but I keep getting error message
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-84-4ada7169ebe7> in <module>
----> 1 df = pd.DataFrame(get_news('Banana'))
2 writer = pd.ExcelWriter('My Result.xlsx', engine='xlsxwriter')
3 df.to_excel(writer, sheet_name='Results', index=False)
4 writer.save()
<ipython-input-79-c5266f97934d> in get_titles(search)
9
10 for date in date_list[:-1]:
---> 11 search = gn.search(search, from_=date, to_=date_list[date_list.index(date)])
12 newsitem = search['entries']
13
~\AppData\Roaming\Python\Python37\site-packages\pygooglenews\__init__.py in search(self, query, helper, when, from_, to_, proxies, scraping_bee)
140 if from_ and not when:
141 from_ = self.__from_to_helper(validate=from_)
--> 142 query += ' after:' + from_
143
144 if to_ and not when:
TypeError: unsupported operand type(s) for +=: 'dict' and 'str'
import pandas as pd
from pygooglenews import GoogleNews
import datetime
gn = GoogleNews()
def get_news(search):
stories = []
start_date = datetime.date(2021,3,1)
end_date = datetime.date(2021,3,5)
delta = datetime.timedelta(days=1)
date_list = pd.date_range(start_date, end_date).tolist()
for date in date_list[:-1]:
search = gn.search(search, from_=date.strftime('%Y-%m-%d'), to_=(date+delta).strftime('%Y-%m-%d'))
newsitem = search['entries']
for item in newsitem:
story = {
'title':item.title,
'link':item.link,
'published':item.published
}
stories.append(story)
return stories
df = pd.DataFrame(get_news('Banana'))
Thank you in advance.
It looks like you are correctly passing in a string into get_news() which is then passed on as the first argument (search) into gn.search().
However, you're reassigning search to the result of gn.search() in the line:
search = gn.search(search, from_=date.strftime('%Y-%m-%d'), to_=(date+delta).strftime('%Y-%m-%d'))
# ^^^^^^
# gets overwritten with the result of gn.search()
In the next iteration this reassigned search is passed into gn.search() which it doesn't like.
If you look at the code in pygooglenews, it looks like gn.search() is returning a dict which would explain the error.
To fix this, simply use a different variable, e.g.:
result = gn.search(search, from_=date.strftime('%Y-%m-%d'), to_=(date+delta).strftime('%Y-%m-%d'))
newsitem = result['entries']
I know that pygooglenews has a limit of 100 articles, so you must to make a loop in which it will scrape every day separately.

ValueError: dictionary update sequence element #13 has length 1; 2 is required

I am getting the following error:
ValueError Traceback (most recent call last)
<ipython-input-19-ec485c9b9711> in <module>
31 except Exception as e:
32 print(e)
---> 33 raise e
34 print(i)
35 i = i+1
<ipython-input-19-ec485c9b9711> in <module>
21 # cc = dict(x.split(':') for x in c.split(','))
22 c = '"'.join(c)
---> 23 cc = dict(x.split(':') for x in c.split(','))
24 df_temp = pd.DataFrame(cc.items())
25 df_temp = df_temp.replace('"','',regex=True)
ValueError: dictionary update sequence element #13 has length 1; 2 is required
Below is the block which is throwing the error. I checked out some of the posts here but they are code specific. Not sure is it input issue or the code.
df_final = pd.DataFrame()
i=1
for file in files:
try:
s3 = session.resource('s3')
key = file
obj = s3.Object('my-bucket',key)
n = obj.get()['Body'].read()
gzipfile = BytesIO(n)
gzipfile = gzip.GzipFile(fileobj=gzipfile)
content = gzipfile.read()
content = content.decode('utf-8')
if len(content) > 0:
content = re.findall(r"(?<=\{)(.*?)(?=\})",content)
for c in content:
c= c.split('"')
for index,val in enumerate(c):
if index%2 == 1:
c[index] = val.replace(':','_').replace(',','_')
c = '"'.join(c)
cc = dict(x.split(':') for x in c.split(','))
df_temp = pd.DataFrame(cc.items())
df_temp = df_temp.replace('"','',regex=True)
df_temp = df_temp.T
new_header = df_temp.iloc[0] #grab the first row for the header
df_temp = df_temp[1:] #take the data less the header row
df_temp.columns = new_header
df_final = pd.concat([df_final, df_temp])
except Exception as e:
print(e)
raise e
print(i)
i = i+1
Can you share what is the issue here? This used to work fine before. Do I make a change or ignore the error?
My guess is that your data is malformed. I'm guessing that at some point, x.split(':') is producing a list with only one element in it because there is no : in x, the string being split. This leads, during the creation of a dictionary from this data, to a single value being passed when a pair of values (for "key" and "value") is expected.
I would suggest that you fire up your debugger and either let the debugger stop when it hits this error, or figure out when it happens and get to a point where you're just about to have the error occur. Then look at the data that's being or about to be processed in your debugger display and see if you can find this malformed data that is causing the problem. You might have to run a prep pass on the data to fix this problem and others like it before running the line that is throwing the exception.

Categories

Resources