I am trying to run a python code that will help me in connecting & extracting data from YouTube Data API v3. However, the moment i try to run the code, it give me the following error right at the first line:
File "C:/Users/asaxena/Desktop/py4e/Social Media Data Analytics/youtube_search.py", line 3, in <module>
from apiclient.discovery import build
ModuleNotFoundError: No module named 'apiclient'
I have already installed google-api-python-client in my working directory through the command: pip install --upgrade google-api-python-client
But its not helping me in running the code.
from apiclient.discovery import build
import argparse
import csv
import unidecode
def youtube_search(options):
youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION, developerKey=DEVELOPER_KEY)
search_response = youtube.search().list(q=options.q, part="id,snippet", maxResults=options.max_results).execute()
videos = []
channels = []
playlists = []
csvFile = open('video_result.csv','w')
csvWriter = csv.writer(csvFile)
csvWriter.writerow(["title","videoId","viewCount","likeCount","dislikeCount","commentCount","favoriteCount"])
for search_result in search_response.get("items", []):
if search_result["id"]["kind"] == "youtube#video":
title = search_result["snippet"]["title"]
title = unidecode.unidecode(title) # Dongho 08/10/16
videoId = search_result["id"]["videoId"]
video_response = youtube.videos().list(id=videoId,part="statistics").execute()
for video_result in video_response.get("items",[]):
viewCount = video_result["statistics"]["viewCount"]
if 'likeCount' not in video_result["statistics"]:
likeCount = 0
else:
likeCount = video_result["statistics"]["likeCount"]
if 'dislikeCount' not in video_result["statistics"]:
dislikeCount = 0
else:
dislikeCount = video_result["statistics"]["dislikeCount"]
if 'commentCount' not in video_result["statistics"]:
commentCount = 0
else:
commentCount = video_result["statistics"]["commentCount"]
if 'favoriteCount' not in video_result["statistics"]:
favoriteCount = 0
else:
favoriteCount = video_result["statistics"]["favoriteCount"]
csvWriter.writerow([title,videoId,viewCount,likeCount,dislikeCount,commentCount,favoriteCount])
csvFile.close()
At the end, I should be able to establish a successful connection with YouTube Data API v3, and extract data in a csv file.
You're importing a non-existing module. According to the documentation here you should be using:
from googleapiclient.discovery import ...
instead of:
from apiclient.discovery import ...
I solved it:
I manually installed the "google-api-python-client-master" separately in my working directory and then ran: "setup.py install" from the command line.
I then manually installed "Unidecode-master" separately within the unzipped "google-api-python-client-master" folder and then ran: "setup.py install" from the command line.
I then ran the above code, and it worked.
I believe you would have to install the API folders separately, or it won't work. Hope this is useful.
Related
My code is work but... i think about simple solution.
Maybe someone knows?
I can't find in API Gitlab how import package from GitLab - file .py
In database_connect.py i have function with connect to database (example)
For example
database_connect = project.files.get(file_path='FUNCTION/DATABASE/database_connect.py', ref='main')
import database_connect
My code ...
import gitlab
import base64
gl = gitlab.Gitlab('link',
private_token='API')
projects = gl.projects.list()
for project in projects[3:4]:
# print(project) # prints all the meta data for the project
print("Project: ", project.name)
print("Gitlab URL: ", project.http_url_to_repo)
database_connect = project.files.get(file_path='FUNCTION/DATABASE/database_connect.py', ref='main')
database_connect_content = base64.b64decode(database_connect.content).decode("utf-8")
database_connect_package = database_connect_content.replace('\\n', '\n')
exec(database_connect_package)
I am using the below azure function which takes http trigger as input.
import logging
import azure.functions as func
from . import test
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
name = req.params.get('name')
strng = req.params.get('strng')
if not name:
try:
req_body = req.get_json()
except ValueError:
pass
else:
name = req_body.get('name')
if name:
return func.HttpResponse(f"Hello, {name}. This HTTP triggered function executed successfully.sum = {test.testfunc(strng)}")
else:
return func.HttpResponse(
"This HTTP triggered function executed successfully. Pass a name in the query string or in the request body for a personalized response.",
status_code=200
)
Below is the test.py file which I imported in the init.py.
import json
import pandas as pd
from pandas import DataFrame
from azure.storage.blob import AppendBlobService
from datetime import datetime
def testfunc(strng):
# return strng
API = json.loads(strng)
test = pd.json_normalize(parse, record_path='Vol', meta=['studyDate'])
df = pd.DataFrame(test)
df["x"] = df["Vol"] * 2
df["y"] = df["Vol"] * 50
df1 = df[['Date', 'x', 'y']]
df2 = df1.to_json(orient='records')
append_blob_service = AppendBlobService(account_name='actname',
account_key='key')
date = datetime.now()
blobname = f"test_cal{date}.json"
append_blob_service.create_blob('container', blobname, if_none_match="*")
append_blob_service.append_blob_from_text('container', blobname, text=df2)
return df2
The above function works when I run it in pycharm and databricks . But when I run the Azure function via visuals studio code, I get the below error.
Exception: ImportError: cannot import name 'AppendBlobService' from 'azure.storage.blob'
I have tried below and still get the same error.
pip install azure-storage --upgrade
pip install azure-storage-blob
Kindly provide assistance with the error.
Is there any other way I can save the df2 variable to Azure storage.
Thank you.
According to Document it says,
If the module (for example, azure-storage-blob) cannot be found, Python throws the ModuleNotFoundError. If it is found, there may be an issue with loading the module or some of its files. Python would throw a ImportError in those cases.
Try setting python to environment variable FUNCTIONS_WORKER_RUNTIME or
Try adding azure.core to your requirements.txt file.
Taken References from:
ImportError: cannot import name 'BlockBlobService' from 'azure.storage.blob'
Azure Functions fails running when deployed
The current version library contains the blobserviceclient instead of the blockblockblockservice. This works for me by using version 2.1.0 can solve it:
pip install azure-storage-blob==2.1.0
here i tryied to run script with pypy3 c.py but above error occured ,
i installed pypy3 -m pip install pyshark but ...
pypy3 c.py
ModuleNotFoundError: No module named 'lxml.objectify'
import pyshark
import pandas as pd
import numpy as np
from multiprocessing import Pool
import re
import sys
temp_array = []
cap = pyshark.FileCapture("ddos_attack.pcap")
#print(cap._extract_packet_json_from_data(cap[0]))
def parse(capture):
print(capture)
packet_raw = [i.strip('\r').strip('\t').split(':') for i in str(capture).split('\n')]
packet_raw = map(lambda num:[num[0].replace('(',''),num[1].strip(')').replace('(','')] if len(num)== 2 else [num[0],':'.join(num[1:])] ,[i for i in packet_raw])
raw = list(packet_raw)[:-1]
cols = [i[0] for i in raw]
vals = [i[1] for i in raw]
temp_array.append(dict(zip(cols,vals)))
return dict(zip(cols,vals))
def preprocess_dataset(x):
count = 0
temp = []
#print(list(cap))
#p = Pool(5)
#r = p.map(parse,cap)
#p.close()
#p.join()
#print(r)
try:
for i in list(cap):
temp.append(parse(i))
count += 1
except Exception:
print("somethin")
data = pd.DataFrame(temp)
print(data)
data = data[['Packet Length','.... 0101 = Header Length','Protocol','Time to Live','Source Port','Length','Time since previous frame in this TCP stream','Window']]
data.rename(columns={".... 0101 = Header Length": 'Header Length'})
filtr = ["".join(re.findall(r'\d.',str(i))) for i in data['Time since previous frame in this TCP stream']]
data['Time since previous frame in this TCP stream'] = filtr
print(data.to_csv('data.csv'))
here i tryied to run script with pypy3 c.py
but above error occured ,
i installed pypy3 -m pip install pyshark but ...
Check your terminal settings.
Try to use another compiler like PyCharm.
It seems lxml is not installed correctly. It is hard to figure out what is going on since you only show the last line of the traceback, and do not state what platform you are on nor what version of PyPy you are using. The lxml package is listed as a requirement for pyshark, so it should have been installed. What happens when you try import lxml ?
I came across this python class ResourcesMoveInfo for moving resources(Azure Images) from one subscription to another with Azure python SDK.
But it's failing when I use it like below:
Pattern 1
reference from https://buildmedia.readthedocs.org/media/pdf/azure-sdk-for-python/v1.0.3/azure-sdk-for-python.pdf
Usage:
metadata = azure.mgmt.resource.resourcemanagement.ResourcesMoveInfo(resources=rid,target_resource_group='/subscriptions/{0}/resourceGroups/{1}'.format(self.prod_subscription_id,self.resource_group))
Error:
AttributeError: module 'azure.mgmt.resource' has no attribute 'resourcemanagement'
Pattern 2
reference from - https://learn.microsoft.com/en-us/python/api/azure-mgmt-resource/azure.mgmt.resource.resources.v2019_07_01.models.resourcesmoveinfo?view=azure-python
Usage:
metadata = azure.mgmt.resource.resources.v2020_06_01.models.ResourcesMoveInfo(resources=rid,target_resource_group='/subscriptions/{0}/resourceGroups/{1}'.format(self.prod_subscription_id,self.resource_group))
Error:
AttributeError: module 'azure.mgmt.resource.resources' has no attribute 'v2020_06_01'
Any help on this requirement/issue would be helpful. Thanks!
Adding code snippet here:
import sys
import os
import time
from azure.common.credentials import ServicePrincipalCredentials
from azure.mgmt.resource import ResourceManagementClient
import azure.mgmt.resource
#from azure.mgmt.resource.resources.v2020_06_01.models import ResourcesMoveInfo
from azure.identity import ClientSecretCredential
from cred_wrapper import CredentialWrapper
class Move():
def __init__(self):
self.nonprod_subscription_id = "abc"
self.prod_subscription_id = "def"
self.credential = ClientSecretCredential(
client_id= os.environ["ARM_CLIENT_ID"],
client_secret= os.environ["ARM_CLIENT_SECRET"],
tenant_id= os.environ["ARM_TENANT_ID"]
)
#resource client for nonprod
self.sp = CredentialWrapper(self.credential)
self.resource_client = ResourceManagementClient(self.sp,self.nonprod_subscription_id)
self.resource_group = "imgs-rg"
def getresourceids(self):
resource_ids = list(resource.id for resource in self.resource_client.resources.list_by_resource_group("{0}".format(self.resource_group)) if resource.id.find("latest")>=0)
return resource_ids
def getresourcenames(self):
resource_names = list(resource.name for resource in self.resource_client.resources.list_by_resource_group("{0}".format(self.resource_group)) if resource.id.find("latest")>=0)
return resource_names
def deleteoldimages(self,name):
#resource client id for prod
rc = ResourceManagementClient(self.sp,self.prod_subscription_id)
for resource in rc.resources.list_by_resource_group("{0}".format(self.resource_group)):
if resource.name == name:
#2019-12-01 is the latest api_version supported for deleting the resource type "image"
rc.resources.begin_delete_by_id(resource.id,"2020-06-01")
print("deleted {0}".format(resource.name))
def moveimages(self):
rnames = self.getresourcenames()
for rname in rnames:
print(rname)
#self.deleteoldimages(rname)
time.sleep(10)
rids = self.getresourceids()
rid = list(rids[0:])
print(rid)
metadata = azure.mgmt.resource.resources.v2020_06_01.models.ResourcesMoveInfo(resources=rid,target_resource_group='/subscriptions/{0}/resourceGroups/{1}'.format(self.prod_subscription_id,self.resource_group))
#moving resources in the rid from nonprod subscription to prod subscription under the resource group avrc-imgs-rg
if rid != []:
print("moving {0}".format(rid))
print(self.resource_client.resources.move_resources(source_resource_group_name="{0}".format(self.resource_group),parameters=metadata))
#self.resource_client.resources.move_resources(source_resource_group_name="{0}".format(self.resource_group),resources=rid,target_resource_group='/subscriptions/{0}/resourceGroups/{1}'.format(self.prod_subscription_id,self.resource_group))
#self.resource_client.resources.begin_move_resources(source_resource_group_name="{0}".format(self.resource_group),parameters=metadata)
if __name__ == "__main__":
Move().moveimages()
From your inputs we can see that the code looks fine. From your error messages, the problem is with importing the modules.
Basically when we import a module few submodules will get installed along with and few will not. This will depend on the version of the package, to understand which modules are involved in a specific version we need to check for version-releases in official documentation.
In your case, looks like some resource modules were missing, if you could see the entire error-trace, there will be a path with sitepackages in our local. Try to find that package and its subfolder(Modules) and compare them with Azure SDK for Python under Resource module, you can find this here.
In such situation we need to explicitly add those sub modules under our package. In your case you can simple download the packaged code from Git link which I gave and can merge in your local folder.
So, I am trying to get the number of sessions for each traffic source from google analitycs for a website, but I came across with this error:
Traceback (most recent call last): File
"/home/mrgrj/PycharmProjects/google-api/get_sessions.py", line 3, in
from analytics_service_object import initialize_service ImportError: No module named analytics_service_object
My code so far looks like this:
from analytics_service_object import initialize_service
def get_source_group(service, profile_id, start_date, end_date):
ids = "ga:" + profile_id
metrics = "ga:sessions"
dimensions = "ga:channelGrouping"
data = service.data().ga().get(
ids=ids, start_date=start_date, end_date=end_date, metrics=metrics,
dimensions=dimensions).execute()
return dict(
data["rows"] + [["total", data["totalsForAllResults"][metrics]]])
if __name__ == '__main__':
service = initialize_service()
profile_id = "your_profile_id"
start_date = "2014-09-01"
end_date = "2014-09-30"
data = get_source_group(service, profile_id, start_date, end_date)
for key, value in data.iteritems():
print key, value
I have also installed the Google API Python client library:
pip install -U google-api-python-client
pip install python-gflags
What am I missing ?
As a mention, all of these work:
from apiclient.discovery import build
from oauth2client.client import flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client import tools