I am trying to extract text data from images using Google Cloud Vision API. My Initial starting point was here After Enabling Vision API and Creating service account,generating json file, I created a script by referring this example.
Here's my code
from google.cloud import vision
from google.cloud.vision_v1 import types
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'generated_after_creating_sec_key.json'
image_path = 'images\\image_1.png'
vision_client = vision.ImageAnnotatorClient()
print(vision_client)
image = types.Image()
image.source.image_path = image_path
response = vision_client.text_detection(image=image)
for text in response.text_annotations:
print(text.description)
The only difference between the example shown in google page and my code is that the image uploaded in the example is on gcloud while mine happens to be on local storage.
Here's the complete stacktrace.
<google.cloud.vision_v1.ImageAnnotatorClient object at 0x000001DF861D7970>
Traceback (most recent call last):
File "text_detection.py", line 10, in <module>
image.source.image_path = image_path
File "C:\Users\user\ProjectFolder\ProjName\venv\lib\site-packages\proto\message.py", line 677, in __setattr__
pb_type = self._meta.fields[key].pb_type
KeyError: 'image_path'
What is the root cause of this error? Please help!
Per the Google Cloud Vision docs, you want image.source.image_uri instead.
According to the Google Cloud Vision docs, you should use
image.source.image_uri instead.
Related
I'm trying to access and modify data on Google Spreadsheet using Python. I'm having trouble to open the Google Spreadsheet from Python. I closely followed various tutorials and prepared the following before writing any code.
Enabled Google Sheets API and Google Drive API on GCP Console
Generated and downloaded credentials (JSON file) from GCP Console
Spreadsheet: Shared (edit-access) with the client email found in the JSON file
Installed gspread and oauth2client --> pip install gspread oauth2client
The following is the Python code to interface with Google Sheets. The goal in Lines 12 and 13, is to output to the console all of the data found in the linked Google Spreadsheet.
1 import gspread
2 from oauth2client.service_account import ServiceAccountCredentials
3
4 scope = ["https://spreadsheets.google.com/feeds","https://www.googleapis.com/auth/spreadsheets","https://www.googleapis.com/auth/drive.file","https://www.googleapis.com/auth/drive"]
5 creds = ServiceAccountCredentials.from_json_keyfile_name('Py-Sheets.json', scope)
6 client = gspread.authorize(creds)
7
8 print("Hello World")
9
10 sheet = client.open("Test-Sheets").sheet1
11
12 sample = sheet.get_all_records()
13 print(sample)
Everything seems to run fine up to line 10 (above), where I get an error saying SpreadsheetNotFound. Here's the error in full (below).
Traceback (most recent call last):
File "/home/username/anaconda3/lib/python3.7/site-packages/gspread/client.py", line 119, in open
self.list_spreadsheet_files(title),
File "/home/username/anaconda3/lib/python3.7/site-packages/gspread/utils.py", line 97, in finditem
return next((item for item in seq if func(item)))
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pysheets.py", line 10, in <module>
sheet = client.open("Test-Sheets").sheet1
File "/home/username/anaconda3/lib/python3.7/site-packages/gspread/client.py", line 127, in open
raise SpreadsheetNotFound
gspread.exceptions.SpreadsheetNotFound
I also received the following error via email.
DNS Error: 15698833 DNS type 'mx' lookup of python-spreadsheets-123456.iam.gserviceaccount.com responded with code NXDOMAIN Domain name not found: python-spreadsheets-123456.iam.gserviceaccount.com
How do I fix the error created after executing Line 10? The code is almost the exact same as what I found in the tutorials. The spreadsheet is named exactly what I put in client.open(). Does the spreadsheet have to be in a specific GDrive directory for it to be located?
An alternative would be opening the spreadsheet by URL on Google Colab:
# Access Google Sheets as a data source.
from google.colab import auth
auth.authenticate_user()
import gspread
from oauth2client.client import GoogleCredentials
gc = gspread.authorize(GoogleCredentials.get_application_default())
# At this point, you will have a link printed in your output which will direct you to a sign-in page. Pick the relevant google account and copy the provided link. Paste it in the provided line at the output section.
# Load your dataframe
import pandas as pd
wb = gc.open_by_url('https://docs.google.com/spreadsheets/.....') # A URL of your workbook.
sheet1 = wb.worksheet('Sheet1') # Enter your sheet name.
original_df = sheet1.get_all_values()
df = pd.DataFrame(original_df)
df.head()
I need (or at least I think I need) to create a file (could be a temp file but for now it does not work while I was testing it) where I can copy a file stored in google cloud storage.
This file is a geojson file and after load the file i will read it using geopandas.
The code will be run it inside a Kubernete in Google Cloud
The code:
def geoalarm(self,input):
from shapely.geometry import Point
import uuid
from google.cloud import storage
import geopandas as gpd
fp = open("XXX.geojson", "w+")
storage_client = storage.Client()
bucket = storage_client.get_bucket('YYY')
blob = bucket.get_blob('ZZZ.geojson')
blob.download_to_file(fp)
fp.seek(0)
PAIS = gpd.read_file(fp.name)
(dictionaryframe,_)=input
try:
place = Point((float(dictionaryframe["lon"])/100000), (float(dictionaryframe["lat"]) / 100000))
<...>
The questions are:
How could I create the file in kubernetes?
Or, how could I use the content of the file as string (if I use download_as_string) in geopandas to do the equivalent of geopanda.read_file(name)?
Extra
I tried using:
PAIS = gpd.read_file("gs://bucket/xxx.geojson")
But I have the following error:
DriverError: '/vsigs/bucket/xxx.geojson' does not exist in the file system, and is not recognized as a supported dataset name.
A VERY general overview of the pattern:
You can start by putting the code on a git repository. On Kubernetes, create a deployment/pod with the ubuntu image and make sure you install python, your python dependencies and pull your code in an initialization script, with the final line invoking python to run your code. In the "command" attribute of your pod template, you should use /bin/bash to run your script. Assuming you have the correct credentials, you will be able to grab the file from Google Storage and process it. To debug, you can attach to the running container using "kubectl exec".
Hope this helps!
A solution to avoid to create a file in kubernetes:
storage_client = storage.Client()
bucket = storage_client.get_bucket('buckte')
blob = bucket.get_blob('file.geojson')
string = blob.download_as_string()
PAIS = gpd.GeoDataFrame.from_features(json.loads(string))
Have a python code running on cloud function using cloud function python.
I'm processing an image on cloud. Now I want to save that image on google-cloud-storage
from google.cloud import storage
import cv2
from tempfile import NamedTemporaryFile
import os
client = storage.Client()
bucket = client.get_bucket('document')
image = cv2.imread('15.png')
with NamedTemporaryFile() as temp:
print(type(temp.name)," ",temp.name)
iName = "".join([str(temp.name),".jpg"])
print(iName)
cv2.imwrite(iName,image)
print(os.path.exists(str(iName)))
blob = bucket.blob('document/Test15.jpg')
blob.upload_from_filename(iName)
My output
< class 'str' >
/var/folders/5j/bqk11bqj4r55f8975mn__72m0000gq/T/tmplcgzaucx
/var/folders/5j/bqk11bqj4r55f8975mn__72m0000gq/T/tmplcgzaucx.jpg
True
Don't know what's going wrong
can anyone suggest a solution?
Silly mistake, turns out
blob = bucket.blob('document/Test15.jpg')
creates another folder document inside the bucket document
so the actual path would be document/document/Test15.jpg
I'mt trying to use goole vision api but i can't run my python script without getting the following error:
google.auth.exceptions.DefaultCredentialsError: ('File /root/GoogleCloudStaff/apikey.json is not a valid json file.', ValueError('Invalid control character at: line 5 column 37 (char 172)',))
My python script:
import io
from google.cloud import vision
vision_client = vision.Client()
#file_name = "/var/www/FlaskApp/FlaskApp/static/"#'375px-Guido_van_Rossum_OSCON_2006_cropped.png'
file_name = '1200px-Guido_van_Rossum_OSCON_2006.jpg'
#file_name = "/var/www/FlaskApp/FlaskApp/static/cyou_pic_folders/cyou_folder_2017_11_16_10_26_18/pi_pic_lc_2017_11_16_10_26_1800049.png"
with io.open(file_name, 'rb') as image_file:
content = image_file.read()
image = vision_client.image(
content=content, )
labels = image.detect_labels()
for label in labels:
print(label.description)
Thanks very much!
DefaultCredentialsError indicates that you failed acquiring default credentials.Have you done initial set up in a proper manner?
Take a look at vision
The error you are facing might be attributed to an issue with the service account key itself. \n is a control character present in the key which signifies a new line, which might be causing the issue. To solve the error, you can either validate the content of the JSON file or you can download the key from Google Cloud again. The key can be downloaded by following these instructions.
After acquiring the service account key, the environment variable GOOGLE_APPLICATION_CREDENTIALS has to be set depending on the Operating System used.
As the final step, run the following Python code which performs a labeling task using the Cloud Vision API. The service account key will be automatically used to authenticate the labeling request.
import io
import os
# Imports the Google Cloud client library
from google.cloud import vision
# Instantiates a client
client = vision.ImageAnnotatorClient()
# The name of the image file to annotate
file_name = os.path.abspath('path/to/file/sample.jpg')
# Loads the image into memory
with io.open(file_name, 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
# Performs label detection on the image file
response = client.label_detection(image=image)
labels = response.label_annotations
print('Labels:')
for label in labels:
print(label.description)
The code prints out the labels which the API returns. The Vision API can detect and extract information about entities in an image, across a broad group of categories.
You seem to be missing the authentication config. From Using the client library:
Using the client library
To run the client library, you must first set up authentication.
I am trying to run the quick start demo by Google Vision APIs on MacOS Sierra.
def run_quickstart():
# [START vision_quickstart]
import io
import os
# Imports the Google Cloud client library
from google.cloud import vision
# Instantiates a client
vision_client = vision.Client()
# The name of the image file to annotate
file_name = os.path.join(
os.path.dirname(__file__),
'resources/wakeupcat.jpg')
# Loads the image into memory
with io.open(file_name, 'rb') as image_file:
content = image_file.read()
image = vision_client.image(
content=content)
# Performs label detection on the image file
labels = image.detect_labels()
print('Labels:')
for label in labels:
print(label.description)
# [END vision_quickstart]
if __name__ == '__main__':
run_quickstart()
Script looks as above. I am using Service Account key file to authenticate. As document suggested I have installed google-vision dependencies via pip and set up an environment variable with,
export GOOGLE_APPLICATION_CREDENTIALS=/my_credentials.json
Environment variable is correctly set. Still script raises,
oauth2client.client.HttpAccessTokenRefreshError: invalid_grant: Invalid JWT Signature.
There are similar questions asked when using API keys, when using Service account file was not mentioned.