How can I use AWS Textract with Python

How can I use AWS Textract with Python - python

I have tested almost every example code I can find on the Internet for Amazon Textract and I cant get it to work. I can upload and download a file to S3 from my Python client so the credentials should be OK. Lots of the errors points to some region failure but I have try every possible combinations.
Here are one of the last test call -
def test_parse_3():
# Document
s3BucketName = "xx-xxxx-xx"
documentName = "xxxx.jpg"
# Amazon Textract client
textract = boto3.client('textract')
# Call Amazon Textract
response = textract.detect_document_text(
Document={
'S3Object': {
'Bucket': s3BucketName,
'Name': documentName
}
})
print(response)
seems to be pretty easy but it generates the error -
botocore.errorfactory.InvalidS3ObjectException: An error occurred (InvalidS3ObjectException) when calling the DetectDocumentText operation: Unable to get object metadata from S3. Check object key, region and/or access permissions.
Any ideas whats wrong and dose someone have a working example (I knew the tabs are not correct in the example code)?
I have also tested a lot of permission settings in AWS. The credentials are in a hidden files created by aws sdk.

I am sure you already know, but the bucket is case sensitive. If you have verified that both the object bucket and name are correct, just make sure to add the appropriate region to your credentials.
I tested just reading from s3 without including the region in the credentials and I was able to list the objects in the bucket with no issues. I am thinking this worked because s3 is supposed to be region agnostic. However, since Textract is region specific, you must define the region in your credentials when using Textract to get the data from the s3 bucket.
I realize this was asked a few months ago, but I am hoping this sheds some light to others that face this issue in the future.

Related

python boto3: AWS Rekognition is unable to access S3 bucket

I am trying to upload the image to S3 and then have AWS Rekognition fetch it from S3 for face detection, but Rekognition cannot do that.
Here is my code - uploading and then detecting:
import boto3
s3 = boto3.client('s3')
s3.put_object(
ACL='public-read',
Body=open('/Users/1111/Desktop/kitten800300/kitten.jpeg', 'rb'),
Bucket='mobo2apps',
Key='kitten_img.jpeg'
)
rekognition = boto3.client('rekognition')
response = rekognition.detect_faces(
Image={
'S3Object': {
'Bucket': 'mobo2apps',
'Name': 'kitten_img.jpeg',
}
}
)
this produces an error:
Unable to get object metadata from S3. Check object key, region and/or access permissions.
Why is that?
About the permissions: I am authorized with AWS root access keys, so I have full access to all resources.

Here are the few things that you can do:
Make sure the region of the S3 bucket is the same as Recognition. Otherwise, it won't work. S3 service is global but every bucket is created in a specific region. The same region should be used by AWS clients.
Make sure the access keys of the user or role have the right set of permissions for the resource.
Make sure the file is actually uploaded.
Make sure there is no bucket policy applied that revokes access.
You can enable logging on your S3 bucket to see errors.
Make sure the bucket is not versioned. If versioned, specify the object version.
Make sure the object has the correct set of ACLs defined.
If the object is encrypted, make sure you have permission to use that KMS key to decrypt the object.

You have to wait for a while that the image uploading is done.
The code looks running smoothly, so your jpeg starts to upload and even before the uploading is finished, Rekognition starts to detect the face from the image. Since the uploading is not finished when the code runs, it cannot find the object from your S3. Put a wait time a bit.

How do I use a different AWS endpoint with another AWS region?

I am using AWS recognition on an S3 bucket of data that is currently located in the US-West-1 region. Unfortunately, AWS Rekognition is not supported in that region. I attempted to copy over my bucket into a US-West-2 region, but encountered difficulties in getting metadata. As such, my question is, how do I route my API call to another endpoint, specifically the endpoint 'https://rekognition.us-east-1.amazonaws.com' even though the bucket is based in another region. Any help or advice would be appreciated.
EDIT: I thought it may be relevant to mention, I am running this on Python.

Assuming you are using boto3 in your python script, you should be able to select a region when you create your client resource. Try doing something similar to this:
re_client= boto3.client('rekognition', region_name='us-east-1')
If your question is if you can use AWS Rekognition in one region to access a bucket in another region: As far as I know, you can't. However, you might be able to either migrate yor bucket to the specific region, or use S3 cross-region recplication to access the data from both regions.

How to properly use create_anonymous_client() function in google cloud storage python library for access on public buckets?

I made a publicly listable bucket on google cloud storage. I can see all the keys if I try to list the bucket objects in the browser. I was trying to use the create_anonymous_client() function so that I can list the bucket keys in the python script. It is giving me an exception. I looked up everywhere and still can't find the proper way to use the function.
from google.cloud import storage
client = storage.Client.create_anonymous_client()
a = client.lookup_bucket('publically_listable_bucket')
a.list_blobs()
Exception I am getting:
ValueError: Anonymous credentials cannot be refreshed.
Additional Query: Can I list and download contents of public google cloud storage buckets using boto3, If yes, how to do it anonymously?

I was also struggling with thing and couldn't find an answer anywhere online. Turns out you can access the bucket with just the bucket() method.
I'm not sure why, but this method can take several seconds sometimes.
client = storage.Client.create_anonymous_client()
bucket = client.bucket('publically_listable_bucket')
blobs = list(bucket.list_blobs())

This error means the bucket you are attempting to list does not grant the right permission. You must Give "Storage Object Viewer" or "Storage Legacy Bucket Reader" role to "allUsers".

Using Boto3 to upload a file to Amazon WorkDocs

According to the Amazon WorkDocs SDK page, you can use Boto3 to migrate your content to Amazon WorkDocs. I found the entry for the WorkSpaces Client in the Boto3 documentation, but every call seems to require a "AuthenticationToken" parameter. The only information I can find on AuthenticationToken is that is it supposed to be a "Amazon WorkDocs authentication token".
Does anyone know what this token is? How do I get one? Is there any code examples of using the WorkDocs Client in Boto3?
I am trying to create a simple Python script that will upload a single document into WorkDocs, but there seems to be little to no information on how to do this. I was easily able to write a script that can upload/download files from S3, but this seems like something else entirely.

How to upload a file to S3 and make it public using boto3?

I am able to upload an image file using:
s3 = session.resource('s3')
bucket = s3.Bucket(S3_BUCKET)
bucket.upload_file(file, key)
However, I want to make the file public too. I tried looking up for some functions to set ACL for the file but seems like boto3 have changes their API and removed some functions. Is there a way to do it in the latest release of boto3?

To upload and set permission to publicly-readable in one step, you can use:
bucket.upload_file(file, key, ExtraArgs={'ACL':'public-read'})
See https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html#the-extraargs-parameter

I was able to do it using objectAcl API:
s3 = boto3.resource('s3')
object_acl = s3.ObjectAcl('bucket_name','object_key')
response = object_acl.put(ACL='public-read')
For details: http://boto3.readthedocs.io/en/latest/reference/services/s3.html#objectacl

Adi's way works. However, if you were like me, you might have run into an access denied issue. This is normally caused by broken permissions of the user.
I fixed it by adding the following to the Action array:
"s3:GetObjectAcl",
"s3:PutObjectAcl"

In the recent versions of boto, ACL is available as a regular parameter - both when using the S3 client and resource, it seems. You can just specify ACL="public_read" without having to wrap it with ExtraParams or using ObjectAcl API.

Set the ACL="public-read" as mentioned above.
Also, make sure your bucket policy Resource line
has both the bare arn and /* arn formats.
Not having them both can cause strange permissions problems.
...
"Resource": ["arn:aws:s3:::my_bucket/*", "arn:aws:s3:::my_bucket"]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How can I use AWS Textract with Python - python

Related

python boto3: AWS Rekognition is unable to access S3 bucket

How do I use a different AWS endpoint with another AWS region?

How to properly use create_anonymous_client() function in google cloud storage python library for access on public buckets?

Using Boto3 to upload a file to Amazon WorkDocs

How to upload a file to S3 and make it public using boto3?

Categories

Resources