botocore.errorfactory.InvalidS3ObjectException - python

I have aws recognition code written in Python, and it run's by Node API, which works fine on Windows system but when I'm deploying it on Linux I'm facing this issue:- botocore.errorfactory.InvalidS3ObjectException: An error occurred (InvalidS3ObjectException) when calling the DetectText operation: Unable to get object metadata from S3. Check object key, region and/or access permissions.
I have given both AmazonRekognitionFullAccess and AmazonS3ReadOnlyAccess access role to I'm user. Still I don't know how to get things going.
Python code:-
bucket = 'image-test'
def image_to_dict(fileName, bucket):
client = boto3.client('rekognition', 'us-east-2')
response = client.detect_text(Image = { 'S3Object': { 'Bucket': bucket,
'Name': fileName } })
return response
Node Code used to run Python script:-
var options = {
mode: 'text',
pythonPath:"/usr/bin/python2.7"
pythonOptions: ['-u'],
scriptPath: "/home/ubuntu/test",
args: [imageURl]
};
PythonShell.run('script.py', options, function (err, results) {
if (err)
throw err;
console.log("Data is: "+results)
I have Python version 2.7 installed on my Ubuntu, pip version 10.0.1.

Thanks for the help.
The reason behind the issue was, when i was passing image name as argument from Node API, the name was getting manipulated due to some Substring logic.So when python script goes with that manipulated name to search in S3 bucket, it used to through above error as that name was not existing in the S3 bucket.

Related

reading credentials from jenkins credential manager and integrating with python script

I have a python script which is making a couple of api calls and returning the response to me via email. I want to run this script through a jenkins pipeline job.
I have a token, which I have stored in the jenkins credential manager as a secret text.
The problem is that I am unsure as how to fetch this token in my python script. I have tried looking at a number of solutions, but all of those are leaving me confused.
This is what my jenkins pipeline looks like:
pipeline {
agent {
node {
label 'node1'
}
}
environment {
deva_stross_token=credentials('devadrita-stross') //i have saved the credential with id 'devadrita-stross', and this I understand, is fetching it for my pipeline
}
stages {
stage('running python script') {
steps {
script {
bat """
python -u C://Users//Administrator//Desktop//stross//stross-script.py
"""
}
}
}
}
}
But what changes should I make to fetch it to my script?
Here is the python script.
import requests
import urllib3
import json
import time
import os
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
def initiateScan():
url = ""
payload={}
files=[
('source',('amail.zip',open('C:/Users/Administrator/Desktop/stross/amail.zip','rb'),'application/zip')),
('metadata',('metadata.json',open('C:/Users/Administrator/Desktop/stross/metadata.json','rb'),'application/json'))
]
headers = {
'Authorization': ' Bearer **<token required here>**'
}
response = requests.request("POST", url, headers=headers, data=payload, files=files, verify=False)
resp=response.json()
print(resp)
jobId=resp["job_id"]
return(jobId)
def main():
jobIdFromInitiate=initiateScan()
main()
Thank you in advance for your help!
EDIT-
I understand that there are a few ways to do this, one with credentials() and one with environment variables. The problem is that I am mixing the two up and am not able to implement either method through code. So please help with code.
EDIT 1-
Dividing #rok's answer into steps to follow-
"(1)Add withCredentials block to your Jenkinsfile for reading the credentials and (2)binding them to environment variables. Then, (3)pass the values of the environment variables to your python script as arguments."
pipeline {
agent {
node {
label 'node1'
}
}
environment {
deva_stross_token=credentials('devadrita-stross')
}
withCredentials([string(credentialsId: 'devadrita-stross', variable: 'deva-stross-token')]) {
// some block
}
stages {
stage('saving stross token and printing') {
steps {
script {
bat """
python -u C://Users//Administrator//Desktop//stross//stross-script.py
"""
}
}
}
}
}
(1) added withCredentials block for reading credentials
(2) what is meant by binding the credentials with environment variables? Did the withCredentials block bind the token to environment variable 'deva-stross-token'? Is my environment block useful or needed here?
(3) how to pass values of the environment variables to my python script?
I feel like I should add that I am relatively new to coding, so please bear with me if my questions seem very basic.
Use Credentials Binding plugin. Add withCredentials block to your Jenkinsfile for reading the credentials and binding them to environment variables. Then, pass the values of the environment variables to your python script as arguments.
Jenkins docs list supported credentials types and various bindings.
This is what worked for me:
Jenkinsfile:
stages {
stage('run stross script') {
steps{
withCredentials([string(credentialsId: 'devadrita-stross', variable: 'my_stross_token')]) {
echo "My stross token is '${my_stross_token}'!"
bat "python C:\\Users\\Administrator\\Desktop\\stross\\stross-script.py "
}
}
}
}
This binds my credential to the specified environment variable.
I then import the variable to my python script, and use it like any other variable, i.e.- pass it to functions, etc.
Here is a code snippet:
def main():
token = os.environ['my_stross_token']
print(token)
jobIdFromInitiate=initiateScan(token)
getStatus(jobIdFromInitiate, token)

Google cloud identity api to get company owned devices list

I am trying to get a company owned devices via https://cloud.google.com/identity/docs/reference/rest/v1/devices/list using golang.
I have tried using a service account and doing domain wide delegation and even adding scopes .
It seems that the below code always giving empty values(no devices list).
Sample golang code that I used:
cloudidentityService, err := cloudidentity.NewService(context.Background(), option.WithCredentialsFile("test.json"),
option.WithScopes("https://www.googleapis.com/auth/cloud-identity.devices.lookup",
"https://www.googleapis.com/auth/cloud-identity.groups","https://www.googleapis.com/auth/cloud-identity.groups.readonly",
"https://www.googleapis.com/auth/cloud-platform"))
addevices, err := cloudidentityService.Devices.List().Do()
if len(addevices.Devices) == 0 {
fmt.Println("empty")
} else {
for _, u := range addevices.Devices {
fmt.Println(u.Name)
}
return
Pls advise.
Stuck badly
Ref code I used is https://github.com/googleapis/google-api-go-client/blob/master/cloudidentity/v1beta1/cloudidentity-gen.go
If anyone can advise via postman that would be great too.

Using python for put_item in dynamodb table with Lambda function

Writing my first Lambda funcation to put item with Python
The problem im having - not getting the input from the registration form ( front end hosted on S3 bucket with static webhost enabled ) to the DynamoDB table
the Data will be sent with this funcation to the API hosted on AWS
const BASE_URL = `API-URL`;
function handleForm() {
const name = document.querySelector('#name').value;
const email = document.querySelector('#email').value;
const phone = document.querySelector('#phone').value;
const data = {
name,
email,
phone
}
console.log(data);
saveDataToAWS(data);
}
async function saveDataToAWS(data) {
const result = await axios.post(BASE_URL, data);
return result.data;
}
Im not sure im using AXIOS the right way but lets continue
The Lambda funcation im using now is pretty much this :
import json
import boto3
dynamodb=boto3.resource('dynamodb')
table=dynamodb.Table('register')
def lambda_handler(event, context):
table.put_item(
Item={
'name'=event['name'],
'email'=event['email'],
'phone'=event['phone']
}
)
respone={
'mes':'Good input !'
}
return {
'statusCode': 200,
'body': respone
}
Im pretty much 99% new to writting code in AWS so im sure im doing most of it wrong
Really looking for you help !
The 'event' attribute has a 'body' parameter that will contain the data in your example:
data = json.loads(event["body"])
table.put_item(
Item={
'name':data['name'],
'email':data['email'],
'phone':data['phone']
}
)
Remember to check CloudWatch Logs as well, as it will tell you whether the Lambda was invoked in the first place, and if it failed.
More information on the structure of the event-attribute can be found here:
https://aws-lambda-for-python-developers.readthedocs.io/en/latest/02_event_and_context/

AWS Lambda keeps returning "\"Hello from Lambda!\"

I'm having some issues with AWS Lambda for Python 3.8. No matter what code I try running, AWS Lambda keeps returning the same response. I am trying to retrieve a information from a DynamoDB instance with the code below:
import json
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('planets')
def lambda_handler(event, context):
response = table.get_item(
Key = {
'id':'mercury'
}
)
print(response)
# TODO implement
return {
'statusCode': 200,
'body': response)
}
I am expecting an output like 'body':{'Item': {'id':'mercury', 'temp':'sizzling hot'}}, or an error even, but I keep getting the response below:
Response:
{
"statusCode": 200,
"body": "\"Hello from Lambda!\""
}
I even change up the code, expecting an error, but I still get the same output.
Usually this is due to one of the following reasons:
You are not deploying your code changes. In the new UI, you have to explicitly Deploy your function using Orange button.
You are invoking old lambda version, rather then your latest version, if you are versioning your functions. You must explicitly choose the correct version to invoke.

Extracting BigQuery Data From a Shared Dataset

Is it possible to extract data (to google cloud storage) from a shared dataset (where I have only have view permissions) using the client APIs (python)?
I can do this manually using the web browser, but cannot get it to work using the APIs.
I have created a project (MyProject) and a service account for MyProject to use as credentials when creating the service using the API. This account has view permissions on a shared dataset (MySharedDataset) and write permissions on my google cloud storage bucket. If I attempt to run a job in my own project to extract data from the shared project:
job_data = {
'jobReference': {
'projectId': myProjectId,
'jobId': str(uuid.uuid4())
},
'configuration': {
'extract': {
'sourceTable': {
'projectId': sharedProjectId,
'datasetId': sharedDatasetId,
'tableId': sharedTableId,
},
'destinationUris': [cloud_storage_path],
'destinationFormat': 'AVRO'
}
}
}
I get the error:
googleapiclient.errors.HttpError: https://www.googleapis.com/bigquery/v2/projects/sharedProjectId/jobs?alt=json
returned "Value 'myProjectId' in content does not agree with value
sharedProjectId'. This can happen when a value set through a parameter
is inconsistent with a value set in the request.">
Using the sharedProjectId in both the jobReference and sourceTable I get:
googleapiclient.errors.HttpError: https://www.googleapis.com/bigquery/v2/projects/sharedProjectId/jobs?alt=json
returned "Access Denied: Job myJobId: The user myServiceAccountEmail
does not have permission to run a job in project sharedProjectId">
Using myProjectId for both the job immediately comes back with a status of 'DONE' and with no errors, but nothing has been exported. My GCS bucket is empty.
If this is indeed not possible using the API, is there another method/tool that can be used to automate the extraction of data from a shared dataset?
* UPDATE *
This works fine using the API explorer running under my GA login. In my code I use the following method:
service.jobs().insert(projectId=myProjectId, body=job_data).execute()
and removed the jobReference object containing the projectId
job_data = {
'configuration': {
'extract': {
'sourceTable': {
'projectId': sharedProjectId,
'datasetId': sharedDatasetId,
'tableId': sharedTableId,
},
'destinationUris': [cloud_storage_path],
'destinationFormat': 'AVRO'
}
}
}
but this returns the error
Access Denied: Table sharedProjectId:sharedDatasetId.sharedTableId: The user 'serviceAccountEmail' does not have permission to export a table in
dataset sharedProjectId:sharedDatasetId
My service account now is an owner on the shared dataset and has edit permissions on MyProject, where else do permissions need to be set or is it possible to use the python API using my GA login credentials rather than the service account?
* UPDATE *
Finally got it to work. How? Make sure the service account has permissions to view the dataset (and if you don't have access to check this yourself and someone tells you that it does, ask them to double check/send you a screenshot!)
After trying to reproduce the issue, I was running into the parse errors.
I did how ever play around with the API on the Developer Console [2] and it worked.
What I did notice is that the request code below had a different format than the documentation on the website as it has single quotes instead of double quotes.
Here is the code that I ran to get it to work.
{
'configuration': {
'extract': {
'sourceTable': {
'projectId': "sharedProjectID",
'datasetId': "sharedDataSetID",
'tableId': "sharedTableID"
},
'destinationUri': "gs://myBucket/myFile.csv"
}
}
}
HTTP Request
POST https://www.googleapis.com/bigquery/v2/projects/myProjectId/jobs
If you are still running into problems, you can try the you can try the jobs.insert API on the website [2] or try the bq command tool [3].
The following command can do the same thing:
bq extract sharedProjectId:sharedDataSetId.sharedTableId gs://myBucket/myFile.csv
Hope this helps.
[2] https://cloud.google.com/bigquery/docs/reference/v2/jobs/insert
[3] https://cloud.google.com/bigquery/bq-command-line-tool
Make sure the service account has permissions to view the dataset (and if you don't have access to check this yourself and someone tells you that it does, ask them to double check/send you a screenshot!)

Categories

Resources