Upload file from html when block public access is true - python

I am using django-s3direct to file upload
https://github.com/bradleyg/django-s3direct
Using IAM role setting because I upload the file from the server on ECS container.
Now I set the blockPublicAccess of S3 false.
When uploading images from html, there comes error.
https://s3.ap-northeast-1.amazonaws.com/static-resource-v/images/c64d6e593de44aa5b10dcf1766582547/_origin.jpg?uploads (403 (Forbidden)
)
initiate error: static-resource-v/line-assets/images/c64d6e593de44aa5b10dcf1766582547/_origin.jpg AWS Code: AccessDenied, Message:Access Deniedstatus:403
OK, it is understandable.
Browser try to access the for initiation.
However there is any way to upload file from browser when blockPublicAccess is true??

Related

Python Django s3 error "file must be encoded before hashing"

We send a file via API.
When the file is saved locally or on the same ec2 instance, all works fine, and we get the response from the API
When the file is saved on AWS s3, we get the error 'Unicode-objects must be encoded before hashing'
This is the code that works to open and send file from local device but not when getting the file from s3
my_file = self.original_file.open(mode='rb').read()

How to directly upload files from google drive to amazon s3?

I want to directly upload some files from google drive to amazon s3 but unable to do so.
I dont know, how can i directly upload the files from google drive to amazon s3.
I tried getting the download link using python and google api.
but when I try to upload to amazon s3 I'm getting errors:
axios
.get("https://drive.google.com/u/0/uc?id="+id+"&export=download", {
responseType: 'arraybuffer',
withCredentials: false,
headers:{
'Authorization':'Bearer: <access_token>',
'Content-Type': 'text/plain'
}
})
.then((response:any) => Buffer.from(response.data, 'binary'))
.then((data:any)=>{
console.log(data)
})
EROR
has been blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource.
Can anyone please tell me, how can i resolve this error?
you got the data from drive then;
with aws-sdk package you can create
s3 = new AWS.S3(....)
then use s3.putObject method
Manage to find the answer and posting this as answer for anyone who is looking for same:
First you need a downloadable link which you can construct using file id:
Here it is:
"https://www.googleapis.com/drive/v3/files/"+file.id+'?alt=media'
Then you can make a GET request with your google access token:
'Authorization','Bearer '+accessToken
This way you can download file from google drive which you can upload to s3.

Connecting to an API at azurewebsites.net to download a JSON file

I need to connect an API on azurewebsites using Python to download a JSON file automatically.
I can access the website and download a JSON file manually.
I tried to connect using:
url = 'https://myplatformconnectiot.azurewebsites.net/swagger/index.html'
r = requests.get(url, headers={"Authentication": " application/json"},cookies={},auth=('user#example.com', 'password'),)
r.json()
Do you know how to download a JSON file in azurewebsites using Python?
You need to use the kudu console url to a get particular file download from a web app.
By using the below python code you can download the file form the web app
import json
import requests
url = 'https://<webappname>.scm.azurewebsites.net/wwwroot/wwwroot/css/site.css'
r = requests.get(url,auth=('username','urlpassword'))
with open(r'C:\Users\name.json','wb') as f:
f.write(r.content)
username & password will be from publish profile credentials file of a web app. you can get the publish profile credentials from portal as shown in below image
Kudu is the engine behind a number of features in Azure App Service related to source control based deployment, and other deployment methods like Dropbox and OneDrive sync.
for more information about kudu you can refer the below document

Transfer data from a S3 bucket to a GCP bucket using temporary credentials

I would like to download a public dataset from the NIMH Data Archive. After creating an account on their website and accepting their Data Usage Agreement, I can download a CSV file which contains the path to all the files in the dataset I am interested in. Each path is of the form s3://NDAR_Central_1/....
1 Download on my personal computer
In the NDA Github repository, the nda-tools Python library exposes some useful Python code to download those files to my own computer. Say I want to download the following file:
s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz
Given my username (USRNAME) and password (PASSWD) (the ones I used to create my account on the NIMH Data Archive), the following code allows me to download this file to TARGET_PATH on my personal computer:
from NDATools.clientscripts.downloadcmd import configure
from NDATools.Download import Download
config = configure(username=USRNAME, password=PASSWD)
s3Download = Download(TARGET_PATH, config)
target_fnames = ['s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz']
s3Download.get_links('paths', target_fnames, filters=None)
s3Download.get_tokens()
s3Download.start_workers(False, None, 1)
Behind the hood, the get_tokens method of s3Download will use USRNAME and PASSWD to generate temporary access key, secret key and security token. Then, the start_workers method will use the boto3 and s3transfer Python libraries to download the selected file.
Everything works fine !
2 Download to a GCP bucket
Now, say I created a project on GCP and would like to directly download this file to a GCP bucket.
Ideally, I would like to do something like:
gsutil cp s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz gs://my-bucket
To do this, I execute the following Python code in the Cloud Shell (by running python3):
from NDATools.TokenGenerator import NDATokenGenerator
data_api_url = 'https://nda.nih.gov/DataManager/dataManager'
generator = NDATokenGenerator(data_api_url)
token = generator.generate_token(USRNAME, PASSWD)
This gives me the access key, the secret key and the session token. Indeed, in the following,
ACCESS_KEY refers to the value of token.access_key,
SECRET_KEY refers to the value of token.secret_key,
SECURITY_TOKEN refers to the value of token.session.
Then, I set these credentials as environment variables in the Cloud Shell:
export AWS_ACCESS_KEY_ID = [copy-paste ACCESS_KEY here]
export AWS_SECRET_ACCESS_KEY = [copy-paste SECRET_KEY here]
export AWS_SECURITY_TOKEN = [copy-paste SECURITY_TOKEN here]
Eventually, I also set up the .boto configuration file in my home. It looks like this:
[Credentials]
aws_access_key_id = $AWS_ACCESS_KEY_ID
aws_secret_access_key = $AWS_SECRET_ACCESS_KEY
aws_session_token = $AWS_SECURITY_TOKEN
[s3]
calling_format = boto.s3.connection.OrdinaryCallingFormat
use-sigv4=True
host=s3.us-east-1.amazonaws.com
When I run the following command:
gsutil cp s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz gs://my-bucket
I end up with:
AccessDeniedException: 403 AccessDenied
The full traceback is below:
Non-MD5 etag ("a21a0b2eba27a0a32a26a6b30f3cb060-6") present for key <Key: NDAR_Central_1,submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz>, data integrity checks are not possible.
Copying s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz [Content-Type=application/x-gzip]...
Exception in thread Thread-2:iB]
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/google/google-cloud-sdk/platform/gsutil/gslib/daisy_chain_wrapper.py", line 213, in PerformDownload
decryption_tuple=self.decryption_tuple)
File "/google/google-cloud-sdk/platform/gsutil/gslib/cloud_api_delegator.py", line 353, in GetObjectMedia
decryption_tuple=decryption_tuple)
File "/google/google-cloud-sdk/platform/gsutil/gslib/boto_translation.py", line 590, in GetObjectMedia
generation=generation)
File "/google/google-cloud-sdk/platform/gsutil/gslib/boto_translation.py", line 1723, in _TranslateExceptionAndRaise
raise translated_exception # pylint: disable=raising-bad-type
AccessDeniedException: AccessDeniedException: 403 AccessDenied
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>A93DBEA60B68E04D</RequestId><HostId>Z5XqPBmUdq05btXgZ2Tt7HQMzodgal6XxTD6OLQ2sGjbP20AyZ+fVFjbNfOF5+Bdy6RuXGSOzVs=</HostId></Error>
AccessDeniedException: 403 AccessDenied
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>A93DBEA60B68E04D</RequestId><HostId>Z5XqPBmUdq05btXgZ2Tt7HQMzodgal6XxTD6OLQ2sGjbP20AyZ+fVFjbNfOF5+Bdy6RuXGSOzVs=</HostId></Error>
I would like to be able to directly download this file from a S3 bucket to my GCP bucket (without having to create a VM, setup Python and run the code above [which works]). Why is it that the temporary generated credentials work on my computer but do not work in GCP Cloud Shell?
The complete log of the debug command
gsutil -DD cp s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz gs://my-bucket
can be found here.
The procedure you are trying to implement is called "Transfer Job"
In order to transfer a file from Amazon S3 bucket to a Cloud Storage bucket:
A. Click the Burger Menu on the top left corner
B. Go to Storage > Transfer
C. Click Create Transfer
Under Select source, select Amazon S3 bucket.
In the Amazon S3 bucket text box, specify the source Amazon S3 bucket name.
The bucket name is the name as it appears in the AWS Management Console.
In the respective text boxes, enter the Access key ID and Secret key associated
with the Amazon S3 bucket.
To specify a subset of files in your source, click Specify file filters beneath
the bucket field. You can include or exclude files based on file name prefix and
file age.
Under Select destination, choose a sink bucket or create a new one.
To choose an existing bucket, enter the name of the bucket (without the prefix
gs://), or click Browse and browse to it.
To transfer files to a new bucket, click Browse and then click the New bucket
icon.
Enable overwrite/delete options if needed.
By default, your transfer job only overwrites an object when the source version is
different from the sink version. No other objects are overwritten or deleted.
Enable additional overwrite/delete options under Transfer options.
Under Configure transfer, schedule your transfer job to Run now (one time) or Run
daily at the local time you specify.
Click Create.
Before setting up the Transfer Job please make sure you have the necessary roles assigned to your account and the required permissions described here.
Also take into consideration that the Storage Transfer Service is currently available to certain Amazon S3 regions, described under the AMAZON S3 tab, of the Setting up a transfer job
Transfer jobs can also be done programmatically. More information here
Let me know if this was helpful.
EDIT
Neither the Transfer Service or gsutil command support currently "Temporary Security Credentials" even though they are supported by AWS. A workaround to do what you want is to change the source code of the gsutil command.
I also filed a Feature Request on your behalf, I suggest you to star it in order to get updates of the procedure.

Uploading large files to Google Storage GCE from a Kubernetes pod

We get this error when uploading a large file (more than 10Mb but less than 100Mb):
403 POST https://www.googleapis.com/upload/storage/v1/b/dm-scrapes/o?uploadType=resumable: ('Response headers must contain header', 'location')
Or this error when the file is more than 5Mb
403 POST https://www.googleapis.com/upload/storage/v1/b/dm-scrapes/o?uploadType=multipart: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>)
It seems that this API is looking at the file size and trying to upload it via multi part or resumable method. I can't imagine that is something that as a caller of this API I should be concerned with. Is the problem somehow related to permissions? Does the bucket need special permission do it can accept multipart or resumable upload.
from google.cloud import storage
try:
client = storage.Client()
bucket = client.get_bucket('my-bucket')
blob = bucket.blob('blob-name')
blob.upload_from_filename(zip_path, content_type='application/gzip')
except Exception as e:
print(f'Error in uploading {zip_path}')
print(e)
We run this inside a Kubernetes pod so the permissions get picked up by storage.Client() call automatically.
We already tried these:
Can't upload with gsutil because the container is Python 3 and gsutil does not run in python 3.
Tried this example: but runs into the same error: ('Response headers must contain header', 'location')
There is also this library. But it is basically alpha quality with little activity and no commits for a year.
Upgraded to google-cloud-storage==1.13.0
Thanks in advance
The problem was indeed the credentials. Somehow the error message was very miss-leading. When we loaded the credentials explicitly the problem went away.
# Explicitly use service account credentials by specifying the private key file.
storage_client = storage.Client.from_service_account_json(
'service_account.json')
I found my node pools had been spec'd with
oauthScopes:
- https://www.googleapis.com/auth/devstorage.read_only
and changing it to
oauthScopes:
- https://www.googleapis.com/auth/devstorage.full_control
fixed the error. As described in this issue the problem is an uninformative error message.

Categories

Resources