Update file name and path using python - python

Below is my API code where I am trying to update the file path on a daily basis to run an API call.
I am unable to determine how I can update the file name on a daily basis. Some help would be highly appreciated.
import requests
url = "*******"
payload = {'Content-Disposition': 'form-data',
'Content-Type': 'text/plain',
'name': 'file'}
files = [
('file', open('/C:/Users/SET/Desktop/TEST/TEST/test_test_test_file_20201001.csv','rb'))
]
headers = {
'Content-Type': 'multipart/form-data',
'X-API-TOKEN': '*******'
}
response = requests.request("POST", url, headers=headers, data = payload, files = files)
print(response.text.encode('utf8'))

I am trying to update the file path on a daily basis
The code below will create the file name based on the current date:
from datetime import date
today = date.today()
file_name = f'test_test_test_file_{today.year}{today.month:02}{today.day:02}.csv'
print(file_name)
output
test_test_test_file_20201007.csv

Related

uplad file to google drive with progress bar with python requests

This is my code for uploading to google drive with python requests using google-drive-api.
import sys
import json
import requests
from tqdm import tqdm
import requests_toolbelt
from requests.exceptions import JSONDecodeError
class ProgressBar(tqdm):
def update_to(self, n: int) -> None:
self.update(n - self.n)
def upload_file(access_token:str, filename:str, filedirectory:str):
metadata = {
"title": filename,
}
files = {}
session = requests.session()
with open(filedirectory, "rb") as fp:
files["file"] = fp
files["data"] = ('metadata', json.dumps(metadata), 'application/json')
encoder = requests_toolbelt.MultipartEncoder(files)
with ProgressBar(
total=encoder.len,
unit="B",
unit_scale=True,
unit_divisor=1024,
miniters=1,
file=sys.stdout,
) as bar:
monitor = requests_toolbelt.MultipartEncoderMonitor(
encoder, lambda monitor: bar.update_to(monitor.bytes_read)
)
r = session.post(
"https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart",
data=monitor,
allow_redirects=False,
headers={"Authorization": "Bearer " + access_token},
)
try:
resp = r.json()
print(resp)
except JSONDecodeError:
sys.exit(r.text)
upload_file("access_token", "test.txt", "test.txt")
When i am trying send file with data attribute in post request then file name did not send and with files attribute in post request then requests-toolbelt not working. How to fix this error ?
When I saw your script, I thought that the content type is not included in the request header. In this case, I think that the request body is directly shown in the uploaded file. I thought that this might be the reason for your current issue. In order to remove this issue, how about the following modification?
From:
r = session.post(
url,
data=monitor,
allow_redirects=False,
headers={"Authorization": "Bearer " + access_token},
)
To:
r = session.post(
url,
data=monitor,
allow_redirects=False,
headers={
"Authorization": "Bearer " + access_token,
"Content-Type": monitor.content_type,
},
)
In this case, from metadata = { "title": filename }, it supposes that url is https://www.googleapis.com/upload/drive/v2/files?uploadType=multipart. Please be careful about this.
When you want to use Drive API v3, please modify metadata = { "title": filename } to metadata = { "name": filename }, and use the endpoint of https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart.
When the file is uploaded with Drive API v3, the value of {'kind': 'drive#file', 'id': '###', 'name': 'test.txt', 'mimeType': 'text/plain'} is returned.
By the way, when an error like badContent occurs in your testing, please try to test the following modification. When in the request body of multipart/form-data the file content is put before the file metadata, it seems that an error occurs. I'm not sure whether this is the current specification. But, I didn't know the order of request body is required to be checked.
From
files = {}
files["file"] = fp
files["data"] = ('metadata', json.dumps(metadata), 'application/json')
To
files = collections.OrderedDict(data=("metadata", json.dumps(metadata), "application/json"), file=fp)
Note:
I thought that in your script, an error might occur at file_size = os.path.getsize(filename). Please confirm this again.
When I tested your script by modifying the above modifications, I could confirm that a test file could be uploaded to Google Drive with the expected filename. In this case, I also modified it as follows.
files = collections.OrderedDict(data=("metadata", json.dumps(metadata), "application/json"), file=fp)
References:
Files: insert of Drive API v2
Files: create of Drive API v3
Upload file data
Metadata needs to be sent in the post body as json.
Python Requests post() Method
data = Optional. A dictionary, list of tuples, bytes or a file object to send to the specified url
json = Optional. A JSON object to send to the specified url
metadata = {
"name": filename,
}
r = session.post(
url,
json=json.dumps(metadata),
allow_redirects=False,
headers={"Authorization": "Bearer " + access_token},
)
Future readers can find below a complete script that also contains details on how to get access to the bearer token for HTTP authentication.
Most of the credit goes to the OP and answers to the OPs question.
"""
Goal: For one time upload of a large file (as the GDrive UI hangs up)
Step 1 - Create OAuth 2.0 Client ID + Client Secret
- by following the "Authentication" part of https://pythonhosted.org/PyDrive/quickstart.html
Step 2 - Get Access Token
- from the OAuth playground -> https://developers.google.com/oauthplayground/
--> Select Drive API v3 -> www.googleapis.com/auth/drive --> Click on "Authorize APIs"
--> Click on "Exchange authorization code for tokens" --> "Copy paste the access token"
--> Use it in the script below
Step 3 - Run file as daemon process
- nohup python -u upload_gdrive.py > upload_gdrive.log 2>&1 &
- tail -f upload_gdrive.log
"""
import sys
import json
import requests
from tqdm import tqdm
import requests_toolbelt # pip install requests_toolbelt
from requests.exceptions import JSONDecodeError
import collections
class ProgressBar(tqdm):
def update_to(self, n: int) -> None:
self.update(n - self.n)
def upload_file(access_token:str, filename:str, filepath:str):
metadata = {
"name": filename,
}
files = {}
session = requests.session()
with open(filepath, "rb") as fp:
files = collections.OrderedDict(data=("metadata", json.dumps(metadata), "application/json"), file=fp)
encoder = requests_toolbelt.MultipartEncoder(files)
with ProgressBar(
total=encoder.len,
unit="B",
unit_scale=True,
unit_divisor=1024,
miniters=1,
file=sys.stdout,
) as bar:
monitor = requests_toolbelt.MultipartEncoderMonitor(
encoder, lambda monitor: bar.update_to(monitor.bytes_read)
)
r = session.post(
"https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart",
data=monitor,
allow_redirects=False,
headers={
"Authorization": "Bearer " + access_token
, "Content-Type": monitor.content_type
},
)
try:
resp = r.json()
print(resp)
except JSONDecodeError:
sys.exit(r.text)
upload_file("<access_token>"
, "<upload_filename>", "<path_to_file>")

Read JSON Direct from API into DataFrame via PySpark/Databricks

I'm attempting to improve a current process that we have. Presently the process is:
API Call Made > JSON Saved as File > API Calls are iterated ending up with multiple files > Files are then read into a Databricks Dataframe.
I am attempting to remove the need to save the JSON as a file and then using read.json the files themselves into a dataframe before I then iterate through the data.
Is there a way I can read the json response into a string and then read it directly into a data frame?
My attempt is below but it keeps failing:
payload={}
headers = {
'Authorization': 'Basic ==',
'Cookie': 'JSESSIONID='
}
response = requests.request("GET", apipath, headers=headers, data=payload)
jsonData = json.dumps(response.text)
jsonDataList = []
jsonDataList.append(jsonData)
jsonRDD = sc.parallelize(jsonDataList)
df = spark.read.json(jsonRDD)
However I do a df.printSchema() and get told its "corrupt" data.
I've also tried to do the following:
payload={}
headers = {
'Authorization': 'Basic ==',
'Cookie': 'JSESSIONID='
}
response = requests.request("GET", apipath, headers=headers, data=payload)
jsonData = json.dumps(response.text)
#jsonDataList = []
#jsonDataList.append(jsonData)
#jsonRDD = sc.parallelize(jsonDataList)
df = spark.read.json(jsonData)
But get told relative path in URI and I'm guessing that is because its not reading directly from a file.
Any assistance would be really appreciated.
I reproduced same in my environment. I got below result:
To Resolve above error please follow this code:
import requests
resp = requests.get('https://reqres.in/api/users?page=1,name,href')
db1 = spark.sparkContext.parallelize([resp.text])
df2 = spark.read.json(db1)
df2.show()
Output:

Replacing the existing file in google drive using python

I am trying to replace an existing file with the same id in the google drive.
import json
import requests
headers = {"Authorization": "Bearer Token"}
para = {
"name": "video1.mp4",
}
files = {
'data': ('metadata', json.dumps(para), 'application/json; charset=UTF-8'),
'file': open("./video1.mp4", "rb")
}
r = requests.post(
"https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart",
headers=headers,
files=files
)
print(r.text)
It creates files of similar names with different id with this code. Can I replace an existing file with the same id by mentioning it somewhere in this code?
Although I'm not sure whether I could correctly understand Can I replace an existing file with the same id by mentioning it somewhere in this code?, if you want to update an existing file on Google Drive, how about the following modified script?
From your showing script, I understood that you wanted to achieve your goal using requests instead of googleapis for python.
Modified script 1:
If you want to update both the file content and file metadata, how about the following modification?
import json
import requests
fileId = "###" # Please set the file ID you want to update.
headers = {"Authorization": "Bearer Token"}
para = {"name": "video1.mp4"}
files = {
"data": ("metadata", json.dumps(para), "application/json; charset=UTF-8"),
"file": open("./video1.mp4", "rb"),
}
r = requests.patch("https://www.googleapis.com/upload/drive/v3/files/" + fileId + "?uploadType=multipart",
headers=headers,
files=files,
)
print(r.text)
If you want to update only the file content, please modify para = {"name": "video1.mp4"} to para = {}.
Modified script 2:
If you want to update only the file metadata, how about the following modification?
import json
import requests
fileId = "###" # Please set the file ID you want to update.
headers = {"Authorization": "Bearer Token"}
para = {"name": "Updated filename video1.mp4"}
r = requests.patch(
"https://www.googleapis.com/drive/v3/files/" + fileId,
headers=headers,
data=json.dumps(para),
)
print(r.text)
Note:
In this answer, it supposes that your access token can be used for upload and update the file. Please be careful about this.
Reference:
Files: update

Azure DevOps Python {"count":1,"value":{"Message":"Unexpected character encountered while parsing value: q. Path '', line 0, position 0.\r\n"}}

Trying to get Work Items for an Azure DevOps Project.
import requests
import base64
from azure.devops.connection import Connection
from azure.devops.v5_1.work_item_tracking.models import Wiql
from msrest.authentication import BasicAuthentication
organization = "https://dev.azure.com/dev"
pat = 'ey3nbq'
authorization = str(base64.b64encode(bytes(':'+pat, 'ascii')), 'ascii')
headers = {
'Accept': 'application/json',
'Content-Type': 'application/json',
'Authorization': 'Basic '+authorization
}
payload = {
"query": "SELECT [System.Id] FROM workitemLinks WHERE ([Source].[System.WorkItemType] = 'Task') AND ([System.Links.LinkType] = 'System.LinkTypes.Hierarchy-Reverse') AND ([Target].[System.WorkItemType] = 'User Story') MODE (DoesNotContain)"
}
response = requests.post(url="https://dev.azure.com/dev/Agile_Board/_apis/wit/wiql?api-version=5.1", headers=headers, data=payload)
print(response.text)
Gives response 400
Have tried many things, been struggling a bit with this. Any help is much appreciated. How to get project's work items without using their ID . Does the request need to be changed in some way?
Update your post to (json=payload):
response = requests.post(url="https://dev.azure.com/YOUR_ORG/Agile_Board/_apis/wit/wiql?api-version=5.1", headers=headers, json=payload)
or use something like this:
payload_str = "{\"query\": \"SELECT [System.Id] FROM workitemLinks WHERE ([Source].[System.WorkItemType] = 'Task') AND ([System.Links.LinkType] = 'System.LinkTypes.Hierarchy-Reverse') AND ([Target].[System.WorkItemType] = 'User Story') MODE (DoesNotContain)\"}"
response = requests.post(url="https://dev.azure.com/YOUR_ORG/Agile_Board/_apis/wit/wiql?api-version=5.1", headers=headers, data=payload_str)
Check this question: How to POST JSON data with Python Requests?

Python Requests: Post JSON and file in single request

I need to do a API call to upload a file along with a JSON string with details about the file.
I am trying to use the python requests lib to do this:
import requests
info = {
'var1' : 'this',
'var2' : 'that',
}
data = json.dumps({
'token' : auth_token,
'info' : info,
})
headers = {'Content-type': 'multipart/form-data'}
files = {'document': open('file_name.pdf', 'rb')}
r = requests.post(url, files=files, data=data, headers=headers)
This throws the following error:
raise ValueError("Data must not be a string.")
ValueError: Data must not be a string
If I remove the 'files' from the request, it works.
If I remove the 'data' from the request, it works.
If I do not encode data as JSON it works.
For this reason I think the error is to do with sending JSON data and files in the same request.
Any ideas on how to get this working?
See this thread How to send JSON as part of multipart POST-request
Do not set the Content-type header yourself, leave that to pyrequests to generate
def send_request():
payload = {"param_1": "value_1", "param_2": "value_2"}
files = {
'json': (None, json.dumps(payload), 'application/json'),
'file': (os.path.basename(file), open(file, 'rb'), 'application/octet-stream')
}
r = requests.post(url, files=files)
print(r.content)
Don't encode using json.
import requests
info = {
'var1' : 'this',
'var2' : 'that',
}
data = {
'token' : auth_token,
'info' : info,
}
headers = {'Content-type': 'multipart/form-data'}
files = {'document': open('file_name.pdf', 'rb')}
r = requests.post(url, files=files, data=data, headers=headers)
Note that this may not necessarily be what you want, as it will become another form-data section.
I'm don't think you can send both data and files in a multipart encoded file, so you need to make your data a "file" too:
files = {
'data' : data,
'document': open('file_name.pdf', 'rb')
}
r = requests.post(url, files=files, headers=headers)
I have been using requests==2.22.0
For me , the below code worked.
import requests
data = {
'var1': 'this',
'var2': 'that'
}
r = requests.post("http://api.example.com/v1/api/some/",
files={'document': open('doocument.pdf', 'rb')},
data=data,
headers={"Authorization": "Token jfhgfgsdadhfghfgvgjhN"}. #since I had to authenticate for the same
)
print (r.json())
For sending Facebook Messenger API, I changed all the payload dictionary values to be strings. Then, I can pass the payload as data parameter.
import requests
ACCESS_TOKEN = ''
url = 'https://graph.facebook.com/v2.6/me/messages'
payload = {
'access_token' : ACCESS_TOKEN,
'messaging_type' : "UPDATE",
'recipient' : '{"id":"1111111111111"}',
'message' : '{"attachment":{"type":"image", "payload":{"is_reusable":true}}}',
}
files = {'filedata': (file, open(file, 'rb'), 'image/png')}
r = requests.post(url, files=files, data=payload)
1. Sending request
import json
import requests
cover = 'superneat.jpg'
payload = {'title': 'The 100 (2014)', 'episodes': json.dumps(_episodes)}
files = [
('json', ('payload.json', json.dumps(payload), 'application/json')),
('cover', (cover, open(cover, 'rb')))
]
r = requests.post("https://superneatech.com/store/series", files=files)
print(r.text)
2. Receiving request
You will receive the JSON data as a file, get the content and continue...
Reference: View Here
What is more:
files = {
'document': open('file_name.pdf', 'rb')
}
That will only work if your file is at the same directory where your script is.
If you want to append file from different directory you should do:
files = {
'document': open(os.path.join(dir_path, 'file_name.pdf'), 'rb')
}
Where dir_path is a directory with your 'file_name.pdf' file.
But what if you'd like to send multiple PDFs ?
You can simply make a custom function to return a list of files you need (in your case that can be only those with .pdf extension). That also includes files in subdirectories (search for files recursively):
def prepare_pdfs():
return sorted([os.path.join(root, filename) for root, dirnames, filenames in os.walk(dir_path) for filename in filenames if filename.endswith('.pdf')])
Then you can call it:
my_data = prepare_pdfs()
And with simple loop:
for file in my_data:
pdf = open(file, 'rb')
files = {
'document': pdf
}
r = requests.post(url, files=files, ...)

Categories

Resources