bulk update failing when document has attachments? - python

I am performing the following operation:
Prepare some documents: docs = [ doc1, doc2, ... ]. The documents have maybe attachments
I POST to _bulk_docs the list of documents
I get an Exception > Problems updating list of documents (length = 1): (500, ('badarg', '58'))
My bulk_docs is (in this case just one):
[ { '_attachments': { 'image.png': { 'content_type': 'image/png',
'data': 'data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAM8AAADkCAIAAACwiOf9AAAAA3NCSVQICAjb4U/gAAAgAElEQVR4nO...'}},
'_id': '08b8fc66-cd90-47a1-9053-4f6fefabdfe3',
'_rev': '15-ff3d0e8baa56e5ad2fac4937264fb3f6',
'docmeta': { 'created': '2013-10-01 14:48:24.311257',
'updated': [ '2013-10-01 14:48:24.394157',
'2013-12-11 08:19:47.271812',
'2013-12-11 08:25:05.662546',
'2013-12-11 10:38:56.116145']},
'org_id': 45345,
'outputs_id': None,
'properties': { 'auto-t2s': False,
'content_type': 'image/png',
'lang': 'es',
'name': 'dfasdfasdf',
'text': 'erwerwerwrwerwr'},
'subtype': 'voicemail-st',
'tags': ['RRR-ccc-dtjkqx'],
'type': 'recording'}]
This is the detailed exception:
Traceback (most recent call last):
File "portal_support_ut.py", line 470, in test_UpdateDoc
self.ps.UpdateDoc(self.org_id, what, doc_id, new_data)
File "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/ps/complex_ops.py", line 349, in UpdateDoc
success, doc = database.UpdateDoc(doc_id, new_data)
File "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/updater.py", line 38, in UpdateDoc
res = self.SaveDoc(doc_id, doc)
File "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/saver.py", line 88, in SaveDoc
else : self.bulk_append(doc, flush, update_revision)
File "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/bulker.py", line 257, in bulk_append
if force_send or flush or not self.timer.use_timer : self.BulkSend(show_progress=True)
File "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/bulker.py", line 144, in BulkSend
results = self.UpdateDocuments(self.bulk)
File "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/bulker.py", line 67, in UpdateDocuments
results = self.db.update(bulkdocs)
File "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/client.py", line 764, in update
_, _, data = self.resource.post_json('_bulk_docs', body=content)
File "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py", line 527, in post_json
**params)
File "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py", line 546, in _request_json
headers=headers, **params)
File "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py", line 542, in _request
credentials=self.credentials)
File "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py", line 398, in request
raise ServerError((status, error))
ServerError: (500, ('badarg', '58'))
What does that badarg mean? Is it possible to send attachments when doing _bulk_docs?

The solution is to remove the data:image/png;base64, prefix before sending the attachment to coudhdb.
For a python alternative, see here.

This was answered in our mailing list, repeating the answer here for completeness.
The data field was malformed in two ways;
'data': 'data:image/png;base64,iVBORw0KGgoAA....'
The 'data:image/png;base64,' prefix is wrong, and the base64 part was malformed (CouchDB obviously needs to decode it to store it).

Related

Probem with reading Json file in Youtube data API

I am trying to upload a video to YouTube using the upload_video method, but I am encountering a TypeError that says 'Object of type function is not JSON serializable.
Here is the code I am using:
def upload_video(title,description,tags,upload_year,uplaod_month,upload_day):
upload_date_time = datetime.datetime(upload_year,uplaod_month,upload_day, 8, 00, 0).isoformat() + 'Z'
print(f"this is a upload time {upload_date_time}")
request_body = {
'snippet': {
'categoryI': 19,
'title': title,
'description': description,
'tags': tags
},
'status': {
'privacyStatus': 'private',
'publishAt': upload_date_time,
'selfDeclaredMadeForKids': False,
},
'notifySubscribers': False
}
mediaFile = MediaFileUpload('output.MP4')
response_upload = service.videos().insert(
part='snippet,status',
body=request_body,
media_body=mediaFile
).execute()
This is the error message I am receiving:
Traceback (most recent call last): File "c:\Users\Lukas\Dokumenty\python_scripts\Billionare livestyle\main.py", line 216, in <module>
upload_video(title,"#Shorts", ["motivation", "business", "luxury", "entrepreneurship", "success", "lifestyle", "inspiration", "wealth", "financial freedom", "investing", "mindset", "personal development", "self-improvement", "goals", "hustle", "ambition", "rich life", "luxury lifestyle", "luxury brand", "luxury travel", "luxury cars"],year,month,day) File "c:\Users\Lukas\Dokumenty\python_scripts\Billionare livestyle\main.py", line 93, in upload_video
response_upload = service.videos().insert( File "C:\Users\Lukas\Dokumenty\python_scripts\Billionare livestyle\env\youtube\lib\site-packages\googleapiclient\discovery.py", line 1100, in method
headers, params, query, body = model.request( File "C:\Users\Lukas\Dokumenty\python_scripts\Billionare livestyle\env\youtube\lib\site-packages\googleapiclient\model.py", line 160, in request
body_value = self.serialize(body_value) File "C:\Users\Lukas\Dokumenty\python_scripts\Billionare livestyle\env\youtube\lib\site-packages\googleapiclient\model.py", line 273, in serialize
return json.dumps(body_value) File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\json\__init__.py", line 231, in dumps
return _default_encoder.encode(obj) File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\json\encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True) File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\json\encoder.py", line 257, in iterencode
return _iterencode(o, 0) File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\json\encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type function is not JSON serializable

Error while uploading file on IPFS (TypeError: expected string or bytes-like object)

I am trying to upload a file on IPFS and retrieve it. The tutorial I am following uses the following approach:
import requests
import json
files = {
"file" : ("Congrats! You have uploaded this on IPFS."),
}
response_hash = requests.post("https://ipfs.infura.io:5001/api/v0/add", files = files)
p = response_hash.json()
hashed = p["Hash"]
print(p)
print(hashed)
params = (
("arg", hashed),
)
response = requests.post("https://ipfs.infura.io:5001/api/v0/block/get", params = params)
print(response.text)
However, I want to upload multiple data, preferably in the form of json arrays. I tried to modify it but I'm running into an error.
My code:
import requests
import json
example = {
"employees":[
{"name":"Shyam", "email":"shyamjaiswal#gmail.com"},
{"name":"Bob", "email":"bob32#gmail.com"},
{"name":"Jai", "email":"jai87#gmail.com"}
]}
response_hash = requests.post("https://ipfs.infura.io:5001/api/v0/add", files = example)
p = response_hash.json()
hashed = p["Hash"]
print(p)
print(hashed)
params = (
("arg", hashed),
)
response = requests.post("https://ipfs.infura.io:5001/api/v0/block/get", params = params)
print(response.text)
Error:
Traceback (most recent call last):
File "ipfs_v1.py", line 16, in <module>
response_hash = requests.post("https://ipfs.infura.io:5001/api/v0/add", files = example)
File "E:\Anaconda3\lib\site-packages\requests\api.py", line 119, in post
return request('post', url, data=data, json=json, **kwargs)
File "E:\Anaconda3\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "E:\Anaconda3\lib\site-packages\requests\sessions.py", line 516, in request
prep = self.prepare_request(req)
File "E:\Anaconda3\lib\site-packages\requests\sessions.py", line 459, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "E:\Anaconda3\lib\site-packages\requests\models.py", line 317, in prepare
self.prepare_body(data, files, json)
File "E:\Anaconda3\lib\site-packages\requests\models.py", line 505, in prepare_body
(body, content_type) = self._encode_files(files, data)
File "E:\Anaconda3\lib\site-packages\requests\models.py", line 166, in _encode_files
rf.make_multipart(content_type=ft)
File "E:\Anaconda3\lib\site-packages\urllib3\fields.py", line 268, in make_multipart
((u"name", self._name), (u"filename", self._filename))
File "E:\Anaconda3\lib\site-packages\urllib3\fields.py", line 225, in _render_parts
parts.append(self._render_part(name, value))
File "E:\Anaconda3\lib\site-packages\urllib3\fields.py", line 205, in _render_part
return self.header_formatter(name, value)
File "E:\Anaconda3\lib\site-packages\urllib3\fields.py", line 116, in format_header_param_html5
value = _replace_multiple(value, _HTML5_REPLACEMENTS)
File "E:\Anaconda3\lib\site-packages\urllib3\fields.py", line 89, in _replace_multiple
result = pattern.sub(replacer, value)
TypeError: expected string or bytes-like object
What am I doing wrong? How do I upload json arrays onto IPFS?
converted employee details as string
import requests
import json
files = {
"employees" : ( """{"name":"Shyam", "email":"shyamjaiswal#gmail.com"},
{"name":"Bob", "email":"bob32#gmail.com"},
{"name":"Jai", "email":"jai87#gmail.com"} """),
}
response_hash = requests.post("https://ipfs.infura.io:5001/api/v0/add", files = files)
p = response_hash.json()
hashed = p["Hash"]
print(p)
print(hashed)
params = (
("arg", hashed),
)
response = requests.post("https://ipfs.infura.io:5001/api/v0/block/get", params = params)
print(response.text)
output:
{'Name': 'employees', 'Hash': 'QmeGTapzFr36Bag6c1w4ZxiuJVM8wxDGMD7GFFmc7onV8c', 'Size': '161'}
QmeGTapzFr36Bag6c1w4ZxiuJVM8wxDGMD7GFFmc7onV8c
╗ {"name":"Shyam", "email":"shyamjaiswal#gmail.com"},
{"name":"Bob", "email":"bob32#gmail.com"},
{"name":"Jai", "email":"jai87#gmail.com"}

Pymongo ignoring allowDiskUse = True

I've looked at the other answers to this question, and yet it is still not working. I am trying to delete duplicate cases, here is the function:
def deleteDups(datab):
col = db[datab]
pipeline = [
{'$group': {
'_id': {
'CASE NUMBER': '$CASE NUMBER',
'JURISDICTION': '$JURISDICTION'},#needs to be case insensitive
'count': {'$sum': 1},
'ids': {'$push': '$_id'}
}
},
{'$match': {'count': {'$gt': 1}}},
]
results = col.aggregate(pipeline, allowDiskUse = True)
count = 0
for result in results:
doc_count = 0
print(result)
it = iter(result['ids'])
next(it)
for id in it:
deleted = col.delete_one({'_id': id})
count += 1
doc_count += 1
#print("API call recieved:", deleted.acknowledged) debug, is the database recieving requests
print("Total documents deleted:", count)
And yet, every time, I get this traceback:
File "C:\Users\*****\Documents\GitHub\*****\controller.py", line 202, in deleteDups
results = col.aggregate(pipeline, allowDiskUse = True)
File "C:\Python38\lib\site-packages\pymongo\collection.py", line 2375, in aggregate
return self._aggregate(_CollectionAggregationCommand,
File "C:\Python38\lib\site-packages\pymongo\collection.py", line 2297, in _aggregate
return self.__database.client._retryable_read(
File "C:\Python38\lib\site-packages\pymongo\mongo_client.py", line 1464, in _retryable_read
return func(session, server, sock_info, slave_ok)
File "C:\Python38\lib\site-packages\pymongo\aggregation.py", line 136, in get_cursor
result = sock_info.command(
File "C:\Python38\lib\site-packages\pymongo\pool.py", line 603, in command
return command(self.sock, dbname, spec, slave_ok,
File "C:\Python38\lib\site-packages\pymongo\network.py", line 165, in command
helpers._check_command_response(
File "C:\Python38\lib\site-packages\pymongo\helpers.py", line 159, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
pymongo.errors.OperationFailure: Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.
I asterisked out bits of path to protect privacy. But it is driving me absolutely nuts that this line: results = col.aggregate(pipeline, allowDiskUse = True) very explicitly passes allowDiskUse = True, and Mongo is just ignoring it. If I misspelled something, I'm blind. True has to be capitalized to pass a bool in python.
I feel like I'm going crazy here.
According to the documentation:
Atlas Free Tier and shared clusters do not support the allowDiskUse option for the aggregation command or its helper method.
(Thanks to Shane Harvey for this info)

/model/train http API giving 500 error when providing “nlu” data in json

I am trying to train model using httpapi and json data blow is the code.
import requests
import json
data = {
"config": "language: en\npipeline:\n- name: WhitespaceTokenizer\n- name: RegexFeaturizer\n- name: LexicalSyntacticFeaturizer\n- name: CountVectorsFeaturizer\n- name: CountVectorsFeaturizer\nanalyzer: \"char_wb\"\nmin_ngram: 1\nmax_ngram: 4\n- name: DIETClassifier\nepochs: 100\n- name: EntitySynonymMapper\n- name: ResponseSelector\nepochs: 100",
"nlu": json.dumps({
"rasa_nlu_data": {
"regex_features": [],
"entity_synonyms": [],
"common_examples": [
{
"text": "i m looking for a place to eat",
"intent": "restaurant_search",
"entities": []
},
{
"text": "I want to grab lunch",
"intent": "restaurant_search",
"entities": []
},
{
"text": "I am searching for a dinner spot",
"intent": "restaurant_search",
"entities": []
},
]
}
}),
"force": False,
"save_to_default_model_directory": True
}
r = requests.post('http://localhost:5005/model/train', json=data)
It gives me 500 error. Below is the log for error:
2020-09-30 07:40:37,511 [DEBUG] Traceback (most recent call last):
File "/home/Documents/practice/rasa/test1/venv/lib/python3.6/site-packages/rasa/server.py", line 810, in train
None, functools.partial(train_model, **info)
File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/Documents/practice/rasa/test1/venv/lib/python3.6/site-packages/rasa/train.py", line 50, in train
additional_arguments=additional_arguments,
File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
File "/home/Documents/practice/rasa/test1/venv/lib/python3.6/site-packages/rasa/train.py", line 83, in train_async
config, domain, training_files
File "/home/Documents/practice/rasa/test1/venv/lib/python3.6/site-packages/rasa/importers/importer.py", line 79, in load_from_config
config = io_utils.read_config_file(config_path)
File "/home/Documents/practice/rasa/test1/venv/lib/python3.6/site-packages/rasa/utils/io.py", line 188, in read_config_file
content = read_yaml(read_file(filename))
File "/home/Documents/practice/rasa/test1/venv/lib/python3.6/site-packages/rasa/utils/io.py", line 124, in read_yaml
return yaml_parser.load(content) or {}
File "/home/Documents/practice/rasa/test1/venv/lib/python3.6/site-packages/ruamel/yaml/main.py", line 343, in load
return constructor.get_single_data()
File "/home/Documents/practice/rasa/test1/venv/lib/python3.6/site-packages/ruamel/yaml/constructor.py", line 111, in get_single_data
node = self.composer.get_single_node()
File "_ruamel_yaml.pyx", line 706, in _ruamel_yaml.CParser.get_single_node
File "_ruamel_yaml.pyx", line 724, in _ruamel_yaml.CParser._compose_document
File "_ruamel_yaml.pyx", line 775, in _ruamel_yaml.CParser._compose_node
File "_ruamel_yaml.pyx", line 891, in _ruamel_yaml.CParser._compose_mapping_node
File "_ruamel_yaml.pyx", line 904, in _ruamel_yaml.CParser._parse_next_event
ruamel.yaml.parser.ParserError: while parsing a block mapping
in "<unicode string>", line 1, column 1
did not find expected key
in "<unicode string>", line 11, column 1
When I train model using terminal commands and json file, it is trained successfully. I think I am missing some formatting required for /model/train api. Can someone tell me where am I going wrong?
I am using rasa version 1.10.14.
Thankyou in advance.
Turns out that the string in config was not proper. It was giving error when training model due to double quotes used with escape characters. I made some tweaks in the config and it trained the model successfully

get TypeError when post an array of files in Python using requests

When I try to post an array of files in Python using requests, I get "TypeError: a bytes-like object is required, not 'dict'".
In order to send an album in Telegram from local, we need to provide an array of files, which contains file type. So I try:
import requests
bot_token = '<BOT_TOKEN>'
send_to = 1234567890
data = {'chat_id': send_to}
album = [
{'type': 'photo', 'media': 'test/foo.png'},
{'type': 'photo', 'media': 'test/bar.png'},
]
for i in range(len(album)):
album[i]['media'] = open(album[i]['media'], 'rb')
files = {'media': album}
test = requests.post(
f'https://api.telegram.org/{bot_token}/sendMediaGroup',
data=data, files=files)
When I run this I get:
Traceback (most recent call last):
File "D:/Games/GitHub/tgapi/test_album.py", line 87, in <module>
data=data, files=files)
File "C:\ProgramData\Anaconda\lib\site-packages\requests\api.py", line 116, in post
return request('post', url, data=data, json=json, **kwargs)
File "C:\ProgramData\Anaconda\lib\site-packages\requests\api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "C:\ProgramData\Anaconda\lib\site-packages\requests\sessions.py", line 519, in request
prep = self.prepare_request(req)
File "C:\ProgramData\Anaconda\lib\site-packages\requests\sessions.py", line 462, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "C:\ProgramData\Anaconda\lib\site-packages\requests\models.py", line 316, in prepare
self.prepare_body(data, files, json)
File "C:\ProgramData\Anaconda\lib\site-packages\requests\models.py", line 504, in prepare_body
(body, content_type) = self._encode_files(files, data)
File "C:\ProgramData\Anaconda\lib\site-packages\requests\models.py", line 169, in _encode_files
body, content_type = encode_multipart_formdata(new_fields)
File "C:\ProgramData\Anaconda\lib\site-packages\urllib3\filepost.py", line 90, in encode_multipart_formdata
body.write(data)
TypeError: a bytes-like object is required, not 'dict'
P.S.:
Adding headers={'Content-Type': 'multipart/form-data'} into requests.post does not help.
The code to send a single file is:
files = {'photo': open('test/foo.png', 'rb')}
test = requests.post(
f'https://api.telegram.org/{bot_token}/sendPhoto',
data=data, files=files)
You're not sending in an array, your files variable is a dictionary, which is exactly what the error is telling you.
Instead of adding the album from your loop to a dict key of media, just add it to an array and pass that in to your request
By referring to a popular package pyTelegramBotAPI, I've figured out:
import requests
import json
bot_token = '<BOT_TOKEN>'
send_to = 1234567890
data = {'chat_id': send_to}
album = [
{'type': 'photo', 'media': 'foo.png'},
{'type': 'photo', 'media': 'bar.png'},
]
files = {}
for i in range(len(album)):
files[album[i]['media']] = open(album[i]['media'], 'rb')
data['media'] = json.dumps(album)
At this moment,
data = {'chat_id': '345060487',
'media': '[{"type": "photo", "media": "attach://foo.png"}, {"type": "photo", "media": "attach://bar.png"}]'
}
files = {
'foo.png': open('test/foo.png', 'rb'),
'bar.png': open('test/bar.png', 'rb'),
}
And finally this code will work:
test = requests.post(
f'https://api.telegram.org/{bot_token}/sendMediaGroup',
data=data, files=files)
Note that, just like what is done in pyTelegramBotAPI, it's much better to generate a random filename by:
import random
import string
def generate_random_filename(length=16):
return ''.join(random.sample(string.ascii_letters, length))

Categories

Resources