Cannont take in multiple inputs - python

i have this code in the google apps script.
function createDocument(invoice_id,cust_name)
{
var TEMPLATE_ID = 'practice_link';
var documentId = DriveApp.getFileById(TEMPLATE_ID).makeCopy().getId();
drivedoc = DriveApp.getFileById(documentId);
drivedoc.setName("Invoice " + invoice_id);
doc = DocumentApp.openById(documentId);
var body = doc.getBody();
body.replaceText('{invoice_id}', invoice_id);
body.replaceText('{cust_name}',cust_name);
drivedoc.setSharing(DriveApp.Access.ANYONE_WITH_LINK, DriveApp.Permission.EDIT);
return "https://docs.google.com/document/d/" + documentId + "/export?format=pdf";
}
function doGet(e) {
var invoice_id = e.parameter.invoice_id;
var cust_name = e.parameter.cust_name;
var url = createDocument(invoice_id,cust_name);
return ContentService.createTextOutput(url);
}
when i use the link :
url?invoice_id=1, it runs properly
but when i try :
url?invoice_id=1&cust_name=customer
it runs but does not make changes for cust_name.
and if i try:
url?cust_name=customer
it shows the error
Exception: Invalid argument: replacement (line 12, file "Code")
after many tries i realised that it is showing this error only for when i use, cust_name=str .
please help me out here.i have many more parameters to pass and m stuck here.
i ll be using this app with python to pass on the parameters

Based on the scenarios you've mentioned, error only occurs when the invoice_id parameter is missing and adding cust_name won't affect the code. Those are hints that the URL you are trying to access is outdated or the script changes you've made is not yet deployed.
url?invoice_id=1, it runs properly
but when i try :
url?invoice_id=1&cust_name=customer
it runs but does not make changes for cust_name.
Also, Exception: Invalid argument: replacement (line 12, file "Code") indicates that the 2nd parameter of body.replaceText(pattern, value) in line 12 is undefined. This part should error too for cust_num if the deployed application is updated and the parameter for cust_num is missing.
Follow these steps on how to deploy script as web app:
At the top right of the script project, click Deploy > New deployment.
Next to "Select type," click Enable deployment types settings > Web app.
Enter the information about your web app in the fields under "Deployment configuration."
Click Deploy.
You can also use the test environments to evaluate your application and to make sure that the output of your project is correct before deploying. See Test a deployment.

Related

Add Python Script as a file to AWS SSM Document (YAML)

I'm trying to write a script for a SystemsManager Automation document and would like to keep the Python code in a seperate file so it's easy to invoke on my local machine. For complex scripts they can also be tested using a tool such as unittest.
Example YAML syntax from my SSM Automation:
mainSteps:
- name: RunTestScript
action: aws:executeScript
inputs:
Runtime: python3.8
Handler: handler
InputPayload:
Action: "Create" # This is just a random payload for now I'm only testing
Script: |-
"${script}" # My script should be injected here :)
Note: Yes, I've written the script directly in YAML and it works fine. But I'd like to achieve something similar to:
locals {
script = file("${path.module}/automations/test.py")
}
resource "aws_ssm_document" "aws_test_script" {
name = "test-script"
document_format = "YAML"
document_type = "Automation"
content = templatefile("${path.module}/automations/test.yaml", {
ssm_automation_assume_role = aws_iam_role.ssm_automation.arn
script = local.script
})
}
My console shows that yes the file is being read correctly.
Terraform plan...:
+ "def handler(event, context):
+ print(event)
+ import boto3
+ iam = boto3.client('iam')
+ response = iam.get_role(
+ RoleName='test-role'
+ )
+ print(response)"
Notice how the indentation is broken? My .py file has the correct indentation.
I suspect one of the terraform functions or YAML operators I'm using it breaking the indentation - which is very important for a language such as Python.
If I go ahead and Terraform apply I receive:
Error: Error updating SSM document: InvalidDocumentContent: YAML not well-formed. at Line: 30, Column: 1
I tried changing the last line in my YAML to be Script: "${script}" and I can Terraform Plan and Apply fine, but the Python script is on a single line and fails when executing the automation in AWS.
I've also tried using indent(4, local.script) without success.
Keen to hear/see what ideas and solutions you may have.
Thanks
I noticed my plan output had "'s around the code. So I tried a multiline string in Python """ and it continued to fail. Bearing in mind I assumed SSM was smart enough to strip quotes if it doesn't want them.
Anyway, the mistake was adding quotes around the template variable:
# Mistake
Script: |-
"${script}" # My script should be injected here :)
Solution
Script: |-
${script}
After that I decided okay now that it works, can I remove the indent() method I'm using and I got the YAML not well-formed error back.
So using:
locals {
script = indent(8, file("${path.module}/automations/test.py"))
}
resource "aws_ssm_document" "test_script" {
name = "test-script"
document_format = "YAML"
document_type = "Automation"
content = templatefile("${path.module}/automations/test.yaml", {
ssm_automation_assume_role = aws_iam_role.ssm_automation.arn
script = local.script
})
}
Works great with:
mainSteps:
- name: RunDMSRolesScript
action: aws:executeScript
inputs:
Runtime: python3.8
Handler: handler
InputPayload:
Action: "create"
Script: |-
${script}
If it helps anyone, this is what my SSM Script looks like in the AWS UI when it runs without errors. Formatted correctly and AWS seems to append " around it but turned into "\" when I provided quotes in my YAML template which would've broken the script as it's now a string literal.
"def handler(event, context):
print(event)
import boto3
iam = boto3.client('iam')
response = iam.get_role(
RoleName='test-role'
)
print(response)"
That's one way to lose 3 hours!

Using python and suds, data not read by server side because element is not defined as an array

I am a very inexperienced programmer with no formal education. Details will be extremely helpful in any responses.
I have made several basic python scripts to call SOAP APIs, but I am running into an issue with a specific API function that has an embedded array.
Here is a sample excerpt from a working XML format to show nested data:
<bomData xsi:type="urn:inputBOM" SOAP-ENC:arrayType="urn:bomItem[]">
<bomItem>
<item_partnum></item_partnum>
<item_partrev></item_partrev>
<item_serial></item_serial>
<item_lotnum></item_lotnum>
<item_sublotnum></item_sublotnum>
<item_qty></item_qty>
</bomItem>
<bomItem>
<item_partnum></item_partnum>
<item_partrev></item_partrev>
<item_serial></item_serial>
<item_lotnum></item_lotnum>
<item_sublotnum></item_sublotnum>
<item_qty></item_qty>
</bomItem>
</bomData>
I have tried 3 different things to get this to work to no avail.
I can generate the near exact XML from my script, but a key attribute missing is the 'SOAP-ENC:arrayType="urn:bomItem[]"' in the above XML example.
Option 1 was using MessagePlugin, but I get an error because my section is like the 3 element and it always injects into the first element. I have tried body[2], but this throws an error.
Option 2 I am trying to create the object(?). I read a lot of stack overflow, but I might be missing something for this.
Option 3 looked simple enough, but also failed. I tried setting the values in the JSON directly. I got these examples by an XML sample to JSON.
I have also done a several other minor things to try to get it working, but not worth mentioning. Although, if there is a way to somehow do the following, then I'm all ears:
bomItem[]: bomData = {"bomItem"[{...,...,...}]}
Here is a sample of my script:
# for python 3
# using pip install suds-py3
from suds.client import Client
from suds.plugin import MessagePlugin
# Config
#option 1: trying to set it as an array using plugin
class MyPlugin(MessagePlugin):
def marshalled(self, context):
body = context.envelope.getChild('Body')
bomItem = body[0]
bomItem.set('SOAP-ENC:arrayType', 'urn:bomItem[]')
URL = "http://localhost/application/soap?wsdl"
client = Client(URL, plugins=[MyPlugin()])
transact_info = {
"username":"",
"transaction":"",
"workorder":"",
"serial":"",
"trans_qty":"",
"seqnum":"",
"opcode":"",
"warehouseloc":"",
"warehousebin":"",
"machine_id":"",
"comment":"",
"defect_code":""
}
#WIP - trying to get bomData below working first
inputData = {
"dataItem":[
{
"fieldname": "",
"fielddata": ""
}
]
}
#option 2: trying to create the element here and define as an array
#inputbom = client.factory.create('ns3:inputBOM')
#inputbom._type = "SOAP-ENC:arrayType"
#inputbom.value = "urn:bomItem[]"
bomData = {
#Option 3: trying to set the time and array type in JSON
#"#xsi:type":"urn:inputBOM",
#"#SOAP-ENC:arrayType":"urn:bomItem[]",
"bomItem":[
{
"item_partnum":"",
"item_partrev":"",
"item_serial":"",
"item_lotnum":"",
"item_sublotnum":"",
"item_qty":""
},
{
"item_partnum":"",
"item_partrev":"",
"item_serial":"",
"item_lotnum":"",
"item_sublotnum":"",
"item_qty":""
}
]
}
try:
response = client.service.transactUnit(transact_info,inputData,bomData)
print("RESPONSE: ")
print(response)
#print(client)
#print(envelope)
except Exception as e:
#handle error here
print(e)
I appreciate any help and hope it is easy to solve.
I have found the answer I was looking for. At least a working solution.
In any case, option 1 worked out. I read up on it at the following link:
https://suds-py3.readthedocs.io/en/latest/
You can review at the '!MessagePlugin' section.
I found a solution to get message plugin working from the following post:
unmarshalling Error: For input string: ""
A user posted an example how to crawl through the XML structure and modify it.
Here is my modified example to get my script working:
#Using MessagePlugin to modify elements before sending to server
class MyPlugin(MessagePlugin):
# created method that could be reused to modify sections with similar
# structure/requirements
def addArrayType(self, dataType, arrayType, transactUnit):
# this is the code that is key to crawling through the XML - I get
# the child of each parent element until I am at the right level for
# modification
data = transactUnit.getChild(dataType)
if data:
data.set('SOAP-ENC:arrayType', arrayType)
def marshalled(self, context):
# Alter the envelope so that the xsd namespace is allowed
context.envelope.nsprefixes['xsd'] = 'http://www.w3.org/2001/XMLSchema'
body = context.envelope.getChild('Body')
transactUnit = body.getChild("transactUnit")
if transactUnit:
self.addArrayType('inputData', 'urn:dataItem[]', transactUnit)
self.addArrayType('bomData', 'urn:bomItem[]', transactUnit)

PRAW Errors updating to new Python Distro

So I am trying to create a bot that cross posts from a sub (r/pics) to (r/polpics) using a bit of code from u/GoldenSights. I upgraded to a new python distro and I get a ton of errors, I don't even know where to begin. Here is the code (formatting off, error lines bold):
Traceback (most recent call last):
File "C:\Users\tonyc\AppData\Local\Programs\Python\Python36-32\Lib\site-
packages\praw\subdump.py", line 84, in <module>
r = praw.Reddit(USERAGENT)
File "C:\Users\tonyc\AppData\Local\Programs\Python\Python36-32\lib\site-
packages\praw\reddit.py", line 150, in __init__
raise ClientException(required_message.format(attribute))
praw.exceptions.ClientException: Required configuration setting 'client_id'
missing.
This setting can be provided in a praw.ini file, as a keyword argument to the `Reddit` class constructor, or as an environment variable.
This seems to be related to USERAGENT setting. I don't think I have that configured right.
USERAGENT = ""
# This is a short description of what the bot does. For example
"/u/GoldenSights' Newsletter bot"
SUBREDDIT = "pics"
# This is the sub or list of subs to scan for new posts.
# For a single sub, use "sub1".
# For multiple subs, use "sub1+sub2+sub3+...".
# For all use "all"
KEYWORDS = ["It looks like this post is about US Politics."]
# Any comment containing these words will be saved.
KEYDOMAINS = []
# If non-empty, linkposts must have these strings in their URL
This is the error line:
print('Logging in')
r = praw.Reddit(USERAGENT) <--here, this is error line 84
r.set_oauth_app_info(APP_ID, APP_SECRET, APP_URI)
r.refresh_access_information(APP_REFRESH)
Also in Reddit.py :
raise ClientException(required_message.format(attribute)) <--- error
praw.exceptions.ClientException: Required configuration setting 'client_id'
missing.
This setting can be provided in a praw.ini file, as a keyword argument to
the `Reddit` class constructor, or as an environment variable.
Firstly, you're going to want to have your API credentials stored externally in your praw.ini file. This makes things a lot more secure, and looks like it might go some way to fixing your issue. Here's what a completed praw.ini file looks like, including the useragent, so try to replicate this.
[DEFAULT]
# A boolean to indicate whether or not to check for package updates.
check_for_updates=True
# Object to kind mappings
comment_kind=t1
message_kind=t4
redditor_kind=t2
submission_kind=t3
subreddit_kind=t5
# The URL prefix for OAuth-related requests.
oauth_url=https://oauth.reddit.com
# The URL prefix for regular requests.
reddit_url=https://www.reddit.com
# The URL prefix for short URLs.
short_url=https://redd.it
[appname]
client_id=IE*******T14_w
client_secret=SW***********************CLY
password=******************
username=appname
user_agent=web:appname:1.0.0 (by /u/username)
Let me know how things go after you sort this out.

Google Analytics and Python

I'm brand new at Python and I'm trying to write an extension to an app that imports GA information and parses it into MySQL. There is a shamfully sparse amount of infomation on the topic. The Google Docs only seem to have examples in JS and Java...
...I have gotten to the point where my user can authenticate into GA using SubAuth. That code is here:
import gdata.service
import gdata.analytics
from django import http
from django import shortcuts
from django.shortcuts import render_to_response
def authorize(request):
next = 'http://localhost:8000/authconfirm'
scope = 'https://www.google.com/analytics/feeds'
secure = False # set secure=True to request secure AuthSub tokens
session = False
auth_sub_url = gdata.service.GenerateAuthSubRequestUrl(next, scope, secure=secure, session=session)
return http.HttpResponseRedirect(auth_sub_url)
So, step next is getting at the data. I have found this library: (beware, UI is offensive) http://gdata-python-client.googlecode.com/svn/trunk/pydocs/gdata.analytics.html
However, I have found it difficult to navigate. It seems like I should be gdata.analytics.AnalyticsDataEntry.getDataEntry(), but I'm not sure what it is asking me to pass it.
I would love a push in the right direction. I feel I've exhausted google looking for a working example.
Thank you!!
EDIT: I have gotten farther, but my problem still isn't solved. The below method returns data (I believe).... the error I get is: "'str' object has no attribute '_BecomeChildElement'" I believe I am returning a feed? However, I don't know how to drill into it. Is there a way for me to inspect this object?
def auth_confirm(request):
gdata_service = gdata.service.GDataService('iSample_acctSample_v1.0')
feedUri='https://www.google.com/analytics/feeds/accounts/default?max-results=50'
# request feed
feed = gdata.analytics.AnalyticsDataFeed(feedUri)
print str(feed)
Maybe this post can help out. Seems like there are not Analytics specific bindings yet, so you are working with the generic gdata.
I've been using GA for a little over a year now and since about April 2009, i have used python bindings supplied in a package called python-googleanalytics by Clint Ecker et al. So far, it works quite well.
Here's where to get it: http://github.com/clintecker/python-googleanalytics.
Install it the usual way.
To use it: First, so that you don't have to manually pass in your login credentials each time you access the API, put them in a config file like so:
[Credentials]
google_account_email = youraccount#gmail.com
google_account_password = yourpassword
Name this file '.pythongoogleanalytics' and put it in your home directory.
And from an interactive prompt type:
from googleanalytics import Connection
import datetime
connection = Connection() # pass in id & pw as strings **if** not in config file
account = connection.get_account(<*your GA profile ID goes here*>)
start_date = datetime.date(2009, 12, 01)
end_data = datetime.date(2009, 12, 13)
# account object does the work, specify what data you want w/
# 'metrics' & 'dimensions'; see 'USAGE.md' file for examples
account.get_data(start_date=start_date, end_date=end_date, metrics=['visits'])
The 'get_account' method will return a python list (in above instance, bound to the variable 'account'), which contains your data.
You need 3 files within the app. client_secrets.json, analytics.dat and google_auth.py.
Create a module Query.py within the app:
class Query(object):
def __init__(self, startdate, enddate, filter, metrics):
self.startdate = startdate.strftime('%Y-%m-%d')
self.enddate = enddate.strftime('%Y-%m-%d')
self.filter = "ga:medium=" + filter
self.metrics = metrics
Example models.py: #has the following function
import google_auth
service = googleauth.initialize_service()
def total_visit(self):
object = AnalyticsData.objects.get(utm_source=self.utm_source)
trial = Query(object.date.startdate, object.date.enddate, object.utm_source, ga:sessions")
result = service.data().ga().get(ids = 'ga:<your-profile-id>', start_date = trial.startdate, end_date = trial.enddate, filters= trial.filter, metrics = trial.metrics).execute()
total_visit = result.get('rows')
<yr save command, ColumnName.object.create(data=total_visit) goes here>

Delete all data for a kind in Google App Engine

I would like to wipe out all data for a specific kind in Google App Engine. What is the
best way to do this?
I wrote a delete script (hack), but since there is so much data is
timeout's out after a few hundred records.
I am currently deleting the entities by their key, and it seems to be faster.
from google.appengine.ext import db
class bulkdelete(webapp.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
try:
while True:
q = db.GqlQuery("SELECT __key__ FROM MyModel")
assert q.count()
db.delete(q.fetch(200))
time.sleep(0.5)
except Exception, e:
self.response.out.write(repr(e)+'\n')
pass
from the terminal, I run curl -N http://...
You can now use the Datastore Admin for that: https://developers.google.com/appengine/docs/adminconsole/datastoreadmin#Deleting_Entities_in_Bulk
If I were a paranoid person, I would say Google App Engine (GAE) has not made it easy for us to remove data if we want to. I am going to skip discussion on index sizes and how they translate a 6 GB of data to 35 GB of storage (being billed for). That's another story, but they do have ways to work around that - limit number of properties to create index on (automatically generated indexes) et cetera.
The reason I decided to write this post is that I need to "nuke" all my Kinds in a sandbox. I read about it and finally came up with this code:
package com.intillium.formshnuker;
import java.io.IOException;
import java.util.ArrayList;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import com.google.appengine.api.datastore.Key;
import com.google.appengine.api.datastore.Query;
import com.google.appengine.api.datastore.Entity;
import com.google.appengine.api.datastore.FetchOptions;
import com.google.appengine.api.datastore.DatastoreService;
import com.google.appengine.api.datastore.DatastoreServiceFactory;
import com.google.appengine.api.labs.taskqueue.QueueFactory;
import com.google.appengine.api.labs.taskqueue.TaskOptions.Method;
import static com.google.appengine.api.labs.taskqueue.TaskOptions.Builder.url;
#SuppressWarnings("serial")
public class FormsnukerServlet extends HttpServlet {
public void doGet(final HttpServletRequest request, final HttpServletResponse response) throws IOException {
response.setContentType("text/plain");
final String kind = request.getParameter("kind");
final String passcode = request.getParameter("passcode");
if (kind == null) {
throw new NullPointerException();
}
if (passcode == null) {
throw new NullPointerException();
}
if (!passcode.equals("LONGSECRETCODE")) {
response.getWriter().println("BAD PASSCODE!");
return;
}
System.err.println("*** deleting entities form " + kind);
final long start = System.currentTimeMillis();
int deleted_count = 0;
boolean is_finished = false;
final DatastoreService dss = DatastoreServiceFactory.getDatastoreService();
while (System.currentTimeMillis() - start < 16384) {
final Query query = new Query(kind);
query.setKeysOnly();
final ArrayList<Key> keys = new ArrayList<Key>();
for (final Entity entity: dss.prepare(query).asIterable(FetchOptions.Builder.withLimit(128))) {
keys.add(entity.getKey());
}
keys.trimToSize();
if (keys.size() == 0) {
is_finished = true;
break;
}
while (System.currentTimeMillis() - start < 16384) {
try {
dss.delete(keys);
deleted_count += keys.size();
break;
} catch (Throwable ignore) {
continue;
}
}
}
System.err.println("*** deleted " + deleted_count + " entities form " + kind);
if (is_finished) {
System.err.println("*** deletion job for " + kind + " is completed.");
} else {
final int taskcount;
final String tcs = request.getParameter("taskcount");
if (tcs == null) {
taskcount = 0;
} else {
taskcount = Integer.parseInt(tcs) + 1;
}
QueueFactory.getDefaultQueue().add(
url("/formsnuker?kind=" + kind + "&passcode=LONGSECRETCODE&taskcount=" + taskcount).method(Method.GET));
System.err.println("*** deletion task # " + taskcount + " for " + kind + " is queued.");
}
response.getWriter().println("OK");
}
}
I have over 6 million records. That's a lot. I have no idea what the cost will be to delete the records (maybe more economical not to delete them). Another alternative would be to request a deletion for the entire application (sandbox). But that's not realistic in most cases.
I decided to go with smaller groups of records (in easy query). I know I could go for 500 entities, but then I started receiving very high rates of failure (re delete function).
My request from GAE team: please add a feature to delete all entities of a kind in a single transaction.
Presumably your hack was something like this:
# Deleting all messages older than "earliest_date"
q = db.GqlQuery("SELECT * FROM Message WHERE create_date < :1", earliest_date)
results = q.fetch(1000)
while results:
db.delete(results)
results = q.fetch(1000, len(results))
As you say, if there's sufficient data, you're going to hit the request timeout before it gets through all the records. You'd have to re-invoke this request multiple times from outside to ensure all the data was erased; easy enough to do, but hardly ideal.
The admin console doesn't seem to offer any help, as (from my own experience with it), it seems to only allow entities of a given type to be listed and then deleted on a page-by-page basis.
When testing, I've had to purge my database on startup to get rid of existing data.
I would infer from this that Google operates on the principle that disk is cheap, and so data is typically orphaned (indexes to redundant data replaced), rather than deleted. Given there's a fixed amount of data available to each app at the moment (0.5 GB), that's not much help for non-Google App Engine users.
Try using App Engine Console then you dont even have to deploy any special code
I've tried db.delete(results) and App Engine Console, and none of them seems to be working for me. Manually removing entries from Data Viewer (increased limit up to 200) didn't work either since I have uploaded more than 10000 entries. I ended writing this script
from google.appengine.ext import db
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import wsgiref.handlers
from mainPage import YourData #replace this with your data
class CleanTable(webapp.RequestHandler):
def get(self, param):
txt = self.request.get('table')
q = db.GqlQuery("SELECT * FROM "+txt)
results = q.fetch(10)
self.response.headers['Content-Type'] = 'text/plain'
#replace yourapp and YouData your app info below.
self.response.out.write("""
<html>
<meta HTTP-EQUIV="REFRESH" content="5; url=http://yourapp.appspot.com/cleanTable?table=YourData">
<body>""")
try:
for i in range(10):
db.delete(results)
results = q.fetch(10, len(results))
self.response.out.write("<p>10 removed</p>")
self.response.out.write("""
</body>
</html>""")
except Exception, ints:
self.response.out.write(str(inst))
def main():
application = webapp.WSGIApplication([
('/cleanTable(.*)', CleanTable),
])
wsgiref.handlers.CGIHandler().run(application)
The trick was to include redirect in html instead of using self.redirect. I'm ready to wait overnight to get rid of all the data in my table. Hopefully, GAE team will make it easier to drop tables in the future.
The official answer from Google is that you have to delete in chunks spread over multiple requests. You can use AJAX, meta refresh, or request your URL from a script until there are no entities left.
The fastest and efficient way to handle bulk delete on Datastore is by using the new mapper API announced on the latest Google I/O.
If your language of choice is Python, you just have to register your mapper in a mapreduce.yaml file and define a function like this:
from mapreduce import operation as op
def process(entity):
yield op.db.Delete(entity)
On Java you should have a look to this article that suggests a function like this:
#Override
public void map(Key key, Entity value, Context context) {
log.info("Adding key to deletion pool: " + key);
DatastoreMutationPool mutationPool = this.getAppEngineContext(context)
.getMutationPool();
mutationPool.delete(value.getKey());
}
One tip. I suggest you get to know the remote_api for these types of uses (bulk deleting, modifying, etc.). But, even with the remote api, batch size can be limited to a few hundred at a time.
Unfortunately, there's no way to easily do a bulk delete. Your best bet is to write a script that deletes a reasonable number of entries per invocation, and then call it repeatedly - for example, by having your delete script return a 302 redirect whenever there's more data to delete, then fetching it with "wget --max-redirect=10000" (or some other large number).
With django, setup url:
url(r'^Model/bdelete/$', v.bulk_delete_models, {'model':'ModelKind'}),
Setup view
def bulk_delete_models(request, model):
import time
limit = request.GET['limit'] or 200
start = time.clock()
set = db.GqlQuery("SELECT __key__ FROM %s" % model).fetch(int(limit))
count = len(set)
db.delete(set)
return HttpResponse("Deleted %s %s in %s" % (count,model,(time.clock() - start)))
Then run in powershell:
$client = new-object System.Net.WebClient
$client.DownloadString("http://your-app.com/Model/bdelete/?limit=400")
If you are using Java/JPA you can do something like this:
em = EntityManagerFactoryUtils.getTransactionalEntityManager(entityManagerFactory)
Query q = em.createQuery("delete from Table t");
int number = q.executeUpdate();
Java/JDO info can be found here: http://code.google.com/appengine/docs/java/datastore/queriesandindexes.html#Delete_By_Query
Yes you can:
Go to Datastore Admin, and then select the Entitiy type you want to delete and click Delete.
Mapreduce will take care of deleting!
On a dev server, one can cd to his app's directory then run it like this:
dev_appserver.py --clear_datastore=yes .
Doing so will start the app and clear the datastore. If you already have another instance running, the app won't be able to bind to the needed IP and therefore fail to start...and to clear your datastore.
You can use the task queues to delete chunks of say 100 objects.
Deleting objects in GAE shows how limited the Admin capabilities are in GAE. You have to work with batches on 1000 entities or less. You can use the bulkloader tool that works with csv's but the documentation does not cover java.
I am using GAE Java and my strategy for deletions involves having 2 servlets, one for doing the actually delete and another to load the task queues. When i want to do a delete, I run the queue loading servlet, it loads the queues and then GAE goes to work executing all the tasks in the queue.
How to do it:
Create a servlet that deletes a small number of objects.
Add the servlet to your task queues.
Go home or work on something else ;)
Check the datastore every so often ...
I have a datastore with about 5000 objects that i purge every week and it takes about 6 hours to clean out, so i run the task on Friday night.
I use the same technique to bulk load my data which happens to be about 5000 objects, with about a dozen properties.
This worked for me:
class ClearHandler(webapp.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
q = db.GqlQuery("SELECT * FROM SomeModel")
self.response.out.write("deleting...")
db.delete(q)
Thank you all guys, I got what I need. :D
This may be useful if you have lots db models to delete, you can dispatch it in your terminal. And also, you can manage the delete list in DB_MODEL_LIST yourself.
Delete DB_1:
python bulkdel.py 10 DB_1
Delete All DB:
python bulkdel.py 11
Here is the bulkdel.py file:
import sys, os
URL = 'http://localhost:8080'
DB_MODEL_LIST = ['DB_1', 'DB_2', 'DB_3']
# Delete Model
if sys.argv[1] == '10' :
command = 'curl %s/clear_db?model=%s' % ( URL, sys.argv[2] )
os.system( command )
# Delete All DB Models
if sys.argv[1] == '11' :
for model in DB_MODEL_LIST :
command = 'curl %s/clear_db?model=%s' % ( URL, model )
os.system( command )
And here is the modified version of alexandre fiori's code.
from google.appengine.ext import db
class DBDelete( webapp.RequestHandler ):
def get( self ):
self.response.headers['Content-Type'] = 'text/plain'
db_model = self.request.get('model')
sql = 'SELECT __key__ FROM %s' % db_model
try:
while True:
q = db.GqlQuery( sql )
assert q.count()
db.delete( q.fetch(200) )
time.sleep(0.5)
except Exception, e:
self.response.out.write( repr(e)+'\n' )
pass
And of course, you should map the link to model in a file(like main.py in GAE), ;)
In case some guys like me need it in detail, here is part of main.py:
from google.appengine.ext import webapp
import utility # DBDelete was defined in utility.py
application = webapp.WSGIApplication([('/clear_db',utility.DBDelete ),('/',views.MainPage )],debug = True)
To delete all entities in a given kind in Google App Engine you only need to do as follows:
from google.cloud import datastore
query = datastore.Client().query(kind = <KIND>)
results = query.fetch()
for result in results:
datastore.Client().delete(result.key)
In javascript, the following will delete all the entries for on page:
document.getElementById("allkeys").checked=true;
checkAllEntities();
document.getElementById("delete_button").setAttribute("onclick","");
document.getElementById("delete_button").click();
given that you are on the admin-page (.../_ah/admin) with the entities you want to delete.

Categories

Resources