I have deployed a GCP cloud function that updates Firestore time_created and time_updated fields in Firestore. My front-end app first creates these fields in Firestore but my function updates them after processing the documents. A snippet of code below generates the timestamp and I use Firestore update function to update the document. There are a few instances where the fields in Firestore will be updated as a dictionary with keys as "seconds" and "nano_seconds" and their values but not as Timestamp. I have been wondering and trying to track down where the issue is coming from. I suspect datetime.now() sometimes does not generate a timestamp value. Help me if you have an idea or seen something like this before. I have attached a snapshot below. The image attached shows an instance of the wrongly formatted date returned from Firestore to my Front-end.
Documents affected have the field showing as this:
time_created: {'seconds': 1637694047.0, 'nanoseconds': 580592000.0}
from datetime import datetime
update_doc = {
u"time_created": datetime.now(),
u"time_updated": datetime.now()
}
Per #mark-tolonen, please don't include images in questions when it's trivial to copy-and-paste the test. Various reasons.
I experienced a different issue with Firestore timestamps and using the Go SDK. When I read your question, I wondered if the issues were related but, I think not.
That said, you can perform some diagnosis. You can emit the Python datetime.now() values of course to ensure you know what's being applied.
You could (!) then use the underlying REST API directly to mimic|repro the calls that your code is making to determine whether the error arises in the API itself or the Python SDK (or your code).
Here's projects.databases.documents.patch which I think underlies the Set. There's also projects.databases.documents.create. In both cases, APIs Explorer provides a way for you to try the API methods in the browser and will yield the e.g. curl equivalents for you.
NOTE
The API requires a parent parameter, defined to be something of the form projects/{project_id}/databases/{databaseId}/documents. Replace project_id with your Project ID and use (default) (with the parenthesis) for the value of {databaseId}.
Related
I'm having a similar problem, but I haven't been able to find a solution.
api.get_user with Tweepy will not give description
I use tweepy (4.8.0) and auth 2.0 (bearer_token)
I tried to load user information by using get_user(...) like this
client = tweepy.Client(
bearer_token=bearer_token
)
result = client.get_user(username="name", user_fields=['created_at'])
I expected to get additional data put in the user_fields, but only simple basic data was passed.
Response(data=<User id=1234 name=NAME username=name>, includes={}, errors=[], meta={})
Maybe I'm missing something or made a mistake?
save me plz...
My answer to that question applies here as well.
From the relevant FAQ section in Tweepy's documentation:
Why am I not getting expansions or fields data with API v2 using Client?
If you are simply printing the objects and looking at that output, the string representations of API v2 models/objects only include the default attributes that are guaranteed to exist.
The objects themselves still include the relevant data, which you can access as attributes or by key, like a dictionary.
I am trying to see if we can pull list of all Salesforce cases that have been deleted using their API using python.
The given below query returns back all Salesforce cases created, but I am trying to see how to retrieve all cases that have been deleted.
SELECT Id FROM Case
I tried doing the below, but it returned no data whereas I know there are deleted cases
SELECT Id FROM Case where isDeleted = true
Queries that include Recycle Bin need to be issued differently. In Apex you need to add "ALL ROWS"
In SOAP API it's queryAll vs normal query call. in REST API it's a different service, also "queryAll".
If you're using simple salesforce it's supposed to be
query = 'SELECT Id FROM Case LIMIT 10'
sf.bulk.Account.query_all(query)
If you're using another library - you'll need to check internals, which API it uses and whether it exposed queryAll to you.
(rememeber that records that are purged from recycle bin don't show up in these queries anymore and then your only hope is something like Data Replication API's getDeleted())
Using jira-python, I want to retrieve the entire changelog for a JIRA issue:
issues_returned = jira.search_issues(args.jql, expand='changelog')
I discovered that for issues with more than 100 entries in their changelog I am only receiving the first 100:
My question is how do I specify a startAt and make another call to get subsequent pages of the changelog (using python-jira)?
From this thread at Atlassian I see that API v3 provides an endpoint to get the change log directly:
/rest/api/3/issue/{issueIdOrKey}/changelog
but this doesn't seem to be accessible via jira-python. I'd like to avoid having to do the REST call directly and authenticate separately. Barring a way to do it directly via jira-python, is there a way to make a 'raw' REST API call from jira-python?
In instances where more than 100 results are present, you'll need to edit the 'startAt' parameter when searching issues:
issues_returned = jira.search_issues(args.jql, expand='changelog', startAt=100)
You'll need to setup a statement that compares the 'total' and 'maxResults' data points, then run another query with a different 'startAt' parameter if the total is higher and append the two together.
Okay, I have watched the video and read the articles in the App Engine documentation (including Using the High Replication Datastore). However I am still completely confused on the practical usage of it. I understand the benefits (from the video) and they sound great. But what I am lacking is a few practical examples. There are plenty of master/slave examples on the web, but very little illustrating (with proper documentation) the high replication datastore. The guestbook code example used in the Using the High Replication Datastore article illustrates the ancestor key by adding a new functionality that the previous guestbook code example does not have (seems you can change guestbook). This just adds to the confusion.
I often use djangoforms on GAE and I was wondering if someone can help me translate all these queries into high replication datastore compatible queries (let's forget for a moment the discussion that not all queries necessarily need to be high replication datastore compatible queries and focus on the example itself).
UPDATE: with high replication datastore compatible queries I refer to queries that always return the latest data and not potential stale data. Using entity groups seems to be the way to go here but as mentioned before, I don't have many practical code examples of how to do this, so that is what I am looking for!
So the queries in this article are:
The main recurring query in this article is:
query = db.GqlQuery("SELECT * FROM Item ORDER BY name")
which we will translate to:
query = Item.all().order('name') // datastore request
validating the form happens like:
data = ItemForm(data=self.request.POST)
if data.is_valid():
# Save the data, and redirect to the view page
entity = data.save(commit=False)
entity.added_by = users.get_current_user()
entity.put() // datastore request
and getting the latest entry from the datastore for populating a form happens like:
id = int(self.request.get('id'))
item = Item.get(db.Key.from_path('Item', id)) // datastore request
data = ItemForm(data=self.request.POST, instance=item)
So what do I/we need to do to make all these datastore requests compatible with the high replication datastore?
One last thing that is also not clear to me. Using ancestor keys, does this have any impact on the model in datastore. For example, in the guestbook code example they use:
def guestbook_key(guestbook_name=None):
return db.Key.from_path('Guestbook', guestbook_name or 'default_guestbook')
However 'Guestbook' does not exist in the model, so how can you use 'db.Key.from_path' on this and why would this work? Does this change how data is stored in the datastore which I need to keep into account when retrieving the data (e.g. does it add another field I should exclude from showing when using djangoforms)?
Like I said before, this is confusing me a lot and your help is greatly appreciated!
I'm not sure why you think you need to change your queries at all. The documentation that you link to clearly states:
The back end changes, but the datastore API does not change at all. You'll use the same programming interfaces no matter which datastore you're using.
The point of that page is just to say that queries may be out of sync if you don't use entity groups. Your final code snippet is just an example of that - the string 'Guestbook' is exactly an ancestor key. I don't understand why you think it needs to exist in the model. Once again, this is unchanged from the non-HR datastore - it has always been the case that keys are built up from paths, which can consist of arbitrary strings. You probably need to reread the documentation on entity groups and keys.
The changes to use the HRD are not in how queries are made, but in what guarantees are made about what data you get back. The example you give:
query = db.GqlQuery("SELECT * FROM Item ORDER BY name")
will work in the HRD as well. The catch (basically) is that this kind of query (using either this syntax, or the Item.all() form) can return objects slightly out-of-date. This is probably not a big deal with the guestbook.
Note that if you're getting an object by key directly, it will never be out-of-date. It's only for queries that you can see this issue. You can avoid this problem with queries by placing all the entities that need to be consistent in a single entity group. Note that this limits the rate at which you can write to the entity group.
In answer to your follow-up question, "Guestbook" is the name of the entity.
I'd like to start by asking for your opinion on how I should tackle this task, instead of simply how to structure my code.
Here is what I'm trying to do: I have a lot of data loaded into a mysql table for a large number of unique names + dates (i.e., where the date is a separate field). My goal is to be able to select a particular name (using rawinput, and perhaps in the future add a drop-down menu) and see a monthly trend, with a moving average, and perhaps other stats, for one of the fields (revenue, revenue per month, clicks, etc). What is your advice - to move this data to an excel workbook via python, or is there a way to display this information in python (with charts that compare to excel, of course)?
Thanks!
Analyze of such data (name,date) could be seen as issuing ad-hoc SQL queries to get timeseries information.
You will 'sample' your information by a date/time frame (day/week/month/year or more detailled by hour/minute) depending of how large is your dataset.
I often use such query where the date field is truncate to the sample rate, in mysql DATE_FORMAT function is cool for that (postgres and oracle use date_trunc and trunc respectivly)
What you want to see in your data is in your your WHERE conditions.
select DATE_FORMAT(date_field,'%Y-%m-%d') as day,
COUNT(*) as nb_event
FROM yourtable
WHERE name = 'specific_value_to_analyze'
GROUP BY DATE_FORMAT(date_field,'%Y-%m-%d');
execute this query and output to a csv file. You could use direct mysql commands for that, but I recommend to make a python script that execute such query, and you can use getopt options for output formatting (with or without columns headers, use different separator than default one, etc). And even you can build dynamically the query based on some options.
To plot such information, look at time series tools. If you have missing data (date that won't appears in result of such sql query) you should take care for the choice. Excel is not the correct one for that, I think (or not master enough it), but could be a start.
Personaly I found dygraph, a javascript library, really cool for time series plotting, and it can be used with a csv file as source. Careful in such configuration, due to crossdomain security constraint, the csv file and html page that display the Dygraph object should be on the same server (or whatever the security constraint of your browser want to accept).
I used to build such webapp using django, as it's my favourite web framework, where I wrap url call as this :
GET /timeserie/view/<category>/<value_to_plot>
GET /timeserie/csv/<category>/<value_to_plot>
The first url call a view that simply output a template file with a variable that reference the url to get the csv file for the Dygraph object :
<script type="text/javascript">
g3 = new Dygraph(
document.getElementById("graphdiv3"),
"{{ csv_url }}",
{
rollPeriod: 15,
showRoller: true
}
);
</script>
The second url call a view that generate the sql query and output the result as text/csv to be rendered by Dygraph.
It's "home made" could stand simple or be extended, run easily on any desktop computer, could be extended to output json format for use by others javascript libraries/framework.
Else there is tool in opensource, related to such reporting (but timeseries capabilities are often not enough for my need) like Pentaho, JasperReport, SOFA. You make the query as datasource inside a report in such tool and build a graph that output timeserie.
I found that today web technique with correct javascript library/framework is really start to be correct to challenge that old fashion of reporting by such classical BI tools and it make things interactive :-)
Your problem can be broken down into two main pieces: analyzing the data, and presenting it. I assume that you already know how to do the data analysis part, and you're wondering how to present it.
This seems like a problem that's particularly well suited to a web app. Is there a reason why you would want to avoid that?
If you're very new to web programming and programming in general, then something like web2py could be an easy way to get started. There's a simple tutorial here.
For a desktop database-heavy app, have a look at dabo. It makes things like creating views on database tables really simple. wxpython, on which it's built, also has lots of simple graphing features.