how can I hydrate a dataframe of tweets id's in python - python

I have a dataframe in pyspark called tweets and I have in it the column "tweet_id" and i want to get the full tweets using their id and put them in a new dataframe (can i do that using tweepy,twarc or twython?).

I recommend you to use Tweepy for handeling tweets with Python. Here are the docs.
If you want to hydrate a tweet id with its metadata there is a method for that purpose: api.get_status(). As the docs describes:
Returns a single status specified by the ID parameter.
You'll find the information about the method and all the parameters available here.

Related

Follow some format to get data from db and fetch to another table

Ok I have table call restapi information and I have a json format containing the restapi format I want to pull the data from db using sql and fetch to another table using python but I only need what is in restapi format not all the records and I also want to create a unique Id column to the same table using python can I do that? Should I use Django
SELECT * FROM Customers WHERE Last_Name='Smith';
SELECT First_Name, Nickname FROM Friends WHERE Nickname LIKE '%brain%';
SELECT CustomerName, City FROM Customers; Try it Yourself »
I want to extract data from these queries to match the restapi json format which mens only what needed and fetch this values and thier names to table which calls restapi table for each value and it’s name using python
After this I want to create columns contained unique Id for each value and keep adding in case want to add anything to table later
Yes, you can. No, you don't need to use Django for that!

Twitter API search query grouping

I am trying to use a search query that either matches where the tweet contains one of multiple hashtags or references a specific account:
"query=(%23food%20OR%20%23tastyt%20OR%20%23delicious%20OR%20%23yum)%20OR%20(%40TwitterFood)%20-is:retweet"
but for some reason this only returns results for tweets referencing the specific account? Anybody able to help with this

User ID's returned from Tweepy are huge numbers that don't link to any accounts

I have been downloading tonnes of tweets for the last few weeks. In order to reduce download time, I only saved tweet user ids not the user account. I need to pass them through a bot check but have now realised that 90% of the user ids are huge numbers (e.g. 1.25103113308656E+018) and cannot be used to search for the account.
Is there a way to convert these back to an account number?
Notes:
The tweet_id column is an equally huge, different number meaning they haven't been read into the wrong column.
When I raise them from the e notation into their raw number it still doesn't work.
I am limited by the week window of the twitter api so I must find a way of linking the data I have already got to individual accounts. This work is for a charitable cause and your help would be greatly appreciated.
The Tweepy API call returns a Response which contains the data in the _json field. You can parse the user key of the said json and extract the IDs and the screen name of the user and store it.
Then you can query the Tweepy api again as per their doc to get the user information.
Please make a note that when you store the ID field, you have to cast it to the String datatype.

Dialogflow - Retrieve data between timperiod from a Googlesheet

I use an API named Sheetsu to retrieve data from a Googlesheet.
Here is an example of sheetsu in python to retreive data depending the parameter date:
data = client.search(sheet="Session", date=03-06-2018)
So this code allow me to retrieve every rows from my sheet call session where colomn date equal 03-06-2018.
What i don't manage to do is to retrieve, with a time value like 16:30:00, every row where the value 16:30:00 is between 2 datetime.
So i would like to know if it is possible to retrieve the data with sheetsu or i should use an another API or if i could use a librairies like datetime to pick the data from sheetsu.
Google Sheets API is available and there's a Dialogflow-Google Sheets sample up on Github I'd recommend taking a look at to get started. You'll need to ensure that your service account client_email has permission to access that specific spreadsheetId of interest. This sample goes into the necessary auth steps but take a look at the Sheets documentation as well.

Salesforce SOQL for retrieving cases

I am using pythons simple-salesforce to fetch information from sf. My requirement is to pul data about cases in queues set up by my organisation. I am unable to gather information from queues,cause I don’t know how to access them using SOQL,is there any online resource that might help me get there?
You can query queues from the Group object.
Select Id from Group where Type = 'Queue'
For a specific queue you can use:
Select Id from Group where Type = 'Queue' AND NAME = 'QueueName'
You can find all available fields for the Group object here: https://developer.salesforce.com/docs/atlas.en-us.api.meta/api/sforce_api_objects_group.htm
You should check out workbench.developerforce.com which lets you write SOQL queries and run them against SF objects.
So write your query there first, then replicate in Python

Categories

Resources