I am very new to MongoDB. I create a database within a loop. Each time (Every 2 hours), I get data from some sources and create a data collection by MongoEngine and name each collection based on the creation time (for example 05_01_2021_17_00_30).
Now, on another python code , I want to get the latest database. how can I call the latest database collection without knowing the name of it?
I saw some guidelines in Stackoverflow but codes are old and not working now. Thanks guys.
I came up with this answer:
In mongo_setup.py: when I want to create a database, it will be named after the time of creation and save the name in a text file.
import mongoengine
import datetime
def global_init():
nownow = datetime.datetime.now()
Update_file_name = str(nownow.strftime("%d_%m_%Y_%H_%M_%S"))
# For Shaking hand between Django and the last updated data base, export the name
of the latest database
# in a text file and from there, Django will understand which database is the
latest
Updated_txt = open('.\\Latest database to read for Django.txt', '+w')
Updated_txt.write(Update_file_name)
Updated_txt.close()
mongoengine.register_connection(alias='core', name=Update_file_name)
In Django views.py: we will call the text file and read the latest database's name:
database_name_text_file = 'directory of the text file...'
db_name_file = open(database_name_text_file, 'r')
db_name = db_name_file.read()
# MongoDb Database
myclient = MongoClient(port=27017)
mydatabase = myclient[db_name]
classagg = mydatabase['aggregation__class']
database_text = classagg.find()
for i in database_text:
....
Related
I am trying to find by id a document in the database, but I get None. What am I doing wrong?
python:
card = mongo.db['grl'].find_one({'id': 448510476})
or:
card = mongo.db['grl'].find_one({'id': '448510476'})
document:
{"_id":{"$oid":"5f25b1d787fc4c34a7d9aabe"},
"id":{"$numberInt":"448510476"},"first_name":"Arc","last_name":"Fl"}
I'm not sure how you are initializing your database but try this:
from pymongo import MongoClient
client = MongoClient("mongodb://127.0.0.1:27017")
db = client.database #Selecting database named "database"
#find one in collection named "collection"
card = db.collection.find_one({"id": "448510476"})
print(card)
I am trying to get the list of tables and their last_modified_date using bigquery REST API.
In the bigquery API explorer I am getting all the fields correctly but when I use the api from Python code its returning 'None' for modified date.
This is the code written for the same in python
from google.cloud import bigquery
client = bigquery.Client(project='temp')
datasets = list(client.list_datasets())
for dataset in datasets:
print dataset.dataset_id
for dataset in datasets:
for table in dataset.list_tables():
print table.table_id
print table.created
print table.modified
In this code I am getting created date correctly but modified date is 'None' for all the tables.
Not quite sure which version of the API you are using but I suspect the latest versions do not have the method dataset.list_tables().
Still, this is one way of getting last modified field, see if this works for you (or gives you some idea on how to get this data):
from google.cloud import bigquery
client = bigquery.Client.from_service_account_json('/key.json')
dataset_list = list(client.list_datasets())
for dataset_item in dataset_list:
dataset = client.get_dataset(dataset_item.reference)
tables_list = list(client.list_tables(dataset))
for table_item in tables_list:
table = client.get_table(table_item.reference)
print "Table {} last modified: {}".format(
table.table_id, table.modified)
If you want to get the last modified time from only one table:
from google.cloud import bigquery
def get_last_bq_update(project, dataset, table_name):
client = bigquery.Client.from_service_account_json('/key.json')
table_id = f"{project}.{dataset}.{table_name}"
table = client.get_table(table_id)
print(table.modified)
I am writing a python script to query about 60 database tables based on a current timestamp and store those as csv file in S3 bucket. There are some global variables that I need access to like engine, aws credentials, current_time etc. I have this file currently as 60 functions each querying a table which then calls a function to write into s3.
How do I organize this code better so I won't have to call these 60 functions from the main function?
More importantly, how do I also organize this code following OOP. I am very new to this and any help would be greatly appreciated.
This is what my current code looks like:
import (bunch of imports)
engine = create_engine('sqlite:///bookdatabase.db', echo=False)
access_key = 'adasdasdasdasd'
access_id = 'asdasdasd'
def table_name():
table_name = 'book'
sql = "select * from book where modified_date < current_date"
mn = pandas.read_sql(sql, engine)
# write_to_s3
def another_table_name():
# .....
# etc. etc.
Functions that do the same thing, only with a single variance are a clue that really those actions can be combined into a better structure.
In your case, you are doing the same thing (calling a database, and updating a bucket), the difference is you call different databases, and read different tables.
So why not create a function like this:
S3_ACCESS_KEY = '....'
S3_ACCESS_ID = '....'
def export_to_s3(db_configuration):
for db, tables in db_configuration.items():
engine = create_engine('sqlite://{}'.format(db), echo=False)
for table_name in tables:
sql = "SELECT * FROM {} WHERE modified_date <
current_date".format(table_name)
cursor = engine.cursor()
cursor.execute(sql)
for result in cursor:
# push result to s3
db_table_names = {'bookdatabase.db': ['book'],
'another.db': ['fruits', 'planets']}
export_to_s3(db_table_names)
My problem is that SQLAlchemy seems to be writing the text not properly encoded in my Oracle database.
I include fragments of the code below:
engine = create_engine("oracle://%s:%s#%s:%s/%s?charset=utf8"%(db_username, db_password, db_hostname,db_port, db_database), encoding='utf8')
connection = engine.connect()
session = Session(bind = connection)
class MyClass(DeclarativeBase):
"""
Model of the be persisted
"""
__tablename__ = "enconding_test"
id = Column(Integer, Sequence('encoding_test_id_seq'),primary_key = True)
blabla = Column(String(255, collation='utf-8'), default = '')
autoload = True
content = unicode("äüößqwerty","utf_8")
t = MyClass(blabla=content.encode("utf_8"))
session.add(t)
session.commit()
If now I read the contents of the database, I get printed something like:
????????qwerty
instead of the original:
äüößqwerty
So basically my question is what do I have to do, to properly store these German characters in the database?
Thanks in advance!
I found a related topic, that actually answers my question:
Python 2.7 connection to oracle loosing polish characters
You simply add the following line, before creating the database connection:
os.environ["NLS_LANG"] = "GERMAN_GERMANY.UTF8"
Additional documentation about which strings you need for different languages are found at the Oracle website:
Oracle documentation on Unicode Support
I have created a Django app and tested the application's performance by populating some 10,0000 records. Now i want to delete it using a python script. Can somebody help me in doing this. This is the script i created to populate data into sql db.
def dumpdata():
for i in range(2,10):
userName = "Bryan"
designation = 'Technician '
employeeID = 2312
dateOfJoin = '2009-10-10'
EmployeeDetails(userName= "Bryan",designation= 'Technician',employeeID= 2312,dateOfJoin= '2009-10-10').save()
dumpdata()
QuerySet.delete()
EmployeeDetails.objects.filter(...).delete()