Just that: Is there a way to copy a azure table (with SAS connection) to a db on Microsoft SQL Server? It could be possible with python?
Thank you all!
I've tried on SSIS visual studio 2019 with no success
You can use **azure data factory ** or azure synapse to copy the data from azure table storage to azure SQL database. Refer MS document on Introduction to Azure Data Factory - Azure Data Factory | Microsoft Learn if you are new to data factory.
Refer MS document on Copy data to and from Azure Table storage - Azure Data Factory & Azure Synapse | Microsoft Learn.
I tried to repro this in my environment.
Linked services are for Azure table storage and azure sql database.
In the linked service for azure table storage's Authentication method, SAS URI is selected and URL and token is given.
Similarly, linked service for Azure Sql databse is created by giving server name, database name, username and password.
Then Copy activity is taken and source dataset for table storage is created and given the same in source settings.
Similarly, sink dataset is created.
Once source and sink datasets are configured in copy activity, pipeline is run to copy data from table storage to Azure SQL DB.
By this way, Data can be copied from azure table storage with SAS key to Azure SQL Database.
Related
Was trying to google how to obtain some read/write usage statistics for tables in a database with spark sql, no success however.
It can be as simple as:
table1 | 3 times this month
table2 | 4 times this month
Or any other more specific statistics will do.
I'm not an owner of the TAC cluster, so don't have a detailed access to driver logs.
thanks.
Configure diagnostic log delivery
Log in to the Azure portal as an Owner or Contributor for the Azure Databricks workspace and click your Azure Databricks Service resource.
In the Monitoring section of the sidebar, click the Diagnostic settings tab.
Click Turn on diagnostics.
On the Diagnostic settings page, provide the following configuration:
Name
Enter a name for the logs to create.
Archive to a storage account
Refer - https://learn.microsoft.com/en-us/azure/architecture/databricks-monitoring/application-logs
I am trying to connecto to azure table storage from Databricks. I can't seem to find any resources that doesn't go to blob containers, but I have tried modifying it for tables.
spark.conf.set(
"fs.azure.account.key.accountname.table.core.windows.net",
"accountkey")
blobDirectPath = "wasbs://accountname.table.core.windows.net/TableName"
df = spark.read.parquet(blobDirectPath)
I am making an assumption for now that tables are parquet files. I am getting authentication errors on this code now.
According to my research, Azure Databricks does not support the data source of Azure table storage. For more details, please refer to https://docs.azuredatabricks.net/spark/latest/data-sources/index.html.
Besides if you still want to use table storage, you can use Azure Cosmos DB Table API. But they have some differences. For more details, please refer to https://learn.microsoft.com/en-us/azure/cosmos-db/faq#where-is-table-api-not-identical-with-azure-table-storage-behavior.
I'm using pydocumentdb to upload some processed data to CosmosDB as a Document on Azure Cloud with a Python script. The files are coming from the same source. The ingestion works well with some files, but gives the next error for the files that are greater than 1000 KB:
pydocumentdb.errors.HTTPFailure: Status code: 413
"code":"RequestEntityTooLarge","message":"Message: {\"Errors\":[\"Request
size is too large\"]
I'm using SQL API and this is how I create the document inside a Collection:
client = document_client.DocumentClient(uri, {'masterKey': cosmos_key})
... I get the Db link and Collection link ...
Client.CreateDocument(collection_link, data)
How can I solve this error?
Per my experience, to store some large size data or files on Azure Cosmos DB, the best practice is to upload data to Azure Blob Storage or other external storages and create an attachment with its references or associated metadata in a document on Azure Cosmos DB.
You can refer to the REST API for Attachments to know it and achieve the feature of your needs using the methods of PyDocument API include CreateAttachment, ReplaceAttachment, QueryAttachments and so on.
Hope it helps.
I need to fetch tables from a dataset using a service account in JSON format.
I have got the list of datasets from one project using the following code snippet:
client = bigquery.Client.from_service_account_json(path)
datasets = list(client.list_datsets())
Now I need the list of all tables from any particular dataset.
I do not have IAM rights.
Therefore I'm using a service account.
I do not really understand what you mean when you say you use Service Account credentials because you do not have IAM rights. But I presume that you have a JSON-encoded key for a Service Account with the right permissions to access the data that you want to retrieve.
As of writing, the latest version of the BigQuery Python Client Library google.cloud.bigquery is 0.32.0. In this version of the library, you can use the list_tables() function to list the tables in a given dataset. To do so, you will have to pass as an argument to that function the reference to a dataset. Below I share a small code example that lists the datasets (and their nested tables) for the project to which the Service Account in use has access:
from google.cloud import bigquery
client = bigquery.Client.from_service_account_json("/path/to/key.json")
for dataset in client.list_datasets():
print(" - {}".format(dataset.dataset_id))
dataset_ref = client.dataset(dataset.dataset_id)
for table in client.list_tables(dataset_ref):
print(" - {}".format(table.table_id))
The output is something like:
- dataset1
- table1
- table2
- dataset2
- table1
You can find more information about the Dataset class and the Table class in the documentation too. Also here you have the general page of the documentation for the BigQuery Python Client Library.
I'm trying to create an AWS Lambda webservice that takes a payload with a new username / password to create a new database and user in an RDS instance.
I'd like to use Boto3 to accomplish this, but I can't seem to find any documentation for this function.
Is this possible using this setup?
Currently AWS SDKs for RDS(Including Boto3 SDK) does not support this nor the AWS CLI.
Its because, creating DB users unique to each DB instance type (mysql, oracle & etc).
The option you have is to run a DDL query using your respective database driver.
http://boto3.readthedocs.io/en/latest/reference/services/rds.html#RDS.Client.generate_db_auth_token documents how create an auth token for connecting to an RDS instance and http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.html covers other setup details.