I need a solution to run python (.py) scripts from oracle (pl/sql). Is there any solution?
For example: I have a python script to send gmail and create Excel spreadsheet from Oracle database. But I have to call this with Oracle, and I also have to use parameters from Oracle.
DBMS_SCHEDULER might be of use.
First create a shell script that is a wrapper for your Python.
Then create the job.
begin
dbms_scheduler.create_program
(
program_name => 'PYEXCEL',
program_type => 'EXECUTABLE',
program_action => '/the_path/the_py_script_wrapper.ks',
enabled => TRUE,
comments => 'Call Python stuff'
);
end;
/
Note, jobs can be configured with parameters in case your script needs these.
Then run:
BEGIN
DBMS_SCHEDULER.RUN_JOB(
JOB_NAME => 'PYEXCEL',
USE_CURRENT_SESSION => FALSE);
END;
/
This is the 'purest' PLSQL only way I think.
TenG's method is the easiest path to what you are looking for but another method can be found using OS_COMMAND
http://plsqlexecoscomm.sourceforge.net/plsqldoc/os_command.html
This (very short) answer to a similar question provides one more possible solution using Jython inerpreter. May be you can find other answers in that thread helpful.
It is possible to call a Java method inside your PL/SQL that will load and run your .py script using a Jython Interpreter. It lacks an example though.
This article provides an example of how to run python code from java using the so called ProcessBuilder and any arbitrary python interpreter installed in the system.
Here another example of ProcessBuilder used to run python from Java which in turn can be run from PL/SQL.
Hope this helps.
Generally the database is isolated from the OS for security reasons. There are a couple of workarounds (*):
One is to write an external procedure which calls OS c code.
One is write a Java Stored Procedure which mimics an OS host command and runs a shell script. Find out more
I think the second option is better for your purposes. In either case you will need to persuade your DBA / security team to allow the granting of the required privileges.
Alternatively Oracle has an inbuilt package UTL_MAIL to send email from PL/SQL and there are third-party PL/SQL libraries which allow us to generate Excel spreadsheets from inside the database. These may be more suitable to your situation (depending on how much you need to re-use your python code).
The other alternative is drive the whole thing from python programs, and just connect to the database to get the data you need.
(*) For completeness, there is a third way to execute OS shell scripts from the database. We can attach pre-processor scripts to external tables which get run whenever we select from the external table. Find out more. But I don't think external tables are relevant in this scenario. And of course external tables also need the granting of OS privileges to the database, so it doesn't avoid that conversation with your DBA / security team.
Related
I have a question and hope someone can direct me in the right direction; Basically every week I have to run a query (SSMS) to get a table containing some information (date, clientnumber, clientID, orderid etc) and then I copy all the information and that table and past it in a folder as a CSV file. it takes me about 15 min to do all this but I am just thinking can I automate this, if yes how can I do that and also can I schedule it so it can run by itself every week. I believe we live in a technological era and this should be done without human input; so I hope I can find someone here willing to show me how to do it using Python.
Many thanks for considering my request.
This should be pretty simple to automate:
Use some database adapter which can work with your database, for MSSQL the one delivered by pyodbc will be fine,
Within the script, connect to the database, perform the query, parse an output,
Save parsed output to a .csv file (you can use csv Python module),
Run the script as the periodic task using cron/schtask if you work on Linux/Windows respectively.
Please note that your question is too broad, and shows no research effort.
You will find that Python can do the tasks you desire.
There are many different ways to interact with SQL servers, depending on your implementation. I suggest you learn Python+SQL using the built-in sqlite3 library. You will want to save your query as a string, and pass it into an SQL connection manager of your choice; this depends on your server setup, there are many different SQL packages for Python.
You can use pandas for parsing the data, and saving it to a ~.csv file (literally called to_csv).
Python does have many libraries for scheduling tasks, but I suggest you hold off for a while. Develop your code in a way that it can be run manually, which will still be much faster/easier than without Python. Once you know your code works, you can easily implement a scheduler. The downside is that your program will always need to be running, and you will need to keep checking to see if it is running. Personally, I would keep it restricted to manually running the script; you could compile to an ~.exe and bind to a hotkey if you need the accessibility.
I would like to be able to run a python script only at the end of a specific DBT model.
My idea is to use post_hook parameter from config() function of that specific model.
Is there a way to do this?
You cannot do this today. dbt does not provide a Python runtime.
Depending on how you deploy dbt, you could use fal for this (either open source or cloud): https://fal.ai/, or another (heavier) orchestrator, like Airflow, Dagster, or Prefect.
You should also know that there is an active Discussion about External Nodes and/or executable exposures that would solve for this use case: https://github.com/dbt-labs/dbt-core/discussions/5073
dbt is also planning to release Python-language models in the near future, but that is unlikely to solve this use case; that Python will be executed in your Warehouse environment, and may or may not be able to make arbitrary web requests (e.g., Snowpark is really just dataframe-python that gets transpiled to SQL)
I have downloaded and installed the Perforce API for Python.
I'm able to run the examples on this page:
http://www.perforce.com/perforce/doc.current/manuals/p4script/03_python.html#1127434
But unfortunately the documentation seems incomplete. For example, the P4 class has a method called run_sync, but it's not documented anywhere (in fact, it doesn't even show up if you run dir(p4) in the Python interactive interpreter, despite the fact that you can use the method just fine in the interactive interpreter.)
So I'm struggling with figuring out how to use the API for anything beyond the trivial examples on the page I linked to above.
I would like to write a script which simply downloads the latest revision of a subdirectory to the filesystem of the computer running it and does nothing else. I don't want the server to change in any way. I don't want there to be any indication that the files came from Perforce (as opposed to if you get the files via the Perforce application, it'll mark the files in your file system as read only until you check them out or whatever. That's silly - I just need to pull down a snapshot of what the subdirectory looked like at the moment the script was run.)
The Python API follows the same basic structure as the command line client (both are very thin wrappers over the same underlying API), so you'll want to look at the command line client documentation; for example, look at "p4 sync" to understand how "run_sync" in P4Python works:
http://www.perforce.com/perforce/r14.2/manuals/cmdref/p4_sync.html
For the task you're describing I would do the following (I'll describe it in terms of Perforce commands since my Python is a little rusty; once you know what commands you're running it should be pretty simple to translate into Python, since the P4Python doc has examples of things like creating and modifying a client spec, which is the hardest part):
1) Create a client that maps the desired depot directory to the desired local filesystem location, e.g. if you want the directory "//depot/foo/..." downloaded to "/usr/team/foo" you'd make a client that looks like:
Client: mytempclient123847
Root: /usr/team/foo
View:
//depot/foo/... //mytempclient123847/...
You should set the "allwrite" option on the client since you said don't want the synced files to be read-only:
Options: allwrite noclobber nocompress unlocked nomodtime rmdir
2) Sync, using the "-p" option to minimize server impact (the server will not record that you "have" the files).
3) Delete the client.
(I'm omitting some details like making sure that you're authenticated correctly -- that's a whole other potential challenge depending on your server's security and whether it's using external authentication, but it sounds like that's not the part you're having trouble with.)
What I really want to do is determine whether a particular file in the MSI exists and contains a particular string.
My current idea is to run:
db = msilib.OpenDatabase('c:\Temp\myfile.msi',1)
query = "select * from File"
view = db.OpenView(query)
view.Execute(None)
cur_record = view.Fetch() # do this until I get the record I want
print cur_record.GetString(3) # do stuff with this value
And then if it's there, extract all the files using
msiexec /a c:\Temp\myfile.msi /qn TARGETDIR=c:\foo
and use whatever parser to see whether my string is there. But I'm hoping a less clunky way exists.
Note that, as the docs for msilib say, "Support for reading .cab files is currently not implemented". And. more generally, the library is designed for building .msi files, not reading them. And there is nothing else in the stdlib that will do what you want.
So, there are a few possibilities:
Find and install another library, like pycabinet. I know nothing about this particular library; it's just the first search hit I got; you probably want to search on your own. But it claims to provide a zipfile-like API for CAB files, which sounds like exactly the part you're missing.
Use win32com (if you've got pywin32) or ctypes (if you're a masochist) to talk to the underlying COM interfaces and/or the classic Cabinet API (which I think is now deprecated, but still works).
Use IronPython instead of CPython, so you can use the simpler .NET interfaces.
Since I don't have a Windows box here, I can't test this, but here's a sketch of Christopher Painter's .NET solution written in IronPython instead of C#:
import clr
clr.AddReference('Microsoft.Deployment.WindowsInstaller')
clr.AddReference('Microsoft.Deployment.WindowsInstaller.Package')
from Microsoft.Deployment.WindowsInstaller import *
from Microsoft.Deployment.WindowsInstaller.Package import *
def FindAndExtractFiles(packagePath, longFileName):
with InstallPackage(packagePath, DatabaseOpenMode.ReadOnly) as installPackage:
if installPackage.FindFiles(longFileName).Count() > 0:
installPackage.ExtractFiles()
Realize that in using Python you have to deal with the Windows Installer (COM) Automation interface. This means you have to do all the database connections, querying and processing yourself.
If you could move to C# ( or say PowerShell ) you could leverage some higher level classes that exist in Windows Installer XML (WiX) Deployment Tools Foundation (DTF).
using Microsoft.Deployment.WindowsInstaller;
using Microsoft.Deployment.WindowsInstaller.Package;
static void FindAndExtractFiles(string packagePath, string longFileName)
{
using (var installPackage = new InstallPackage(packagePath, DatabaseOpenMode.ReadOnly))
{
if(installPackage.FindFiles(longFileName).Count() > 0 )
installPackage.ExtractFiles();
}
}
You could also write this as ComVisible(True) and call it from Python.
The MSI APIs are inherently clunky, so it's only a matter of where the abstraction lies. Bear in mind that if you just need this a couple times, it may be easier to browse the cab file(s) manually in Explorer. (Files are stored by file key instead of file name).
Is it possible to use multiple languages along side with ruby. For example, I have my application code in Ruby on Rails. I would like to calculate the recommendations and I would like to use python for that. So essentially, python code would get the data and calculate all the stuff and probably get the data from DB, calculate and update the tables.Is it possible and what do you guys think about its adv/disadv
Thanks
If you are offloading work to an exterior process, you may want to make this a webservice (ajax, perhaps) of some sort so that you have some sort of consistent interface.
Otherwise, you could always execute the python script in a subshell through ruby, using stdin/stdout/argv, but this can get ugly quick.
Depending on your exact needs, you can either call out to an external process (using popen, system, etc) or you can setup another mini-web-server or something along those lines and have the rails server communicate with it over HTTP with a REST-style API (or whatever best suits your needs).
In your example, you have a ruby frontend website and then a number-crunching python backend service that builds up recommendation data for the ruby site. A fairly nice solution is to have the ruby site send a HTTP request to the python service when it needs data updating (with a payload of information to identify what it needs doing to what or some such) and then the python backend service can crunch away and update the table which presumably your ruby frontend will automatically pick up the changes of during the next request and display.
I would use the system command
as such
system("python myscript.py")
An easy, quick 'n' dirty solution in case you have python scripts and you want to execute them from inside rails, is this:
%x[shell commands or python path/of/pythonscript.py #{ruby variables to pass on the script}]
or
``shell commands or python path/of/pythonscript.py #{ruby variables to pass on the script}\ (with ` symbol in the beginning and the end).
Put the above inside a controller and it will execute.
For some reason, inside ruby on rails, system and exec commands didn't work for me (exec crashed my application and system doesn't do anything).