ssh - getting metadata of many remote files

ssh - getting metadata of many remote files - python

there's a remote file system which i can access using ssh.
i need to:
scan this file system to find all the files newer than a given datetime.
retrieve a list of those files' names, size, and modified_time_stamp
some restrictions:
i can't upload a script to this remote server. i can only run commands through ssh
there could be well over 100k of files in the remote server, and this process should happen at least once a minute, so the number of ssh calls should be minimal, and preferably equal to 1
i've already managed to get (1) using this:
`touch -am -t {timestamp} /tmp/some_filename; find {path} -newer /tmp/some_filename; rm /tmp/some_filename')`
and i thought i can move in the direction of piping the results into "xargs ls -l" and then parsing the results to extract the size and timestamp from there, but then i found this article...
also, i'm running the command using python (i.e. it's not just a command line), so it's ok to do some post processing on the results coming from the ssh command

I suggest writing or modifying your python script on the server side as follows:
When no data has been acquired in a while, acquire initial data using the touch/find script you provided and making calls on the found files to get the needed properties
Then, in the python script on the server, subscribe to inotify() data to get updates.
When a remote connects and needs all this data, provide the latest update from combining 1+2
inotify is a system call supported in Linux that allows you to monitor file system events on a directory in real time.
See:
https://serverfault.com/questions/30292/automatic-notification-of-new-or-changed-files-in-a-folder-or-share
http://linux.die.net/man/7/inotify
https://github.com/seb-m/pyinotify

Related

How can I receive the output from a remote (ssh) tail -F command on my local machine with Python?

I am looking for a good way to have a python script that establishes an ssh connection, does a tail -F command on a log file, and receives each line as it is appended to the file in real time so I can further process it and check for certain things. I have tried using tail -1 in an infinite loop, but often times many lines are added almost instantaneously so this is not practical. This is for an automation framework which currently uses Exscript for executing commands over ssh, but I haven't found anything in the docs about this, so I am open to other libraries as well.
I have looked far and wide for many hours for a solution, and nothing is up to date or exactly what I'm looking for.

Run Spyder /Python on remote server

So there are variants of this question - but none quite hit the nail on the head.
I want to run spyder and do interactive analysis on a server. I have two servers , neither have spyder. They both have python (linux server) but I dont have sudo rights to install packages I need.
In short the use case is: open spyder on local machine. Do something (need help here) to use the servers computation power , and then return results to local machine.
Update:
I have updated python with my packages on one server. Now to figure out the kernel name and link to spyder.
Leaving previous version of question up, as that is still useful.
The docker process is a little intimidating as does paramiko. What are my options?

(Spyder maintainer here) What you need to do is to create an Spyder kernel in your remote server and connect through SSH to it. That's the only facility we provide to do what you want.
You can find the precise instructions to do that in our docs.

I did a long search for something like this in my past job, when we wanted to quickly iterate on code which had to run across many workers in a cluster. All the commercial and open source task-queue projects that I found were based on running fixed code with arbitrary inputs, rather than running arbitrary code.
I'd also be interested to see if there's something out there that I missed. But in my case, I ended up building my own solution (unfortunately not open source).
My solution was:
1) I made a Redis queue where each task consisted of a zip file with a bash setup script (for pip installs, etc), a "payload" Python script to run, and a pickle file with input data.
2) The "payload" Python script would read in the pickle file or other files contained in the zip file. It would output a file named output.zip.
3) The task worker was a Python script (running on the remote machine, listening to the Redis queue) that would would unzip the file, run the bash setup script, then run the Python script. When the script exited, the worker would upload output.zip.
There were various optimizations, like the worker wouldn't run the same bash setup script twice in a row (it remembered the SHA1 hash of the most recent setup script). So, anyway, in the worst case you could do that. It was a week or two of work to setup.
Edit:
A second (much more manual) option, if you just need to run on one remote machine, is to use sshfs to mount the remote filesystem locally, so you can quickly edit the files in Spyder. Then keep an ssh window open to the remote machine, and run Python from the command line to test-run the scripts on that machine. (That's my standard setup for developing Raspberry Pi programs.)

Recover previous Python output to terminal

I have a bash script, in Python that runs on a Ubuntu server. Today, I mistakenly closed the Putty window after monitoring that the script ran correctly.
There is some usefull information that was printed during the scrip running and I would like to recover them.
Is there a directory, like /var/log/syslog for system logs, for Python logs?
This scripts takes 24 hours to run, on a very costly AWS EC2 instance, and running it again is not an option.
Yes, I should have printed usefull information to a log file myself, from the python script, but no, I did not do that.

Unless the script has an internal logging mechanism like e.g. using logging as mentioned in the comments, the output will have been written to /dev/stdout or /dev/stderr respectively, in which case, if you did not log the respective data streams to a file for persistent storage by using e.g. tee, your output is lost.

Using python to ssh into multiple servers and grab the file with the same postfix

I normally use a bash script to grab all the files onto local machine and use glob to process all the files. Just wondering what would be the best way to use python (instead of another bash script) to ssh into each server and process those files?
My current program runs as
for filename in glob.glob('*-err.txt'):
input_open = open (filename, 'rb')
for line in input_open:
do something
My files all have the ending -err.txt and the directories where they reside in the remote server have the same name /documents/err/. I am not able to install third party libraries as I don't have the permission.
UPDATE
I am trying to not to scp the files from the server but to read it on the remote server instead..
I want to use a local python script LOCALLY to read in files on remote server.

The simplest way to do it is to use paramico_scp to use ssh copy from the remote server (How to scp in python?)
If you are not allowed to download any libraries, you can create SSH key pair so that connecting to server does not require a password (https://www.debian.org/devel/passwordlessssh). You then can for each file do
import os
os.system('scp user#host:/path/to/file/on/remote/machine /path/to/local/file')
Note that using system is usually considered less portable than using libraries. If you give the script that use system('scp ...') to copy the files and they do not have SSH key pair set up, they will experience problems

Looks like you want to use a local Python script remotely. This has been answered here.

Is it possible to use python to establish a putty ssh session and send some input?

Fist of all, due to Company Policy, Paramiko, or installing anything that requires administrative access to local machine it right out; otherwise I would have just done that.
All I have to work with is python with standard libraries & putty.
I am attempting to automate some tedious work that involves logging into a network device (usually Cisco, occasionally Alcatel-Lucent, or Juniper), running some show commands, and saving the data. (I am planning on using some other scripts to pull data from this file, parse it, and do other things, but that should be irrelevant to the task of retrieving the data.) I know this can be done with telnet, however I need to do this via ssh.
My thought is to use putty's logging ability to record output from a session to a file. I would like to use Python to establish a putty session, send scripted log-in and show commands, and then close the session. Before I set out on this crusade, does anyone know of any way to do this? The closest answers I have found to this all suggest to use Paramiko, or other python ssh library; I am looking for a way to do this given the constraints I am under.
The end-result would ideal be able to be used as a function, so that I can iterate through hundreds of devices from a list of ip addresses.
Thank you for your time and consideration.

If you can't use paramiko, and Putty is all you get so the correct tool is actually not Putty - it's his little brother Plink - you can download it here
Plink is the command line tool for Putty and you can your python script to call it using os.system("plink.exe [options] username#server.com [command])
See MAN Page here
Hope it will help,
Liron

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.