I am trying to use the Kaggle api. I have downloaded kaggle using pip and moved kaggle.json to ~/.kaggle, but I haven't been able to run kaggle on Command Prompt. It was not recognized. I suspect it is because I have not accomplished the step "ensure python binaries are on your path", but honestly I am not sure what it means. Here is the error message when I try to download a dataset:
>>> sys.version
'3.9.1 (tags/v3.9.1:1e5d33e, Dec 7 2020, 17:08:21) [MSC v.1927 64 bit (AMD64)]'
>>> import kaggle
>>> kaggle datasets list -s demographics
File "<stdin>", line 1
kaggle datasets list -s demographics
^
SyntaxError: invalid syntax
kaggle is python module but it should also install script with the same name kaggle which you can run in console/terminal/powershell/cmd.exe as
kaggle datasets list -s demographics
but this is NOT code which you can run in Python Shell or in Python script.
If you find this script kaggle and open it in editor then you can see it imports main from kaggle.cli and it runs main()
And this can be used in own script as
import sys
from kaggle.cli import main
sys.argv += ['datasets', 'list', '-s', 'demographics']
main()
But this method sends results directly on screen/console and it would need assign own class to sys.stdout to catch this text in variable.
Something like this:
import sys
import kaggle.cli
class Catcher():
def __init__(self):
self.text = ''
def write(self, text):
self.text += text
def close(self):
pass
catcher = Catcher()
old_stdout = sys.stdout # keep old stdout
sys.stdout = catcher # assing new class
sys.argv += ['datasets', 'list', '-s', 'demographics']
result = kaggle.cli.main()
sys.stdout = old_stdout # assign back old stdout (because it is needed to run correctly `print()`
print(catcher.text)
Digging in source code on script kaggle I see you can do the same using
import kaggle.api
kaggle.api.dataset_list_cli(search='demographics')
but this also send all directly on screen/console.
EDIT:
You can get result as list of special objects which you can later use with for-loop
import kaggle.api
result = kaggle.api.dataset_list(search='demographics')
for item in result:
print('title:', item.title)
print('size:', item.size)
print('last updated:', item.lastUpdated)
print('download count:', item.downloadCount)
print('vote count:', item.voteCount)
print('usability rating:', item.usabilityRating)
print('---')
I'm trying to change the value of "dockerversion=" in this bash script.
# Docker Variables
containerid=$(docker ps -qf "name=vaultwarden")
imageid=$(docker images -q vaultwarden/server)
dockerversion=1
---------------
# Stop/RM Image
docker stop $containerid
docker rm $containerid
docker rmi $imageid
I'm using python and currently am at
# Pull Portainer Version
url = 'https://github.com/dani-garcia/vaultwarden/releases/latest'
r = requests.get(url)
version = r.url.split('/')[-1]
# Pull Current Version
with open('vaultwarden-update', 'r') as vaultwarden:
fileversion = vaultwarden.readlines()
vcurrentversion = re.sub(r'dockerversion=', '', fileversion[14])
# Check who is higher
if version > vcurrentversion:
with open('vaultwarden-update', '') as vaultwarden:
for line in fileversion[14]:
vaultwarden.write(re.sub(re.escape(vcurrentversion), version))
I basically want python to check the github releases, see if there's a change, compare that number to the bash script variable, update that within the bash-script and run the script.
The # Check who is higher
Won't work as I need to keep the entire other script file. Just mainly looking for ways to append/change a value through python. Dynamically.
Any thoughts?
(this is literally my first python script)
I took your code and changed it just enough to do what you want it to do. You would need to do some value validations, so that you could log and exit before execution comes to the final part where you rewrite your file (you don't want to open and rewrite if there is nothing to rewrite with).
import re
import requests
from distutils.version import LooseVersion
# Pull Portainer Version
url = 'https://github.com/dani-garcia/vaultwarden/releases/latest'
r = requests.get(url)
github_version = r.url.split('/')[-1]
# Pull Current Version
with open('vaultwarden-update') as vaultwarden:
file_content = vaultwarden.read()
file_match = re.search(r'(dockerversion=([0-9.]*))', file_content)
file_version = file_match.group(2)
# Check who is higher
if LooseVersion(github_version) > LooseVersion(file_version):
print(f'github version ({github_version}) > file version ({file_version})')
with open('vaultwarden-update', 'w') as vaultwarden:
new_file_content = file_content.replace(file_match.group(1), f'dockerversion={github_version}')
vaultwarden.write(new_file_content)
Currently for your content, it outputs:
github version (1.23.0) > file version (1)
I wrote a program in Python-3.6.2 on Windows 10. I want get the CPU serial number.
Here is my code:
def getserial():
# Extract serial from cpuinfo file
cpuserial = "0000000000000000"
try:
f = open('/proc/cpuinfo','r')
for line in f:
if line[0:6]=='Serial':
cpuserial = line[10:26]
f.close()
except:
cpuserial = "ERROR000000000"
return cpuserial
print(getserial())
When I run the program, it prints: ERROR000000000.
How do I fix it?
Your code doesn't let any exception raised. So, you don't see the error: There is no '/proc/cpuinfo' file on Windows.
I have rewrite your code like that:
def getserial():
# Extract serial from cpuinfo file
with open('/proc/cpuinfo','r') as f:
for line in f:
if line[0:6] == 'Serial':
return line[10:26]
return "0000000000000000"
First, I have a with statement to use the file context manager: whenever an exception is raised or not, your file will be closed.
And I simplify the loop: if it found a "Serial" entry, it returns the value.
EDIT
If you have python with a version >= 2.6 you can simply use
import multiprocessing
multiprocessing.cpu_count()
http://docs.python.org/library/multiprocessing.html#multiprocessing.cpu_count
EDIT2
The best solution I found to get the "cpuinfo" is with the py-cpuinfo library.
import cpuinfo
info = cpuinfo.get_cpu_info()
print(info)
But, I think that "Serial" entry is not standard. I can't see it on classic systems.
I'm trying to update the script found here to work with IPython 0.13.1, and reached a standstill. The script invokes
import IPython.ipapi
ip = IPython.ipapi.get()
for var in self.magic_who_ls():
try:
pickle.dump(user_ns[var],fout,1)
saved_vars.append(var)
except:
# An object that cannot be pickled was encountered
print("Unable to save object: %s" % var)
I am aware IPython.ipapi was moved to IPython.core.ipapi, expose_magic was renamed to define_magic and magic_who_ls was renamed to who_ls, but I am not being able to invoke who_ls from within the script to get the list of namespace variables. Can anyone give me a hint?
import IPython
ip = IPython.core.ipapi.get()
for var in ip.run_line_magic('who_ls', ''):
# potato
I am trying to obtain the current NoteBook name when running the IPython notebook. I know I can see it at the top of the notebook. What I am after something like
currentNotebook = IPython.foo.bar.notebookname()
I need to get the name in a variable.
adding to previous answers,
to get the notebook name run the following in a cell:
%%javascript
IPython.notebook.kernel.execute('nb_name = "' + IPython.notebook.notebook_name + '"')
this gets you the file name in nb_name
then to get the full path you may use the following in a separate cell:
import os
nb_full_path = os.path.join(os.getcwd(), nb_name)
I have the following which works with IPython 2.0. I observed that the name of the notebook is stored as the value of the attribute 'data-notebook-name' in the <body> tag of the page. Thus the idea is first to ask Javascript to retrieve the attribute --javascripts can be invoked from a codecell thanks to the %%javascript magic. Then it is possible to access to the Javascript variable through a call to the Python Kernel, with a command which sets a Python variable. Since this last variable is known from the kernel, it can be accessed in other cells as well.
%%javascript
var kernel = IPython.notebook.kernel;
var body = document.body,
attribs = body.attributes;
var command = "theNotebook = " + "'"+attribs['data-notebook-name'].value+"'";
kernel.execute(command);
From a Python code cell
print(theNotebook)
Out[ ]: HowToGetTheNameOfTheNoteBook.ipynb
A defect in this solution is that when one changes the title (name) of a notebook, then this name seems to not be updated immediately (there is probably some kind of cache) and it is necessary to reload the notebook to get access to the new name.
[Edit] On reflection, a more efficient solution is to look for the input field for notebook's name instead of the <body> tag. Looking into the source, it appears that this field has id "notebook_name". It is then possible to catch this value by a document.getElementById() and then follow the same approach as above. The code becomes, still using the javascript magic
%%javascript
var kernel = IPython.notebook.kernel;
var thename = window.document.getElementById("notebook_name").innerHTML;
var command = "theNotebook = " + "'"+thename+"'";
kernel.execute(command);
Then, from a ipython cell,
In [11]: print(theNotebook)
Out [11]: HowToGetTheNameOfTheNoteBookSolBis
Contrary to the first solution, modifications of notebook's name are updated immediately and there is no need to refresh the notebook.
As already mentioned you probably aren't really supposed to be able to do this, but I did find a way. It's a flaming hack though so don't rely on this at all:
import json
import os
import urllib2
import IPython
from IPython.lib import kernel
connection_file_path = kernel.get_connection_file()
connection_file = os.path.basename(connection_file_path)
kernel_id = connection_file.split('-', 1)[1].split('.')[0]
# Updated answer with semi-solutions for both IPython 2.x and IPython < 2.x
if IPython.version_info[0] < 2:
## Not sure if it's even possible to get the port for the
## notebook app; so just using the default...
notebooks = json.load(urllib2.urlopen('http://127.0.0.1:8888/notebooks'))
for nb in notebooks:
if nb['kernel_id'] == kernel_id:
print nb['name']
break
else:
sessions = json.load(urllib2.urlopen('http://127.0.0.1:8888/api/sessions'))
for sess in sessions:
if sess['kernel']['id'] == kernel_id:
print sess['notebook']['name']
break
I updated my answer to include a solution that "works" in IPython 2.0 at least with a simple test. It probably isn't guaranteed to give the correct answer if there are multiple notebooks connected to the same kernel, etc.
It seems I cannot comment, so I have to post this as an answer.
The accepted solution by #iguananaut and the update by #mbdevpl appear not to be working with recent versions of the Notebook.
I fixed it as shown below. I checked it on Python v3.6.1 + Notebook v5.0.0 and on Python v3.6.5 and Notebook v5.5.0.
import jupyterlab
if jupyterlab.__version__.split(".")[0] == "3":
from jupyter_server import serverapp as app
key_srv_directory = 'root_dir'
else :
from notebook import notebookapp as app
key_srv_directory = 'notebook_dir'
import urllib
import json
import os
import ipykernel
def notebook_path(key_srv_directory, ):
"""Returns the absolute path of the Notebook or None if it cannot be determined
NOTE: works only when the security is token-based or there is also no password
"""
connection_file = os.path.basename(ipykernel.get_connection_file())
kernel_id = connection_file.split('-', 1)[1].split('.')[0]
for srv in app.list_running_servers():
try:
if srv['token']=='' and not srv['password']: # No token and no password, ahem...
req = urllib.request.urlopen(srv['url']+'api/sessions')
else:
req = urllib.request.urlopen(srv['url']+'api/sessions?token='+srv['token'])
sessions = json.load(req)
for sess in sessions:
if sess['kernel']['id'] == kernel_id:
return os.path.join(srv[key_srv_directory],sess['notebook']['path'])
except:
pass # There may be stale entries in the runtime directory
return None
As stated in the docstring, this works only when either there is no authentication or the authentication is token-based.
Note that, as also reported by others, the Javascript-based method does not seem to work when executing a "Run all cells" (but works when executing cells "manually"), which was a deal-breaker for me.
The ipyparams package can do this pretty easily.
import ipyparams
currentNotebook = ipyparams.notebook_name
On Jupyter 3.0 the following works. Here I'm showing the entire path on the Jupyter server, not just the notebook name:
To store the NOTEBOOK_FULL_PATH on the current notebook front end:
%%javascript
var nb = IPython.notebook;
var kernel = IPython.notebook.kernel;
var command = "NOTEBOOK_FULL_PATH = '" + nb.base_url + nb.notebook_path + "'";
kernel.execute(command);
To then display it:
print("NOTEBOOK_FULL_PATH:\n", NOTEBOOK_FULL_PATH)
Running the first Javascript cell produces no output.
Running the second Python cell produces something like:
NOTEBOOK_FULL_PATH:
/user/zeph/GetNotebookName.ipynb
Yet another hacky solution since my notebook server can change. Basically you print a random string, save it and then search for a file containing that string in the working directory. The while is needed because save_checkpoint is asynchronous.
from time import sleep
from IPython.display import display, Javascript
import subprocess
import os
import uuid
def get_notebook_path_and_save():
magic = str(uuid.uuid1()).replace('-', '')
print(magic)
# saves it (ctrl+S)
display(Javascript('IPython.notebook.save_checkpoint();'))
nb_name = None
while nb_name is None:
try:
sleep(0.1)
nb_name = subprocess.check_output(f'grep -l {magic} *.ipynb', shell=True).decode().strip()
except:
pass
return os.path.join(os.getcwd(), nb_name)
There is no real way yet to do this in Jupyterlab. But there is an official way that's now under active discussion/development as of August 2021:
https://github.com/jupyter/jupyter_client/pull/656
In the meantime, hitting the api/sessions REST endpoint of jupyter_server seems like the best bet. Here's a cleaned-up version of that approach:
from jupyter_server import serverapp
from jupyter_server.utils import url_path_join
from pathlib import Path
import re
import requests
kernelIdRegex = re.compile(r"(?<=kernel-)[\w\d\-]+(?=\.json)")
def getNotebookPath():
kernelId = kernelIdRegex.search(get_ipython().config["IPKernelApp"]["connection_file"])[0]
for jupServ in serverapp.list_running_servers():
for session in requests.get(url_path_join(jupServ["url"], "api/sessions"), params={"token": jupServ["token"]}).json():
if kernelId == session["kernel"]["id"]:
return Path(jupServ["root_dir"]) / session["notebook"]['path']
Tested working with
python==3.9
jupyter_server==1.8.0
jupyterlab==4.0.0a7
Modifying #jfb method, gives the function below which worked fine on ipykernel-5.3.4.
def getNotebookName():
display(Javascript('IPython.notebook.kernel.execute("NotebookName = " + "\'"+window.document.getElementById("notebook_name").innerHTML+"\'");'))
try:
_ = type(NotebookName)
return NotebookName
except:
return None
Note that the display javascript will take some time to reach the browser, and it will take some time to execute the JS and get back to the kernel. I know it may sound stupid, but it's better to run the function in two cells, like this:
nb_name = getNotebookName()
and in the following cell:
for i in range(10):
nb_name = getNotebookName()
if nb_name is not None:
break
However, if you don't need to define a function, the wise method is to run display(Javascript(..)) in one cell, and check the notebook name in another cell. In this way, the browser has enough time to execute the code and return the notebook name.
If you don't mind to use a library, the most robust way is:
import ipynbname
nb_name = ipynbname.name()
If you are using Visual Studio Code:
import IPython ; IPython.extract_module_locals()[1]['__vsc_ipynb_file__']
Assuming you have the Jupyter Notebook server's host, port, and authentication token, this should work for you. It's based off of this answer.
import os
import json
import posixpath
import subprocess
import urllib.request
import psutil
def get_notebook_path(host, port, token):
process_id = os.getpid();
notebooks = get_running_notebooks(host, port, token)
for notebook in notebooks:
if process_id in notebook['process_ids']:
return notebook['path']
def get_running_notebooks(host, port, token):
sessions_url = posixpath.join('http://%s:%d' % (host, port), 'api', 'sessions')
sessions_url += f'?token={token}'
response = urllib.request.urlopen(sessions_url).read()
res = json.loads(response)
notebooks = [{'kernel_id': notebook['kernel']['id'],
'path': notebook['notebook']['path'],
'process_ids': get_process_ids(notebook['kernel']['id'])} for notebook in res]
return notebooks
def get_process_ids(name):
child = subprocess.Popen(['pgrep', '-f', name], stdout=subprocess.PIPE, shell=False)
response = child.communicate()[0]
return [int(pid) for pid in response.split()]
Example usage:
get_notebook_path('127.0.0.1', 17004, '344eb91bee5742a8501cc8ee84043d0af07d42e7135bed90')
To realize why you can't get notebook name using these JS-based solutions, run this code and notice the delay it takes for the message box to appear after python has finished execution of the cell / entire notebook:
%%javascript
function sayHello() {
alert('Hello world!');
}
setTimeout(sayHello, 1000);
More info
Javascript calls are async and hence not guaranteed to complete before python starts running another cell containing the code expecting this notebook name variable to be already created... resulting in NameError when trying to access non-existing variables that should contain notebook name.
I suspect some upvotes on this page became locked before voters could discover that all %%javascript-based solutions ultimately don't work... when the producer and consumer notebook cells are executed together (or in a quick succession).
All Json based solutions fail if we execute more than one cell at a time
because the result will not be ready until after the end of the execution
(its not a matter of using sleep or waiting any time, check it yourself but remember to restart kernel and run all every test)
Based on previous solutions, this avoids using the %% magic in case you need to put it in the middle of some other code:
from IPython.display import display, Javascript
# can have comments here :)
js_cmd = 'IPython.notebook.kernel.execute(\'nb_name = "\' + IPython.notebook.notebook_name + \'"\')'
display(Javascript(js_cmd))
For python 3, the following based on the answer by #Iguananaut and updated for latest python and possibly multiple servers will work:
import os
import json
try:
from urllib2 import urlopen
except:
from urllib.request import urlopen
import ipykernel
connection_file_path = ipykernel.get_connection_file()
connection_file = os.path.basename(connection_file_path)
kernel_id = connection_file.split('-', 1)[1].split('.')[0]
running_servers = !jupyter notebook list
running_servers = [s.split('::')[0].strip() for s in running_servers[1:]]
nb_name = '???'
for serv in running_servers:
uri_parts = serv.split('?')
uri_parts[0] += 'api/sessions'
sessions = json.load(urlopen('?'.join(uri_parts)))
for sess in sessions:
if sess['kernel']['id'] == kernel_id:
nb_name = os.path.basename(sess['notebook']['path'])
break
if nb_name != '???':
break
print (f'[{nb_name}]')
just use ipynbname , which is practical
import ipynbname
nb_fname = ipynbname.name()
nb_path = ipynbname.path()
print(f"{nb_fname=}")
print(f"{nb_path=}")
I found this in https://stackoverflow.com/a/65907473/15497427