Accessing local files using online Jupyter - python

I am using a locked down system where I cannot install any applications including Anaconda or any other python.
Anybody knows if it is possible to access local files from a jupyter online solution? I know it would probably be slow as the file would have to be moved back and forth?
Thanks

Yes, you can use your local files from a Jupyter online solution doing what you say of moving them back and forth. (The remote server cannot connect to your local system itself beyond the browser sandbox, and so concerns like Chris mentions aren't an issue.)
I can demonstrate this easily:
Go to here and click on the launch binder badge you see.
A temporary session backed by MyBinder.org will spin up. Depending where you are in the world, you may be on a machine run by Jupyter folks via Google or another member of this service backed by folks in a Federation that believe this a valuable service to offer to empower Jupyter users.
After the session comes up, you'll be in JupyterLab interface. You'll see a file/directory navigation pane on the left side. You can click and drag a file on your local computer and drop it in that pane. You should see it show up on the remote directory.
You should be able to open and edit it. Depending on what it is or what you convert it to. You can even run it.
Of course you can make a new notebook on the remote session and save it. Then after saving it, download it back to your local machine by right-clicking on the icon for it in the file navigation pane and selecting 'Download'.
If you prefer to work in the classic Jupyter notebook interface, you can go to 'Help' and select 'Launch Classic Notebook' from the menu. The classic Jupyter Dashboard will come up. You will need to upload things to there using the upload button as drag and drop only works in JupyterLab. You can download back to your local computer from the dashboard or when you have a notebook open, you can use the file menu to download back to your local machine, too.
Make sure you save anything you make that is useful back to your machine as the sessions are temporary and will time out after 10 minutes of inactivity. They'll also disconnect after a few hours even if you are actively using them. There's a safety net built in that works if it does disconnect but you have to aware of it ahead of time. And it is best tested a few times in advance when you don't need it. See Getting your notebook after your Binder has stopped.
As this is going to a remote machine, obviously there are security concerns. Part of this is addressed by the temporary nature of the sessions. Nothing is stored remotely once the active session goes away. (Hence, the paragraph above because once it is gone, it is gone.) However, don't upload anything you wouldn't want someone else to see. Don't share keys and things with this system. In fact, it is possible now to do real time co-authoring/co-editing of Jupyter notebooks via the MyBinder system although some of the minor glitches are still being worked out.
A lot of packages you can install right in the session using %pip install or %conda install in cells right in the notebook. However, sometimes you want them already installed so the session is ready with the necessary software. (Plus some software won't work unless installed during the building backing image of the container backing the session.) That is where it becomes handy that you can customize the session that comes up via configuration files in public repositories. A list of places you can host those files is seen by going to MyBinder.org and pressing the dropdown menu in the top left side of that form there, under 'GitHub repository name or URL'. Here's an example. You can look in requirements.txt and see I install quite a few packages in the data science stack.
Of course there's other related online offerings for Jupyter (or you can install it on remote servers) and many use authentication. As some of those cost money and you are unsure about your locked system, the MyBinder.org system may help you test the limits of what you can do on your machine.

Related

How to sync python files from PyCharm IDE to GitHub Automatically?

I'm currently using PyCharm IDE to learn Python. I am not aware of how to sync my file automatically to GitHub. Or to be precise I want my code to automatically sync as I type, to my GitHub repo. Like I want the file to exist in GitHub and edit it over my IDE.
Is there any solution for this to happen?
Regards,
Kausik
That is not how git (or Github) works. Version control systems are designed to capture milestones in your project. I think you're confusing git with file management cloud services (e.g., Dropbox or Google Drive). If you need something that would sync your files with each "save" you make to a file then services like Dropbox are what you looking for.
However, version control systems (e.g., git) are much better suited for code management if you adjust your workflow to follow how they were intended to be used. In PyCharm, after each milestone (e.g., a bug fix or a new feature implementation) you would do the following:
Stage changed files by checking them.
Commit the changes by adding a commit message.
Push changes to the remote repository.
All these can be found within the Commit window in PyCharm (View >> Tool Windows >> Commit). All the three steps above can be done in one click
One last thing. If your goal is to collaborate with someone else in real-time, then PyCharm has a new feature called "Code with me" (Tools >> Code With Me ...). I don't know if it is available for free but the idea is that you would invite friends and change the code base together in real-time. And eventually, you would push the changes to the remote repository.

How to hide / obfuscate code in Pycharm effectively?

I want to work on a Windows machine with some privacy / intellectual property related python code within Pycharm.
I want to avoid that somebody else later on can see my code on the local file system (or can recover / undelete it from the SSD based file system). Therefore I am looking for a solution that keeps my python script(s) encrypted on the local file system but editable / usable within Pycharm.
I thought of creating a RAM disk or installing a virtual machine on the Windows computer. Unfortunately, I do not have admin rights on that computer so I cannot install any software nor create virtual machines.
Additionally I can use an USB stick in read only mode on that machine, but cannot write files back to that USB drive.
I am looking for a solution where I am still able to edit a Python script on that computer within Pycharm, but this file should not be persisted to the file system. Once I have finished my work, the Python script should be written back to the file system in an encrypted way.
Use an online Idle and save your code online.
If you really want pycharm I would upload the code and delete it from the pc and download again whenever I needed it.
You could work 'in the cloud' where you have an AWS instance that you checkout / clone /develop your work and you just use your computer essentially like a thin client for connecting to your remote instance.

Is JupyterLab on https://try.jupyter.org/ persistent / where does it store your notebooks?

I would like to perform a small, one-day-long coding workshop with some kids and teenagers.
For this, I am looking for a publicly hosted JupyterLab system that we can use for free to write some small Python scripts. One requirement is that we can upload a .csv file to this system.
I stumbled across https://try.jupyter.org/ which provides free JupyterLab instances to use.
My question: Does anybody have experience how long scripts stay "uploaded" there? Is the lab regularly reset, or does it store the scripts somewhere locally (in the browser cache, etc.)? We would shut down the PCs to go for lunch and the files should be accessible in the lab again once we restart the PCs.
I know we can download and store the notebooks offline (which we will do, just to make sure) and reupload them if we need, but it would still be nice to know about the persistency of the "Try" service.

What strategy should I use to periodically extract information from a specific folder

With this question I would like to gain some insights/verify that I'm on the right track with my thinking.
The request is as follows: I would like to create a database on a server. This database should be updated periodically by adding information that is present in a certain folder, on a different computer. Both the server and the computer will be within the same network (I may be running into some firewall issues).
So the method I am thinking of using is as follows. Create a tunnel between the two systems. I will run a script that periodically (hourly or daily) searches through the specified directory, convert the files to data and add it to the database. I am planning to use python, which I am fairly familiar with.
Note: I dont think I will be able to install python on the pc with the files.
Is this at all doable? Is my approach solid? Please let me know if additional information is required.
Create a tunnel between the two systems.
If you mean setup the firewall between the two machines to allow connection, then yeah. Just open the postgresql port. Check postgresql.conf for the port number in case it isn't the default. Also put the correct permissions in pg_hba.conf so the computer's ip can connect to it.
I will run a script that periodically (hourly or daily) searches through the specified directory, convert the files to data and add it to the database. I am planning to use python, which I am fairly familiar with.
Yeah, that's pretty standard. No problem.
Note: I dont think I will be able to install python on the pc with the files.
On Windows you can install anaconda for all users or just the current user. The latter doesn't require admin privileges, so that may help.
If you can't install python, then you can use some python tools to turn your python program into an executable that contains all the libraries, so you just have to drop that into a folder on the computer and execute it.
If you absolutely cannot install anything or execute any program, then you'll have to create a scheduled task to copy the data to a computer that has python over the network, and run the python script there, but that's extra complication.
If the source computer is automatically backed up to a server, you can also use the backup as a data source, but there will be a delay depending on how often it runs.

Interactive Ipython Notebooks on Heroku

I am currently trying to make python tutorials and host them using an ipython notebook on a Heroku site. The problem is that ipython notebooks are static when uploaded. I am trying to make it such that the user can use the notebook interactively (such as print outputs). I also dont want the output from their notebooks to be saved permanently on the Heroku website.
From what I understand, you have 2 issues do deal with :
interactive notebooks
"read only" notebooks (do not save the modifications)
For issue 1, you need to use a jupyter (the new IPython name for notebooks) server. Only showing the notebook is not enough because you need a server to "understand" and execute the modifications. See : http://jupyter-notebook.readthedocs.io/en/latest/public_server.html
I am not familiar with Heroku, after googling 2s I found this : https://github.com/pl31/heroku-jupyter which was able to deploy a working Jupyter server on a demo heroku machine.
According to me, issue 2 is more difficult to solve.
When the "learners" will change the notebook, the modifications will be applied to the notebook file (.ipnb) so the modifications will be persistent... This is not want you want.
You could try some tricks using file permissions to prevent the kernel to save the file, but I think it would only crash the kernel...
Moreover it asks several user-interaction problems, for instance what if I lose my internet connection ? Will I loose my work ? Why ? Is this what I really want as a learner ?
For this, the best solution is to provide a user access to the notebook / a worksapce where she can save her progression, but it is more work than just deploy a jupyter server. As an example, see databricks.com (the first (only) one that come to mind, not necessary the best).
(As a remark, it seems that the multi user mode is already implemented : https://jupyterhub.readthedocs.io/en/latest/)
I would like to add a last remark about the security of the server. Letting stranger access a server with an embedded shell sound like a bad idea if you are not prepared for the consequences. I would suggest you to see how you can put each user's jupyter session in a "jail" / container, anything that works in Heroku.

Categories

Resources