`pipreqs` generating blank requirements.txt in Docker Container

`pipreqs` generating blank requirements.txt in Docker Container - python

I am using Python in a Jupyter Lab notebook in a Docker container. I have the following code in one cell:
import numpy as np
import os
import pandas as pd
Then I run the following cell:
!pipreqs /app/loaded_reqs
and get:
INFO: Successfully saved requirements file in /app/loaded_reqs/requirements.txt
But when I open the requirements.txt, it shows up empty/blank. I expected numpy, os and pandas to be in this requirements.txt file. Why might it not be working?

According to this Medium post by Iván Lengyel, pipreqs doesn't support Jupyter notebooks. (This issue in in the pipreqs repo, open since 2016 convinces me of the veracity of that assertion. Nicely, the issue post also suggests the solution I had already found when searching the terms 'pipreqs jupyter' at Google.) Plus, importantly you generally don't use tools that act on notebook files inside the notebook you are trying to use. (Or at least it is something to always watch out for, [or test if possible], similar in a way to avoiding iterating on a list you are modifying in the loop.)
Solution -- use pipreqsnb instead:
In that Medium post saying it doesn't work with notebooks, Iván Lengyel proffers a wrapper for it that works for notebooks. So in the terminal outside the notebook, but in the same environment (inside the docker container, in your case), install pipreqsnb via pip install pipreqsnb. Then run it pointing it at your specific notebook file. I'll give an example in the next paragraph.
I just tried it and it worked in temporary sessions launched from here by pressing launch binder badge there. When the session came up, I opened a terminal and ran pip install pipreqsnb and then pipreqsnb index.ipynb. That first time I saw requirements.txt get made with details on the versions of matplotlib, numpy, scipy, and seaborn. To fully test it was working, I opened index.ipynb in the running session and added a cell with import pandas as pd typed in it and saved the notebook. Then I shutdown the kernel and over in the terminal ran, pipreqsnb index.ipynb. When I re-examined the requirements.txt file now pandas has been added with details about the versions.
More about maybe why !pipreqs /app/loaded_reqs failed:
I had the idea that maybe you needed to save the notebook first after adding the import statements cell? However, nevermind. That still won't help because as stated here pipreqs, and further confirmed at the pipreqs issues list doesn't support Jupyter notebooks.
Also, keep in mind the use of the exclamation in a notebook to run a command in the shell doesn't mean that shell will be in the same environment as the kernel of the notebook, see the second paragraph here to more perspective on that. (This can be useful to understand for future things though, such as why you want to use the %pip or %conda magic commands when installing from inside a notebook, see here, and not put an exclamation point in front of that command in modern Jupyter.)
Or inside the notebook at the end, I'd suggest trying %watermark --iversions, see watermark. And then making some code to generate the requirements.txt from that. (Also, I had seen there was bug in that related to some packages imported with from X import Y, see here.)
Or I'd suggest trying %pip freeze inside the notebook for the full environment information. Not just what the file needs, though.

Related

Using Matlab.engine and installing tensorflow at the same time

Currently I am working on a project with Jupyter Notebook in which I need to run a matlab script (.m) which includes a function that provides me with data which i try to solve with a tensorflow model afterwards. I can set up an environment that runs the matlab code an gives me the data and I can set up an environment that does the tensorflow thing but my problem is I can`t do it in the same environment.
Here is the setup and the problems. I am using matlab.engine which I installed like described here: https://de.mathworks.com/help/matlab/matlab_external/install-the-matlab-engine-for-python.html
To run my Jupyter Notebook I first navigate to the location where my python.exe and the matlab files are lying ("C:\Users\Philipp\AppData\Local\Programs\Python\Python37-32\Scripts"). If I try to run pip install tensorflow (in Anaconda Prompt) I got a lot of different errors like the following. Conda install works but even when it is installed i can`t import it.
ImportError: No module named 'tensorflow.core' or
ERROR: Could not find a version that satisfies the requirement tensorflow or just No module named 'tensorflow'
I searched for all those problems but nothing helped me. I think this has something to do with the directory I am working in and I know it is bad but I have no idea how to change that. The error also occurs in different environments.

Have you tried running !pip install tensorflow directly in Jupyter Notebook? It's a temporary workaround, but I am having the same problems and this one helped. Remember to comment it out after installation, so you wont re-run it by accident.

I found a solution to my problem. For this I needed a Jupyter Notebook and an external .py script that I design as a Flask. I can luckily run those in different environments. I past and request the data from the server by using "get" and "post".
If someone still has another idea to do all this in one JN, I would still be happy about answers.

Attempting to replicate already working code, but encountering an error during import

A very simple question with a lot of context behind it, so thanks in advance for your patience. In summary I am attempting to control a DAC/ADC development board by adapting already existing software written in python. However, I am encountering an error when I simply try to import some of the prerequisites.
The board I am attempting to control is the ZCU111. The ZCU111 has an ARM processor on board and some wonderful folks found a way to put linux on board; Pynq. The pynq software is imaged onto a SD card, and the SD card is mounted onto the ZCU111 board and the processor boots from the card. I communicate to the board through a USB serial interface and a hosted SSH server on the ZCU111.
Using pynq, some modules have been created to control automatically set the inner workings of the ZCU111 board, ZCU111 GitHub.
Using those modules someone created some great Jupiter notebooks that help run the ZCU111 though a nice gui interface; RFSoC_SAM. However, I dont want run the board through a notebook or gui, I wanted to adapt parts of the code to a much simpler .py file to be run from the terminal.
image of filing system on SD card
To the left is an image of the filing system on the SD card. The GitHub folder contains the modules to control the inner workings, and its identical to the link above. The Jupiter notebook folder contains the notebook which I wish to emulate and works as expected. The pynq folder contains the modules for pynq itself. The Sami_py folder is where I placed my test code.
Here are where my problems and questions begin: The Notebook that works begins with 2 lines of code:
from rfsoc_sam.overlay import
sam = Overlay()
When I scour the SD card, I can't find rfsoc_sam anywhere on the SD card. I'm confused how it works? The GitHub listed above for the RFSoC_Sam DOES have the accompanying .py files. Why do'nt those .py files appear on my SC card? Does the Jupiter notebook package all the necessary files? Regardless the first step in creating my own software is to import the same modules the Overlay from the rfsoc_sam module does. Despite it not appearing on the SD card, I can open the file from the GitHub
from pynq import Overlay, allocate
import xrfclk
import xrfdc
import os
from .hierarchies import *
from .quick_widgets import Image
from ipywidgets import IntProgress
from IPython.display import display
from IPython.display import clear_output
import time
import threading
That is everything that needed to be imported, however I just wanted to start with xrfclk. Since that folder and init file can be found on the SD card in the GitHub folder, the same folder from the link above, ZCU111 GitHub.
I wrote a .py file placed in the Sami_py folder:
import sys
sys.path.insert(1, '//192.168.2.99/xilinx/GitHub/ZCU111-PYNQ/ZCU111/packages/xrfclk')
import xrfclk
The error message I receive:
Traceback (most recent call last):
File "Sami_py/RFSoC_Trial.py", line 3, in <module>
import xrfclk
ImportError: No module named xrfclk
I thought I pointed to the right directory, do i need to point to the directory that has the init file directly? I am not sure why i can't get the include to work, any thoughts?
Happy to provide more context. Thanks in advance for any help or advnce,
Sami

I'm sorry to hear you're having problems with RFSoC SAM. I created this project and can give you some help about RFSoC and PYNQ.
Firstly, the RFSoC SAM package is installed and maintained using pip, which is the python package manager. As PYNQ v2.6 uses Python3, we need to invoke the pip3 command to install RFSoC SAM. In the repository readme, you will see that installing RFSoC SAM on the ZCU111 has a few steps. You need to connect your board to the internet for these steps to work correctly. You should also use the terminal in the Jupyter Labs environment to run the commands (this is detailed in the repository readme). The first step is a patch for the xrfdc package, and looks a bit like this:
mkdir /home/xilinx/GitHub cd /home/xilinx/GitHub/ git clone https://github.com/dnorthcote/ZCU111-PYNQ cd /home/xilinx/GitHub/ZCU111-PYNQ cp /home/xilinx/GitHub/ZCU111-PYNQ/ZCU111/packages/xrfdc/pkg/xrfdc/__init__.py /usr/local/lib/python3.6/dist-packages/xrfdc/__init__.py
The reason a patch is required is because the xrfdc package needs an update that has not yet been implemented for the ZCU111, but it has been implemented for the RFSoC2x2. I simply moved these updates over to the ZCU111, which means RFSoC SAM will function correctly. If you do not implement this patch, the notebook for RFSoC SAM will fail to load. I am not in charge of the xrfdc package, which is why the package needs to be patched (not version controlled and updated).
After implementing the xrfdc patch, you can install RFSoC SAM via pip. The command looks a bit like this:
pip3 install git+https://github.com/strath-sdr/rfsoc_sam
This command will install the RFSoC SAM package to the following location:
/usr/local/lib/python3.6/dist-packages/rfsoc-sam
This location is where all pip packages are stored for PYNQ v2.6. You should not attempt to directly modify packages in this location as they are all under version control. If you would like to develop and change the RFSoC SAM source code, then you need to consider a slightly different workflow.
The RFSoC SAM package does not have any developer tools. To make changes to the project, I simply Git clone the project to my Jupyter workspace in PYNQ, and then make changes to the project. When I'm ready, I can simply reinstall the package by running the following command in a Jupyter Labs terminal:
pip3 install .
This command will just reinstall RFSoC SAM, but using the offline changes you make instead. Using this workflow, you could easily create a fork of RFSoC SAM, make changes to the source code, commit those changes, and easily install the package on your ZCU111. This workflow is something you should consider.
You may be more interested in trying out RFSoC + PYNQ projects that are of lower complexity. For instance, you will be able to see the following projects: RFSoC QPSK and RFSoC Radio. You may have also seen the RFSoC Workshop.
Going forward, I would recommend investigating the above projects, and also understanding Linux and pip a little better. A few google searches should get you on the right path here. Try to create your own pip package and a Github repository to store the package. If you can do this, you will have greatly improved your understanding of pip, and the PYNQ operating system.
Also, If you have questions in the future, you can use the [PYNQ discussion page] (https://discuss.pynq.io/). I will be able to find your questions faster here.
If you have any further issues, feel free to drop me a message. You can also use the RFSoC SAM issues page in the repository, which helps me track these type of questions a little easier.
David N.

How can I make the Lux package work in Python?

I was trying to use the package called LUX for Python. I followed this tutorial. So it was pretty simple, I just had to import some csv and when I called my data, I would be able to see multiple graphs. The problem is that I do everything and nothing shows up for me.
I was using Melbourne House Market data, and this is my script so far:
# firstly, we install package and extensions
!pip install lux-api
!jupyter nbextension install --py luxwidget
!jupyter nbextension enable --py luxwidget
# then, load the packages
import lux
import pandas as pd
# load data
melb_data = pd.read_csv("melb_data.csv")
So far so good... at least, I thought so. After doing these things, when we see the tutorial mentioned in the begining of this question, it mentions that if we call the dataframe now, instead of appearing only the dataset, we would be able to see some graphics as well. But that didn't happen to me. I know screenshots are not the best choice, but this is what I see:
As you can see, there is the Toggle button, but it is only working to hide the table. there isn't the graphs I saw in the tutorial. I also tried to follow this tutorial as well, but there isn't anything new there.
Any ideas on what I'm missing here? Why I can't find a way to make this package work?

It seems that you have not installed the JuypterLab extension. It is described in the README file that you linked, here. You will need to execute the following two commands:
jupyter labextension install #jupyter-widgets/jupyterlab-manager
jupyter labextension install luxwidget
and then restart JupyterLab. The prerequisite here is having a JupyterLab 2.x or 3.x and Node.js installed (while many extensions for JupyterLab 3.x do not require node.js any longer, this one still does - as it seems).

Editing packages not taking effect in jupyter

I forked a repo from github and copied to my local machine and opened all the files in Jupyter. Now I have access to all the .py files and all the notebooks.
Let's say I want to add a new function to the package as follows:
teste(self):
return self
I do this by writing the function in the here.py file and to make sure it works I test it on a notebook by calling it in a cell and executing that cell:
print(here.teste(worked))
However, this doesn't work. My guess is that I have not updated the package itself so the function teste() does not exist. How do I commit this change to the package locally (without using pull request).

Most likely you need to restart your jupyter kernel for the changes to take effect.
Git is merely a versioning system, it does not care what python does and does not influence how python works.
Python loads your package when it is imported import my_package as mp. When you make changes to that package while python is running, it is not aware of those changes. If you try to re-import, python will merely check if it is already imported (which is true) and do nothing. So the changes still does not take effect. Only when you restart the kernel, and import the package will it take effect. You can also re-import a package with the following (python 3.4 and greater):
import importlib
importlib.reload(package)

Run pip in python idle

I am curious about running pip.
Everytime I ran pip in command shell in windows like that
c:\python27\script>pip install numpy
But, I wondered if I can run it in python idle.
import pip
pip.install("numpy")
Unfortunately, it is not working.

Still cannot comment so I added another answer. Pip has had several entrypoints in the past. And it's not recommended to call pip directly or in-process (if you still want to do it, "runpy" is kind of recommended):
import sys
import runpy
sys.argv=["pip", "install", "packagename"]
runpy.run_module("pip", run_name="__main__")
But this should also work:
try:
from pip._internal import main as _pip_main
except ImportError:
from pip import main as _pip_main
_pip_main(["install", "packagename"])

This question is, or should be, about how to run pip from a python program. IDLE is not directly relevant to this version of the quesiton.
To expand on J. J. Hakala's comment: a command-line such as pip install pillow is split on spaces to become sys.argv. When pip is run as a main module, it calls pip.main(sys.argv[1:]). If one imports pip, one may call pip.main(arg_line.split()), where arg_line is the part of the command line after pip.
Last September (2015) I experimented with using this unintended API from another python program and reported the initial results on tracker issue 23551. Discussion and further results followed.
The problem with executing multiple commands in one process is that some pip commands cache not only sys.path, which normally stays constant, but also the list of installed packages, which normally changes. Since pip is designed to run one command per process, and then exit, it never updates the cache. When pip.main is used to run multiple commands in one process, commands given after the caching may use a stale and no-longer-correct cache. For example, list after install shows how things were before the install.
A second problem for a program that wants to examine the output from pip is that it goes to stdout and stderr. I posted a program that captures these streams into program variables as part of running pip.
Using a subprocess call for each pip command, as suggested by L_Pav, though less efficient, solves both problems. The communicate method makes the output streams available. See the subprocess doc.

At moment there are no official way to do it, you could use pip.main but you current idle session will not 'see' this installed package.
There been a lot a discussion over how to add a "high level" programmatic API for pip, it's seems promising.

Actually, I think, you can use subprocess.Popen(apt-get numpy), not sure how to do it with PIP though.

If your on a Mac you should be able to do it like this:
Go to your IDLE.
Run help('modules').
Find the HTML module.
Run help('HTML')
There should pop up a file map, for example this/file/map/example/.
Go to the finder and do command+shift+g and paste the file map there. Please delete the last file, because then your gonna go to the modules files.
There are all the modules. If you want to add modules, download the files of the module and put them there.
I hope this helps you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.