Basics of setting up a Spyder workspace and projects - python

I have searched for a basic tutorial regarding workspaces and projects in the Spyder IDE. What I want to understand is the basic concepts of how to use the workspace and projects to organize my code. It seems that this is perhaps basic programming skills and that is the reason why I have issues finding any kind of overview. This page seems to be related, but is actually about Eclipse and rather sparse. The Pythonxy tutorial and the documentation for Spyder does not go into any detail. Neither does the Anaconda documentation.
The questions I have are:
When should I set up a new workspace (if ever)?
When do I create a new project?
How does the PYTHONPATH depend on my workspace and project settings? Is it the same in all cases or can I customize it per workspace/project?
Are there other settings apart from the PYTHONPATH that I should configure?
How specific are the answers above to Spyder? Would it be the same for other IDEs, like Eclipse?
I am running Spyder on 64-bit Windows 7, as part of the Anaconda package.

Update Oct 2016: Spyder 3 now has project facilities similar to that of other IDEs (especially Rstudio).
Now you if you have a folder with scripts, you can go to
Projects > New Projects > Existing Directory
to import it. The selected directory will be set as the base directory for the project.

I use spyder for data analysis and I have just started using the project workspace. I believe that it allows you to write better code due to the organization. As a previous post stated that "This can be helpful in web development", which is true because web development requires good software engineering due to the complexity of the files and how they interact with each other. This organization/structure can be used in data analysis as well.
Often, data analysts that use Anaconda have an engineering or science background, not necessarily software engineering or computer science. This means that good software engineering principles may be missing (myself included). Setting up a workspace does one critical thing that I believe is missing from the discussion. It adds the workspace to the system path. Set up a project and then try
import sys
print sys.path
You will see your project's directory added to the PYTHONPATH . This means I can break up my project and import functions from different files within my project. This is highly beneficial when analysis becomes complex or you want to create some type of larger model that will be used on a regular basis. I can create all of my functions in one file, maybe functions for plots in another and then import them in a separate script file.
in myScript.py
from myFunctions import func1
from myFunctions import func2
from myPlots import histPlot
This is a much cleaner approach to data analysis and allows you to focus on one specific task at a time.
In python 3 there is the %autoreload capability so you can work on your functions and then go back to your script file and it will reload them each time if you find errors. I haven't tried this yet bc the majority of my work is in 2.7, but this would seem to add even greater flexibility when developing.
So when should you do this? I think it is always a good idea, I just started using this setup and I will never go back!

In my experience, setting up a workspace in Spyder is not always necessary.
A workspace is a space on your computer where you create and save all the files you work in. Workspaces usually help in managing your project files.
Once you create a workspace in Spyder, a pane called "Project Explorer" opens up inside Spyder. There you see in real-time the files of your project. For instance, if you generate a file with Python, it will show in that pane.
The pane let's you keep the files organized, filter them etc. This can be useful for web development for example because helps you keep your content organized.
I use Python to handle files (e.g. csv) and work with data (data analysis), and I find no use in the workspace feature.
Moreover, if you delete a file in the Project Explorer pane, the file cannot be found in the Windows recycle bin.

One critical piece of information that appears to be missing from the Spyder documentation is how to create a new workspace in the first place. When no workspace exists after installing Spyder, creating your first project automatically initiates the creation of a workspace (at least in the Anaconda 3 distribution). However, it is not as obvious how to create a new workspace when a workspace already exists.
This is the only method I have found for creating a new workspace:
(1) Select the Project explorer window in Spyder. If this window or tab doesn't appear anywhere in the Spyder application, use View > Panes > Project explorer to enable the window.
(2) Click on the folder icon in the upper-right corner of the Project explorer window. This icon brings up a dialog that can create a new workspace. The dialog allows selection of a directory for the .spyderworkspace file.

Related

Managing a Python Monorepo in PyCharm

I'm experimenting with monorepos & python. The idea is having multiple projects in the same repo, each project should have its own virtualenv.
I find it kinda cumbersome managing all of that in PyCharm.
PyCharm supports managing multiple projects in with different venvs: https://www.jetbrains.com/help/pycharm/opening-multiple-projects.html?_ga=2.5681206.409054178.1602169802-543218074.1500382704
But it's not very friendly if you have many projects, I will have to "open" and "attach" each and every one of them.
Let's see an example in this repo:
Under the project directory I have 2 projects:
The 2 projects directories are marked in bold (just like the root one), basically meaning they are "PyCharm projects".
Under the preferences window, you can see all the projects:
But there's no option of adding new projects there.
If I had a 3rd project, I would have to open it and attach it to the current window.
Am I missing something? Is there an easier way of marking a sub-directory as a project?
Imagine cloning a repository with 10 projects or more, configuring all the settings on PyCharm is going to be very frustrating.
You can share some of the files into the .idea folder via your repository. This folder and particularly the .idea/modules.xml file contains all the configuration details for your project(s).
Therefore, you only have to do the setup once (or as projects are added) then the configuration will be replicated.

Using Eclipse PyDev's search function with external libraries

I find PyDev's search function incredibly useful and use it regularly to navigate around my projects. I've got my interpreters set up correctly so PyDev knows about the external libraries that my code uses, and even lets me follow references into the library modules. This is great, obviously, but I also want to be able to search the external libraries like I can search my own code.
There's a similar question pertaining to Java development here: How do I search Libraries in eclipse?
Is there anything out there for PyDev?
I use two different approaches to allow searching in my library code:
When I am using virtualenv, I keep all my code under myproject/src and add it and myproject/lib/python2.7/site-packages/ as pydev source folders. (Be sure to setup your python interpreter to myproject/bin/python as well)
In other cases, I use two different pydev projects. The first (myproject) includes my code. The second one is called myproject-lib and includes the libraries as it's source paths (.../site_packages). The first project references the second projects (and usually I keep both of them in one workspace). This works great with virtualenv, but I believe that you can actually create a pydev project in your system-wide python. Make sure you use the same python interpreter in both projects.
Now you can quickly and easily use Open Resource (CTRL+T) and the Globals Browser (CTRL+Shift+T) to lookup your libs.
I'm afraid PyDev doesn't support this yet. I created feature request for this at https://jira.appcelerator.org/browse/APSTUD-7405 Meanwhile you could link folders of external libraries to your project.

Multiple directories and/or subdirectories in IPython Notebook session?

The IPython documentation pages suggest that opening several different sessions of IPython notebook is the only way to interact with saved notebooks in different directories or subdirectories, but this is not explicitly confirmed anywhere.
I am facing a situation where I might need to interact with hundreds of different notebooks, which are classified according to different properties and stored in subdirectories of a main directory. I have set that main directory (let's call it /main) in the ipython_notebook_config.py configuration file to be the default directory.
When I launch IPython notebook, indeed it displays any saved notebooks that are within /main (but not saved notebooks within subdirectories within /main).
How can I achieve one single IPython dashboard that shows me the notebooks within /main and also shows subdirectories, lets me expand a subdirectory and choose from its contents, or just shows all notebooks from all subdirectories?
Doing this by launching new instances of IPython every time is completely out of the question.
I'm willing to tinker with source code if I have to for this ability. It's an extremely basic sort of feature, we need it, and it's surprising that it's not just the default IPython behavior. For any amount of saved notebooks over maybe 10 or 15, this feature is necessary.
The IPython documentation pages suggest that opening several different sessions of IPython notebook is the only way to interact with saved notebooks in different directories or subdirectories, but this is not explicitly confirmed anywhere.
Yes, this is a current (temporary) limitation of the Notebook server. Multi-directory support is very high on the notebook todo list (unfortunately that list is long, and devs are few and have day jobs), it is just not there yet. By 0.14 (Fall, probably), you should have no reason to be running more than one nb server, but for now that's the only option for multiple directories. All that is missing for a simple first draft is:
Associating individual notebooks with directories (fairly trivial), and
Web UI for simple filesystem navigation (slightly less trivial).
I'm willing to tinker with source code if I have to for this ability
The limiting factor, if you want to poke around in the source, is the NotebookManager, which is associated with a particular directory. If you tweak the list_notebooks() method to handle subdirectories, you are 90% there.
I was curious about this as well, so I tossed together an quick example here that allows you to at least read/run/edit/save notebooks in subdirs (walk depth is limited to 2, but easy to change). Any new notebooks will be in the top-level dir, and there is no UI for moving them around.
The interface and architecture design issues for multiple directory support (and more generally for "project" support) for iPython notebook are important to get right. A design is described in
IPEP 16: Notebook multi directory dashboard and URL mapping
and is being discussed at IPEP 16: Notebook multi directory dashboard and URL mapping · Issue #3166 · ipython/ipython

Referencing an external library in a Python appengine project, using Pydev/Eclipse

it's a couple of months I've started development in Python - having myself a C# and Java background.
I'm currently working on 2 different python/appengine applications, and as often happens in those cases, both application share common code - so I would like to refactor and move the common/generic code into a shared place.
In either Java or C# I'd just create a new library project, move the code into the new project and add a reference to the library from the main projects.
I tried the same in Python, but I am unable to make it work.
I am using Eclipse with Pydev plugin.
I've created a new Pydev project, moved the code, and attempted to:
reference the library project from the main projects (using Project Properties -> Project References)
add the library src folder folder into the main projects (in this case I have an error - I presume it's not possible to leave the project boundaries when adding an existing source folder)
add as external library (pretty much the same as google libraries are defined, using Properties -> External libraries)
Import as link (from Import -> File System and enabling "Create links in workspace")
In all cases I am able to reference the library code while developing, but when I start debugging, the appengine development server throws an exception because it can't find what I have moved into a separate library project.
Of course I've searched for a solution a lot, but it looks like nobody has experienced the same problem - or maybe nobody doesn't need to do the same :)
The closest solution I've been able to find is to add an ant script to zip the library sources and copy in the target project - but this way debugging is a pain, as I am unable to step into the library code.
Any suggestion?
Needless to say, the proposed solution must take into account that the library code has to be included in the upload process to appengine...
Thanks
The dev_appserver and the production environment don't have any concept of projects or libraries, so you need to structure your app so that all the necessary libraries are under the application's root. The easiest way to do this, usually, is to symlink them in as subdirectories, or worst-case, to copy them (or, using version control, make them sub-repositories).
How that maps to operations in your IDE depends on the IDE, but in general, it's probably easiest to get the app structured as you need it on disk, and work backwards from that to get your IDE set up how you like it.

How to organize Eclipse - Workspace VS Programming languages

I use Eclipse for programming in PHP (PDT), Python and sometimes Android. Each of this programming languages requires to run many things after Eclipse start.
Of course I do not use all of them at one moment, I have different workspace for each of those. Is there any way, or recommendation, how to make Eclipse to run only neccessary tools when opening defined workspace?
e.g.:
I choose /workspace/www/, so then only PDT tools will run
I choose /workspace/android/, so then only Android tools and buttons in toolbars will appears
Do I have to manually remove all unneccessary things from each of the workspace? Or it is either possible to remove all?
The plug-ins are stored in the Eclipse installation, not in the workspace folder. So one solution would be to different Eclipse installations for every task, in this case only the required plug-ins would load (and the others not available), on the other hand, you have to maintain at least three parallel Eclipse installations.
Another solution is to disable plug-in activation on startup: in Preferences/General/Startup and Shutdown you can disable single plug-ins not loading. The problem with this approach is, that this only helps to not load plug-ins, but its menu and toolbar contributions will be loaded.
I haven't done this myself... but apparently you can have ONE installation of Eclipse with multiple configurations: see this stackoverflow question.
Using different Eclipse configurations (as described in the link) would allow you to open Eclipse differently and thus only load the plugins you want.

Categories

Resources