I have the last six months been working on a Python GUI application that I will use at work. Specifically my GUI will run on a couple of super computer clusters that I use for work.
However, I am mostly developing the software at my personal computer, and here I do not have direct access to the commands that my GUI will call, since the GUI will use subprocess to call commands that only are available on the computing cluster.
So, in order to efficiently develop the program, I often have to copy the directory containing all files related to the GUI, to the cluster. Then I test my current version there, locate all my bugs, fix them by editing the files on the cluster, and finally copy back all files to my computer, overwriting the old version.
This just seems like a bad way of doing it, but I have to be able to test my software in the environment it is made for in order to find my bugs.
Surely this is a common problem in software development... What do actual programmers do (as opposed to hobby programmers such as myself)?
Edit:
Examples of commands that are only available on the computing cluster, that I make heavy use of, are squeue, sacct, and scontrol (SLURM related commands).
Edit2:
I could mention that I tested using ssh connections with Python, but it slowed down the commands significantly, having to establish the ssh connection for each command I wanted. Unless I could set of a lasting ssh session, as in logging in when opening my program, I don't think the ssh-ing will work.
Explore the concepts that make Vagrant a popular choice for developers
Vagrant is a tool for building and managing virtual machine
environments in a single workflow. With an easy-to-use workflow and
focus on automation, Vagrant lowers development environment setup
time, increases production parity, and makes the "works on my machine"
excuse a relic of the past.
Your use case is covered by a couple of vagrant boxes that create a slurm cluster for development purposes. A good starting point might be
Example slurm cluster on your laptop (multiple VMs via vagrant)
If you understand and can setup your development environment with tools like Vagrant, you might explore next which options modern code editors or integrated development environments (IDE) offer for remote development. Remote development covers some other use cases, that might fit into your developer toolbox as well.
A "good enough", free and open source code editor for Python development is Visual Studio Code. According to the docs it has powerful features for remote development.
Visual Studio Code Remote Development allows you to use a container, remote machine, or the Windows Subsystem for Linux (WSL) as a full-featured development environment.
Read the docs
VS Code Remote Development
Related
So my team and I have bought into Docker - it is fantastic for deployment and testing. My real question is how to set up a great developer experience, specifically around writing Python apps, but this question could be generalized to nodejs, Java, etc.
The problem: When writing a Python app, I really like having decent linting/autocomplete functionality, there are some really good editors out there (Atom, VSCode, PyCharm) that provide these, but most really want a Python install on the local disk. The real advantage of Docker is that all of the core language and any project libraries can all be in the container, so reproducing all of that on the host machine just for developing is a pain.
I know that PyCharm pro does support Docker and docker-compose, but I found it quite sluggish and a lot of the test running capabilities were busted. On top of that, I really would like something that I can commit to version control so that the team can share dev setup and people don't have to repeat all of the steps for their own system.
A few Ideas that I had were:
Install an editor (like Atom) in a sidecar Docker container and use X11 forwarding
Use a browser based editor such as https://c9.io/ in a container - this seems most promising
Install some agent in a dev container that could handle autocomplete/linting, etc. and connect to it from a locally running editor - I think this would be the best solution, but I also think that right now it actually doesn't exist.
Has anyone had luck setting up a more productive development environment besides just mounting volumes and editing text?
You should use an 'advanced' IDE like IntelliJ (Pycharm) and configure a remote Python SDK using SSH-Access to your App-Docker-Container (using a shared ssh-key to auth against the app-container with a preinstalled openssh server and preconfigured authorized_keys file).
You can share this SDK information in your project file with all devs, so they wlll have this setup out of the box
1) This will ensure, your IDE knows about all the python libs/symbols available/installed in your docker-container during runtime. It will also enable you to properly debug remotely at the same time
2) This ensures, you have an IDE at your hand including a lot of important additional features like the inspector, 3way duff, search in path.. . hardly any of the Browser-Based IDEs will catch up with Pycharm at this point IMHO
Of course, as already mentioned in the comments, you need to share aka mount your code into the container. On linux, you plainly use host-volume-mounts from your local src folder to the container.
On OSX, you will run into performance issues when using host mounts. You might use something like http://docker-sync.io ( i am biased - there are also a lot of other similar tools )
I know this is an old question, but as I stumbled across it while trying to see what other editors might offer in this space, I would like to point out Visual Studio Code's notion of a Dev Container, which seems to provide the best level of integration I've seen for this so far. I'm hoping to see this turn into an industry trend myself.
Could use x11docker
x11docker allows to run graphical desktop applications (and entire desktops) in Docker Linux containers.
Docker allows to run applications in an isolated container environment. Containers need much less resources than virtual machines for similar tasks.
Docker does not provide a display server that would allow to run applications with a graphical user interface.
x11docker fills the gap. It runs an X display server on the host system and provides it to Docker containers.
Additionally x11docker does some security setup to enhance container isolation and to avoid X security leaks. This allows a sandbox environment that fairly well protects the host system from possibly malicious or buggy software.
https://github.com/mviereck/x11docker
https://github.com/mviereck/x11docker/wiki (extensive! knowledge)
https://dev.to/brickpop/my-dream-come-true-launching-gui-docker-sessions-with-dx11-in-seconds-1a53
I have a python server that I need to run in both a Linux and Windows environment, and my question is about deployment. What is the best method for deploying the solution instead of just double clicking on the file and running it?
Since I use the server_forever() on the server, I can just run the script from command line, but this keeps the python window open. If I log off the machine, naturally the process will stop. So what is the best method for deploying a python script that needs to keep running if the user is logged in or off a machine.
Since I am going to be using multiple environment, Linux and Windows, can you please be specific in what OS you are talking about?
For windows, I was thinking of running the script 'At Startup' using the Windows scheduler. But I wanted to see if anyone had a better option. For linux, I really don't know what to create. I am assuming a CRON job?
Deployment does refer to coding, so using serve_forever() on a multiprocessing job manager keeps the python window open upon execution. Is there a way to hide this window through code? Would you recommend using a conversion tool like py2exe instead?
This is the subject matter of a whole library of books, so I will just give an introduction here :-)
You can basically start scripts directly and then have multiple options to do this in a way that they keep running in the background.
If you have certain functionality that needs to run on regular moments, you would do this by scheduling it:
Windows: Windows Scheduler or specific scheduling tools
Linux: Cron
If your problem is that you want to start a script without it closing on you while SSH'ing into Linux, you want to look into the "screen" or "tmux" tools.
If you want to have it started automatically this could be done by using the "At Startup" as you point out and Linux has similar functionalities, but the preferred and more robust way would be to set up a service that is better integrated with the OS.
Windows: Windows Service
Linux: Daemon
Even more capabilities can be yielded by using an application server such a Django
Tomcat (see comment) is an option, but definitely not the standard one; you'll have a hard time finding support both from Tomcat people running Python or Python people running their stuff on Tomcat. That being said, I imagine you could enable CGI and have it run a Python command with your script.
Yet, instead of just starting a Python script I would strongly encourage you to have a look at different Python options that are probably available for your specific use case. From lightweight web solutions like Flask over a versatile networking engine like Twisted to a full blown web framework like Django.
They all have rather well-thought-out deployment solutions available. Look up WSGI for more background.
I am working on a project where I am quite comfortable running linux, virtualenv, pip, manage.py runserver, git and so on for back-end development. I work with a front-end developer who needs to collaborate remotely, currently via a Dropbox synced copy of the codebase (also in a git branch) on Windows. A development server on my side lets the developer see their changes semi-live.
Although this has served us fairly well so far, has anyone come across a similar working arrangement with a better setup for collaboration?
I'm mindful that the source control learning curve and environmental management overhead is potentially significant and somewhat unnecessary for front-end work (as long as I commit from time to time). I'm considering a VM based setup such as BitNami's DjangoStack so that the front-end dev has their own server setup, but I thought I'd ask about other experiences.
I would recommend vagrant not only for quick development setups (which it excels at), but also for sharing VM configurations as you can publish your own vagrant file which your designer uses.
It relies on VirtualBox Sun Oracle's open source hypervisor and is available for free on all major platforms.
I have been in a very similar situation before Rog, where the backend was a Ruby on Rails setup running on *nix, and the frontend guy needed windows. We initially set up a Windows-Apache-MySql+git+RoR (using Cygwin and other tools) but eventually installing our app libraries and gems became a pain on the windows setup (anytime we would introduce a new gem (or app in django terms) the setup would break on windows). In the end we finally made the front-end guy work on *nix setup.
andLinux is extremely useful in these situations, it lets your run a seamless install of linux withing a windows 2000 setup, so the front end guy can still use windows tool. It is not like a dual boot, but here both the OS are running at the same time. Have a look into it.
I commit every time I make some changes that I think might work: I don't do extensive testing before a commit. Also, my commits will soon be automatically pushed to a remote repository. (I'm the only developer, and I have to add features or rewrite parts of the code many times a day.)
I'd like to set up a remote computer to run regression tests automatically whenever I commit anything; and then email me back the differences report.
What's the easiest way to set this up?
All my code is in Python 3. My own system is Windows 7, ActiveState Python, TortoiseHG, and Wing IDE. I can set up the remote computer as either Linux or Windows. The application is all command-line, with text input and output.
Use a continious integration server such as Buildbot or Jenkins and configure it to monitor the repository. Then run the tests using that. Buildbot is written in Python so you should feel right at home with it.
If you feel it's wasteful to make Buildbot or Jenkins poll the repository (even though hg pull uses very few resources when there are no new changesets), then you can configure a changegroup hook in the repository to trigger a build in the CI server.
I would recommend setting up Buildbot. You can have it watch a remote repository (Mercurial is supported) and automatically kick off a build when the repository changes. In your case, a build would just be running your test suite.
Its waterfall display allows you to see which builds failed and when, in relation to commits from the repository. It can even notify you, with the offending commit, when something breaks.
Jenkins is another option, supporting most of the same features. There are even cloud hosting options, like ShiningPanda that can host it for you, and they offer free licensing for open-source projects.
I'm finding Hadoop on Windows somewhat frustrating: I want to know if there are any serious alternatives to Hadoop for Win32 users. The features I most value are:
Ease of initial setup & deployment on a smallish network (I'd be astonished if we ever got more than 20 worker-PCs assigned to this project)
Ease of management - the ideal framework should have web/GUI based administration system so that I do not have to write one myself.
Something popular & stable. Bonuses depend on us getting this project delivered in time.
BACKGROUND:
The company I work for wants to build a new grid system to run some financial calculations.
The first framework I have been evaluating is Hadoop. This seemed to do exactly what was intended except that it's very UNIX oriented. I was able to get all of the tutorials up & running on an Ubuntu VirtualBox. Unfortunately nothing seems to run easily on Win32.
Yes... Win32: Our company has a policy that everything has to run on Windows. None of the server admins (or anybody outside of select few developers) know anything about Linux. I'd probably get in trouble if they found my virtual Ubuntu environment! The sad fact is that our grid needs to be hosted on Win32 (since all the test PCs run Windows XP 32bit), with an option to upgrade to Win64 at sometime in the future.
To complicate matters - 95% of what we want to run are Python scripts with C++ Windows 32bit DLL add ons. Our calculation library is overwhelmingly written in Python. Our calculation libraries will not run on anything other than Windows... I do not really have a choice
For python there is:
disco
bigtempo
celery - not really a map-reduce framework, but it's a good start if you want something very customized
And you can find a bunch of hadoop clients/integrations on pypi
You could try MPI. It is a standard for message-passing concurrent applications. We are running it on our Linux cluster but it is cross-platform. The most popular implementation is mpich2, written in C. There are python bindings for MPI through the mpi4py library.
IPython has some parallel computing features that are simple and work on windows. It may be enough for your needs. Here's a good place to start:
http://showmedo.com/videotutorials/video?name=7200100&fromSeriesID=720
I've compiled a list of available MapReduce/Hadoop offerings in the cloud (hosted services, PaaS-level), this might be of help as well.
Many distributed computing frameworks can be used for many-task computing. If you don't need the MapReduce paradigm, but rather the ability to distribute the tasks of a job across separate computers, communication and resource management, then you could take a look at other platforms in this area like Condor, or even Boinc; both run on Windows.
You could also run Hadoop on Linux virtual machines.