How to keep the same python console on pycharm - python

I love pycharm as it is very useful for my data analysis.
However there is something that I still can't figure out a big problem.
When I start to save a lot of variables, it is very useful. But sometimes, especially when I want to run a piece of row using seaborn and create a new graph, sometimes, all my variables disseapear and I have to reload them again from scratch.
I'd like to know, do you know a way to keep the data stored and run only a piece of my code without getting this problem ?
Thank you

I am not really sure what issue you are having, but from the sounds of it you just need to store a bunch of data and only use some of it.
If that's the case, I would save files which each set of data and then import the one you want to use at that time.
If you stick to the same data names then your program can just do something like:
import data_435 as data
# now your code can always access data.whatever_var
# even though you have 435 different data sets

Related

Migrating Python code away from Jupyter and over to PyCharm

I'd like to take advantage of a number of features in PyCharm hence looking to port code over from my Notebooks. I've installed everything but am now faced with issues such as:
The Display function appears to fail hence dataframe outputs (used print) are not so nicely formatted. Equivalent function?
I'd like to replicate the n number of code cell in a Jupyter notebook. The Jupyter code is split over 9 cells in the one Jupyter file and shift+ Enteris an easy way to check outputs then move on. Now I've had to place all the code in the one Project/python file and have 1200 lines of code. Is there a way to section the code like it is in Jupyter? My VBA background envisions 9 routines and one additional calling routine to get the same result.
Each block of code is importing data from SQL Server and some flat files so there is some validation in between running them. I was hoping there was an alternative to manually selecting large chunks of code/executing and/or Breakpoints everytime it's run.
Any thoughts/links would be appreciated. I spent some $$ on Udemy on a PyCharm course but it does not help me with this one.
Peter
The migration part is solved in this question: convert json ipython notebook(.ipynb) to .py file, but perhaps you already knew that.
The code-splitting part is harder. One reason to why Jupyter is so widely spread is the functionality to split the output and run each cell separately. I would recommend #Andrews answer though.
If you are using classes put each class in a new file.

How to manage complexity while using IPython notebooks?

Imagine that you working with a large dataset, distributed over a bunch of CSV files. You open an IPython notebook and explore stuff, do some transformations, reorder and clean up data.
Then you start doing some experiments with the data, create some more notebooks and in the end find yourself heaped up with a bunch of different notebooks which have data transformation pipelines buried in them.
How to organize data exploration/transformation/learning-from-it process in such a way, that:
complexity doesn't blow, raising gradually;
keep your codebase managable and navigable;
be able to reproduce and adjust data transformation pipelines?
Well, I have this problem now and then when working with a big set of data. Complexity is something I learned to live with, sometimes it's hard to keep things simple.
What i think that help's me a lot is putting all in a GIT repository, if you manage it well and make frequent commits with well written messages you can track the transformation to your data easily.
Every time I make some test, I create a new branch and do my work on it. If it gets to nowhere I just go back to my master branch and keep working from there, but the work I done is still available for reference if I need it.
If it leads to something useful I just merge it to my master branch and keep working on new tests, making new branches, as needed.
I don't think it answer all of your question and also don't know if you already use some sort version control in your notebooks, but that is something that helps me a lot and I really recommend it when using jupyter-notebooks.

Maya selection import

I'm trying to figure if it's possible to import a maya file into a Maya scene, but only certain things objects (such as locators named "xyz" and it's animation) but skip everything else. (I'm not looking to import a folder amount of files, but select certain elements from a maya file)
I've been searching low and wide for something resembling what I'm after, but I' can't seem to find it.
Is it possible with Maya's Python API?
Feels like you will be much better off solving this problem at an earlier stage, than waiting until Maya importing.
If it's a .ma file, you can probably parse and do filtering on it, and save the relevant elements into another .ma file. Otherwise I found this forum question seems to be relevant:
http://tech-artists.org/t/loading-mb-ma-outside-of-maya/2344

Run python code from a certain point

I am going some data analysis with python, and it involves reading data at the beginning of the script. I am currently debugging it, and it is cumbersome to wait for the data file to read each time. Is there any way that I can do something similar to a breakpoint which python will not need to read the data each time? It would just begin with the code below reading the data.
It sounds from your question like you have some lines at the beginning of a script which you do not want to process each time you run the script. That particular scenario is not really something that makes a lot of sense from a scripting point of view. Scripts are read from the top down unless you call a function or something. With that said, here is what I'm gathering you want your workflow to be like:
Do some time consuming data loading (once)
Try out code variations until one works
Be able to run the entire thing when you're done
If that's accurate, I suggest 3 options:
If you don't need the data that's loaded from step 1 in the specific code you're testing, just comment out the time consuming portion until you're done with the new code
If you do need the data, but not ALL of the data to test your new code, create a variable that looks like a small subset of the actual data returned, comment out the time consuming portion, then switch it back when complete. Something like this:
# data_result = time_consuming_file_parser()
data_result = [row1, row2, row3]
# new code using data_result
Finally, if you absolutely need the full data set but don't want to wait for it to load every time before you make changes, try looking into pdb or Python DeBugger. This will let you put a breakpoint after your data load and then play around in the python shell until you are satisfied with your result.
import pdb
pdb.set_trace()

Help with PyEPL logging

I have never used Python before, most of my programming has been in MATLAB and Unix. However, recently I have been given a new assignment that involves fixing an old PyEPL program written by a former employee (I've tried contacting him directly but he won't respond to my e-mails). I know essentially nothing about Python, and though I am picking it up, I thought I'd just quickly ask for some advice here.
Anyway, there are two issues at hand here, really. The first is this segment of the code:
exp = Experiment()
exp.setBreak()
vt = VideoTrack("video")
at = AudioTrack("audio")
kt = KeyTrack("key")
log = LogTrack("session")
clk = PresentationClock()
I understand what this is doing; it is creating a series of tracking files in the directory after the program is run. However, I have searched a bunch of online tutorials and can't find a reference to any of these commands in them. Maybe I'm not searching the right places or something, but I cannot find ANYTHING about this.
What I need to do is modify the
log = LogTrack("session")
segment of the code, so that all of the session.log files go into a new directory, separate from the other log files. But I also need to find a way to not only concatenate them into a single session.log file, but add a new column to that file that will add the subject number (the program is meant to be run by multiple subjects to collect data).
I am not asking anyone to do my work for me, but if anyone could give me some pointers, or any sort of advice, I would greatly appreciate it.
Thanks
I would first check if there is a line in the code
from some_module_name import *
This could easily explain why you can call these functions (classes?). It will also tell you what file to look in to modify the code for LogTrack.
Edit:
So, a little digging seems to find that LogTrack is part of PyEPL's textlog module. These other classes are from other modules. Somewhere in this person's code should be a line something like:
from PyEPL.display import VideoTrack
from PyEPL.sound import AudioTrack
from PyEPL.textlog import LogTrack
...
This means that these are classes specific to PyEPL. There are a few ways you could go about modifying how they work. You can modify the source of the LogTrack class so that it operates differently. Perhaps easier would be to simply subclass LogTrack and change some of its methods.
Either of these will require a fairly thorough understanding of how this class operates.
In any case, I would download the source from here, open up the code/textlog.py file, and start reading how LogTrack works.

Categories

Resources