How do you implement experiments with conditional branching in PsychoPy Builder? - python

Many behavioural experimental designs in psychology/neuroscience require conditional branching (e.g. only proceed to the test phase if a requisite performance level has been reached in an initial practice phase). PsychoPy’s Builder view allows one to generate a Python script to run an experiment using largely graphical controls. But it doesn't seem to have built-in support for conditional branching.
Can skipping a particular routine on a given run be implemented in Builder by using Python snippets in a Code component? Or does it require moving to the full Python Coder environment?

The Coder view in PsychoPy gives you full access to the Python programming language and hence you can implement arbitrarily complex experimental designs.
PsychoPy’s graphical Builder view, meanwhile, emphasises ease of use and simplicity over flexibility. One thing it does not cater for directly is conditional branching. It can, however, be hacked to achieve it indirectly.
Let’s say you have a three-phase experiment: a practice block, followed by two possible experimental blocks, ConditionA or ConditionB. After completing the practice block, high-performing subjects are assigned to conditionA, while low-performing subjects are assigned to conditionB.
To implement this in Builder, create three routines to represent each of the task blocks (Practice, conditionA, and conditionB). Each will also be surrounded by a loop (practice_loop, A_loop, and B_loop, respectively.) Also insert a routine between Practice and conditionA (called, say, assignCondition).
In the assignCondition routine, place a Code component. Assume in this case that a performance score counter was maintained in the Practice routine. We can use this to change the number of repetitions of subsequent routines. That is, by setting the repetition number of a loop to zero, we ensure that the routines inside that loop will not be executed. Hence the number of repetitions of these loops will not be a fixed value, but instead a variable (say, repetitionsA and repetitionsB).
In the "Begin Routine" tab of the assignCondition routine's Code component, put a Python snippet like this:
if performanceScore > 25:
repetitionsA = 50 # run this routine 50 times
repetitionsB = 0 # don't run this condition at all
else:
repetitionsA = 0 # vice versa: don't run this
repetitionsB = 50 # do run this
A fuller description of this technique is given by Matt Wall in a blog post here (with an fMRI block design as example, in which the order of blocks needs to be variable):
http://computingforpsychologists.wordpress.com/2013/11/12/how-to-hack-conditional-branching-in-the-psychopy-builder/

Related

Which is preferred for mapping values: reusing mapping functions OR building reference tables to do lookups?

Which is preferred for mapping values: reusing mapping functions OR building reference tables to do lookups?
This is a very general high-level question, which, I believe is mostly language-independent. For the example, I will use SAS, but I also would like to know how people feel about this in R and python.
I am on a team that likes to use mapping functions to take an input and generate some output, usually in a one-to-one manner. For example, here is some modified SAS code that shows what my team typically does:
proc fcmp;
function mapFunction($ input);
output = .;
if input= "Day zero" then output= 0;
else if input = "Day one" then output = 1;
else if input = "Day two" then output = 2;
return(output);
endsub;
run;
The team would then use mapFunction in multiple other programs when mapping needs to be done.
This goes against my instinct. My instinct would be to use the mapping function once to generate a set of key:value pairs or a dictionary or two-column table and thereafter use lookup/indexing/joining/merging operations to refer to the table.
I have a hard time explaining why this is my instinct, but it is. It feels better to have a reference table instead of calling a custom function repeatedly. However, I don't trust my feelings, and I'd like to hear from others to see if I can rationalize one technique over the other.
Can someone provide arguments/counter-arguments to support one technique or the other?
Thank you!
I've done a lot of conversions, which involve re-mapping codes. Your situation may be a little different. Conversions being one-off events, rather than on-going transformations like overnight DW refreshes.
The default position is to use a table, as you suggest. It has a real benefit in that a) you can query data to check for non-existent values in the map before you convert; b) the responsibility goes back to the business to confirm the code mappings are correct.
It gets trickier where the business rules for the conversion are in flux. For example, your first cut might provide a simple map, but the someone says 'except for these codes, where you need to look up an extra value' and suddenly you need a function. At that point, you need to revisit the code base to find where the mapped table is used and update to a function. That can be a lesson learned, depending on the size of the code base.
I have used functions to wrap the use of a mapping table. If you don't know which code is going to be problematic in a month's time, this can be useful. Function just selects a value from a mapping table as you would do in a join. It will be slower to execute but if you're optimizing programmer hours rather than execution time then that works. In any case, you get the data visibility from the code map as well as flexibility for scope-creep from the function. When the scope changes, just update the function.
When you say 'multiple other programs', sounds like a red flag. If you store the data in mapping tables, would the database in which they are stored be visible to these programs? That would be a consideration, big one.
What's your evaluation criteria?
Are you looking for:
program run time
developer coding time
usability
ease of maintenance
Different people prioritize different things. The one good thing about the table method is that it's generic, simple and makes sense in almost any language. But...that does not mean it's the best for that language. How each language implements things under the hood will affect the criteria above.
In SAS, I would rarely use FCMP, but Formats are one of the fastest ways there and really have a table behind it that can be easily updated, so that would be my choice in SAS. It's slightly slower than the HASH and SET+KEY methods but it's easier to use and maintain and those end up saving more time in the long run. And I prioritize human time over computer time for 99% of my projects.
If you search on lexjansen.com you'll find many papers on the different ways to do look ups and a comparison of their run times. https://www.lexjansen.com/phuse/2007/cs/CS06.pdf

What parts of analytical code to write unit tests for?

For the last while I've been writing analytical Python code that gets run on demand when users interact with a front end tool throught a queue based batch processing.
Typically the users set some values in the front end tool that get passed as parameters to the analytical code and they either supply a dataset or choose a subset of data from an overall data source that their company provides.
Typically each analytical model sits in a larger repo amongst other analytical models so each model would usually sit in it's own module and that module would export one function which is the entrypoint in to that model. The models range from being simple models that take on the order of minutes to very complex stastical or machine learning based models and might use combinations of numpy/Pandas/Numba or Dask dataframes that take on the order of hours.
Now to my question, I've been going back on forth on where I should be aiming to concentrate my testing efforts for this type of code. Previously earlier on in my career I naively thought that every function should have a unit test so my code would have a comprehensive of set of tests.
I quickly realised that this was counter-productive as even a small performance refactor could result in ripping apart and possibly even throwing away a lot of the unit tests. So clearly it felt like I should only be writing tests for the main public function of each model, however, this usually means the opposite happening, for some of the more complex models, edge cases that were quite deep into the control flow were now hard to test.
My question then is how should I be aiming to properly test these analytical models? Some people would probably say "Only test public facing functions, if you can't test edge cases through the public facing functions then they should technically not be reachable so don't need to be there". But, I've found, in reality this doesn't quite work.
To provide a simple example, say the particular model is to calculate a frequency matrix for dropoff/pickoff points from a taxi dataset.
import pandas as pd
def _cat(col1, col2):
cat_col = col1.astype(str).str.cat(col2.astype(str), ', ')
return cat_col
def _make_points_df(taxi_df):
pickup_points = _cat(taxi_df["pickup_longitude"], taxi_df["pickup_latitude"])
dropoff_points = _cat(taxi_df["dropoff_longitude"], taxi_df["dropoff_latitude"])
points_df = pd.DataFrame({"pickup": pickup_points, "dropoff": dropoff_points})
return points_df
def _points_df_to_freq_mat(points_df):
mat_df = points_df.groupby(['pickup', 'dropoff']).size().unstack(fill_value=0)
return mat_df
def _validate_taxi_df(taxi_df):
if type(taxi_df) is not pd.DataFrame:
raise TypeError(f"taxi_df param must be a pandas dataframe, got: {type(taxi_df)}")
expected_cols = {
"pickup_longitude",
"pickup_latitude",
"dropoff_longitude",
"dropoff_latitude",
}
if set(taxi_df) != expected_cols:
raise RuntimeError(
f"Expected the following columns for taxi_df param: {expected_cols}."
f"Got: {set(taxi_df)}"
)
def calculate_frequency_matrix(taxi_df, long_round=1, lat_round=1):
"""Calculate a dropoff/pickup frequency matrix which tells you the number of times
passengers have been picked up and dropped from a given discrete point. The
resolution of these points is controlled by using the long_round and lat_round params
Paramaters
----------
taxi_df : pandas.DataFrame
Dataframe specifying dropoff and pickup long/lat coordinates
long_round : int
Number of decimal places to round the dropoff and pickup longitude values to
lat_round : int
Number of decimal places to round the dropoff and pickup latitude values to
Returns
-------
pandas.DataFrame
Dataframe in matrix format of frequency of dropoff/pickup points
Raises
------
TypeError : If taxi_df is not a pandas DataFrame
RuntimeError : If taxi_df does not contain correct columns
"""
_validate_taxi_df(taxi_df)
taxi_df = taxi_df.copy()
taxi_df["pickup_longitude"] = taxi_df["pickup_longitude"].round(long_round)
taxi_df["dropoff_longitude"] = taxi_df["dropoff_longitude"].round(long_round)
taxi_df["pickup_latitude"] = taxi_df["pickup_latitude"].round(lat_round)
taxi_df["dropoff_latitude"] = taxi_df["dropoff_latitude"].round(lat_round)
points_df = _make_points_df(taxi_df)
mat_df = _points_df_to_freq_mat(points_df)
return mat_df
Taking in a dataframe like
pickup_longitude pickup_latitude dropoff_longitude dropoff_latitude
0 -73.988129 40.732029 -73.990173 40.756680
1 -73.964203 40.679993 -73.959808 40.655403
2 -73.997437 40.737583 -73.986160 40.729523
3 -73.956070 40.771900 -73.986427 40.730469
4 -73.970215 40.761475 -73.961510 40.755890
5 -73.991302 40.749798 -73.980515 40.786549
6 -73.978310 40.741550 -73.952072 40.717003
7 -74.012711 40.701527 -73.986481 40.719509
Say in terms of a folder structure this code would sit at
analytics/models/taxi_freq/taxi_freq.py
and the
analytics/models/taxi_freq/__init__.py
file would look like
from taxi_freq import calculate_frequency_matrix
And obviously the private functions in the above code could be split across multiple utiltiy files in analytics/models/taxi_freq/.
Would the consensus be to only test the calculate_frequency_matrix function, or should the "private" helper methods and other utility files/functions within the taxi_freq module also be tested?
As with software development in general, also with testing you always have to search for solutions that represent the (ideally optimal) tradeoff between competing goals. One of the primary goals of testing in general and also for unit-testing is to find bugs (see Myers, Badgett, Sandler: The Art of Software Testing, or, Beizer: Software Testing Techniques, but also many others).
In your project you may have a more relaxed position on this, but there are many software projects where it would have serious consequences if implementation level bugs escape to later development phases or even to the field. Some say, your goal should rather be to increase confidence in your code - and this is also true, but confidence can only be a consequence of doing testing right. If you don't test to find bugs, then I will simply not have confidence in your code after you have finished testing.
When finding bugs is a primary goal of unit-testing, then attempts to keep unit-test suites completely independent of implementation details is likely to result in inefficient test suites - that is, test suites that are not suited to find all bugs that could be found. Different implementations have different potential bugs. If you don't use unit-testing for finding these bugs, then any other test level (integration, subsystem, system) is definitely less suited for finding them systematically.
For example, think about the different ways to implement a Fibonacci function: as an iterative or recursive function, as a closed form expression (Moivre/Binet), or as a lookup table: The interface is always the same, the possible bugs differ significantly, and so do the unit-testing strategies. There will be a useful set of implementation independent test cases, but these alone will not be sufficient to find all bugs that are likely for the specific implementation.
The goal to have an effective test suite therefore is in competition with another goal, namely to have a maintenance friendly test suite. This goal, however, comes in different forms with different consequences: You could demand that the unit-test suite shall not be affected when implementation details change. This is quite tough and IMO puts the secondary goal of maintenance friendly test code above the primary goal of finding bugs.
Meszaros has a more balanced formulation, namely "The effort for changes to the code base shall be commensurate with the effort to maintain the test suite." (see Meszaros: Principles of Test Automation: Ensure Commensurate Effort). That is, little changes to the SUT shall only require little changes to the test suite, for larger changes to the SUT it is acceptable that the test suite requires equally large changes. (However, for me personally the formulation "the effort for test code maintenance shall be low" is sufficient.)
Conclusion:
For me, as I see finding bugs as the primary goal and test suite maintainability as a secondary goal, this leads to the following consequence: I accept that I have to test also implementation details to find more bugs. But, despite this fact I nevertheless try to keep the maintenance effort low: I do this mostly by applying the following mechanisms that aim at making it simpler to adjust the test suite in case of changes to the SUT:
First, if the goal of a specific test case can be reached by an implementation agnostic test case and an implementation dependent test case, prefer the implementation agnostic test case. In other words, don't make individual test cases unnecessarily implementation dependent.
Second, hide implementation details behind helper functions. There can be helper functions for specific setups, teardowns, assertions etc. This is an extremely powerful mechanism to limit the effect of implementation details within the test suite.

Has Python got branch prediction?

I implemented a physics simulation in Python (most of the heavy lifting is done in numerical libraries anyways, thus performance is good enough).
Now that the project has grown a bit, I added extra functionality via parameters that do not change during the simulation. With that comes the necessity to have the program do one thing or another based on their values, i.e., quite a few if-else scattered around the code.
My question is simple: does Python implement some form of branch prediction? Am I going to wear the performance significantly or is the interpreter smart enough to see that some parameters never change? Having a constant if-else inside a function that is called a million times, is the conditional evaluated every time or some magic happens? When there is no easy way to remove the conditional altogether, is there a way to give the interpreter some hints and favour/emulate branch prediction?
You could in theory benefit here from some JIT functionality that may observe the control flow over time and could effectively suppress never-taken branches by rearranging the code. Some of the Python interpreters contain JIT compilers (I think PyPy does in newer versions, maybe Jython as well), and may be able to do this optimization, but that of course depends on the actual code.
However, the main form of branch prediction is done in HW, and is unrelated to the SW or language constructs used (in Python's case - quite a few levels of abstraction above). This mechanism eventually observes these conditional code paths as branches, and may be able to learn them if they are indeed statically determined. However, as any prediction mechanism, it has limited capacity, and since your code is supposed to be big, it may not be able to accommodate predictions for all these branches. It's still considered quite good, so chances are that the critical ones may work.
Finally, if you really want to optimize your code, you can convert some of these conditions to constants (assigning an argument a constant value instead of parsing the command line), or hiding the condition completely with something like __debug__. This way you won't have to worry about predicting them, but can restore the capability with minimal work if you need them in the future.

What does abstraction mean in programming?

I'm learning python and I'm not sure of understanding the following statement : "The function (including its name) can capture our mental chunking, or abstraction, of the problem."
It's the part that is in bold that I don't understand the meaning in terms of programming. The quote comes from http://www.openbookproject.net/thinkcs/python/english3e/functions.html
How to think like a computer scientist, 3 edition.
Thank you !
Abstraction is a core concept in all of computer science. Without abstraction, we would still be programming in machine code or worse not have computers in the first place. So IMHO that's a really good question.
What is abstraction
Abstracting something means to give names to things, so that the name captures the core of what a function or a whole program does.
One example is given in the book you reference, where it says
Suppose we’re working with turtles, and a common operation we need is
to draw squares. “Draw a square” is an abstraction, or a mental chunk,
of a number of smaller steps. So let’s write a function to capture the
pattern of this “building block”:
Forget about the turtles for a moment and just think of drawing a square. If I tell you to draw a square (on paper), you immediately know what to do:
draw a square => draw a rectangle with all sides of the same length.
You can do this without further questions because you know by heart what a square is, without me telling you step by step. Here, the word square is the abstraction of "draw a rectangle with all sides of the same length".
Abstractions run deep
But wait, how do you know what a rectangle is? Well, that's another abstraction for the following:
rectangle => draw two lines parallel to each other, of the same length, and then add another two parallel lines perpendicular to the other two lines, again of the same length but possibly of different length than the first two.
Of course it goes on and on - lines, parallel, perpendicular, connecting are all abstractions of well-known concepts.
Now, imagine each time you want a rectangle or a square to be drawn you have to give the full definition of a rectangle, or explain lines, parallel lines, perpendicular lines and connecting lines -- it would take far too long to do so.
The real power of abstraction
That's the first power of abstractions: they make talking and getting things done much easier.
The second power of abstractions comes from the nice property of composability: once you have defined abstractions, you can compose two or more abstractions to form a new, larger abstraction: say you are tired of drawing squares, but you really want to draw a house. Assume we have already defined the triangle, so then we can define:
house => draw a square with a triangle on top of it
Next, you want a village:
village => draw multiple houses next to each other
Oh wait, we want a city -- and we have a new concept street:
city => draw many villages close to each other, fill empty spaces with more houses, but leave room for streets
street => (some definition of street)
and so on...
How does this all apply to programmming?
If in the course of planning your program (a process known as analysis and design), you find good abstractions to the problem you are trying to solve, your programs become shorter, hence easier to write and - maybe more importantly - easier to read. The way to do this is to try and grasp the major concepts that define your problems -- as in the (simplified) example of drawing a house, this was squares and triangles, to draw a village it was houses.
In programming, we define abstractions as functions (and some other constructs like classes and modules, but let's focus on functions for now). A function essentially names a set of single statements, so a function essentially is an abstraction -- see the examples in your book for details.
The beauty of it all
In programming, abstractions can make or break productivity. That's why often times, commonly used functions are collected into libraries which can be reused by others. This means you don't have to worry about the details, you only need to understand how to use the ready-made abstractions. Obviously that should make things easier for you, so you can work faster and thus be more productive:
Example:
Imagine there is a graphics library called "nicepic" that contains pre-defined functions for all abstractions discussed above: rectangles, squares, triangles, house, village.
Say you want to create a program based on the above abstractions that paints a nice picture of a house, all you have to write is this:
import nicepic
draw_house()
So that's just two lines of code to get something much more elaborate. Isn't that just wonderful?
A great way to understand abstraction is through abstract classes.
Say we are writing a program which models a house. The house is going to have several different rooms, which we will represent as objects. We define a class for a Bathroom, Kitchen, Living Room, Dining Room, etc.
However, all of these are Rooms, and thus share several properties (# of doors/windows, square feet, etc.) BUT, a Room can never exist on it's own...it's always going to be some type of room.
It then makes sense to create an abstract class called Room, which will contain the properties all rooms share, and then have the classes of Kitchen, Living Room, etc, inherit the abstract class Room.
The concept of a room is abstract and only exists in our head, because any room that actually exists isn't just a room; it's a bedroom or a living room or a classroom.
We want our code to thus represent our "mental chunking". It makes everything a lot neater and easier to deal with.
As defined on wikipedia: Abstraction_(computer_science)
Abstraction tries to factor out details from a common pattern so that
programmers can work close to the level of human thought, leaving out
details which matter in practice, but are not exigent to the problem being
solved.
Basically it is removing the details of the problem. For example, to draw a square requires several steps, but I just want a function that draws a square.
Let's say you write a function which receives a bunch of text as parameter, then reads credentials in a config file, then connects to a SMTP server using those credentials and sends a mail using that text.
The function should be named sendMail(text), not parseTextReadCredentialsInFileConnectToSmtpThenSend(text) because it is more easy to represent what it does this way, to yourself and when presenting the API to coworkers or users... even though the 2nd name is more accurate, the first is a better abstraction.
In a simple sentence, I can say: The essence of abstraction is to extract essential properties while omitting inessential details. But why should we omit inessential details? The key motivator is preventing the risk of change.
The best way to to describe something is to use examples:
A function is nothing more than a series of commands to get stuff done. Basically you can organize a chunk of code that does a single thing. That single thing can get re-used over and over and over through your program.
Now that your function does this thing, you should name it so that it's instantly identifiable as to what it does. Once you have named it you can re-use it all over the place by simply calling it's name.
def bark():
print "woof!"
Then to use that function you can just do something like:
bark();
What happens if we wanted this to bark 4 times? Well you could write bark(); 4 times.
bark();
bark();
bark();
bark();
Or you could modify your function to accept some type of input, to change how it works.
def bark(times):
i=0
while i < times:
i = i + 1
print "woof"
Then we could just call it once:
bark(4);
When we start talking about Object Oriented Programming (OOP) abstraction means something different. You'll discover that part later :)
Abstraction: is a very important concept both in hardware and software.
Importance: We the human can not remember all the things all the times. For example, if your friend speaks 30 random numbers quickly and asks you to add them all, it won't be possible for you. Reason? You might not be able to keep all those numbers in mind. You might write those numbers on a paper even then you will be adding right most digits one by one ignoring the left digits at one time and then ignoring the right most digits at the other time having added the right most ones.
It shows that at one time we the human can focus at some particular issue while ignoring those which are already solved and moving focus towards what are left to be solved.
Ignoring less important thing and focusing the most important (for the time being and in particular context) is called Abstraction
Here is how abstraction works in programming.
Below is the world's famous hello world program in C language:
//C hello world example hello.c
#include <stdio.h>
int main()
{
printf("Hello world\n");
return 0;
}
This is the simplest and usually the first computer program a programmer writes. When you compile and run this program on command prompt, the output may appear like this:
Here are the serious questions
Computer only understands the binary code how was it able to run your English like code? You may say that you compiled the code to binary using compiler. Did you write compiler to make your program work? No. You needed not to. You installed GNU C compiler on your Linux system and just used it by giving command:
gcc -o hello hello.c
And it converted your English like C language code to binary code and you could run that code by giving command:
./hello
So, for writing an application in C program, you never need to know how C compiler converts C language code to binary code. So you used GCC compiler as an abstraction.
Did you write code for main() and printf() functions? No. Both are already defined by someone else in C language. When we run a C program, it looks for main() function as starting point of program while printf() function prints output on computer screen and is already defined in stdio.h so we have to include it in program. If both the functions were not already written, we had to write them ourselves just to print two words and computers would be the most boring machines on earth ever. Here again you use abstraction i.e. you don't need to know how printf prints text on monitor and all you need to know is how to give input to printf function so that it shows the desired output.
I did not expand the answer to abstraction of operating system, kernel, firmware and hardware for the sake of simplicity.
Things to remember:
While doing programming, you can use abstraction in a variety of ways to make your program simple and easy.
Example 1: You can use a constant to abstract value of PI 3.14159 in your program because PI is easy to remember than 3.14159 for the rest of program
Example 2: You can write a function which returns square of a given number and then anyone, including you, can use that function by giving it input as parameters and getting a return value from it.
Example 3: In an Object-oriented programming (OOP), like Java, you may define an object which encapsulates data and methods and you can use that object by invoking its methods.
Example 4: Many applications provide you API which you use to interact with that application. When you use API methods, you never need to know how they are implemented. So abstraction is there.
Through all the examples, you can realize the importance of abstraction and how it is implemented in programming. One key thing to remember is that abstraction is contextual i.e. keeps on changing as per context

Writing reusable code

I find myself constantly having to change and adapt old code back and forth repeatedly for different purposes, but occasionally to implement the same purpose it had two versions ago.
One example of this is a function which deals with prime numbers. Sometimes what I need from it is a list of n primes. Sometimes what I need is the nth prime. Maybe I'll come across a third need from the function down the road.
Any way I do it though I have to do the same processes but just return different values. I thought there must be a better way to do this than just constantly changing the same code. The possible alternatives I have come up with are:
Return a tuple or a list, but this seems kind of messy since there will be all kinds of data types within including lists of thousands of items.
Use input statements to direct the code, though I would rather just have it do everything for me when I click run.
Figure out how to utilize class features to return class properties and access them where I need them. This seems to be the cleanest solution to me, but I am not sure since I am still new to this.
Just make five versions of every reusable function.
I don't want to be a bad programmer, so which choice is the correct choice? Or maybe there is something I could do which I have not thought of.
Modular, reusable code
Your question is indeed important. It's important in a programmers everyday life. It is the question:
Is my code reusable?
If it's not, you will run into code redundancies, having the same lines of code in more than one place. This is the best starting point for bugs. Imagine you want to change the behavior somehow, e.g., because you discovered a potential problem. Then you change it in one place, but you will forget the second location. Especially if your code reaches dimensions like 1,000, 10,0000 or 100,000 lines of code.
It is summarized in the SRP, the Single-Responsibilty-Principle. It states that every class (also applicable to functions) should only have one determination, that it "should do just one thing". If a function does more than one thing, you should break it apart into smaller chunks, smaller tasks.
Every time you come across (or write) a function with more than 10 or 20 lines of (real) code, you should be skeptical. Such functions rarely stick to this principle.
For your example, you could identify as individual tasks:
generate prime numbers, one by one (generate implies using yield for me)
collect n prime numbers. Uses 1. and puts them into a list
get nth prime number. Uses 1., but does not save every number, just waits for the nth. Does not consume as much memory as 2. does.
Find pairs of primes: Uses 1., remembers the previous number and, if the difference to the current number is two, yields this pair
collect all pairs of primes: Uses 4. and puts them into a list
...
...
The list is extensible, and you can reuse it at any level. Every function will not have more than 10 lines of code, and you will not be reinventing the wheel everytime.
Put them all into a module, and use it from every script for an Euler Problem related to primes.
In general, I started a small library for my Euler Problem scripts. You really can get used to writing reusable code in "Project Euler".
Keyword arguments
Another option you didn't mention (as far as I understand) is the use of optional keyword arguments. If you regard small, atomic functions as too complicated (though I really insist you should get used to it) you could add a keyword argument to control the return value. E.g., in some scipy functions there is a parameter full_output, that takes a bool. If it's False (default), only the most important information is returned (e.g., an optimized value), if it's True some supplementary information is returned as well, e.g., how well the optimization performed and how many iterations it took to converge.
You could define a parameter output_mode, with possible values "list", "last" ord whatever.
Recommendation
Stick to small, reusable chunks of code. Getting used to this is one of the most valuable things you can pick up at "Project Euler".
Remark
If you try to implement the pattern I propose for reusable functions, you might run into a problem immediately at point 1: How to create a generator-style function for this? E.g., if you use the sieve method. But it's not too bad.
My guess, create module that contain:
private core function (example: return list of n-th first primes or even something more generall)
several wrapper/util functions that use core one and prepare output different ways. (example: n-th prime number)
Try to reduce your functions as much as possible, and reuse them.
For example you might have a function next_prime which is called repeatedly by n_primes and n_th_prime.
This also makes your code more maintainable, as if you come up with a more efficient way to count primes, all you do is change the code in next_prime.
Furthermore you should make your output as neutral as possible. If you're function returns several values, it should return a list or a generator, not a comma separated string.

Categories

Resources