SQLAlchemy sqlalchemy.sql.expression.select vs. sqlalchemy.sql.expression.Select

SQLAlchemy sqlalchemy.sql.expression.select vs. sqlalchemy.sql.expression.Select - python

So I'm brand new to SQLAlchemy, and I'm trying to use the SQL Expression API to create a SELECT statement that specifies the exact columns to return. I found both a class and a function defined in the sqlalchmey.sql.expressions module and I'm not too sure which to use... Why do they have both a class and a function? When would you use one over the other? And would anyone be willing to explain why they need to have both in their library? It doesn't really make much sense to me to be honest, other than just to confuse me. :) JK
Thanks for the help in advance!

Use the source.
Here's the implementation of the select function, from the source code:
def select(columns=None, whereclause=None, from_obj=[], **kwargs):
"""Returns a ``SELECT`` clause element.
(... long docstring ...)
"""
return Select(columns, whereclause=whereclause, from_obj=from_obj, **kwargs)
So, it is exactly the same.

the expression package provides Python functions to do everything. These functions in some cases return a class instance verbatim from the function's arguments and other times compose an object from several components. It was originally the idea that the functions would be doing a lot more composition than they ended up doing in the end. In any case, the package prefers to stick to pep-8 as far as classes being in CamelCase, functions being all lowercase, and wanted the front end API to be all lower case - so you have the public "constructor" functions.
The SQL expression language is very easy to grok if you start with the tutorial.

I think it's pretty much the same. The documentation says for select (the function):
The returned object is an instance of Select.
As you can pass the select function the same parameters that Select.__init__() accepts, I don't really see a difference. At first glance the arguments of the class constructor seem to be a superset of the function's. But the function can be passed any of the constructor's keyword arguments.

Related

python string to a function call with arguments, without using eval

I have a string stored in a database stands for a class instance creation for example module1.CustomHandler(filename="abc.csv", mode="rb"), where CustomHandler is a class defined in module1.
I would like to evaluate this string to create a class instance for a one time use. Right now I am using something like this
statement = r'module1.CustomHandler(filename="abc.csv", mode="rb")' # actually read from db
exec(f'from parent.module import {statement.split(".")[0]}')
func_or_instance = eval(statement) # this is what I need
Only knowledgable developers can insert such records into database so I am not worried about eval some unwanted codes. But I've read several posts saying eval is unsafe and there is always a better way. Is there a way I can achieve this without using eval?

You might want to take a look at the ast Python module, which stands for abstract syntax trees. It's mainly used when you need to process the grammar of the programming language, work with code in string format, and so much more functions available in the official documentation.
In this case eval() function looks like the best solution, clear and readable, but safe only under certain conditions. For example, if you try to evaluate a code that contains a class not implemented in the code, it will throw an exception. That's the main reason why eval is sometimes unsafe.

Python 3 kwargs insight

This has been a source of confusion and frustration for years now. Say you import a particularly poorly documented module and some method that you need to you only has **kwargs for its arguments, how are you supposed to know what keys that method is checking for?
def test(**kwargs):
if 'greeting' in kwargs:
print(kwargs['greeting'])
If i were to call text, how would i know that 'greeting is something the method was looking for?
test(greeting='hi)
Some simplistic cases the IDE can help out with, but most use cases seem to be out of the IDE's scope

Think of kwargs as a dictionary. There is no way to tell from the outside what key-value combinations the method will accept (in your case the test method is essentially a black box) but this is the point of having documentation. Without kwargs, some function headers would get extremely cluttered.

Use documentation!
The subprocess-module's docs is a good example. If you are using a newer version of python (3.7 or 3.6 with backport), consider using dataclasses as an alternative to kwargs, if it fits your usecase.

If it's not documented, your only recourse is to read the source.

Adding a **kwargs argument to a function is used when you don't want to explicitly define the arguments which must be named.
A trivial example:
If a function takes as an argument another function which is undetermined and may have different kwargs each time
def foo(func,**kwargs):
print(func)
return func(**kwargs)
You won't know what the function is explicitly looking for.
You can have in your example
def foo(greeting=None):
which shows the function is looking for greeting but it can be None

Stubbing out functions or classes

Can you explain the concept stubbing out functions or classes taken from this article?
class Loaf:
pass
This class doesn't define any methods or attributes, but syntactically, there needs to be something in the definition, so you use pass. This is a Python reserved word that just means “move along, nothing to see here”. It's a statement that does nothing, and it's a good placeholder when you're stubbing out functions or classes.`
thank you

stubbing out functions or classes
This refers to writing classes or functions but not yet implementing them. For example, maybe I create a class:
class Foo(object):
def bar(self):
pass
def tank(self):
pass
I've stubbed out the functions because I haven't yet implemented them. However, I don't think this is a great plan. Instead, you should do:
class Foo(object):
def bar(self):
raise NotImplementedError
def tank(self):
raise NotImplementedError
That way if you accidentally call the method before it is implemented, you'll get an error then nothing happening.

A 'stub' is a placeholder class or function that doesn't do anything yet, but needs to be there so that the class or function in question is defined. The idea is that you can already use certain aspects of it (such as put it in a collection or pass it as a callback), even though you haven't written the implementation yet.
Stubbing is a useful technique in a number of scenarios, including:
Team development: Often, the lead programmer will provide class skeletons filled with method stubs and a comment describing what the method should do, leaving the actual implementation to other team members.
Iterative development: Stubbing allows for starting out with partial implementations; the code won't be complete yet, but it still compiles. Details are filled in over the course of later iterations.
Demonstrational purposes: If the content of a method or class isn't interesting for the purpose of the demonstration, it is often left out, leaving only stubs.

Note that you can stub functions like this:
def get_name(self) -> str : ...
def get_age(self) -> int : ...
(yes, this is valid python code !)
It can be useful to stub functions that are added dynamically to an object by a third party library and you want have typing hints.
Happens to me... once :-)

Ellipsis ... is preferable to pass for stubbing.
pass means "do nothing", whereas ... means "something should go here" - it's a placeholder for future code. The effect is the same but the meaning is different.

Stubbing is a technique in software development. After you have planned a module or class, for example by drawing it's UML diagram, you begin implementing it.
As you may have to implement a lot of methods and classes, you begin with stubs. This simply means that you only write the definition of a function down and leave the actual code for later. The advantage is that you won't forget methods and you can continue to think about your design while seeing it in code.

The reason for pass is that Python is indentation dependent and expects one or more indented statement after a colon (such as after class or function).
When you have no statements (as in the case of a stubbed out function or class), there still needs to be at least one indented statement, so you can use the special pass statement as a placeholder. You could just as easily put something with no effect like:
class Loaf:
True
and that is also fine (but less clear than using pass in my opinion).

Parameter names in Python functions that take single object or iterable

I have some functions in my code that accept either an object or an iterable of objects as input. I was taught to use meaningful names for everything, but I am not sure how to comply here. What should I call a parameter that can a sinlge object or an iterable of objects? I have come up with two ideas, but I don't like either of them:
FooOrManyFoos - This expresses what goes on, but I could imagine that someone not used to it could have trouble understanding what it means right away
param - Some generic name. This makes clear that it can be several things, but does explain nothing about what the parameter is used for.
Normally I call iterables of objects just the plural of what I would call a single object. I know this might seem a little bit compulsive, but Python is supposed to be (among others) about readability.

I have some functions in my code that accept either an object or an iterable of objects as input.
This is a very exceptional and often very bad thing to do. It's trivially avoidable.
i.e., pass [foo] instead of foo when calling this function.
The only time you can justify doing this is when (1) you have an installed base of software that expects one form (iterable or singleton) and (2) you have to expand it to support the other use case. So. You only do this when expanding an existing function that has an existing code base.
If this is new development, Do Not Do This.
I have come up with two ideas, but I don't like either of them:
[Only two?]
FooOrManyFoos - This expresses what goes on, but I could imagine that someone not used to it could have trouble understanding what it means right away
What? Are you saying you provide NO other documentation, and no other training? No support? No advice? Who is the "someone not used to it"? Talk to them. Don't assume or imagine things about them.
Also, don't use Leading Upper Case Names.
param - Some generic name. This makes clear that it can be several things, but does explain nothing about what the parameter is used for.
Terrible. Never. Do. This.
I looked in the Python library for examples. Most of the functions that do this have simple descriptions.
http://docs.python.org/library/functions.html#isinstance
isinstance(object, classinfo)
They call it "classinfo" and it can be a class or a tuple of classes.
You could do that, too.
You must consider the common use case and the exceptions. Follow the 80/20 rule.
80% of the time, you can replace this with an iterable and not have this problem.
In the remaining 20% of the cases, you have an installed base of software built around an assumption (either iterable or single item) and you need to add the other case. Don't change the name, just change the documentation. If it used to say "foo" it still says "foo" but you make it accept an iterable of "foo's" without making any change to the parameters. If it used to say "foo_list" or "foo_iter", then it still says "foo_list" or "foo_iter" but it will quietly tolerate a singleton without breaking.
80% of the code is the legacy ("foo" or "foo_list")
20% of the code is the new feature ("foo" can be an iterable or "foo_list" can be a single object.)

I guess I'm a little late to the party, but I'm suprised that nobody suggested a decorator.
def withmany(f):
def many(many_foos):
for foo in many_foos:
yield f(foo)
f.many = many
return f
#withmany
def process_foo(foo):
return foo + 1
processed_foo = process_foo(foo)
for processed_foo in process_foo.many(foos):
print processed_foo
I saw a similar pattern in one of Alex Martelli's posts but I don't remember the link off hand.

It sounds like you're agonizing over the ugliness of code like:
def ProcessWidget(widget_thing):
# Infer if we have a singleton instance and make it a
# length 1 list for consistency
if isinstance(widget_thing, WidgetType):
widget_thing = [widget_thing]
for widget in widget_thing:
#...
My suggestion is to avoid overloading your interface to handle two distinct cases. I tend to write code that favors re-use and clear naming of methods over clever dynamic use of parameters:
def ProcessOneWidget(widget):
#...
def ProcessManyWidgets(widgets):
for widget in widgets:
ProcessOneWidget(widget)
Often, I start with this simple pattern, but then have the opportunity to optimize the "Many" case when there are efficiencies to gain that offset the additional code complexity and partial duplication of functionality. If this convention seems overly verbose, one can opt for names like "ProcessWidget" and "ProcessWidgets", though the difference between the two is a single easily missed character.

You can use *args magic (varargs) to make your params always be iterable.
Pass a single item or multiple known items as normal function args like func(arg1, arg2, ...) and pass iterable arguments with an asterisk before, like func(*args)
Example:
# magic *args function
def foo(*args):
print args
# many ways to call it
foo(1)
foo(1, 2, 3)
args1 = (1, 2, 3)
args2 = [1, 2, 3]
args3 = iter((1, 2, 3))
foo(*args1)
foo(*args2)
foo(*args3)

Can you name your parameter in a very high-level way? people who read the code are more interested in knowing what the parameter represents ("clients") than what their type is ("list_of_tuples"); the type can be defined in the function documentation string, which is a good thing since it might change, in the future (the type is sometimes an implementation detail).

I would do 1 thing,
def myFunc(manyFoos):
if not type(manyFoos) in (list,tuple):
manyFoos = [manyFoos]
#do stuff here
so then you don't need to worry anymore about its name.
in a function you should try to achieve to have 1 action, accept the same parameter type and return the same type.
Instead of filling the functions with ifs you could have 2 functions.

Since you don't care exactly what kind of iterable you get, you could try to get an iterator for the parameter using iter(). If iter() raises a TypeError exception, the parameter is not iterable, so you then create a list or tuple of the one item, which is iterable and Bob's your uncle.
def doIt(foos):
try:
iter(foos)
except TypeError:
foos = [foos]
for foo in foos:
pass # do something here
The only problem with this approach is if foo is a string. A string is iterable, so passing in a single string rather than a list of strings will result in iterating over the characters in a string. If this is a concern, you could add an if test for it. At this point it's getting wordy for boilerplate code, so I'd break it out into its own function.
def iterfy(iterable):
if isinstance(iterable, basestring):
iterable = [iterable]
try:
iter(iterable)
except TypeError:
iterable = [iterable]
return iterable
def doIt(foos):
for foo in iterfy(foos):
pass # do something
Unlike some of those answering, I like doing this, since it eliminates one thing the caller could get wrong when using your API. "Be conservative in what you generate but liberal in what you accept."
To answer your original question, i.e. what you should name the parameter, I would still go with "foos" even though you will accept a single item, since your intent is to accept a list. If it's not iterable, that is technically a mistake, albeit one you will correct for the caller since processing just the one item is probably what they want. Also, if the caller thinks they must pass in an iterable even of one item, well, that will of course work fine and requires very little syntax, so why worry about correcting their misapprehension?

I would go with a name explaining that the parameter can be an instance or a list of instances. Say one_or_more_Foo_objects. I find it better than the bland param.

I'm working on a fairly big project now and we're passing maps around and just calling our parameter map. The map contents vary depending on the function that's being called. This probably isn't the best situation, but we reuse a lot of the same code on the maps, so copying and pasting is easier.
I would say instead of naming it what it is, you should name it what it's used for. Also, just be careful that you can't call use in on a not iterable.

What is the proper way to comment functions in Python?

Is there a generally accepted way to comment functions in Python? Is the following acceptable?
#########################################################
# Create a new user
#########################################################
def add(self):

The correct way to do it is to provide a docstring. That way, help(add) will also spit out your comment.
def add(self):
"""Create a new user.
Line 2 of comment...
And so on...
"""
That's three double quotes to open the comment and another three double quotes to end it. You can also use any valid Python string. It doesn't need to be multiline and double quotes can be replaced by single quotes.
See: PEP 257

Use docstrings.
This is the built-in suggested convention in PyCharm for describing function using docstring comments:
def test_function(p1, p2, p3):
"""
test_function does blah blah blah.
:param p1: describe about parameter p1
:param p2: describe about parameter p2
:param p3: describe about parameter p3
:return: describe what it returns
"""
pass

Use a docstring, as others have already written.
You can even go one step further and add a doctest to your docstring, making automated testing of your functions a snap.

Use a docstring:
A string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object.
All modules should normally have docstrings, and all functions and classes exported by a module should also have docstrings. Public methods (including the __init__ constructor) should also have docstrings. A package may be documented in the module docstring of the __init__.py file in the package directory.
String literals occurring elsewhere in Python code may also act as documentation. They are not recognized by the Python bytecode compiler and are not accessible as runtime object attributes (i.e. not assigned to __doc__ ), but two types of extra docstrings may be extracted by software tools:
String literals occurring immediately after a simple assignment at the top level of a module, class, or __init__ method are called "attribute docstrings".
String literals occurring immediately after another docstring are called "additional docstrings".
Please see PEP 258 , "Docutils Design Specification" [2] , for a detailed description of attribute and additional docstrings...

The principles of good commenting are fairly subjective, but here are some guidelines:
Function comments should describe the intent of a function, not the implementation
Outline any assumptions that your function makes with regards to system state. If it uses any global variables (tsk, tsk), list those.
Watch out for excessive ASCII art. Having long strings of hashes may seem to make the comments easier to read, but they can be annoying to deal with when comments change
Take advantage of language features that provide 'auto documentation', i.e., docstrings in Python, POD in Perl, and Javadoc in Java

I would go for a documentation practice that integrates with a documentation tool such as Sphinx.
The first step is to use a docstring:
def add(self):
""" Method which adds stuff
"""

Read about using docstrings in your Python code.
As per the Python docstring conventions:
The docstring for a function or method should summarize its behavior and document its arguments, return value(s), side effects, exceptions raised, and restrictions on when it can be called (all if applicable). Optional arguments should be indicated. It should be documented whether keyword arguments are part of the interface.
There will be no golden rule, but rather provide comments that mean something to the other developers on your team (if you have one) or even to yourself when you come back to it six months down the road.

I would go a step further than just saying "use a docstring". Pick a documentation generation tool, such as pydoc or epydoc (I use epydoc in pyparsing), and use the markup syntax recognized by that tool. Run that tool often while you are doing your development, to identify holes in your documentation. In fact, you might even benefit from writing the docstrings for the members of a class before implementing the class.

While I agree that this should not be a comment, but a docstring as most (all?) answers suggest, I want to add numpydoc (a docstring style guide).
If you do it like this, you can (1) automatically generate documentation and (2) people recognize this and have an easier time to read your code.

You can use three quotes to do it.
You can use single quotes:
def myfunction(para1,para2):
'''
The stuff inside the function
'''
Or double quotes:
def myfunction(para1,para2):
"""
The stuff inside the function
"""

The correct way is as follows:
def search_phone_state(phone_number_start,state,dataframe_path,separator):
"""
returns records whose phone numbers begin with a phone_number_start and are from state
"""
dataframe = pd.read_csv(filepath_or_buffer=dataframe_path, sep=separator, header=0)
return dataframe[(pd.Series(dataframe["Phone"].values.tolist()).str.startswith(phone_number_start, na="False"))& (dataframe["State"]==state)]
If you do:
help(search_phone_state)
It will print:
Help on function search_phone_state in module __main__:
search_phone_state(phone_number_start, state, dataframe_path, separator)
returns records whose phone numbers begin with a phone_number_start and are from state

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.