I'm trying to delay evaluation for a bit, and so I prefer to work with functions as long as possible.
I have class Function
which defines composition and pointwise arithmetics for functions:
from functools import reduce
def compose(*funcs):
'''
Compose a group of functions f1, f2, f3, ... into (f1(f2(f3(...))))
'''
result = reduce(lambda f, g: lambda *args, **kaargs: f(g(*args, **kaargs)), funcs))
return Function(result)
class Function:
'''
>>> f = Function(lambda x : x**2)
>>> g = Function(lambda x : x + 4)
>>> h = f/g
>>> h(6)
3.6
>>> (f + 1)(5)
26
>>> (2 * f)(3)
18
# >> means composition, but in the order opposite to the mathematical composition
>>> (f >> g)(6) # g(f(6))
40
# | means apply function: x | f is the same as f(x)
>>> 6 | f | g # g(f(6))
40
'''
# implicit type conversion from a non-callable arg to a function that returns arg
def __init__(self, arg):
if isinstance(arg, Function):
# would work without this special case, but I thought long chains
# of nested functions are best avoided for performance reasons (??)
self._func = arg._func
elif callable(arg):
self._func = arg
else:
self._func = lambda *args, **kwargs : arg
def __call__(self, *args, **kwargs):
return self._func(*args, **kwargs)
def __add__(lhs, rhs):
# implicit type conversions, to allow expressions like f + 1
lhs = Function(lhs)
rhs = Function(rhs)
new_f = lambda *args, **kwargs: lhs(*args, **kwargs) + rhs(*args, **kwargs)
return Function(new_f)
# same for __sub__, __mul__, __truediv__, and their reflected versions
# ...
# function composition
# similar to Haskell's ., but with reversed order
def __rshift__(lhs, rhs):
return compose(rhs, lhs)
def __rrshift__(rhs, lhs):
return compose(rhs, lhs)
# function application
# similar to Haskell's $, but with reversed order and left-associative
def __or__(lhs, rhs):
return rhs(lhs)
def __ror__(rhs, lhs):
return rhs(lhs)
Originally, all my functions had the same signature: they took as a single argument an instance of class Data
, and returned a float
. That said, my implementation of Function
didn't depend on this signature.
Then I started to add various higher order functions. For example, I often need to create a bounded version of an existing function, so I wrote a function cap_if
:
from operator import le
def cap_if(func, rhs):
'''
Input arguments:
func: function that determines if constraint is violated
rhs: Function(rhs) is the function to use if constraint is violated
Output:
function that
takes as an argument function f, and
returns a function with the same signature as f
>>> f = Function(lambda x : x * 2)
>>> 5 | (f | cap_if(le, 15))
15
>>> 10 | (f | cap_if(le, 15))
20
>>> 5 | (f | cap_if(le, lambda x : x ** 2))
25
>>> 1.5 | (f | cap_if(le, lambda x : x ** 2))
3.0
'''
def transformation(original_f):
def transformed_f(*args, **kwargs):
lhs_value = original_f(*args, **kwargs)
rhs_value = rhs(*args, **kwargs) if callable(rhs) else rhs
if func(lhs_value, rhs_value):
return rhs_value
else:
return lhs_value
return Function(transformed_f)
return Function(transformation)
Here's the problem. I now want to introduce functions that take a "vector" of Data
instances, and return a "vector" of numbers. At first glance, I could have kept my existing framework unchanged. After all, if I implement a vector as, say, numpy.array
, the vectors would support pointwise arithmetic, and so pointwise arithmetics on the functions would work as intended without any changes to the code above.
But the above code breaks on higher order functions such as cap_if
(which is supposed to constrain each individual element in the vector). I see three options:
Create a new version of
cap_if
, sayvector_cap_if
, for the function on vectors. But then I'd need to do that for all other higher order functions, which feels undesirable. The advantage of this approach, though, is that I could in the future replace the implementation for those functions with, say,numpy
functions for huge performance gains.Implement functions that "raise" the type of a function from "number -> number" to "<function from Data to number> to <function from Data to number>", and from "number -> number" to "<function from vector of Data to number> to <function from vector of Data to number>". Let's call these functions
raise_to_data_function
andraise_to_vector_function
. Then I can definebasic_cap_if
as a function on individual numbers (rather than a higher order function); I do the same for other similar helper functions I need. Then I useraise_to_data_function(basic_cap_if)
instead ofcap_if
andraise_to_vector_function(basic_cap_if)
instead ofcap_if_vector
. This approach would seem to be more elegant, in the sense that I only need to implement each basic function once. But it loses the possible performance gains I described above, and it also results in code that has a lot of function calls.I could follow the approach in 2, but automatically apply
raise_to_data_function
,raise_to_vector_function
functions whenever required, based on context. Presumably I can implement this inside the__or__
method (function application): if it detects a function being passed tosimple_cap_if
, it would check the signature of the function being passed, and apply the appropriateraise_to
function to the right hand side. (The signatures could be exposed for example by making functions with different signatures members of different subclasses ofFunction
; or by having a designated method inFunction
). This would seem to be very hacky, since a lot of implicit type conversions may happen; but it does reduce the code clutter.
Am I missing a better approach, and/or some arguments for/against these ones?