python - 計算によるNaNの伝播

Question

通常、NaN（数値ではない）は計算を通じて伝播するため、各ステップでNaNをチェックする必要はありません。これはほとんどの場合機能しますが、明らかに例外があります。例えば：

>>> nan = float('nan')
>>> pow(nan, 0)
1.0

私はこれについて次のコメントを見つけました：

算術演算による静かなNaNの伝播により、中間段階での広範なテストを行わなくても、一連の演算の最後にエラーを検出できます。ただし、言語と関数によっては、NaNを式でサイレントに削除できるため、他のすべての浮動小数点値（たとえば、1として定義されるNaN ^ 0）に対して一定の結果が得られるため、一般に後のテストであることに注意してください。設定されたINVALIDフラグは、NaNが導入されたすべてのケースを検出するために必要です。

べき関数がどのように機能するかについてより厳密な解釈を望む人々を満足させるために、2008年の規格は2つの追加のべき関数を定義しています。pown（x、n）ここで、指数は整数でなければなりません。powr（x、y）は、パラメーターがNaNであるか、指数が不定形になるたびにNaNを返します。

Pythonを介して上記のINVALIDフラグをチェックする方法はありますか？あるいは、NaNが伝播しないケースをキャッチする他のアプローチはありますか？

動機：欠測データにはNaNを使用することにしました。私のアプリケーションでは、入力が欠落していると結果が欠落しているはずです。私が説明した例外を除いて、それはうまく機能します。

score 3 · Accepted Answer

これが尋ねられてから1か月が経過したことを認識していますが、同様の問題が発生し（つまりpow(float('nan'), 1)、Jython 2.52b2などの一部のPython実装で例外がスローされます）、上記の回答が私とはまったく異なることがわかりました。探している。

6502で提案されているMissingDataタイプを使用するのは道のりのようですが、具体的な例が必要でした。Ethan FurmanのNullTypeクラスを試しましたが、データ型を強制しないため、これは算術演算では機能しないことがわかりました（以下を参照）。また、オーバーライドされた各算術関数に明示的に名前を付けることも好きではありませんでした。

イーサンの例とここで見つけたコードの微調整から始めて、私は以下のクラスに到着しました。このクラスには多くのコメントがありますが、実際には関数型コードが数行しか含まれていないことがわかります。

重要なポイントは次のとおりです。1。coerce（）を使用して、混合型（NoData + floatなど）の算術演算用に2つのNoDataオブジェクトを返し、文字列ベース（concatなど）の演算用に2つの文字列を返します。 2. getattr（）を使用して、他のすべての属性/メソッドアクセス用に呼び出し可能なNoData（）オブジェクトを返します。3。call（）を使用して、NoData（）オブジェクトの他のすべてのメソッドを実装します。NoData（）オブジェクトを返します。

これがその使用例です。

>>> nd = NoData()
>>> nd + 5
NoData()
>>> pow(nd, 1)
NoData()
>>> math.pow(NoData(), 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: nb_float should return float object
>>> nd > 5
NoData()
>>> if nd > 5:
...     print "Yes"
... else:
...     print "No"
... 
No
>>> "The answer is " + nd
'The answer is NoData()'
>>> "The answer is %f" % (nd)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: float argument required, not instance
>>> "The answer is %s" % (nd)
'The answer is '
>>> nd.f = 5
>>> nd.f
NoData()
>>> nd.f()
NoData()

NoData（）でpowを使用すると**演算子が呼び出されるため、NoDataで機能することに気付きましたが、math.powを使用すると、最初にNoData（）オブジェクトをfloatに変換しようとするため機能しません。私は非数学の捕虜を使って満足しています-うまくいけば、6502などが上記のコメントで捕虜に問題があったときにmath.powを使っていました。

私が解決する方法を考えることができないもう1つの問題は、format（％f）演算子を使用することです...この場合、NoDataのメソッドは呼び出されません。フロートを指定しないと、演算子は失敗します。とにかくここにクラス自体があります。

class NoData():
"""NoData object - any interaction returns NoData()"""
def __str__(self):
    #I want '' returned as it represents no data in my output (e.g. csv) files
    return ''        

def __unicode__(self):
    return ''

def __repr__(self):
    return 'NoData()'

def __coerce__(self, other_object):
    if isinstance(other_object, str) or isinstance(other_object, unicode):
        #Return string objects when coerced with another string object.
        #This ensures that e.g. concatenation operations produce strings.
        return repr(self), other_object  
    else:
        #Otherwise return two NoData objects - these will then be passed to the appropriate
        #operator method for NoData, which should then return a NoData object
        return self, self

def __nonzero__(self):
    #__nonzero__ is the operation that is called whenever, e.g. "if NoData:" occurs
    #i.e. as all operations involving NoData return NoData, whenever a 
    #NoData object propagates to a test in branch statement.       
    return False        

def __hash__(self):
    #prevent NoData() from being used as a key for a dict or used in a set
    raise TypeError("Unhashable type: " + self.repr())

def __setattr__(self, name, value):
    #This is overridden to prevent any attributes from being created on NoData when e.g. "NoData().f = x" is called
    return None       

def __call__(self, *args, **kwargs):
    #if a NoData object is called (i.e. used as a method), return a NoData object
    return self    

def __getattr__(self,name):
    #For all other attribute accesses or method accesses, return a NoData object.
    #Remember that the NoData object can be called (__call__), so if a method is called, 
    #a NoData object is first returned and then called.  This works for operators,
    #so e.g. NoData() + 5 will:
    # - call NoData().__coerce__, which returns a (NoData, NoData) tuple
    # - call __getattr__, which returns a NoData object
    # - call the returned NoData object with args (self, NoData)
    # - this call (i.e. __call__) returns a NoData object   

    #For attribute accesses NoData will be returned, and that's it.

    #print name #(uncomment this line for debugging purposes i.e. to see that attribute was accessed/method was called)
    return self

score 2 · Accepted Answer

頭痛の種になっているだけの場合は、簡単に再定義して、好きな状況でpow()戻ることができます。NaN

def pow(x, y):
    return x ** y if x == x else float("NaN")

NaN指数として使用できる場合は、それも確認する必要があります。これによりValueError、底が1の場合を除いて、例外が発生します（明らかに、1の累乗は、数値ではない場合でも1であるという理論に基づいています）。

（そしてもちろんpow()、実際には3つのオペランドを取ります。3番目はオプションです。これは演習として残しておきます...）

残念ながら、**演算子の動作は同じであり、組み込みの数値型に対してそれを再定義する方法はありません。これをキャッチする可能性は、floatその実装__pow__()のサブクラスを記述し、__rpow__()そのクラスをNaN値に使用することです。

Pythonは、計算によって設定されたフラグへのアクセスを提供していないようです。たとえそうだったとしても、それはあなたが個々の操作の後にチェックしなければならない何かです。

実際、さらに検討すると、欠測値に対してダミークラスのインスタンスを使用するのが最善の解決策になると思います。Pythonは、これらの値を使用して実行しようとするすべての操作をチョークして例外を発生させます。例外をキャッチして、デフォルト値などを返すことができます。必要な値が欠落している場合、残りの計算を続行する理由はないため、例外は問題ないはずです。

score 2 · Accepted Answer

自分で定義しNaNたクラスのインスタンスを使用する代わりに、それを使用するのにすでに別のセマンティクスがあるのはなぜですか？MissingData

伝播を取得するためのインスタンスの操作の定義MissingDataは簡単なはずです...

score 2 · Accepted Answer

あなたの質問に答えるには：いいえ、通常のフロートを使用してフラグをチェックする方法はありません。ただし、Decimalクラスを使用すると、より詳細な制御が可能になります。。。しかし、少し遅いです。

もう1つのオプションは、次のようなEmptyDataまたはNullクラスを使用することです。

class NullType(object):
    "Null object -- any interaction returns Null"
    def _null(self, *args, **kwargs):
        return self
    __eq__ = __ne__ = __ge__ = __gt__ = __le__ = __lt__ = _null
    __add__ = __iadd__ = __radd__ = _null
    __sub__ = __isub__ = __rsub__ = _null
    __mul__ = __imul__ = __rmul__ = _null
    __div__ = __idiv__ = __rdiv__ = _null
    __mod__ = __imod__ = __rmod__ = _null
    __pow__ = __ipow__ = __rpow__ = _null
    __and__ = __iand__ = __rand__ = _null
    __xor__ = __ixor__ = __rxor__ = _null
    __or__ = __ior__ = __ror__ = _null
    __divmod__ = __rdivmod__ = _null
    __truediv__ = __itruediv__ = __rtruediv__ = _null
    __floordiv__ = __ifloordiv__ = __rfloordiv__ = _null
    __lshift__ = __ilshift__ = __rlshift__ = _null
    __rshift__ = __irshift__ = __rrshift__ = _null
    __neg__ = __pos__ = __abs__ = __invert__ = _null
    __call__ = __getattr__ = _null

    def __divmod__(self, other):
        return self, self
    __rdivmod__ = __divmod__

    if sys.version_info[:2] >= (2, 6):
        __hash__ = None
    else:
        def __hash__(yo):
            raise TypeError("unhashable type: 'Null'")

    def __new__(cls):
        return cls.null
    def __nonzero__(yo):
        return False
    def __repr__(yo):
        return '<null>'
    def __setattr__(yo, name, value):
        return None
    def __setitem___(yo, index, value):
        return None
    def __str__(yo):
        return ''
NullType.null = object.__new__(NullType)
Null = NullType()

__repr__および__str__メソッドを変更することをお勧めします。また、Null辞書キーとして使用したり、セットに保存したりすることはできませんのでご注意ください。

python - 計算によるNaNの伝播

4 に答える 4

Related

Reference