c++ - Windows 暗号化サービスプロバイダーが重複すると、Python と Pycrypto が発生する

Question

編集と更新

2013 年 3 月 24 日:
Python からの出力ハッシュは、utf-16 に変換し、'e' または 'm' バイトに到達する前に停止した後、c++ からのハッシュと一致するようになりました。ただし、復号化された結果は一致しません。私の SHA1 ハッシュは 20 バイト = 160 ビットであり、RC4 キーの長さは 40 から 2048 ビットまで変化する可能性があるため、おそらく WinCrypt で模倣する必要があるデフォルトのソルトが行われている可能性があります。CryptGetKeyParam KP_LENGTH または KP_SALT

2013 年 3 月24 日:
CryptGetKeyParam KP_LENGTH は、私のキーの長さが 128 ビットであることを教えてくれます。私はそれに160ビットのハッシュを与えています。したがって、おそらく最後の 32 ビットまたは 4 バイトを破棄しているだけです。現在テスト中。

2013 年 3 月 24 日: はい、以上です。Python で SHA1 ハッシュの最後の 4 バイトを破棄すると、同じ復号化結果が得られます。

クイック情報:

データブロックを復号化するための C++ プログラムがあります。Windows Crytographic Service Provider を使用するため、Windows でのみ機能します。他のプラットフォームで動作することを望みます。

メソッドの概要:

Windows Crypto API では、バイトの ASCII エンコードパスワードがワイド文字表現に変換され、SHA1 でハッシュされて RC4 ストリーム暗号のキーが作成されます。

Python PyCrypto では、ASCII でエンコードされたバイト文字列が Python 文字列にデコードされます。これは、mbctowcs が C++ で変換を停止する原因となる、経験的に観察されたバイトに基づいて切り捨てられます。この切り捨てられた文字列は utf-16 で enocode され、文字間に 0x00 バイトが効果的に埋め込まれます。この新しい切り捨てられ、パディングされたバイト文字列は SHA1 ハッシュに渡され、ダイジェストの最初の 128 ビットが PyCrypto RC4 オブジェクトに渡されます。

問題 [解決済み]
Python 3.xw/PyCrypto で同じ結果が得られないようです

C++ コードスケルトン:

HCRYPTPROV hProv      = 0x00;
HCRYPTHASH hHash      = 0x00;
HCRYPTKEY  hKey       = 0x00;
wchar_t    sBuf[256]  = {0};

CryptAcquireContextW(&hProv, L"FileContainer", L"Microsoft Enhanced RSA and AES Cryptographic Provider", 0x18u, 0);

CryptCreateHash(hProv, 0x8004u, 0, 0, &hHash);
//0x8004u is SHA1 flag

int len = mbstowcs(sBuf, iRec->desc, sizeof(sBuf));
//iRec is my "Record" class
//iRec->desc is 33 bytes within header of my encrypted file
//this will be used to create the hash key. (So this is the password)

CryptHashData(hHash, (const BYTE*)sBuf, len, 0);

CryptDeriveKey(hProv, 0x6801, hHash, 0, &hKey);

DWORD dataLen = iRec->compLen;  
//iRec->compLen is the length of encrypted datablock
//it's also compressed that's why it's called compLen

CryptDecrypt(hKey, 0, 0, 0, (BYTE*)iRec->decrypt, &dataLen);
// iRec is my record that i'm decrypting
// iRec->decrypt is where I store the decrypted data
//&dataLen is how long the encrypted data block is.
//I get this from file header info

Python コードスケルトン:

from Crypto.Cipher import ARC4
from Crypto.Hash import SHA

#this is the Decipher method from my record class
def Decipher(self):

    #get string representation of 33byte password
    key_string= self.desc.decode('ASCII')

    #so far, these characters fail, possibly others but
    #for now I will make it a list
    stop_chars = ['e','m']

    #slice off anything beyond where mbstowcs will stop
    for char in stop_chars:
        wc_stop = key_string.find(char)
        if wc_stop != -1:
            #slice operation
            key_string = key_string[:wc_stop]

    #make "wide character"
    #this is equivalent to padding bytes with 0x00

    #Slice off the two byte "Byte Order Mark" 0xff 0xfe 
    wc_byte_string = key_string.encode('utf-16')[2:]

    #slice off the trailing 0x00
    wc_byte_string = wc_byte_string[:len(wc_byte_string)-1] 

    #hash the "wchar" byte string
    #this is the equivalent to sBuf in c++ code above
    #as determined by writing sBuf to file in tests
    my_key = SHA.new(wc_byte_string).digest()

    #create a PyCrypto cipher object
    RC4_Cipher = ARC4.new(my_key[:16])

    #store the decrypted data..these results NOW MATCH
    self.decrypt = RC4_Cipher.decrypt(self.datablock)

疑わしい [編集: 確認済み] 原因
1. パスワードの mbstowcs 変換により、SHA1 ハッシュに供給される「元のデータ」が Python と C++ で同じではありませんでした。mbstowcs は、0x65 および 0x6D バイトで変換を停止していました。元のデータは、元の 33 バイトのパスワードの一部のみを wide_char でエンコードして終了しています。

RC4 は可変長キーを持つことができます。Enhanced Win Crypt Service プロバイダーでは、デフォルトの長さは 128 ビットです。キーの長さを指定しないままにしておくと、「元のデータ」の 160 ビット SHA1 ダイジェストの最初の 128 ビットが取得されました。

編集の調査方法 : 私自身の実験と @RolandSmith の提案に基づいて、私の問題の 1 つは、mbctowcs が予期していなかった方法で動作することであることがわかりました。"e" (0x65) と "m" (0x6d) (おそらくその他) で sBuf への書き込みを停止しているようです。したがって、私の説明 (Ascii でエンコードされたバイト) のパスワード "Monkey" は、sBuf では "M on k" のように見えます。これは、mbstowcs が e で停止し、システムの 2 バイトの wchar typedef に基づいてバイト間に 0x00 を配置したためです。これは、変換結果をテキストファイルに書き出すことでわかりました。

BYTE pbHash[256];  //buffer we will store the hash digest in 
DWORD dwHashLen;  //store the length of the hash
DWORD dwCount;
dwCount = sizeof(DWORD);  //how big is a dword on this system?


//see above "len" is the return value from mbstowcs that tells how
//many multibyte characters were converted from the original
//iRec->desc an placed into sBuf.  In some cases it's 3, 7, 9
//and always seems to stop on "e" or "m"

fstream outFile4("C:/desc_mbstowcs.txt", ios::out | ios::trunc | ios::binary);
outFile4.write((const CHAR*)sBuf, int(len));
outFile4.close();

//now get the hash size from CryptGetHashParam
//an get the acutal hash from the hash object hHash
//write it to a file.
if(CryptGetHashParam(hHash, HP_HASHSIZE, (BYTE *)&dwHashLen, &dwCount, 0)) {
  if(CryptGetHashParam(hHash, 0x0002, pbHash, &dwHashLen,0)){

    fstream outFile3("C:/test_hash.txt", ios::out | ios::trunc | ios::binary);
    outFile3.write((const CHAR*)pbHash, int(dwHashLen));
    outFile3.close();
  }
}

参考：
環境定義によってはワイド文字が問題になる
VC++ 6.0 と VS 2008 の Windows 暗号化サービスの違い

utf-8 を utf-16 文字列に
変換する Python - ワイド文字文字列をバイナリファイルから Python Unicode 文字列に変換する

PyCrypto RC4 の例
https://www.dlitz.net/software/pycrypto/api/current/Crypto.Cipher.ARC4-module.html

Sha256 で文字列をハッシュする

http://msdn.microsoft.com/en-us/library/windows/desktop/aa379916(v=vs.85).aspx

http://msdn.microsoft.com/en-us/library/windows/desktop/aa375599(v=vs.85).aspx

score 1 · Accepted Answer

wchar_t小さなテストプログラム (C) でのサイズをテストできます。

#include <stdio.h> /* for printf */
#include <stddef.h> /* for wchar_t */

int main(int argc, char *argv[]) {
    printf("The size of wchar_t is %ld bytes.\n", sizeof(wchar_t));
    return 0;
}

端末から C++ プログラムを実行できる場合はprintf()、C++ コードで呼び出しを使用して、egiRec->descとハッシュの結果を画面に書き込むこともできます。sbufそれ以外の場合はfprintf()、それらをファイルにダンプするために使用します。

C++ プログラムの動作をよりよく模倣するために、を使用して Python コードctypesを呼び出すこともできます。mbstowcs()

編集：あなたが書いた：

問題の 1 つは、間違いなく mbctowcs にあります。ハッシュするために、（私にとって）予測できない数のバイトをバッファに転送しているようです。

mbctowcs変換されたワイド文字の数を返すことに注意してください。つまり、マルチバイトエンコーディングの 33 バイトバッファには、使用するエンコーディングに応じて、5 (UTF-8 6 バイトシーケンス) から 33 文字までの任意の文字を含めることができます。

Edit2:dwFlagsのパラメータとして 0 を使用していますCryptDeriveKey。そのドキュメントによると、上位 16 ビットにはキーの長さが含まれている必要があります。CryptDeriveKeyの戻り値をチェックして、呼び出しが成功したかどうかを確認する必要があります。

Edit3mbctowcs : Python でテストできます(ここではIPythonを使用しています):

In [1]: from ctypes import *

In [2]: libc = CDLL('libc.so.7')

In [3]: monkey = c_char_p(u'Monkey')

In [4]: test = c_char_p(u'This is a test')

In [5]: wo = create_unicode_buffer(256)

In [6]: nref = c_size_t(250)

In [7]: libc.mbstowcs(wo, monkey, nref)
Out[7]: 6

In [8]: print wo.value
Monkey

In [9]: libc.mbstowcs(wo, test, nref)
Out[9]: 14

In [10]: print wo.value
This is a test

libc = cdll.msvcrtWindows では、おそらくの代わりに使用する必要があることに注意してくださいlibc = CDLL('libc.so.7')。

c++ - Windows 暗号化サービス プロバイダーが重複すると、Python と Pycrypto が発生する

編集と更新

クイック情報:

メソッドの概要:

1 に答える 1

Related

Reference

c++ - Windows 暗号化サービスプロバイダーが重複すると、Python と Pycrypto が発生する