おそらくあなたには手遅れですが、他の誰かを助けるかもしれません。私は同じ問題に直面し、パスをサニタイズするための信頼できる方法を見つける必要がありました。
これが私が3つのステップで使用することになったものです:
ステップ1:カスタムクリーニング。
public static string RemoveSpecialCharactersUsingCustomMethod(this string expression, bool removeSpecialLettersHavingASign = true)
{
var newCharacterWithSpace = " ";
var newCharacter = "";
// Return carriage handling
// ASCII LINE-FEED character (LF),
expression = expression.Replace("\n", newCharacterWithSpace);
// ASCII CARRIAGE-RETURN character (CR)
expression = expression.Replace("\r", newCharacterWithSpace);
// less than : used to redirect input, allowed in Unix filenames, see Note 1
expression = expression.Replace(@"<", newCharacter);
// greater than : used to redirect output, allowed in Unix filenames, see Note 1
expression = expression.Replace(@">", newCharacter);
// colon: used to determine the mount point / drive on Windows;
// used to determine the virtual device or physical device such as a drive on AmigaOS, RT-11 and VMS;
// used as a pathname separator in classic Mac OS. Doubled after a name on VMS,
// indicates the DECnet nodename (equivalent to a NetBIOS (Windows networking) hostname preceded by "\\".).
// Colon is also used in Windows to separate an alternative data stream from the main file.
expression = expression.Replace(@":", newCharacter);
// quote : used to mark beginning and end of filenames containing spaces in Windows, see Note 1
expression = expression.Replace(@"""", newCharacter);
// slash : used as a path name component separator in Unix-like, Windows, and Amiga systems.
// (The MS-DOS command.com shell would consume it as a switch character, but Windows itself always accepts it as a separator.[16][vague])
expression = expression.Replace(@"/", newCharacter);
// backslash : Also used as a path name component separator in MS-DOS, OS/2 and Windows (where there are few differences between slash and backslash); allowed in Unix filenames, see Note 1
expression = expression.Replace(@"\", newCharacter);
// vertical bar or pipe : designates software pipelining in Unix and Windows; allowed in Unix filenames, see Note 1
expression = expression.Replace(@"|", newCharacter);
// question mark : used as a wildcard in Unix, Windows and AmigaOS; marks a single character. Allowed in Unix filenames, see Note 1
expression = expression.Replace(@"?", newCharacter);
expression = expression.Replace(@"!", newCharacter);
// asterisk or star : used as a wildcard in Unix, MS-DOS, RT-11, VMS and Windows. Marks any sequence of characters
// (Unix, Windows, later versions of MS-DOS) or any sequence of characters in either the basename or extension
// (thus "*.*" in early versions of MS-DOS means "all files". Allowed in Unix filenames, see note 1
expression = expression.Replace(@"*", newCharacter);
// percent : used as a wildcard in RT-11; marks a single character.
expression = expression.Replace(@"%", newCharacter);
// period or dot : allowed but the last occurrence will be interpreted to be the extension separator in VMS, MS-DOS and Windows.
// In other OSes, usually considered as part of the filename, and more than one period (full stop) may be allowed.
// In Unix, a leading period means the file or folder is normally hidden.
expression = expression.Replace(@".", newCharacter);
// space : allowed (apart MS-DOS) but the space is also used as a parameter separator in command line applications.
// This can be solved by quoting, but typing quotes around the name every time is inconvenient.
//expression = expression.Replace(@"%", " ");
expression = expression.Replace(@" ", newCharacter);
if (removeSpecialLettersHavingASign)
{
// Because then issues to zip
// More at : http://www.thesauruslex.com/typo/eng/enghtml.htm
expression = expression.Replace(@"ê", "e");
expression = expression.Replace(@"ë", "e");
expression = expression.Replace(@"ï", "i");
expression = expression.Replace(@"œ", "oe");
}
return expression;
}
手順2:まだ削除されていない無効な文字を確認します。
追加の検証手順として、Path.GetInvalidPathChars()
上記の方法を使用して、まだ削除されていない潜在的な無効な文字を検出します。
public static bool ContainsAnyInvalidCharacters(this string path)
{
return (!string.IsNullOrEmpty(path) && path.IndexOfAny(Path.GetInvalidPathChars()) >= 0);
}
手順3:手順2で検出された特殊文字をすべて削除します。
そして最後に、私はこの方法を最後のステップとして使用して、残っているものをすべてきれいにします。(パスとファイル名から不正な文字を削除する方法から?):
public static string RemoveSpecialCharactersUsingFrameworkMethod(this string path)
{
return Path.GetInvalidFileNameChars().Aggregate(path, (current, c) => current.Replace(c.ToString(), string.Empty));
}
最初のステップでクリーンアップされなかった無効な文字をログに記録します。「リーク」が検出されるとすぐに、カスタムメソッドを改善するためにその方法を選択します。Path.GetInvalidFileNameChars()
上記で報告された次のステートメント(MSDNから)のため、私は信頼できません:
「このメソッドから返される配列には、ファイル名とディレクトリ名で無効な文字の完全なセットが含まれているとは限りません。」
これは理想的なソリューションではないかもしれませんが、私のアプリケーションのコンテキストと必要な信頼性のレベルを考えると、これは私が見つけた最良のソリューションです。