c# - XFA（Adobe XML Forms Architecture）動的PDFをプログラムで検出する方法

Question

私はpdfをtifに変換するシステムを持っています。基本的に、これはcsharpで記述されたプログラムであり、iTextSharpを使用してpdfおよびpdf2tif（http://pdftotif.sourceforge.net/）に関するメタデータを取得してファイルに変換します。多くのPDFが正しく変換されないことに気づきました。AcrobatとFoxitでは、複数ページのフォームとして開きますが、他のビューア（Ghostscript ...）では、メッセージ付きの1ページのドキュメントとして開きます。

「このドキュメントの全内容を表示するには、新しいバージョンのPDFビューアが必要です。「www.adobe.com/products/acrobat/readstep2.html」から最新バージョンのAdobeReaderにアップグレードできます。http://www.adobe.com/support/products/acrreader.htmlにアクセスしてください"

いくつかのゴーグルは、これらはXFA動的PDFであると私に言いました。プログラムでそれを検出して、これらのpdfを別の方法で処理できるようにする方法はありますか？

score 2 · Accepted Answer

iTextAPIは良いスタートです。

iTextSharpでは、メソッドを呼び出す代わりに、オブジェクトのプロパティにアクセスします。（iTextSharpで適度な量の作業を行ったことがある場合は、おそらくこれをすでに知っているでしょう）

とにかく、HTTPハンドラーを使用した簡単な例を次に示します。

<%@ WebHandler Language="C#" Class="iTextXfa" %>
using System;
using System.Web;
using iTextSharp.text;  
using iTextSharp.text.pdf;

public class iTextXfa : IHttpHandler {
  public void ProcessRequest (HttpContext context) {
    HttpServerUtility Server = context.Server;
    string[] testFiles = { 
      Server.MapPath("./non-XFA.pdf"), Server.MapPath("./XFA.pdf") 
    };
    foreach (string file in testFiles) {
      XfaForm xfa = new XfaForm(new PdfReader(file));
      context.Response.Write(string.Format(
        "<p>File: {0} is XFA: {1}</p>",
        file,
        xfa.XfaPresent ? "YES" : "NO"
      ));
    }
  }
  public bool IsReusable { get { return false; } }
}

score 0 · Accepted Answer

コマンドラインアプローチ：

strings document.pdf | grep XFA

1行か2行を取得した場合は、おそらくXFAPDFを使用しています。

<</Names[(!ADBE::0100_VersChkStrings) 364 0 R(!ADBE::0100_VersChkVars) 365 0 R(!ADBE::0200_VersChkCode_XFACheck) 366 0 R]>>

c# - XFA（Adobe XML Forms Architecture）動的PDFをプログラムで検出する方法

2 に答える 2

Related

Reference