java - Java - xml ドキュメントのサイズを決定する

Question

指定された URL から xml ファイルを取得する簡単なコードがあります。

DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(link);

そのコードは xml ドキュメント (org.w3c.dom.Document) を返します。結果のxmlドキュメントのサイズを取得する必要があります。サードパーティのjarファイルを使用せずに、それを行うエレガントな方法はありますか?

ノード数ではなく、KB または MB 単位の PS サイズ

score 3 · Accepted Answer

最初のナイーブバージョン:ファイルをローカルバッファーにロードします。次に、入力の長さがわかります。次に、バッファーから XML を解析します。

URL url = new URL("...");
InputStream in = new BufferedInputStream(url.openStream());
ByteArrayOutputStream buffer1 = new ByteArrayOutputStream();
int c = 0;
while((c = in.read()) >= 0) {
  buffer1.write(c);
}

System.out.println(String.format("Length in Bytes: %d", 
    buffer1.toByteArray().length));

ByteArrayInputStream buffer2 = new ByteArrayInputStream(buffer1.toByteArray());

Document doc = DocumentBuilderFactory.newInstance()
    .newDocumentBuilder().parse(buffer2);

欠点は、RAM の追加バッファです。

より洗練された 2 番目のバージョン:java.io.FilterInputStream入力ストリームを、それを介してストリーミングされるバイト数をカウントするカスタムでラップします。

URL url = new URL("...");
CountInputStream in = new CountInputStream(url.openStream());
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(in);
System.out.println(String.format("Bytes: %d", in.getCount()));

これがCountInputStreamです。すべてのread()メソッドが上書きされ、スーパークラスにデリゲートされ、結果のバイト数がカウントされます。

public class CountInputStream extends FilterInputStream {

  private long count = 0L;

  public CountInputStream(InputStream in) {
    super(in);
  }

  public int read() throws IOException {
    final int c = super.read();
    if(c >= 0) {
      count++;
    }
    return c;
  }

  public int read(byte[] b, int off, int len) throws IOException {
    final int bytesRead = super.read(b, off, len);
    if(bytesRead > 0) {
      count += bytesRead;
    }
    return bytesRead;
  }

  public int read(byte[] b) throws IOException {
    final int bytesRead = super.read(b);
    if(bytesRead > 0) {
      count += bytesRead;
    }
    return bytesRead;
  }

  public long getCount() {
    return count;
  }
}

score 0 · Accepted Answer

次の方法で実行できます。

long start = Runtime.getRuntime().freeMemory();

XML Document オブジェクトを構築します。次に、上記のメソッドを再度呼び出します。

Document ocument = parser.getDocument();

long now = Runtime.getRuntime().freeMemory();

System.out.println(" size of Document "+(now - start) );

score 0 · Accepted Answer

XML ファイルを DOM ツリーに解析すると、ソースドキュメント (文字列として) は存在しなくなります。そのドキュメントから構築されたノードのツリーがあるだけなので、DOM ドキュメントからソースドキュメントのサイズを正確に判断することはできなくなります。

ID 変換を使用して、DOM ドキュメントを XML ファイルに戻すことができます。しかし、これはサイズを取得する非常に回りくどい方法であり、ソースドキュメントのサイズと完全に一致するわけではありません。

あなたがやろうとしていることについて、最善の方法は、ドキュメントを自分でダウンロードし、サイズをメモしてからDocumentBuilder.parse、InputStream.

score 0 · Accepted Answer

0

たぶんこれ：

document.getTextContent().getBytes().length;

于 2012-07-05T11:56:56.423 に答える

java - Java - xml ドキュメントのサイズを決定する

4 に答える 4

Related

Reference