java - XMLタグ間をジャンプする

Question

これはSAXの疑問です。親タグと一致する場合にのみ、XMLファイルの子タグを処理したいと思います。例：

<version>
    <parent tag-1>
       <tag 1>
       <tag 2>
     </parent tag-1 >
     <parent tag-2>
       <tag 1>
       <tag 2>
     </parent tag-2>
</version>

上記のコードでは、最初に親タグ（つまり、ユーザー入力に基づいて親タグ-1または親タグ''-2）を照合してから、その下の子タグを処理します。これは、SAXがDOMの制御を制限していること、および私がSAXとJavaの両方の初心者であることを念頭に置いて、SAXパーサーで実行できますか？もしそうなら、対応する方法を引用していただけますか？TIA

score 1 · Accepted Answer

The solution proposed by @Wing C. Chen is more than decent, but in your case, I wouldn't use a stack.

A use case for a stack when parsing XML

A common use case for a stack and XML is for example verifying that XML tags are balanced, when using your own lexer(i.e. hand made XML parser with error tolerance).

A concrete example of it would be building the outline of an XML document for the Eclipse IDE.

When to use SAX, Pull parsers and alike

Memory efficiency when parsing a huge XML file
You don't need to navigate back and forth in the document.

However Using SAX to parse complex documents can become tedious, especially if you want to apply operations to nodes based on some conditions.

When to use DOM like APis

You want easy access to the nodes
You want to navigate back and forth in the document at any time
Speed is not the main requirement vs development time/readability/maintenance

My recommendation

If you don't have a huge XML, use a DOM like API and select the nodes with XPath. I prefer Dom4J personally, but I don't mind other APis such as JDom or even Xpp3 which has XPath support.

score 1 · Accepted Answer

パフォーマンス上の理由でこれを行うことを検討している場合、SAXはとにかくドキュメント全体をスプールします。

ただし、コードの良さの観点から、XMLFilterを使用して接続することにより、SAXパーサーが一致しない子を返さないようにすることができます。おそらく、ロジックを自分で作成する必要があります（Wing C. Chenの投稿で提供されているようなものです）が、アプリケーションロジックに配置する代わりに、フィルター実装に抽象化することもできます。

これにより、フィルタリングロジックをより簡単に再利用できるようになり、アプリケーションコードがよりクリーンでわかりやすくなります。

score 1 · Accepted Answer

確かに、親タグを覚えておけば簡単にできます。

一般に、xmlタグを解析するとき、人々はスタックを使用してそれらのタグのファミリーマップを追跡します。あなたのケースは次のコードで簡単に解決できます：

Stack<Tag> tagStack = new Stack<Tag>();

public void startElement(String uri, String localName, String qName,
        Attributes attributes)
     if(localName.toLowerCase().equals("parent")){
          tagStack.push(new ParentTag());
     }else if(localName.toLowerCase().equals("tag")){
          if(tagStack.peek() instanceof ParentTag){
               //do your things here only when the parent tag is "parent"
          }
     }
}
public void endElement(String uri, String localName, String qName)
        throws SAXException{
     if(localName.toLowerCase().equals("parent")){
          tagStack.pop();
     }
}

または、tagnameを更新することで、自分がどのタグにいるのかを簡単に思い出すことができます。

String tagName = null;
public void startElement(String uri, String localName, String qName,
        Attributes attributes)
     if(localName.toLowerCase().equals("parent")){
          tagName = "parent";
     }else if(localName.toLowerCase().equals("tag")){
          if(tagName!= null && tagName.equals("parent")){
               //do your things here only when the parent tag is "parent"
          }
     }
}
public void endElement(String uri, String localName, String qName)
        throws SAXException{
     tagName = null;
}

しかし、私はスタック方式を好みます。これは、すべての祖先タグを追跡するためです。

score 0 · Accepted Answer

特定のタグにジャンプする場合は、DOMパーサーを使用する必要があります。これにより、ドキュメント全体がメモリに読み込まれ、名前でタグを要求してからそのタグの子を要求するなど、ツリーの特定のノードにアクセスするさまざまな方法が提供されます。

したがって、SAXに制限されていない場合は、DOMをお勧めします。SAX over DOMを使用する主な理由は、ドキュメント全体が一度にロードされるため、DOMがより多くのメモリを必要とするためだと思います。

score 0 · Accepted Answer

SAXパーサーは、タグにヒットするたびに、実装内のメソッドを呼び出します。親に応じて異なる動作が必要な場合は、それを変数に保存する必要があります。

java - XMLタグ間をジャンプする

5 に答える 5

Related

Reference