4

更新:の処理を除いて、私は今この質問のほとんどに答えたと思います<pgBreak>XSLTこの投稿の最後にある編集の下で私の更新と最新情報を見ることができます

昨日も同じような質問をして、良い答えが返ってきました。しかし、それ以来、これが私のすべての拠点をカバーしていないことに気付いたので、今日はより詳細な質問をしています。

XML IN

<?xml version="1.0" encoding="UTF-8"?>    
<root>
<pgBreak pgId="i"/>
    <p xml:id="a-01">
        <highlight rend="italic">Bacon ipsum dolor sit amet</highlight> bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
        tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
        bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
        bacon filet mignon pork chop tail.
        <note.ref id="0001"><super>1</super></note.ref>
        <note id="0001">
            <p>
                You may need to consult a <highlight rend="italic">latin</highlight> butcher. Good Luck.
            </p>
        </note>   
        Pork loin <pgBreak pgId="01"/> ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
        hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine   
        beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham 
        hock pork hamburger fatback.
    </p>
    <p xml:id="a-02">
        Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
        tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
        bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
        bacon filet mignon pork chop tail.
    </p>
    <p xml:id="a-03">
        Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
        tongue. 
            <quote>
                <p> 1.
                    Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
                    bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
                    bacon filet mignon pork chop tail.
                </p>
                <p> 2.
                    Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
                    bone. Sirloin <pgBreak pgId="02"/>turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
                    bacon filet mignon pork chop tail.
                </p>
                <p> 3.
                    Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
                    bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
                    bacon filet mignon pork chop tail.
                </p>
            </quote>
    </p>
</root>

HTML OUT

  <!DOCTYPE HTML>
<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
      <title>Test</title>
   </head>
   <body>
      <div id="pg-i">
        Page i
      </div>
      <p data-chunkid="a-01"> 
         <span class="highlight-italic">Bacon ipsum dolor sit amet</span>bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
         tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin
         pastrami t-
         bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef
         hamburger 
         bacon filet mignon pork chop tail.
         <span class="noteRef" id="0001"><sup>1</sup></span></p>
      <div id="note-0001" data-chunkid="a-01">
         <p>
            You may need to consult a <span class="highlight-italic">latin</span> butcher. Good Luck.

         </p>
      </div>
      <p data-chunkid="a-01">   
         Pork loin
      </p>
      <div id="pg-01">
          Page 01
       </div>
        <p data-chunkId="a-01">
         ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket
         rump ham, tail
         hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola
         short ribs swine   
         beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola
         bacon ham 
         hock pork hamburger fatback.
       </p>
      <p data-chunkid="a-02"><span class="highlight-italic">Bacon ipsum dolor sit</span> amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
         tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin
         pastrami t-
         bone. Sirloin turducken short ribs <span class="highlight-bold">t-bone</span> andouille strip steak pork loin corned beef hamburger 
         bacon filet mignon pork chop tail.

      </p>

      <p data-chunkid="a-03">
         Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs
         doner tri-tip 
         tongue. 

      </p>
      <blockquote data-chunkid="a-03">
        <p> 1.
            Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
            bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
            bacon filet mignon pork chop tail.
        </p>
         <p>2.
               Tri-tip ground round <span class="highlight-italic">short ribs</span> capicola meatloaf shank drumstick short loin pastrami t-
               bone. Sirloin 
          </p>
       </blockquote>
       <div id="pg-02">
         Page: 02
       </div>
       <blockquote data-chunkid="a-03"> 
         </p>
               turducken short ribs t-bone andouille strip steak pork loin corned beef
               hamburger bacon filet mignon pork chop tail.

         </p>
        <p> 3.
            Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
            bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
            bacon filet mignon pork chop tail.
        </p>

      </blockquote>
      <p data-chunkid="a-03">
         Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs
         doner tri-tip 
         tongue. 

      </p>
   </body>
</html>

xmlをhtml5に変換したいのですが、各チャンク(xml:id)を一緒に保持します。divits(divの乱用)を避けたいので、各pをdivでラップすることはできませんが、無効なHTMLも避けようとしています。たとえば、親p(xml:id = a-01)を取得してその子孫にラップするのは簡単ですが、ブロックレベル<div>と別のレベル<p>は無効になり、ブラウザはテキストの終了後にすべてを解釈します。孤立したテキストとして。

昨日の質問からXSLTいろいろな修正を試しました。しかし、私は少しなじみのない領域にいることに気づきます。また、ソリューションの簡潔な説明が役立つので、XSLTをよりよく理解できるようになります。これは、今後数か月でXSLTにもっと時間を費やすように見えるからです。マイケル・ケイか何かの本を手に入れるべきでしょう。

編集:私が使用しているXSLTの現在のバージョン

注:ページ分割はまだ試みていません。また、<meta>タグを閉じることができません....酸素14はそれについて不平を言い続けます。

<xsl:template match="/">
    <html>
        <body>
            <xsl:apply-templates/>
        </body>
    </html>
</xsl:template>

<xsl:template match="p[not((parent::note,.//p, .//div))]">
    <p data-chunkID="{@xml:id}">
        <xsl:apply-templates/>
    </p>
</xsl:template>

<xsl:template match="p[.//p, .//div]">
    <xsl:for-each-group select="node()" group-adjacent="boolean((self::text(), self::note.ref,self::highlight))">
        <xsl:choose>
            <xsl:when test="current-grouping-key()">
                <p data-chunkID="{../@xml:id}">
                    <xsl:apply-templates select="current-group()"/>
                </p>
            </xsl:when>
            <xsl:when test="self::p">
                <p>
                    <xsl:apply-templates/>
                </p>
            </xsl:when>
            <xsl:otherwise>
                <xsl:apply-templates select="current-group()"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:for-each-group>
</xsl:template>

<xsl:template match="note.ref">
    <span class="noteRef" id="{@id}">
        <xsl:apply-templates/>
    </span>
</xsl:template>

<xsl:template match="super">
    <sup>
        <xsl:apply-templates/>
    </sup>
</xsl:template>

<xsl:template match="note">
    <div id="note-{@id}" data-chunkID="{../@xml:id}">
        <p>
        <xsl:apply-templates/>
        </p>
    </div>
</xsl:template>


<xsl:template match="quote">
    <blockquote data-chunkID="{../@xml:id}">
        <p>
        <xsl:apply-templates/>
        </p>
    </blockquote>
</xsl:template>



<xsl:template match="highlight">
    <xsl:variable name="class" select="concat(name(.),'-',string(@rend))"/>
    <xsl:choose>
        <xsl:when test="@rend[.= 'italic']">
            <span class="{$class}">
                <xsl:apply-templates/>
            </span>
        </xsl:when>
        <xsl:when test="@rend[.= 'bold']">
            <span class="{$class}">
                <xsl:apply-templates/>
            </span>
        </xsl:when>
        <xsl:otherwise>
            <span class="{$class}">
                <xsl:apply-templates/>
            </span>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

4

1 に答える 1

1

入力が出力と少し矛盾しているようです。(それは期待される出力ですか、それとも現在取得している出力ですか)?チャンクa-02およびa-03<highlight>には入力に要素がありませんが、出力には<span class="highlight...">要素があります。また、チャンクa-03には、ブロッククォートの後にテキストが重複しています。

私はあなたの例のすべてを実行する実用的なソリューションを作成したと思います。これを試していただけませんか?

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="/">
    <html>
      <head>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
        <title>Test</title>
      </head>
      <body>
        <xsl:apply-templates/>
      </body>
    </html>
  </xsl:template>

  <xsl:template match="p | div">
    <xsl:variable name="breaks" select="note | pgBreak | quote" />
    <xsl:variable name="firstNonBreak" select="node()[count(. | $breaks) != count($breaks)][1]" />
    <xsl:variable name="nonBreaksAfterBreak"
                  select="$breaks/following-sibling::node()[1][count(. | $breaks) != count($breaks)]" />

    <xsl:apply-templates select="$breaks | $firstNonBreak | $nonBreaksAfterBreak" mode="sectChild" />
  </xsl:template>

  <!-- Template to output the chunk id attribute of a particular hierarchy -->
  <xsl:template name="ChunkId">
    <xsl:variable name="id" select="ancestor::*[../self::root]/@xml:id" />
    <xsl:if test="$id">
      <xsl:attribute name="data-chunkid">
        <xsl:value-of select="$id"/>
      </xsl:attribute>
    </xsl:if>
  </xsl:template>

  <!-- Splitting types - notes, page breaks, quotes -->
  <xsl:template match="pgBreak" mode="sectChild">
    <div id="pg-{@pgId}">
      <xsl:value-of select="concat('Page ', @pgId)"/>
    </div>
  </xsl:template>

  <xsl:template match="quote | note" mode="sectChild">
    <xsl:apply-templates />
  </xsl:template>

  <!-- Receives the first node of each block of content outside of the splitting types
       and passes processing onto itself and siblings within its block-->
  <xsl:template match="text() | highlight | note.ref | super" mode="sectChild">

    <xsl:variable name="content">
      <xsl:apply-templates select="." mode="buildContent" />
    </xsl:variable>

    <xsl:if test="normalize-space($content)">
      <xsl:call-template name="Nest">
        <xsl:with-param name="hierarchy" select="ancestor::*[not(self::root)]" />
        <xsl:with-param name="content" select="$content" />
      </xsl:call-template>
    </xsl:if>
  </xsl:template>

  <!-- Recursive template to output nodes from the top level down to content -->
  <xsl:template name="Nest">
    <xsl:param name="topLevel" select="true()"/>
    <xsl:param name="hierarchy" />
    <xsl:param name="content" />

    <xsl:variable name="top" select="$hierarchy[1]" />
    <xsl:variable name="remainder" select="$hierarchy[position() > 1]" />

    <!-- If there's a quote or note yet to come, don't output tags until we get there -->
    <xsl:variable name="skipTags" select="boolean($remainder[self::quote or self::note])" />
    <!-- Recursive output is captured in a variable, to be output later in this template -->
    <xsl:variable name="inside">
      <xsl:if test="$hierarchy">
        <xsl:call-template name="Nest">
          <xsl:with-param name="topLevel" select="$topLevel and $skipTags" />
          <xsl:with-param name="hierarchy" select="$remainder" />
          <xsl:with-param name="content" select="$content" />
        </xsl:call-template>
      </xsl:if>
    </xsl:variable>

    <xsl:choose>
      <xsl:when test="not($hierarchy)">
        <xsl:copy-of select="$content" />
      </xsl:when>
      <xsl:when test="$top/self::quote">
        <blockquote>
          <xsl:call-template name="ChunkId" />
          <xsl:copy-of select="$inside"/>
        </blockquote>
      </xsl:when>
      <xsl:when test="$top/self::note">
        <div id="note-{$top/@id}">
          <xsl:call-template name="ChunkId" />
          <xsl:copy-of select="$inside"/>
        </div>
      </xsl:when>
      <xsl:when test="not($skipTags)">
        <xsl:element name="{name($top)}">
          <xsl:if test="$topLevel">
            <xsl:call-template name="ChunkId" />
          </xsl:if>
          <xsl:copy-of select="$inside"/>
        </xsl:element>
      </xsl:when>
      <xsl:otherwise>
        <xsl:copy-of select="$inside"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <xsl:template match="node()" mode="buildContent">
    <xsl:if test="not(self::note or self::quote or self::pgBreak)">
      <!-- output this node -->
      <xsl:apply-templates select="self::node()[normalize-space(.)]" mode="contentOutput" />
      <!-- pass processing onto next sibling -->
      <xsl:apply-templates select="following-sibling::node()[1]" mode="buildContent" />
    </xsl:if>
  </xsl:template>

  <!-- Bottom level content - text, note refs, superscript, highlight-->
  <xsl:template match="text()" mode="contentOutput">
    <xsl:copy-of select="."/>
  </xsl:template>

  <xsl:template match="note.ref" mode="contentOutput">
    <span class="noteRef" id="{@id}">
      <xsl:apply-templates mode="contentOutput"/>
    </span>
  </xsl:template>

  <xsl:template match="super" mode="contentOutput">
    <sup>
      <xsl:apply-templates mode="contentOutput"/>
    </sup>
  </xsl:template>

  <xsl:template match="highlight" mode="contentOutput">
    <xsl:variable name="class" select="concat(name(.),'-',string(@rend))"/>
    <span class="{$class}">
      <xsl:apply-templates mode="contentOutput"/>
    </span>
  </xsl:template>
</xsl:stylesheet>

閉じられていないメタタグは、を使用した結果だと思いますmethod="html"method="xml"閉じたメタタグを取得するために使用する必要がある場合があります。を使用method="html"すると、上記の変換により、サンプル入力から次の出力が生成されます。

<html>
  <head>
    <META http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>Test</title>
  </head>
  <body>
  <p data-chunkid="a-01"><span class="highlight-italic">Bacon ipsum dolor sit amet</span> bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip
    tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
    bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger
    bacon filet mignon pork chop tail.
    <span class="noteRef" id="0001">
      <sup>1</sup>
    </span></p>
      <div id="note-0001" data-chunkid="a-01">
      <p>
        You may need to consult a <span class="highlight-italic">latin</span> butcher. Good Luck.
      </p>
    </div>
    <p data-chunkid="a-01">
    Pork loin </p>
    <div id="pg-01">Page 01</div>
    <p data-chunkid="a-01"> ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
    hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine
    beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham
    hock pork hamburger fatback.
  </p>
  <p data-chunkid="a-02">
    Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip
    tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
    bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger
    bacon filet mignon pork chop tail.
  </p>
  <p data-chunkid="a-03">
    Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip
    tongue.
    </p>
      <blockquote data-chunkid="a-03">
      <p>
        Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
        bone. Sirloin </p>
    </blockquote>
    <div id="pg-02">Page 02</div>
    <blockquote data-chunkid="a-03">
      <p>turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger
        bacon filet mignon pork chop tail.
      </p>
    </blockquote>

</body>
</html>

メソッドを「xml」に変更し、meta要素を手動で変換に追加することで、同じ結果を得ることができますが、次のようになります。<head>

  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Test</title>
  </head>
于 2013-01-18T11:06:39.300 に答える