regex - XSLT 2.0 を使用してパターンに従って文字列を分割する

Question

XSLT 2.0 を使用して解析する必要がある文字列があります。

入力文字列

Hoffmann, Rüdiger (Universtiy-A, SomeCity, (SomeCountry); University-B, SomeCity, (SomeCountry)); Author, X; Author, B. (University-C, SomeCity (SomeCountry))

期待される出力
Hoffmann, Rüdiger (Universtiy-A, SomeCity, (SomeCountry); University-B, SomeCity, (SomeCountry))
Author, X
Author, B. (University-C, SomeCity (SomeCountry))

構造は - 著者名、その後に彼の大学です。ただし、1 人の著者が 2 つの大学を持つこともできます。また、大学間の区切り文字と 2 組の著者間の区切り文字は同じです。(この場合はセミコロン)。

所属間のセミコロンを無視して、著者-所属グループの区切り文字に基づいて分割する必要があります。

正規表現の助けを借りてそれができると信じていますが、自分で正規表現を構築した経験はあまりありません。

score 0 · Accepted Answer

大学のリストと全国の括弧が常に存在する限り、それらを一致させることができます。

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="2.0"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:mf="http://example.com/mf"
    exclude-result-prefixes="xs mf">

    <xsl:output method="text"/>
    <xsl:param name="authors">Author, A. (Universtiy-A, SomeCity, (SomeCountry); University-B, SomeCity, (SomeCountry));Author, B. (University-C, SomeCity (SomeCountry))</xsl:param>

    <xsl:template match="/">
        <xsl:value-of select="mf:split($authors)" separator="&#10;"/>
    </xsl:template>

    <xsl:function name="mf:split" as="xs:string*">
        <xsl:param name="input" as="xs:string"/>
        <xsl:analyze-string select="$input" regex="[^;)]*?\([^(]*?\([^(]*?\)\)">
            <xsl:matching-substring>
                <xsl:sequence select="."/>
            </xsl:matching-substring>
        </xsl:analyze-string>
    </xsl:function>
</xsl:transform>

regex - XSLT 2.0 を使用してパターンに従って文字列を分割する

1 に答える 1

Related

Reference