1

Selenium を使用して Web サイトで Web 自動化を実行するのは初めてで、2 つの div タグの間でテキストを抽出するのに問題があります。

これは、テキストを抽出しようとしている HTML コードの一部です。

 ...
<tr>
    <td width="150">
    <a href="https://rads.stackoverflow.com/amzn/click/com/B0099RGRT8" rel="nofollow noreferrer">
    <img height="90" border="0" width="90" alt="iOttie Easy Flex2 Windshield Dashboard Car Mount H&hellip by iOttie" src="http://ecx.images-amazon.com/images/I/51mf6Ry9J2L._SL500_SS90_.jpg">
    </a>
    <div class="xxsmall" style="margin-top: 5px">
        <a href="https://rads.stackoverflow.com/amzn/click/com/B0099RGRT8" rel="nofollow noreferrer">iOttie Easy Flex2 Windshield Dashboard Car Mount Holder Desk Stand for iPhone 5 4S 4 3GS Samsung Gal&amp;hellip</a>
        by iOttie
    </div>
    </td>
    <td style="padding-left: 10px;">
        <div>
            <div>
                <span style="margin-left:-5px; vertical-align: -1">

                </span>
                <b>
                <a href="http://www.amazon.com/gp/cdp/member-reviews/A2UQ07EFPSX78X/ref=cm_pdp_rev_title_1?ie=UTF8&sort_by=MostRecentReview#R12ATB4KTIWFV8">Bought for my wife, now I want one. Excellent Product.</a>
                </b>
                ,
                <span class="nowrap">November 30, 2012</span>
            </div>
            <div style="margin-top: 5px;">
                I bought this mount for my wife, the feedback from her was is that it was really nice and easy to use even while driving.
                <br>
                <br>
                So I "borrowed" it for a couple days, and now I am going to get one for myself. I am using it with an iPhone, but it would work fine with phones of all sizes, which is nice. If my phone size ever changes the mount will accommodate different sizes phones.
                <br>
                <br>
                The phone is very easy to insert and remove , even while driving.
                <br>
                The mount is easy to position but not loose enough that it doesn't hold the position you want.
                <br>
                <br>
                I was very impressed with the windshield mount, it is not just a typical suction cup mount. (Which always at some point…
                <a href="http://www.amazon.com/gp/cdp/member-reviews/A2UQ07EFPSX78X/ref=cm_pdp_rev_more?ie=UTF8&sort_by=MostRecentReview#R12ATB4KTIWFV8">Read more</a>
            </div>
        </div>
    </td>
</tr>
...

他の div タグには、実際には他のテキストも含まれています。

ここから抽出したかったのは、妻のためにこのマウントを購入したことです。妻からのフィードバックは、運転中でも本当に素晴らしく、使いやすいというものでした。

            I bought this mount for my wife, the feedback from her was is that it was really nice and easy to use even while driving.

            So I "borrowed" it for a couple days, and now I am going to get one for myself. I am using it with an iPhone, but it would work fine with phones of all sizes, which is nice. If my phone size ever changes the mount will accommodate different sizes phones.

            The phone is very easy to insert and remove , even while driving.

            The mount is easy to position but not loose enough that it doesn't hold the position you want.

            I was very impressed with the windshield mount, it is not just a typical suction cup mount. (Which always at some point…

これは私のコードです:

String review;
try {
    review = WebElement.bucketElement.findElement(By.xpath("./td/div")).getText();
} catch (NoSuchElementException nsee) {
    review = "NA";
}

これは実際には、最も内側のすべての div タグからすべてのテキストを抽出しますが、これは私が望むものではありません。特定の div タグをターゲットにする./td/div/div[3]ことはできますが、div タグの間のテキストを取得できません。

何かご意見は?

ありがとう

4

2 に答える 2

1

回避策として正規表現を使用できます。

String review;
try {
    review = WebElement.bucketElement.findElement(By.xpath("./td/div")).getText();
    review.replaceAll("(<.+>)", "");
} catch (NoSuchElementException nsee) {
    review = "NA";
}

正規表現は、すべてのタグと内部要素のテキストを削除します。最初のレベルのテキストだけが残っています。それはあなたが持っている場合を意味します:

some strange<div>other text</div> text 結果の文字列は次のようになります。some strange text

より複雑な正規表現が必要な場合は、ここでテストするのに便利なリンクです

于 2013-03-28T07:19:16.163 に答える