regex - セレンロケーターで正規表現を使用する方法

Question

私はセレン RC を使用しています。たとえば、一致する属性 href を持つすべてのリンク要素を取得したいと思います。

http://[^/]*\d+com

使用したい：

sel.get_attribute( '//a[regx:match(@href, "http://[^/]*\d+.com")]/@name' )

これは、正規表現に一致するすべてのリンクの name 属性のリストを返します。（またはそのようなもの）

ありがとう

score 13 · Accepted Answer

The answer above is probably the right way to find ALL of the links that match a regex, but I thought it'd also be helpful to answer the other part of the question, how to use regex in Xpath locators. You need to use the regex matches() function, like this:

xpath=//div[matches(@id,'che.*boxes')]

(this, of course, would click the div with 'id=checkboxes', or 'id=cheANYTHINGHEREboxes')

ただし、matches 関数は Xpath のすべてのネイティブブラウザ実装でサポートされているわけではないことに注意してください (最も顕著なのは、これを FF3 で使用するとエラーがスローされることです: 無効な xpath[2])。

特定のブラウザーで問題が発生した場合 (私が FF3 で行ったように)、Selenium の allowNativeXpath("false") を使用して JavaScript Xpath インタープリターに切り替えてみてください。遅くなりますが、「matches」や「ends-with」など、より多くの Xpath 関数で動作するようです。:)

score 3 · Accepted Answer

Selenium コマンド getAllLinks を使用して、ページ上のリンクの ID の配列を取得できます。これをループして、getAttribute を使用して href を確認できます。getAttribute は、ロケーターの後に @ と属性名を指定します。たとえば、Java では次のようになります。

String[] allLinks = session().getAllLinks();
List<String> matchingLinks = new ArrayList<String>();

for (String linkId : allLinks) {
    String linkHref = selenium.getAttribute("id=" + linkId + "@href");
    if (linkHref.matches("http://[^/]*\\d+.com")) {
        matchingLinks.add(link);
    }
}

score 2 · Accepted Answer

sel.get_eval()考えられる解決策は、リンクのリストを返す JS スクリプトを使用して作成することです。次のような答え: selenium: Is it possible to use the regexp in selenium locators

score 0 · Accepted Answer

Selenium RC の代替方法もいくつかあります。これらは純粋な Selenium ソリューションではなく、プログラミング言語のデータ構造と Selenium との対話を可能にします。

また、HTML ページのソースを取得してから、ソースを正規表現して一致するリンクのセットを返すこともできます。正規表現のグループ化を使用して、URL、リンクテキスト/ID などを分離し、それらをセレンに戻してクリックまたは移動できます。

もう 1 つの方法は、親/ルート要素の HTML ページソースまたは innerHTML (DOM ロケーター経由) を取得し、HTML をプログラミング言語の DOM オブジェクトとして XML に変換することです。次に、目的の XPath (正規表現を使用するかどうかに関係なく) を使用して DOM をトラバースし、目的のリンクのみのノードセットを取得できます。リンクテキスト/IDまたはURLを解析して、セレンに戻してクリックまたは移動できます。

リクエストに応じて、以下に例を示します。とにかく、投稿は特定の言語に見えなかったので、混合言語です。例として、一緒にハッキングできるものを使用しているだけです。それらは完全にテストされていないか、まったくテストされていませんが、以前に他のプロジェクトでコードの一部を扱ったことがあるため、これらは、先ほど説明したソリューションを実装する方法の概念コードの例です。

//Example of element attribute processing by page source and regex (in PHP)
$pgSrc = $sel->getPageSource();
//simple hyperlink extraction via regex below, replace with better regex pattern as desired
preg_match_all("/<a.+href=\"(.+)\"/",$pgSrc,$matches,PREG_PATTERN_ORDER);
//$matches is a 2D array, $matches[0] is array of whole string matched, $matches[1] is array of what's in parenthesis
//you either get an array of all matched link URL values in parenthesis capture group or an empty array
$links = count($matches) >= 2 ? $matches[1] : array();
//now do as you wish, iterating over all link URLs
//NOTE: these are URLs only, not actual hyperlink elements

//Example of XML DOM parsing with Selenium RC (in Java)
String locator = "id=someElement";
String htmlSrcSubset = sel.getEval("this.browserbot.findElement(\""+locator+"\").innerHTML");
//using JSoup XML parser library for Java, see jsoup.org
Document doc = Jsoup.parse(htmlSrcSubset);
/* once you have this document object, can then manipulate & traverse
it as an XML/HTML node tree. I'm not going to go into details on this
as you'd need to know XML DOM traversal and XPath (not just for finding locators).
But this tutorial URL will give you some ideas:

http://jsoup.org/cookbook/extracting-data/dom-navigation

the example there seems to indicate first getting the element/node defined
by content tag within the "document" or source, then from there get all
hyperlink elements/nodes and then traverse that as a list/array, doing
whatever you want with an object oriented approach for each element in
the array. Each element is an XML node with properties. If you study it,
you'd find this approach gives you the power/access that WebDriver/Selenium 2
now gives you with WebElements but the example here is what you can do in
Selenium RC to get similar WebElement kind of capability
*/

regex - セレンロケーターで正規表現を使用する方法

5 に答える 5

Related

Reference