javascript - JS/Node:- node.io を使用したタグの選択

Question

私は初心者で、node.io
http://www.nycourts.gov/reporter/3dseries/2013/2013_06966.htmを使用してこのページのコンテンツをスクレイピングする割り当てを行っています。

< P > タグの下にあるテキストコンテンツを文字列として変数に保存したいと考えています。

私のコードはこれです：

var nodeio = require('node.io'); var メソッド = { 入力: false, 実行: function() { this.getHtml(' http://www.nycourts.gov/reporter/3dseries/2013/2013_06966.htm ', function(err, $) {
        //Handle any request / parsing errors
        if (err) this.exit(err);


         var content = $('P');

         this.emit(content);
    });
} }
exports.job = new nodeio.Job({タイムアウト:10}, メソッド);

これはエラーを示しています:「P」に一致する要素はありません。助けてください..

score 1 · Accepted Answer

Error: No elements matching 'P'コマンドを実行するときに私も得ました：

$ ./node_modules/.bin/node.io query http://www.nycourts.gov/reporter/3dseries/2013/2013_06966.htm P

根本的な原因は</P>そのページに終わりがないことであり、node.io は最新の Web ブラウザーのような不正な HTML の自動修正をサポートしていません。クエリ時にうまく機能しますが<blockquote>：

$ ./node_modules/.bin/node.io query http://www.nycourts.gov/reporter/3dseries/2013/2013_06966.htm blockquote

ただし、 Seleniumテクノロジを使用して、実際のブラウザーで HTML ドキュメントを解析することで作成できます。

ホスト上のノードとセレングリッドで実行して、必要なものを取得できるJavaScriptの例を次に示します。質問に対する私の他の回答を参照できます。:

var webdriverjs = require('webdriverjs');

var client = webdriverjs.remote({
  host: 'localhost',
  port: 4444,
  desiredCapabilities: {
    browserName: 'safari', // you can change this accordingly
    version: '7',
    platform: "MAC"  // you can change this accordingly
  }
});

client.init();

client.url('http://www.nycourts.gov/reporter/3dseries/2013/2013_06966.htm')
  .getText("P",function(err, text) { console.log (text)}).call(function () {});

client.end();

javascript - JS/Node:- node.io を使用したタグの選択

1 に答える 1

Related

Reference