java - HTTP 応答コード 403 を取得する HtmlCleaner の問題を解決する

Question

Web サイトからデータを取得するために html クリーナーを使用していますが、このエラーが発生し続けます。

サーバーが HTTP 応答コードを返しました: URL の 403: http://www.groupon.com/browse/chicago?z=skip

以前に同じコードを使用したことがあり、完全に機能しているため、何が間違っているのかわかりません。誰か私を助けてください。

コードは以下のとおりです。

public ArrayList ParseGrouponDeals(ArrayList arrayList) {
    try {
        CleanerProperties props = new CleanerProperties();

        props.setTranslateSpecialEntities(true);
        props.setTransResCharsToNCR(true);
        props.setOmitComments(true);

        TagNode root = new HtmlCleaner(props).clean(new URL("http://www.groupon.com/browse/chicago?z=skip"));

        //Get the Wrapper.
        Object[] objects = root.evaluateXPath("//*[@id=\"browse-deals\"]");
        TagNode dealWrapper = (TagNode) objects[0];

        //Get the childs
        TagNode[] todayDeals = dealWrapper.getElementsByAttValue("class", "deal-list-tile grid_5_third", true, true);
        System.out.println("++++ Groupon Deal Today: " + todayDeals.length + " deals");
        for (int i = 0; i < todayDeals.length; i++) {
            String link = String.format("http://www.groupon.com%s", todayDeals[i].findElementByAttValue("class", "deal-permalink", true, true).getAttributeByName("href").toString());
            arrayList.add(link);
        }
        return arrayList;
    } catch (Exception e) {
        System.out.println("Error parsing Groupon:" + e.getMessage());
        e.printStackTrace();
    }
    return null;
}

java - HTTP 応答コード 403 を取得する HtmlCleaner の問題を解決する

1 に答える 1

Related

Reference