0

I have a problem.

I have a list of SKU numbers (hundreds) that I'm trying to match with the title of the product that it belongs to. I have thought of a few ways to accomplish this, but I feel like I'm missing something... I'm hoping someone here has a quick and efficient idea to help me get this done.

The products come from Aidan Gray.

Attempt #1 (Batch Program Method) - FAIL:

After searching for a SKU in Aidan Gray, the website returns a URL that looks like below:

http://www.aidangrayhome.com/catalogsearch/result/?q=SKUNUMBER

... with "SKUNUMBER" obviously being a SKU.

The first result of the webpage is almost always the product.

To click the first result (through the address bar) the following can be entered (if Javascript is enabled through the address bar):

javascript:{document.getElementsByClassName("product-image")[0].click;}

I wanted to create a .bat file through Command Prompt and execute the following command:

firefox http://www.aidangrayhome.com/catalogsearch/result/?q=SKUNUMBER javascript:{document.getElementsByClassName("product-image")[0].click;}

... but Firefox doesn't seem to allow these two commands to execute in the same tab.

If that worked, I was going to go to http://tools.buzzstream.com/meta-tag-extractor, paste the resulting links to get the titles of the pages, and export the data to CSV format, and copy over the data I wanted.

Unfortunately, I am unable to open both the webpage and the Javascript in the same tab through a batch program.

Attempt #2 (I'm Feeling Lucky Method):

I was going to use Google's &btnI URL suffix to automatically redirect to the first result.

http://www.google.com/search?btnI&q=site:aidangrayhome.com+SKUNUMBER

After opening all the links in tabs, I was going to use a Firefox add-on called "Send Tab URLs" to copy the names of the tabs (which contain the product names) to the clipboard.

The problem is that most of the results were simply not lucky enough...

If anybody has an idea or tip to get this accomplished, I'd be very grateful.

4

1 に答える 1

1

これには JScript を使用することをお勧めします。ハイブリッド コードとしてバッチ スクリプトに含めるのは簡単で、その構造と構文は JavaScript に慣れている人なら誰でも知っているものであり、それを使用して XMLHTTPRequest (知識の乏しい人には Ajax とも呼ばれます) 経由で Web ページを取得し、そこから DOM オブジェクトを構築できます。COM オブジェクト.responseTextを使用します。htmlfile

とにかく、挑戦:受け入れました。これを .bat 拡張子で保存します。SKU を含むテキスト ファイルを 1 行に 1 つずつ検索し、それぞれの検索ページを取得してスクレイピングし、.className「product-image」の最初のアンカー要素から情報を CSV ファイルに書き込みます。

@if (@CodeSection == @Batch) @then

@echo off
setlocal

set "skufile=sku.txt"
set "outfile=output.csv"
set "URL=http://www.aidangrayhome.com/catalogsearch/result/?q="

rem // invoke JScript portion
cscript /nologo /e:jscript "%~f0" "%skufile%" "%outfile%" "%URL%"

echo Done.

rem // end main runtime
goto :EOF

@end // end batch / begin JScript chimera

var fso = WSH.CreateObject('scripting.filesystemobject'),
    skufile = fso.OpenTextFile(WSH.Arguments(0), 1),
    skus = skufile.ReadAll().split(/\r?\n/),
    outfile = fso.CreateTextFile(WSH.Arguments(1), true),
    URL = WSH.Arguments(2);

skufile.Close();

String.prototype.trim = function() { return this.replace(/^\s+|\s+$/g, ''); }

// returns a DOM root object
function fetch(url) {
    var XHR = WSH.CreateObject("Microsoft.XMLHTTP"),
        DOM = WSH.CreateObject('htmlfile');

    WSH.StdErr.Write('fetching ' + url);

    XHR.open("GET",url,true);
    XHR.setRequestHeader('User-Agent','XMLHTTP/1.0');
    XHR.send('');
    while (XHR.readyState!=4) {WSH.Sleep(25)};
    DOM.write(XHR.responseText);
    return DOM;
}

function out(what) {
    WSH.StdErr.Write(new Array(79).join(String.fromCharCode(8)));
    WSH.Echo(what);
    outfile.WriteLine(what);
}

WSH.Echo('Writing to ' + WSH.Arguments(1) + '...')
out('sku,product,URL');

for (var i=0; i<skus.length; i++) {
    if (!skus[i]) continue;

    var DOM = fetch(URL + skus[i]),
        anchors = DOM.getElementsByTagName('a');

    for (var j=0; j<anchors.length; j++) {
        if (/\bproduct-image\b/i.test(anchors[j].className)) {
            out(skus[i]+',"' + anchors[j].title.trim() + '","' + anchors[j].href + '"');
            break;
        }
    }
}

outfile.Close();

残念ながら、htmlfileCOM オブジェクトは をサポートしていませんgetElementsByClassName。:/しかし、これは私のテストでは十分に機能するようです。

于 2015-03-26T14:19:48.887 に答える