Webサイトをスクレイピングしますか?mechanizeを使用します:
#!/usr/bin/ruby1.8
require 'rubygems'
require 'mechanize'
agent = WWW::Mechanize.new
page = agent.get("http://xkcd.com")
page = page.link_with(:text=>'Forums').click
page = page.link_with(:text=>'Mathematics').click
page = page.link_with(:text=>'Math Books').click
#puts page.parser.to_html # If you want to see the html you just got
posts = page.parser.xpath("//div[@class='postbody']")
for post in posts
title = post.at_xpath('h3//text()').to_s
author = post.at_xpath("p[@class='author']//a//text()").to_s
body = post.xpath("div[@class='content']//text()").collect do |div|
div.to_s
end.join("\n")
puts '-' * 40
puts "title: #{title}"
puts "author: #{author}"
puts "body:", body
end
出力の最初の部分:
----------------------------------------
title: Math Books
author: Cleverbeans
body:
This is now the official thread for questions about math books at any level, fr\
om high school through advanced college courses.
I'm looking for a good vector calculus text to brush up on what I've forgotten.\
We used Stewart's Multivariable Calculus as a baseline but I was unable to pur\
chase the text for financial reasons at the time. I figured some things may hav\
e changed in the last 12 years, so if anyone can suggest some good texts on thi\
s subject I'd appreciate it.
----------------------------------------
title: Re: Multivariable Calculus Text?
author: ThomasS
body:
The textbooks go up in price and new pretty pictures appear. However, Calculus \
really hasn't changed all that much.
If you don't mind a certain lack of pretty pictures, you might try something li\
ke Widder's Advanced Calculus from Dover. it is much easier to carry around tha\
n Stewart. It is also written in a style that a mathematician might consider no\
rmal. If you think that you might want to move on to real math at some point, i\
t might serve as an introduction to the associated style of writing.