For your first question, see Ethics of robots.txt
You need to keep in mind the purpose of robots.txt. Robots that are crawling a site can potentially wreck havoc on the site and essentially cause a DoS attack. So if your "automation" is crawling at all or is downloading more than just a few pages every day or so, AND the site has a robots.txt file that excludes you, then you should honor it.
Personally, I find a little grey area. If my script is working at the same pace as a human using a browser and is only grabbing a few pages then I, in the spirit of the robots exclusion standard, have no problem scrapping the pages so long as it doesn't access the site more than once a day. Please read that last sentence carefully before judging me. I feel it is perfectly logical. Many people may disagree with me there though.
For your second question, web servers have the ability to return a 403 based on the User-Agent attribute of the HTTP header sent with your request. In order to have your script mimic a browser, you have to miss-represent yourself. Meaning, you need to change the HTTP header User-Agent attribute to be the same as the one used by a mainstream web browser (e.g., Firefox, IE, Chrome). Right now it probably says something like 'Mechanize'.
Some sites are more sophisticated than that and have other methods for detecting non-human visitors. In that case, give up because they really don't want you accessing the site in that manner.