I have an application that is implemented in WinForms that processes ~7,800 URLs in approximately 5 minutes (downloads the URL, parses the content, looks for specific pieces of data and if it finds what its looking for does some additional processing of that data.
This specific application used to take between 26 to 30 minutes to run, but by changing the code to the TPL (Task Parallel Library in .NET v4.0) it executes in just 5. The computer is a Dell T7500 workstation with dual quad core Xeon processors (3 GHz), running with 24 GB of RAM, and Windows 7 Ultimate 64-bit edition.
Though, it's not exactly the same as your situation this too is extremely IO intensive. The documentation on TPL states it was originally conceived for processor bound problem sets, but this doesn't rule out using it in IO situations (as my application demonstrates to me). If you have at least 4 cores and you're not seeing your processing time drop significantly then it's possible you have other implementation issues that are preventing the TPL from really being efficient (locks, hard drive items, etc.). The book Parallel Programming with Microsoft .NET really helped me to understand "how" your code needs to be modified to really take advantage of all that power.
Worth a look in my opinion.