Answering the 'why not to use NumRenderingThreads' question.
Its complicated :-)
PostScript is an interpreted language and can only be run as a single thread of interpretation because the state can change as you go, and program execution can rely on the current state. Since you cannot multi-thread the interpreter you need to start at the beginning of the program and proceed to the end.
So, in order to run threads, we write a 'clist' a form of what's known as a display list. This isn't interpreted, its little more than a list of graphic primitives and positions on the output page which is derived by executing the PostScript program. NB the clist is a fixed resolution, the original PostScript program isn't, the PostScript program can take different actions depending on the environment it executes in, the clist can't, etc.
We then split the output page into horizontal stripes and use the clist to run one thread per stripe to render just the bits that lie on that stripe. The clist allows multiple thread access and because its not interpreted, values don't change. Some objects will lie across striped and will be (partially) rendered multiple times (this is important for image data) In order to create the final page we stitch the stripes back together.
This means that overall we need to interpret the program write the clist, read the clist multiple times creating multiple stripes, which we then need to put back together.
Alternatively we can use a page buffer, a chunk of memory large enough to hold the entire output. We interpret once, and render as we go. We don't write a clist, we only render each object once and we don't need to consolidate the outputs. Not surprisingly this is faster.
So what's the point of a clist and multiple threads ? Well at high resolution most systems don't have enough memory to hold the entire output in one go, so we have to produce stripes and stitch them together, which means we need a clist. If we have to go down that route then yes, NumRenderingThreads will be faster.
At 150 dpi, unless you are printing banners for the sides of buildings, this is unlikely to be the case :-)
So NumRenderingThreads can be faster, but in your case it almost certainly isn't. But it may be so fast anyway that you can't tell.