Don't know if this one's commercial enough - but I'd certainly buy it, so perhaps other webmasters with big sites would too:
Since Google Panda has slammed lots of big sites for having "low quality" pages on their domain - it's become important to look hard at the quality of every page on your domain that google has in its index.
But if you have a big site (or in fact anything over 1,000 pages) - it's almost impossible to get a list of all the URLs that Google has in its index.
For example - we have a site where a SITE: query is reporting 135,000 pages indexed - but Google will only return 1,000 urls. So you're left guessing as to what the rest are.
I reckon we've got around 100,000 real pages - so Google's indexed around 35,000 strange variations, pages with query strings, or stuff I've plain forgotten about.
A utility that could extract a full list (or even a big list) of all the pages google has in its index for a particular domain would be incredibly useful - once you know what they've got, you could use NoIndex and robots.txt to clean it up.
I reckon this might be possible by querying Google with site: and words that only appear on a few pages on your site - and then de-duping the results, to create a list of unique pages in Google's index.
Might not be possible to be exhaustive - but I reckon it would be easy to get way beyond the 1,000 or so URLs that Google will return.