What format are the files in? I did some experimenting with the tif OCR last fall. Had mixed results. http://www.3dvision.com/wordpress/2012/10/22/epdm-works-with-scanned-paper-drawings/
Generally I think the pdf iFilter is better than the Tiff. I've never experiemented with the ones you have to pay for.
The content search runs with Microsoft indexing. If you have the RAM and disk space I wouldn't worry much about a practical limit. You can turn the OCR on after or before the files are in the vault.
Good article Jeff, it speaks of the two categories of documnents I want to glean information from. First is 30 plus years of drawings we have made digital so that we can consider turfing the hard copy. Second are client specifications that we must consult for projects, they are all pdf's already and most of them are clean enough that I think we can get a decent OCR on them.
The legacy drawings was a hopeful item as it could help us with searching capability that we do not currently have, but if EPDM can't do it that's fine, not a big deal.
But the client specs would really help, there are hundreds of documents (and they are long), to be able to search the content for key words would be grand!
If these documents are in Office or text format you'll probably be very happy with the result. If they are graphical and clean, I would expect pretty good results.
The more graphics or sloppier the image the worse they'll be.
You won't know until you try. You can always change and use different iFilters if you aren't happy.
Let us know what you learn!
if you want to get text content from scanned image or document, you need the ocr software or tool to help you. the ocr converter can recognize text from image, some advantage ocr tool can save the text to plain txt file and searchable pdf document.