Posted by: Wildan Maulana | February 19, 2009

Index ratio Nutch Index

Just copy paste🙂 :

—————
It is ~1M = 1.5G Ram for just the index (not segments, linkdb, etc.) So
for a 4G Ram box you can fit ~2M pages, an 8G box = ~4.5M pages. That
being said it all depends on the amount of content you index per page
(max content size).

Dennis

Miguel Costa wrote:
> Hi,
>
> I would like to know the ratio between (index size)/(collection size) for
> collections larger than 1 TB.
> My objective is to have all the index in memory, so having I x GB of memory,
> what is the maximum size of a collection I can index?
> Anyone can give me some numbers from your indexations ?
>
> Regards,
>
> —
>
> Miguel Costa

http://www.lucidimagination.com/search/document/c6c099bf31b0de55/index_ratio#de145fe338543d5b


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: