Wednesday, March 30, 2011

Index Size vs Disk Space for Solr Optimize

You require at least twice as much disk space as the size of the index. This is because of the way Solr optimizes the index. During optimization, Solr creates a copy of the index and when ready replaces the current copy with the new one. An optimize call takes longer than a Commit call. Optimize is cpu/disk/time intensive and must be done infrequently - maybe once a day. Because of the resource intensive nature, optimze must be run during the time of the day when the query volume is low.

This is apparent when you run large loads using dataimporthandler. Monitor the size of the data\index directory in your solr installation to see the size changes. You can add a querystring argument of optimize=false with your dataimporthandler call to run unoptimized commits when the dataimport finishes.

For more info on optional attributes of commit and optimize visit : http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22

No comments:

Post a Comment