Link to home
Start Free TrialLog in
Avatar of shuboarder
shuboarderFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Problems with Notes database size

Hi all,

I have a strange problem...

I have a database containing approx. 1800 documents that is nearly 400mb in size.
There are no attachments in the documents and no large images used in the database design.

I do have agents that delete all records and create new records every 8 hours.
However, I have disabled soft deletions and would expect this database to be no more than 60mb in size.
I have also tried a compact without success. Are these deleted documents somehow still being stored in the database?

Or, is there something I'm missing?
Avatar of Steve Knight
Steve Knight
Flag of United Kingdom of Great Britain and Northern Ireland image

Have you tried a compact -B  from the server to reduce file size?

load compact path\filename.nsf -B

or possible -C to do an old Copy style compact.

Deleted documents should just leave a stub behind but until you run a compact the unused space will remain in the database.

Steve
If you delete all records using your script and then check the database properties for no. of documents does it show 0 or could it be there are other documents in there that do not meet your view selection formulae?

It would also be worth compacting the db as above when there are no. documents in it then let your script re-populate.

Is there no way your script could update the existing records BTW rather than re-create them three times a day?

Steve
Hi shuboarder,
if you've already analyzed the db design and established that design isn't causing this size inconsistency and Compact doesn't yield anything I suggest you to create the view with selection formula:
    @All
Add the column with formula:
    Form
Add the column with the formula:
    @DocLength/1024
(@DocLength returns the approximate size of a document in bytes, so this way you'll get kilobytes)

Make all columns sortable desc. and asc. and you'll be able to analyze the docs size.

Hope this helps,
Marko
ASKER CERTIFIED SOLUTION
Avatar of Sjef Bosman
Sjef Bosman
Flag of France image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
So if your database is NEVER replicated, you can bring down the value for "Removing documents" to some decent value, e.g. 3. As you say, all documents are re-created every day...
Avatar of shuboarder

ASKER

Hi Sjef,

thanks for the reply...

Where can I set this?
Found it....

Database >> Replication >> Settings >> Space Savers

I have set this to 3 as suggested.

Lets see what happens. (Don't worry, this is only a test database at the moment!)
After having set it to 3, compact the database once again.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
just a thought.

any special reason why you have to delete all docs and create new ones again? would updating the existing docs be easier?
To put it all into perspective: 400 Mb on an 80 Gb disk that costs $80... The file storage costs you half a dollarcent!
Sjef,

it's still 330mb big after setting to 3 and compacting

Dragon-IT

I don't appear to be able to set this to 0. 1 is the lowest.

CezarF

Updating existing docs would be better yes, but this way seemed easier.

Sjef,

it's not the cost of storage that is the problem. It's going to be the cost of replicating nearly 400mb of data across our WAN. If these deletion stubs exist in the quantity you suggest, then replication of this database will eventually grind to a halt.
So you ARE replicating the database! Then the setting to 3 won't be correct, for if replication doesn't work for just one day, you'll see documents in your views that should have been deleted.

If replication is all you are worried about, then relax. Those 400 Mb aren't copied from one replica to the other, but only the differences. What will happen during a replication is something like this:
- database documents are compared (only ids and update times)
- you deleted 1800 documents, so 1800 deletions are sent to the other replica
- you added 1800 documents, so all documents are replicated to the replica database
This will happen no matter the physical size of the database. Even the replicas may differ in physical size, depending on the views that are used. Unused views cost nothing.

To see the actual sizes of the views in your database, please look in the log.nsf database on the server, open Usage/by Size, locate the document for your database and open it...
Thanks for the info. Sjef.

However, if someone is to make a local replica, this takes a long time.
Does this form of replication (i.e. new replica) replicate all these stubs as well?

If possible I would like to lose as many of these stubs as possible.

Are there any replication settings I can change to improve this and reduce the size?
shuboarder, would you rather do extra coding than to face db size problem?  :)

by deleting all docs and recreating them, notes will update all view indices, ft index, etc, and who knows what else domino has to update internally?
Cezar, the difficult thing is, these records are created from MS Excel.

I don't know how much code would be involved in querying Excel and synching it with a Notes database.
It seemed easier to just have 1 agent import all excel data. Then a second agent to delete the contents of the view periodically.
As I said before, you can do batter than just delete all records. I fully agree with Cesar.

Just an alternative way:
- create your main and a temporary database (with the same design)
- import Excel in the temp. database
- create an agent that
    opens two views at the same time, in both databases
    compares documents, finds differences and updates the existing documents,
    adds documents if they don't exist yet
    removes documents that no longer exist
- the temp. database need never be replicated
batter -> better  

:$ Xcuse me...
shuboarder, i see. i guess you'll have to bear with the trade-offs then. :)
i would do it this way.

an agent to
 - flag all docs for deletion
 - read excel file to check if doc exists and update accordignly (and set the delete flag to false)
 - delete all docs flagged for deletion.

Sjef, although that sounds like a great idea, I get the feeling that it's not that sraight forward...
I admit the database situation is not ideal, but is there anything else I can do to improve it?
If the documents are never modified manually, the process can really be pretty straightforward.

You already had a look in log.nsf? Usage/By Size? What's so large in your database?
And NotesPeek? What does it say? Or ScanEZ? Or DXLPeek from OpenNTF? To download:
    http://www-10.lotus.com/ldd/sandbox.nsf/0/2791869F4E1D3FA385256F2C00432973?Open
    http://www.ytria.com/WebSite.nsf/Er_Download?ReadForm&Lang=en
    http://www.openntf.org/projects/pmt.nsf/ProjectLookup/DXLPeek

I have a database here that is 720 Mb with 62,000 documents. Not many deletions (10%) so I guess 400 Mb can hardly be called "slim". Some ideas to reduce space:
- select "Don't maintain unread marks" in the Advanced Database Properties
- disable specialized response hierarchy information (and run a Compact)
Ok,

NotesPeek didn't work....

ScanEZ says I have 21992 deletion stubs on one replica
and 7344 on another.





That must be after setting the number of days to 3 and the Compact. To be on the safe side, set that number to 21, to cope with accidentally delayed replication of a week.

And what says log.nsf?
Hi Sjef,

I have set this to 21 and have let it settle for a couple of days. However, one replica is 32MB and the other is 48MB and growing in size daily. Is this normal? and why is only one replica growing?

Unfortunately I can't access the log.nsf so can't comment on that.
Can't access log.nsf?? That would be the first thing to do, either ask for an axtract of it from one of the Admins, or go to them and ask them about your database, from the view Usage/By Size. I think it's ESSENTIAL you get that information, otherwise we'll just continue to stumble in the dark.

If one replica is growing faster than the other, then probably one replica is used and the other isn't. A database in use will mean that views are opened. Unopened view occupy (almost) no space, that's why it takes some time to open a view no one opened before (in the last 45 days).
The database seems to be settling down now.
However, when I try to create a new replica it is taking hours.

I assume it is trying to also replicate all the deletion stubs?
Is there any way to prevent replication of stubs or would that defeat the object?
Arrgghhhh there are 1.5 million updates!
But of course it will replicate deletion stubs! That's the whole point of those things, that every database knows that a certain document should be deleted if it still exists.

Admin, with log.nsf??
So is the answer just don't ever replicate it?

What do I need to look for in log.nsf?
To see the actual sizes of the views in your database, please look in the log.nsf database on the server, open Usage/by Size, locate the document for your database and open it...

Ask any decent Admin to assist you if you don't have access yourself. They ought to help you. I 'spose...