admin管理员组

文章数量:1576345

You may notice that for a given set of data the MongoDB datafiles in /data/db are larger than the data set inserted into the database.  There are several reasons for this.

local.* files and replication

The replication oplog is preallocated as a capped collection in the local database. This will be allocated approximately 5% of disk space by default on 64 bit installations. If you would like a smaller oplog size use the --oplogSize command line parameter.

Datafile Preallocation

Each datafile is preallocated to a given size.  (This is done to prevent file system fragmentation, among other reasons.)  The first file for a database is <dbname>.0, then <dbname>.1, etc.  <dbname>.0 will be 64MB, <dbname>.1 128MB, etc., up to 2GB.  Once the files reach 2GB in size, each successive file is also 2GB.

Thus if the last datafile present is say, 1GB, that file might be 90% empty if it was recently reached.

Additionally, on Unix, mongod will preallocate an additional datafile in the background and do background initialization of this file.  These files are prefilled with zero bytes.  This inititialization can take up to a minute (less on a fast disk subsystem) for larger datafiles; without prefilling in the background this could result in significant delays when a new file must be prepopulated.

You can disable preallocation with the --noprealloc option to the server. This flag is nice for tests with small datasets where you drop the db after each test. It shouldn't be used on production servers.

For large databases (hundreds of GB or more) this is of no signficant consequence as the unallocated space is small.

Deleted Space

MongoDB maintains deleted lists of space within the datafiles when objects or collections are deleted.  This space is reused but never freed to the operating system.

To compact this space, run db.repairDatabase() from the mongo shell (note this operation will block and is slow).

When testing and investigating the size of datafiles, if your data is just test data, use db.dropDatabase() to clear all datafiles and start fresh.

Checking Size of a Collection

Use the validate command to check the size of a collection -- that is from the shell run:

 

 

This command returns info on the collection data but note there is also data allocated for associated indexes.

本文标签: excessiveDiskSpace