Need to edge out the competition for your dream job? Train for certifications today.
Experts Exchange Solution brought to you by
"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.
When should I use GridFS?
For documents in a MongoDB collection, you should always use GridFS for storing files larger than 16 MB.
In some situations, storing large files may be more efficient in a MongoDB database than on a system-level filesystem.
• If your filesystem limits the number of files in a directory, you can use GridFS to store as many files as needed.
• When you want to keep your files and metadata automatically synced and deployed across a number of systems and facilities. When using geographically distributed replica sets MongoDB can distribute files and their metadata automatically to a number of mongod instances and facilities.
• When you want to access information from portions of large files without having to load whole files into memory, you can use GridFS to recall sections of files without reading the entire file into memory.
Do not use GridFS if you need to update the content of the entire file atomically. As an alternative you can store multiple versions of each file and specify the current version of the file in the metadata. You can update the metadata field that indicates “latest” status in an atomic update after uploading the new version of the file, and later remove previous versions if needed.
Furthermore, if your files are all smaller the 16 MB BSON Document Size limit, consider storing the file manually within a single document. You may use the BinData data type to store the binary data. See your drivers documentation for details on using BinData.
Another big plus is that if we use the ordinary filesystem, we would have to handle backup/replication/scaling ourselves. We would also have to come up with some sort of hashing scheme ourselves plus we would need to take care about cleanup/sorting/moving because filesystems do not love lots of small files.
With GridFS, we can use MongoDB's built-in replication/backup/scaling e.g. scale reads by adding more read-only slaves and writes by using sharding. We also get out of the box hashing (read uuid (universally unique identifier)) for stored content plus we do not suffer from filesystem performance degradation because of a myriad of small files.
Also, we can easily access information from random sections of large files, another thing traditional tools working with data right off the filesystem are not good at. Last but not least, we can keep information associated with the file (who has edited it, download count, description, etc.) right with the file itself.
Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.
From novice to tech pro — start learning today.
Premium members can enroll in this course at no extra cost.
Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.
Have a better answer? Share it in a comment.
Please enter a first name
Please enter a last name
Must be at least 4 characters long.
Join and Comment