mysql as filesystem

Posted on 2009-04-18
Last Modified: 2013-11-14
Thanks for looking. Ive got a server with over 9 million files to store (called the "archive" from now on). The archive is comprised of roughly half pdfs, and half the .txt files derived from those pdf files. Pdf's are an average of 200kB, txt is generally 2kB. These files are split into sequentially numbered folders with roughly 10k f8iles in each so as not to overload the file system. The archive is expected to double in size.

For both redundancy and sheer size requirements we have gotten an adaptec external minisas raid card and run that to an external enclosure with 4x 1.5TB SATA drives in raid 10. This is working, however it is shockingly slow. For reference when we copied the archive (roughly 900GB total) from one 1TB non-raided drive to another it took about 5 hours. When we copied it to the adaptec raid array it took almost 48 hours. My assumption is that this is because of the massive number of files (ie would be much faster with 1x 1tb file) and how that gets stored on raid??

So, I had an idea. Our mysql DB is running over 100 million records and doing so just fine. So, I considered setting up a new DB on mysql with a table which has an index (we already have a file numbering system in place) and a blob field, then just putting the txt / pdf files in the blob. This would:
1. allow mysql to act as a filesystem
2. provide one, or a small number, of actual files to be placed on the raid array (ie the mysql table files would be stored there instead of the pdf/txt files).

So, here go the questions:

1. would this be faster / slower?
2. should I build one table like:   column: primary index, column: pdf(blob), column TXT (blob)
or split it into two tables, one for the pdf, one for the corresponding txt files? (question from a pure performance standpoint)
3. I would be using innodb tables to make use of row level locking and because we wont ever be doing text searches. Is there a way to break the data file into multiple ones without paging the actual table (ie table: pdf 0-1000000, table pdf 1000001-2000000 etc)
4. related to above, should I consider paging the tables to avoid a 1TB table? or is that not a big deal? ie would the db suffer in performance due to the sheer table file size?
5. would a 9 million record DB table with 200k blob perform worse because of the data size than a 9 million db table with a very small amount of data (say int(1))? ie  can i expect any kind of performance like I have with my other tables.

Thanks for all your help.
Question by:thummper
    LVL 12

    Expert Comment

    What is the underlying filesystem you're using? If it's ext3, before making the leap to a database for your files, have you looked at tuning the ext3 filesystems for your non-typical application? The striped raid drives are going to squeeze about as much performance out of the physical hardware as you can, but there are many parameters that can be adjusted to increase the filesystem's performance (writeback frequency, journalling options, directory indexing, etc..). I don't have experience with as large a system as yours, but the out-of-the-box ext3 is configured more for data integrity than for performance.

    Author Comment

    yes, the current fs is ext3, however it is also ext on both non-raided sata drives which were used and they got MUCH better performance. Im wondering if, because the files are so small, theres a performance hit with striping, ie files smaller than 1 stripe. To illustrate what I mean by all drives being ext3 take the following scenario:

    Drive A - 1TB ext3 - sda1
    Drive B - 1TB ext3 - sdb1
    Drive C - 4x 1.5TB drives raid10 (hw raid from pciex8 card) - sdc1

    The total archive (being roughly 9million small files) originated on drive A, and was copied to drive B in 5 hours. From there it was then copied from drive B to drive C which took 48 hours.  Given that A and B are the same, the read part of the two copies are the same, so the only difference, and hence the short pole here, has to be the difference in writing to a setup like drive B vs writing to a setup like drice C (ie a raid issue in the current conditions) My assumption is that its raid in the situation of a bazillion small files.

    I do think you have a great line of thinking here, in either tuning ext3 or the raid setup, but not sure how to do either or the optimal specific settings for either. Any insight?

    The possible advantage of using mysql as the virtual filesystem, in my mind at least, is that you are now striping 1 (or a handful) of very large files, and backup becomes rather easy as well, as you just do a mysql dump to the backup media (also subject to easy gz compression without a massive tar operation first). To elaborate on the backup model, we have looked into tape backups and were disappointed that they tend to be a fe hundred GB rather than the ridiculously high capacity I assumed. So, well likely be backing up to additional sata drives, which definitely cant take 48 hours to do each time we do it. My assumption is that by doing a gz mysql dump we can fit a 2TB db table easily into a standard 1 or 1.5TB drive.
    LVL 12

    Accepted Solution

    The stripe size just maps blocks in a virtual device created by raid to a physical device. The size of the files is more of an issue for the ext3 file system and how big the chunks of free space are that are allocated to files. Allocating more blocks when a file needs to be extended can result in lots of wasted space for small files.

    Stripe size affects how much of this virtual device gets read/written to the disks at one time. Larger stripe sizes (like 1/4 to a full cylinder) increase the amount of the virtual filesystem that is read or written at one time. Therefore, I don't think that you'll buy much by exchanging multiple small files for a single database file.

    The current wisdom I've seen indicates that large stripe sizes result in better performance, all else being equal.

    However, writes to a striped volume are always slower. That's because the error correction information can be scattered across multiple platters, all of which need to be read. It's the read performance that striping enhances. So your backup of the filesystem is probably highlighting the worst characteristic of striping, while reading the files from a striped volume and writing to a non-striped volume would probably show striping at its best.

    So ultimately, you need to decide what behaviour you want to optimize. if your goal is to optimize write performance, while still having the reliablity, simple mirrored raid is probably your best choice. But striping should give you the best read performance.

    If you choose the latter, look at turning off the "last access" feature.  Also, if you have reliable power, or a UPS that can allow the OS to do an orderly shutdown, look at going to writeback journalling to reduce writes to the disk. If data integrity is of prime importance, you might consider storing the journal on another, perhaps non-striped, spindle.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How your wiki can always stay up-to-date

    Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
    - Increase transparency
    - Onboard new hires faster
    - Access from mobile/offline

    As a database administrator, you may need to audit your table(s) to determine whether the data types are optimal for your real-world data needs.  This Article is intended to be a resource for such a task. Preface The other day, I was involved …
    In this article, I provide some information on storage disks which go into calculations that will help you figure out how much Input/output Operations Per Second (IOPS) your disk subsystem can deliver. To effectively size & tune up applications l…
    This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
    This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

    761 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    13 Experts available now in Live!

    Get 1:1 Help Now