Link to home
Start Free TrialLog in
Avatar of thummper
thummper

asked on

mysql as filesystem

Thanks for looking. Ive got a server with over 9 million files to store (called the "archive" from now on). The archive is comprised of roughly half pdfs, and half the .txt files derived from those pdf files. Pdf's are an average of 200kB, txt is generally 2kB. These files are split into sequentially numbered folders with roughly 10k f8iles in each so as not to overload the file system. The archive is expected to double in size.

For both redundancy and sheer size requirements we have gotten an adaptec external minisas raid card and run that to an external enclosure with 4x 1.5TB SATA drives in raid 10. This is working, however it is shockingly slow. For reference when we copied the archive (roughly 900GB total) from one 1TB non-raided drive to another it took about 5 hours. When we copied it to the adaptec raid array it took almost 48 hours. My assumption is that this is because of the massive number of files (ie would be much faster with 1x 1tb file) and how that gets stored on raid??

So, I had an idea. Our mysql DB is running over 100 million records and doing so just fine. So, I considered setting up a new DB on mysql with a table which has an index (we already have a file numbering system in place) and a blob field, then just putting the txt / pdf files in the blob. This would:
1. allow mysql to act as a filesystem
2. provide one, or a small number, of actual files to be placed on the raid array (ie the mysql table files would be stored there instead of the pdf/txt files).

So, here go the questions:

1. would this be faster / slower?
2. should I build one table like:   column: primary index, column: pdf(blob), column TXT (blob)
or split it into two tables, one for the pdf, one for the corresponding txt files? (question from a pure performance standpoint)
3. I would be using innodb tables to make use of row level locking and because we wont ever be doing text searches. Is there a way to break the data file into multiple ones without paging the actual table (ie table: pdf 0-1000000, table pdf 1000001-2000000 etc)
4. related to above, should I consider paging the tables to avoid a 1TB table? or is that not a big deal? ie would the db suffer in performance due to the sheer table file size?
5. would a 9 million record DB table with 200k blob perform worse because of the data size than a 9 million db table with a very small amount of data (say int(1))? ie  can i expect any kind of performance like I have with my other tables.

Thanks for all your help.
Avatar of Hugh Fraser
Hugh Fraser
Flag of Canada image

What is the underlying filesystem you're using? If it's ext3, before making the leap to a database for your files, have you looked at tuning the ext3 filesystems for your non-typical application? The striped raid drives are going to squeeze about as much performance out of the physical hardware as you can, but there are many parameters that can be adjusted to increase the filesystem's performance (writeback frequency, journalling options, directory indexing, etc..). I don't have experience with as large a system as yours, but the out-of-the-box ext3 is configured more for data integrity than for performance.
Avatar of thummper
thummper

ASKER

yes, the current fs is ext3, however it is also ext on both non-raided sata drives which were used and they got MUCH better performance. Im wondering if, because the files are so small, theres a performance hit with striping, ie files smaller than 1 stripe. To illustrate what I mean by all drives being ext3 take the following scenario:

Given:
Drive A - 1TB ext3 - sda1
Drive B - 1TB ext3 - sdb1
Drive C - 4x 1.5TB drives raid10 (hw raid from pciex8 card) - sdc1

The total archive (being roughly 9million small files) originated on drive A, and was copied to drive B in 5 hours. From there it was then copied from drive B to drive C which took 48 hours.  Given that A and B are the same, the read part of the two copies are the same, so the only difference, and hence the short pole here, has to be the difference in writing to a setup like drive B vs writing to a setup like drice C (ie a raid issue in the current conditions) My assumption is that its raid in the situation of a bazillion small files.

I do think you have a great line of thinking here, in either tuning ext3 or the raid setup, but not sure how to do either or the optimal specific settings for either. Any insight?

The possible advantage of using mysql as the virtual filesystem, in my mind at least, is that you are now striping 1 (or a handful) of very large files, and backup becomes rather easy as well, as you just do a mysql dump to the backup media (also subject to easy gz compression without a massive tar operation first). To elaborate on the backup model, we have looked into tape backups and were disappointed that they tend to be a fe hundred GB rather than the ridiculously high capacity I assumed. So, well likely be backing up to additional sata drives, which definitely cant take 48 hours to do each time we do it. My assumption is that by doing a gz mysql dump we can fit a 2TB db table easily into a standard 1 or 1.5TB drive.
ASKER CERTIFIED SOLUTION
Avatar of Hugh Fraser
Hugh Fraser
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial