Solved

How scalable is this?

Posted on 2002-04-10
4
180 Views
Last Modified: 2010-04-21
Hello all
We have a box from which we ftp data.
The data is stored in the filesystem, in a hierarchy of directories and subdirectories, for example:
productx--
         |
         year
            |
            month
                 |
                 day
                   |
                    hour
                       |
                       minute
 
so to look up a file, you have to find the right product, and then navigate down to the minute the product was produced, and then parse to find the right file.
There is rarely more than a half-dozen of the same product that were produced at the same minute, and commonly only a few.

Some people want to use this box for a much higher volume
product, and there are concerns that such a lookup system won't scale, so I thought I'd ask the experts for their opinions.
The people who work with the box regularly say it can scale, but those who don't think that not using a modern RDBMS is ridiculous.
Thanks much for your input, and I'll also give at least the few best answers some points, too, maybe 50 each.
Vic
0
Comment
Question by:vlg
  • 2
  • 2
4 Comments
 
LVL 40

Expert Comment

by:jlevie
ID: 6932721
Where is the concern about scaleability? In terms of storage, access time, FTP server load?

It would seem to me that storage would not be an issue with this sort layout unless there could be more files located in a particular minute than were inodes on the disk that contained them. The layout of the storage lends itself to multiple mount points so even a humongous amount of 'product' could be easily handled.

Likewise I doubt that access time would be an issue. There are other limits that would come into play long before traversing the file system(s) would be a problem.

Server load could well be an issue. Presumably if a much higher volume product is to be handled there would be a higher volume of FTP traffic, both in placing the product on the server and in retrieving it. And this is an area that I don't see an RDBMS helping all that much. Distributing the existing file system across multiple servers looks to be a better solution as far as scalability is concerned. That way both the data storage and FTP load is distributed. With an RDBMS you'd still have a single choke point, even if there were multiple FTP servers.

So it would seem to me that a file system based FTP repository would scale better than an RDBMS based system.
0
 

Author Comment

by:vlg
ID: 6934423
jlevie

Thanks for your response - we're primarily concerned about access time.  You mentioned:

There are other limits that would come into play
long before traversing the file system(s) would be a problem.

Can you tell me which limits those are, are why they come into play first?

Thanks again

Edmund
0
 
LVL 40

Accepted Solution

by:
jlevie earned 100 total points
ID: 6934682
Well, since the access to the data is by FTP, the network and FTP service will limit the process first. The actual data transfer via FTP is the most efficient means of moving data between systems, but the overhead associated with getting from the initial FTP connect to the get/put is significant. Also you have to consider what limits the OS and hardware have, like max number of processes, max open files, installed memory, etc. Obviously, if there are lots and lots of FTP sessions active you could easily get into a situation where the server would start to swap which would significantly impact access time. And that's where a number of servers, each with a portion of the storage, has an advantage.
0
 

Author Comment

by:vlg
ID: 6934972
Thanks much!
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This tech tip describes how to install the Solaris Operating System from a tape backup that was created using the Solaris flash archive utility. I have used this procedure on the Solaris 8 and 9 OS, and it shoudl also work well on the Solaris 10 rel…
Why Shell Scripting? Shell scripting is a powerful method of accessing UNIX systems and it is very flexible. Shell scripts are required when we want to execute a sequence of commands in Unix flavored operating systems. “Shell” is the command line i…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now