Distributing Threads Across Multiple Servers

Posted on 2009-12-28
Last Modified: 2013-12-10
I don't quite understand why you cannot treat processors in a cluster the same way you treat cores in a CPU.  Shouldn't I be able to run a cluster of 10 quadcore machines, and to my programs it is viewed as being a 40-core CPU?  Is this what the kerrighed kernel does?
Question by:incero
    LVL 26

    Expert Comment

    Have an Admin add this question to a couple Linux OS categories.
    I think you'll get more assistance there.
    LVL 43

    Expert Comment

    You could if You also had one memory address space spread across all cluster nodes, however clusters usually have local memory only.
    From very same reason whole process has to be transferred from one machine to another, not single threads only.

    If Your clustering kernel supports common address space(accessing memory from other hosts as local), that should be possible. But even though, that would greatly slowed down the application, right?
    LVL 69

    Expert Comment

    An OS knows about threads and how to manage them on a local machine.  If you want to handle a cluster as a parallel computing resource, you need a specialized OS, such as Beowulf  There are also special programming languages like CUDA for nVidia gpus that treats the stream processors in certain video cards as cores.
    LVL 10

    Expert Comment

    In principle this could work but practical considerations would normally render it impractical. All threads running in a computer have access to local memory which allows very rapid sharing of data and inter-thread communication. In a cluster that does not share memory you would be limited by the relatively slow network link. For some application types this may not be a serious problem. What would be a problem is getting the software that would efficiently manage such a system.
    LVL 57

    Accepted Solution

    It takes some serious horse power and high speed connections in order to do what you are talking about.  Each OS image must be able to talk to the other OS images fast enough so that it does not slow down the normal processing.  Reading about the kerrighed kernel it seems that it is attempting to do this.

    One enviroment that does come close to doing this is IBM z/OS using Parallel Sysplex.  This is a clustering enviroment where each OS image talks to each other via a Coupling Facility.  It is not 100% like what you are asking, but it is close.  Parallel Sysplex give the impression that instead of "X" number of z/OS systems (up to 32), you have 1 system.  It uses Workload Manager (each system runs this and it talks to all of the other systems via the Coupling Facility) to see which system has what resources and how busy each system is.

    Right now with up to 32 z/OS images and each image supporting up to 64 CPUs you can have a 2048 "single image" system.  I have never heard of a site that large yet.

    Part of the problem is that you need high speed communications between each box, and 10 Gb Ethernet is not fast enough.  I think right now that the coupling facility links up to 4 GB (32 Gb).

    You can use RMDA over IP to access memory of a remote machine over IP, at least that is my impression of what RMDA is used for.

    LVL 39

    Expert Comment

    OpenVMS comes as close as it gets (esp a Galaxy cluster, which has shared memory...)....

    With the separation of machines you put up a boundary in memory. To cross that boundary you need to add communication to it ==> overhead.
    Multi threading is used to handle more stuff WITHIN one process context, splitting a process context acros memory boundaries just makes no sense... Then build a system with more cores or split your procedure into multi process handling.
    (VMS Cluster go up to ~200 nodes of up to 64 Cores each, managed as a single resource domain, if needed for disaster tolerance with default setup the diameter of the cluster can be 500Km)

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    6 Surprising Benefits of Threat Intelligence

    All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

    If you don't have the right permissions set for your WordPress location in IIS, you won't be able to perform automatic updates. Here's how to fix the problem.
    Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to moveā€¦
    Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    20 Experts available now in Live!

    Get 1:1 Help Now