Link to home
Start Free TrialLog in
Avatar of ravenpl
ravenplFlag for Poland

asked on

x86_64 and process vmemory limit

exactly. For i386 it's 3G (I know there are patches, but the default is 3G)
what's the limit for x86_64 (I know the bus is 40bit wide, 48bit for new opterons)
Please provide some links describing the matter.
Avatar of pjedmond
pjedmond
Flag of United Kingdom of Great Britain and Northern Ireland image

The source code for the kernel memory model:

Documentation/x86_64/mm.txt

Paging design allows per process virtual address space limit of 1 Terabyte x86_64 is for 1 Terabyte....*BUT* this was specifically for a 40bit wide bus.

It may capable of being bigger for the new opterons, but I guess that there might be other dependencies in the code somewhere?

(   (()
(`-' _\
 ''  ''

Avatar of ravenpl

ASKER

Well, in fact I'm familiar with this doc, but still unsure:
Can process virtual memory grow to 2^48 bytes or lower?
Looking at the source code, it is definitely capable of 1 terabyte (provided the architecture supports it). Beyond that, I'd not want to try, as all of the memory is defined in blocks stored in tables. Trying to work out what can be fitted in those tables is beyond me without an x86_64 system to test and tweak on.

Do you *really* need more than 1 Terabyte? From a practical perspective, I'd say not to exceed 2^40. If you look at the way the code blocks appear to be referenced - I'm looking at a 2.6.16 kernel greater than 2^41 is an impossibility to reference. However, there may be other areas that deal with the 'newer athlons'. In addition, the current kernel may have evolved a bit further since this to take account of newer Athlons.

>Can process virtual memory grow to 2^48 bytes or lower?

Therefore for most x86_64 processors (possibly all?) virtual memory *for a single application* cannot exceed 1 Terabytes. (For all applications combined concurrently referenced by the kernel it definitely cannot exceed 2TB) This applies to a linux 2.6.16 FC 4 kernel.......so lower:)

Having said that, I was working on an Intel supercomputer system a few years ago that had multiple processors, and was going to be used for weather forcasting....which makes me start considering potential ways to use a cluster if you should need to exceed the aforementioned limits....Do you?

(   (()
(`-' _\
 ''  ''





Avatar of ravenpl

ASKER

For i386 it's 4G. But in default kernel there's only 3G for userspace process.
Again - I have no clear responce - what is the limit on x86_64 (i mean userspace process)?
I understand there is a hardware limit on 2^40 (1TB) - but this is for kernel.
Avatar of Duncan Roe
I do have an AMD-64 system, so I experimented with the following program:

> ! cat tryme.c
/*
gcc -Wall -Wmissing-prototypes -Wstrict-prototypes -g3 -ggdb tryme.c -o tryme
*/
#include <stdio.h>
#include <unistd.h>
int main (int argc, char**argv)
{
  unsigned char *p;
  unsigned long l;
 
  l = 0xca000000L; /* best seen 0x40312000000; varies */
  l = 0x80000000L; /* best seen 0x2aaa80000000; consistent. */
  l = 0x40000000L; /* best seen 0x2aaa80000000; consistent. */
  l =  0x1000000L; /* Runs out of physical mem & crashes. */
  l =  0x4000000L; /* Runs out of physical mem & crashes. */
  l =  0x8000000L; /* best seen 0x2aaaa8000000; consistent. */
  l = 0x10000000L; /* best seen 0x2aaaa0000000; consistent. */
 
  for(p = (void *)0x80000000L;; p += l)
  {
    if (brk(p))
      break;
    *(p-1) = 0;                    /* Make sure system lets us use last byte */
  }                                /* for(p = 0x80000000L; p = p <<1;) */
  printf("Allocated %p, l = 0x%lx\n", p - l, l);
  return 0;
}                                  /* int main (int argv, char**argv) */

The brk() system call is what malloc() uses to extend the size of program memory. The test program does actually make sure it can write through the pointer which brk() reported was allocated.

It seems that the addrsss limit is around 0x2aaaa8000000 - well over 40 bits (I have configured the kernel for sparse memory). Allocating memory more that 2G at a time gave me erratic results, with an increment of somewhere slightly above 0xca000000 hardly allocating anything at all.

So that seems to be about the extent of the user address space limit - why did you want to know by the way?
Avatar of ravenpl

ASKER

Great test!
One more thing - could You verify the VmPeak & VmSize values from /proc/self/status ?
preopen the file and preallocate read buffer to avoid memory problems...
ASKER CERTIFIED SOLUTION
Avatar of Duncan Roe
Duncan Roe
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial