Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

64bit milter grows huge in vsize?

Posted on 2011-05-12
21
Medium Priority
?
1,405 Views
Last Modified: 2013-12-26
I have developed milter application to use with postfix.
I'm running it on two 32bit scientific Linux 6 and one 64bit.
On the 64bit the application grows very large on it's vsize. The results is like

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
13184 mail      20   0 1074m 1256  628 S  0.0  0.0   0:00.00 milter+

It's not a big problem, since RSS is pretty small and does not grow over time. Actually the vsize does not grow as well.
But on 32bit systems I don't observe this behavior.

I have looked as /proc/pid/smaps and found out many following map pairs.
7f26cc021000-7f26d0000000 ---p 00000000 00:00 0
Size:              65404 kB
Rss:                   0 kB
...
7f26d0000000-7f26d0021000 rw-p 00000000 00:00 0
Size:                132 kB
Rss:                  36 kB
...

please note, that the 64M blocks have no rights assigned!?

At first I thought it was threads stack, but first my application does not run such many threads(verified with proc), second I have run 'ulimit -s 4096' and seeing few 4M maps like:
7f26c7c00000-7f26c8000000 rw-p 00000000 00:00 0
Size:               4096 kB
Rss:                  24 kB
7f26c8000000-7f26c8021000 rw-p 00000000 00:00 0
Size:                132 kB
Rss:                  12 kB

note, that there's still 132K map just after.

If i run 32bit compiled milter on 64bit system, those right-less maps are still there, but only 892K sized?

Any thoughts what those maps are?
0
Comment
Question by:ravenpl
  • 11
  • 10
21 Comments
 
LVL 53

Expert Comment

by:Infinity08
ID: 35744945
To figure out what exactly creates those mappings, you could strace your application, and look for the mmap calls that created the mappings.

You'll probably find that they don't actually consume any memory (physical or swap).
The size is probably for alignment reasons, and/or to avoid fragmentation.

Their existence might indeed be for threads (stack space eg. as you suspect). As part of a thread pool maybe, or your threading library (pthreads ?) preparing for future threads.
0
 
LVL 43

Author Comment

by:ravenpl
ID: 35761101
I'll try to strace if I find some milter tester.
Meanwhile, I think I have verified it is not a stack (no use, no rights, hence 132 just after?)

More ideas? Especially I wonder why the milter or thread library would mmap 64M chunks with no rights?
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35761155
Rights of a mapping can be modified. It's not because a mapping doesn't have any rights now, that it can't have them later. Without further information, by best guess is still what I posted in my first response. The strace will probably shed some more light on things.
0
Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

 
LVL 43

Author Comment

by:ravenpl
ID: 35761187
> Rights of a mapping can be modified. It's not because a mapping doesn't have any rights now, that it can't have them later.
Of course they can, but do You know glibc/milter/pthreads to mmap memory without rights first?

I know I can do the investigation by myself(i'm a developer), hence I thought someone has already seen this...
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35761329
>> Of course they can, but do You know glibc/milter/pthreads to mmap memory without rights first?

Yes. And/or removing rights to old mappings, in order for later re-use, or for alignment purposes.

But as I said : the information you gave is not enough to say for certain what they're for. You need to find out how they were created, and strace can help you figure that out.
0
 
LVL 43

Author Comment

by:ravenpl
ID: 35761443
> Yes. And/or removing rights to old mappings, in order for later re-use, or for alignment purposes.
But those are not old - I can assure You. Moreover, old mappings would probaby have non-zero RSS - unless unmapped/remapped.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35761457
Again : there's no way to tell for certain what they're for without knowing how they were created.
0
 
LVL 43

Author Comment

by:ravenpl
ID: 35763590
Results from strace:
- the mmap is even bigger, then half of it is unmapped immediatelly
- the new big mmap happens only if new thread was created - but only if process had not so many threads so far
- in other words there's as many big mmaps as the process had threads at the same time ever
- "new slot" thread spawning looks like
[pid  7746] mmap(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7fc07abfe000
Process 8168 attached
[pid  8168] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7fc06c000000
[pid  8168] munmap(0x7fc070000000, 67108864) = 0

I also used gdb to see some more.
The bt shows the mmap is a result of malloc - but the malloc is very small(eg. 12b), so I would blame glibc/pthread rather than milter.
#0  0x0000003ce10de380 in mmap64 () from /lib64/libc.so.6
#1  0x0000003ce1076a01 in new_heap () from /lib64/libc.so.6
#2  0x0000003ce1077173 in arena_get2 () from /lib64/libc.so.6
#3  0x0000003ce1079b1a in malloc () from /lib64/libc.so.6
#4  0x0000003ce1407d22 in mi_rd_cmd () from /usr/lib64/libmilter.so.1.0
#5  0x0000003ce1405785 in mi_engine () from /usr/lib64/libmilter.so.1.0
#6  0x0000003ce1407818 in mi_handle_session () from /usr/lib64/libmilter.so.1.0
#7  0x0000003ce1406369 in ?? () from /usr/lib64/libmilter.so.1.0
#8  0x0000003ce18077e1 in start_thread () from /lib64/libpthread.so.0
#9  0x0000003ce10e18ed in clone () from /lib64/libc.so.6

I also tried using different memory allocator(preloading libtcmalloc from google-perftools) - and those mmaps were gone.

So those mmaps have to be some kind per-thread cache or whatever. I haven't googled what is it yet.

The questions are
- what is it
- how can I disable it or shrink it's size.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35763752
>> [pid  8168] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7fc06c000000
>> [pid  8168] munmap(0x7fc070000000, 67108864) = 0

Ok, so what happens here, is that it wants a 64MB aligned mapping.
But since it has no control over the alignment, it asks for a 128MB block instead, and then it cuts off the start and end using munmap to keep only the 64MB part that is aligned on a 64MB boundary.
In this case, it so happens that the block returned was already aligned to 64MB, so only the end half needed to be cut off.


>> - what is it

Since the strace shows that it happens when spawning a thread, it shows that the first mmap is the stack for the thread, and the second (the 64MB one) is the so-called "guard page", which protects against a stack overflow.


>> - how can I disable it or shrink it's size.

Normally, you don't need to. The mapping is MAP_NORESERVE, so it doesn't take up any physical or virtual memory.

But if you do, you can use pthread_attr_setguardsize to change the size of the guard page :

        http://linux.die.net/man/3/pthread_attr_setguardsize

Or you can provide your own stack using pthread_attr_setstack, in which case the guard page is not used :

        http://linux.die.net/man/3/pthread_attr_setstack
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35763762
>> it shows that the first mmap is the stack for the thread

After a closer look, it seems like the first mmap is unrelated to the second (given its relative location to the guard page), so the thread's stack might have been allocated elsewhere.
0
 
LVL 43

Author Comment

by:ravenpl
ID: 35763798
> The mapping is MAP_NORESERVE, so it doesn't take up any physical or virtual memory.
I know, but still it looks very strange. I'd like to know what it is - it does not look very normal(unless java - haha).

This big mmap is stack unrelated. strace shows two mmaps. First is the stack(10M default stack size), second one is the big mmap. Those are unrelated - both are mapped at NULL address.
So both questions still stand
- what is it
- how can I disable it or shrink it's size.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35763820
Did you miss my earlier post (http:#35763752), where I answered that ?
0
 
LVL 43

Author Comment

by:ravenpl
ID: 35763847
No I haven't miss it. But if You want 64M aligned mem You can ask for 64M only. I'm not getting why waste 64M vspace per thread?
If I have multithreaded application run with vsize ulimit then what - have to use different allocator to run it?

About stack - we already agreed(i think) it's unrelated - there are separate 10M mmaps for stacks.

Infinity08: do know what it really is or just assuming? This thread is getting little long...
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35763905
>> But if You want 64M aligned mem You can ask for 64M only.

No, you can't, because you can't be sure that the 64MB block you ask for will be aligned on a 64MB boundary.


>> I'm not getting why waste 64M vspace per thread?

As I said : for the guard page.


>> If I have multithreaded application run with vsize ulimit then what - have to use different allocator to run it?

If you want to control the size of the guard page (or get rid of it), I gave you two options to do so.


>> About stack - we already agreed(i think) it's unrelated - there are separate 10M mmaps for stacks.

There is the stack, and then there's the guard page. They are intrinsically related, because the guard page is there to guard against stack overflow.


>> Infinity08: do know what it really is or just assuming?

Yes.


>> This thread is getting little long...

I can't help it that it took a while before you posted the strace output ;)
0
 
LVL 43

Accepted Solution

by:
ravenpl earned 0 total points
ID: 35764021
>>> But if You want 64M aligned mem You can ask for 64M only.
>No, you can't, because you can't be sure that the 64MB block you ask for will be aligned on a 64MB boundary.
You right here.

>>> I'm not getting why waste 64M vspace per thread?
>As I said : for the guard page.
What guard page? This is not a stack! For heap it does not make sense to guard only the last object in heap or having guard before the heap.
Please do the math! Stack was allocated at 0x7fc07abfe000 then (almost)128M at 0x7fc06c000000 - but 0x7fc06c000000 + 128M is 0x7FC074000000 - over 100M to the stack start(end actually).


OK, I browsed glibc sources, and glibc does that.
It defines HEAP_MAX_SIZE as 2 * DEFAULT_MMAP_THRESHOLD_MAX where DEFAULT_MMAP_THRESHOLD_MAX is 512K on 32bit and 32M on 64.
Then it wants a heap per thread - if no spare heap allocates new one[new_heap()]. The new heap has to be HEAP_MAX_SIZE aligned so next heaps can fit into possibly holes(to save vsize - so ironic!). To align, it simply allocates doubled size then unmaps spare part.
Finally it changes the protection for wanted region only leaving majority of the new heap without any rights. If more heap is needed changes protection on next chunk[grow_heap()] and so on.

To me - it looks stupid and vsize wasteful. On the other hand why someone would limit vspace on 64bit? Good Q, isn't it?
Also, it cannot be changed - it's compile time constants.

All in all, I'll have to live with this huge vsize on multithreaded applications or link against different memory allocator.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35764077
>> Please do the math!

I did. Which prompted me to post http:#35763762. I noticed the discrepancy, and then corrected myself.


>> OK, I browsed glibc sources, and glibc does that.

I guess you found an explanation for the discrepancy then.
So it sounds like you're using a pthread implementation that doesn't make use of guard pages, and I was on the wrong track (it looks exactly like a guard page would look though heh : same size, alignment, permissions, options, creation time - but I guess the old saying "if it looks and quacks like a duck ..." is not always true).


>> To me - it looks stupid and vsize wasteful. On the other hand why someone would limit vspace on 64bit? Good Q, isn't it?

I don't think a few 64MB chunks of vspace really make much of a difference in the grand scheme of things heh. There's plenty of vspace :)
0
 
LVL 43

Author Comment

by:ravenpl
ID: 35764103
> So it sounds like you're using a pthread implementation that doesn't make use of guard pages
Can You tell me why heap should use guard pages? Because I provided the strace allocates two areas(10M stack, 128M something) on purpose.
> though heh : same size
And why they(guard pages) would be 64MB large?

> I don't think a few 64MB chunks of vspace really make much of a difference in the grand scheme of things heh. There's plenty of vspace :)
Yes, still having 64+ threads(which is likely in milter implementation) gives You vspace over 4GB! I'm not saying it's large number on 64bit - only it looks very strane. Doesn't it?
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35764210
>> Can You tell me why heap should use guard pages?

I'm not sure what you mean by that, but the thread stack is commonly allocated directly from virtual memory.
If you are referring to the stack trace you posted, I couldn't tell which mmap it was for, so I didn't consider it for my reply.


>> And why they(guard pages) would be 64MB large?

Technically, it's a multiple of the page size (http://linux.die.net/man/2/getpagesize), which is commonly 4KB on 32bit systems eg.

64MB is admittedly big, but unless your page size is larger than 64MB it's a multiple of the page size, which makes it consistent with the size a guard page would have.

Anyway, since you've found it's not a guard page, it seems like it's a moot point now.



>> only it looks very strane. Doesn't it?

It looks like something that was worth investigating and learning about, yes :)
0
 
LVL 43

Author Comment

by:ravenpl
ID: 35764249
> If you are referring to the stack trace you posted, I couldn't tell which mmap it was for, so I didn't consider it for my reply.
If read carefully You would notice it is not a stack. It's all there in both, strace and gdb's bt.

>> And why they(guard pages) would be 64MB large?
I see it's very hard to simply admit it's .. say unwise.

EOT.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35764268
I'm not sure what you're getting at here, ravenpl ... But I get the feeling you're trying to make me come off as some kind of quack, and I can't say I appreciate that.

All this time, I've worked with the information you decided to give, and with that very limited view I came up with an explanation (out of all the possible reasons for such mappings) that turned out to be the wrong one. That's all there's to it.

If you want to make anything more from it, be my guest, but that'll be without me.
0
 
LVL 43

Author Closing Comment

by:ravenpl
ID: 35814137
Answered myself.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction: Hints for the grid button.  Nested classes, templated collections.  Squash that darned bug! Continuing from the sixth article about sudoku.   Open the project in visual studio. First we will finish with the SUD_SETVALUE messa…
Have you tried to learn about Unicode, UTF-8, and multibyte text encoding and all the articles are just too "academic" or too technical? This article aims to make the whole topic easy for just about anyone to understand.
The goal of this video is to provide viewers with basic examples to understand and use pointers in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.
Suggested Courses

810 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question