摄政王的田园小萌妻91:Playing with Virtual Memory

来源:百度文库 编辑:中财网 时间:2024/04/29 21:41:37
Snail in a Turtleneck

Kristina Chodorow's Blog

PHP Extensions Made Eldrich: Classes More PHP Internals: References

Playing with Virtual Memory

Linux: the developer's personal gentleman

When you run a process, it needs some memory to store things: its heap, its stack, and any libraries it’s using. Linux provides and cleans up memory for your process like an extremely conscientious butler. You can (and generally should) just let Linux do its thing, but it’s a good idea to understand the basics of what’s going on.

One easy way (I think) to understand this stuff is to actually look at what’s going on using the pmap command. pmap shows you memory information for a given process.

For example, let’s take a really simple C program that prints its own process id (PID) and pauses:

#include #include #include  int main() {  printf("run `pmap %d`\n", getpid());  pause();}

Save this as mem_munch.c. Now compile and run it with:

$ gcc mem_munch.c -o mem_munch$ ./mem_munchrun `pmap 25681` 

The PID you get will probably be different than mine (25681).

At this point, the program will “hang.” This is because of the pause() function, and it’s exactly what we want. Now we can look at the memory for this process at our leisure.

Open up a new shell and run pmap, replacing the PID below with the one mem_munch gave you:

$ pmap 2568125681:   ./mem_munch0000000000400000      4K r-x--  /home/user/mem_munch0000000000600000      4K r----  /home/user/mem_munch0000000000601000      4K rw---  /home/user/mem_munch00007fcf5af88000   1576K r-x--  /lib/x86_64-linux-gnu/libc-2.13.so00007fcf5b112000   2044K -----  /lib/x86_64-linux-gnu/libc-2.13.so00007fcf5b311000     16K r----  /lib/x86_64-linux-gnu/libc-2.13.so00007fcf5b315000      4K rw---  /lib/x86_64-linux-gnu/libc-2.13.so00007fcf5b316000     24K rw---    [ anon ]00007fcf5b31c000    132K r-x--  /lib/x86_64-linux-gnu/ld-2.13.so00007fcf5b512000     12K rw---    [ anon ]00007fcf5b539000     12K rw---    [ anon ]00007fcf5b53c000      4K r----  /lib/x86_64-linux-gnu/ld-2.13.so00007fcf5b53d000      8K rw---  /lib/x86_64-linux-gnu/ld-2.13.so00007fff7efd8000    132K rw---    [ stack ]00007fff7efff000      4K r-x--    [ anon ]ffffffffff600000      4K r-x--    [ anon ] total             3984K

This output is how memory “looks” to the mem_munch process. If mem_munch asks the operating system for 00007fcf5af88000, it will get libc. If it asks for 00007fcf5b31c000, it will get the ld library.

This output is a bit dense and abstract, so let’s look at how some more familiar memory usage shows up. Change our program to put some memory on the stack and some on the heap, then pause.

#include #include #include #include  int main() {  int on_stack, *on_heap;   // local variables are stored on the stack  on_stack = 42;  printf("stack address: %p\n", &on_stack);   // malloc allocates heap memory  on_heap = (int*)malloc(sizeof(int));  printf("heap address: %p\n", on_heap);   printf("run `pmap %d`\n", getpid());  pause();}

Now compile and run it:

$ ./mem_munch stack address: 0x7fff497670bcheap address: 0x1b84010run `pmap 11972`

Again, your exact numbers will probably be different than mine.

Before you kill mem_munch, run pmap on it:

$ pmap 1197211972:   ./mem_munch0000000000400000      4K r-x--  /home/user/mem_munch0000000000600000      4K r----  /home/user/mem_munch0000000000601000      4K rw---  /home/user/mem_munch0000000001b84000    132K rw---    [ anon ]00007f3ec4d98000   1576K r-x--  /lib/x86_64-linux-gnu/libc-2.13.so00007f3ec4f22000   2044K -----  /lib/x86_64-linux-gnu/libc-2.13.so00007f3ec5121000     16K r----  /lib/x86_64-linux-gnu/libc-2.13.so00007f3ec5125000      4K rw---  /lib/x86_64-linux-gnu/libc-2.13.so00007f3ec5126000     24K rw---    [ anon ]00007f3ec512c000    132K r-x--  /lib/x86_64-linux-gnu/ld-2.13.so00007f3ec5322000     12K rw---    [ anon ]00007f3ec5349000     12K rw---    [ anon ]00007f3ec534c000      4K r----  /lib/x86_64-linux-gnu/ld-2.13.so00007f3ec534d000      8K rw---  /lib/x86_64-linux-gnu/ld-2.13.so00007fff49747000    132K rw---    [ stack ]00007fff497bb000      4K r-x--    [ anon ]ffffffffff600000      4K r-x--    [ anon ] total             4116K

Note that there’s a new entry between the final mem_munch section and libc-2.13.so. What could that be?


# from pmap
0000000001b84000 132K rw--- [ anon ]
# from our program
heap address: 0x1b84010

The addresses are almost the same. That block ([ anon ]) is the heap. (pmap labels blocks of memory that aren’t backed by a file [ anon ]. We’ll get into what being “backed by a file” means in a sec.)

The second thing to notice:


# from pmap
00007fff49747000 132K rw--- [ stack ]
# from our program
stack address: 0x7fff497670bc

And there’s your stack!

One other important thing to notice: this is how memory “looks” to your program, not how memory is actually laid out on your physical hardware. Look at how much memory mem_munch has to work with. According to pmap, mem_munch can address memory between address 0x0000000000400000 and 0xffffffffff600000 (well, actually 0x00007fffffffffffffff, beyond that is special). For those of you playing along at home, that’s almost 10 million terabytes of memory. That’s a lot of memory. (If your computer has that kind of memory, please leave your address and times you won’t be at home.)

So, the amount of memory the program can address is kind of ridiculous. Why does the computer do this? Well, lots of reasons, but one important one is that this means you can address more memory than you actually have on the machine and let the operating system take care of making sure the right stuff is in memory when you try to access it.

Memory Mapped Files

Memory mapping a file basically tells the operating system to load the file so the program can access it as an array of bytes. Then you can treat a file like an in-memory array.

For example, let’s make a (pretty stupid) random number generator ever by creating a file full of random numbers, then mmap-ing it and reading off random numbers.

First, we’ll create a big file called random (note that this creates a 1GB file, so make sure you have the disk space and be patient, it’ll take a little while to write):

$ dd if=/dev/urandom bs=1024 count=1000000 of=/home/user/random1000000+0 records in1000000+0 records out1024000000 bytes (1.0 GB) copied, 123.293 s, 8.3 MB/s$ ls -lh random-rw-r--r-- 1 user user 977M 2011-08-29 16:46 random

Now we’ll mmap random and use it to generate random numbers.

#include #include #include #include #include  int main() {  char *random_bytes;  FILE *f;  int offset = 0;   // open "random" for reading                                                                                                                                                f = fopen("/home/user/random", "r");  if (!f) {    perror("couldn't open file");    return -1;  }   // we want to inspect memory before mapping the file                                                                                                                        printf("run `pmap %d`, then press ", getpid());  getchar();   random_bytes = mmap(0, 1000000000, PROT_READ, MAP_SHARED, fileno(f), 0);   if (random_bytes == MAP_FAILED) {    perror("error mapping the file");    return -1;  }   while (1) {    printf("random number: %d (press  for next number)", *(int*)(random_bytes+offset));    getchar();     offset += 4;  }}

If we run this program, we’ll get something like:

$ ./mem_munch run `pmap 12727`, then press 

The program hasn’t done anything yet, so the output of running pmap will basically be the same as it was above (I’ll omit it for brevity). However, if we continue running mem_munch by pressing enter, our program will mmap random.

Now if we run pmap it will look something like:

$ pmap 1272712727:   ./mem_munch0000000000400000      4K r-x--  /home/user/mem_munch0000000000600000      4K r----  /home/user/mem_munch0000000000601000      4K rw---  /home/user/mem_munch000000000147d000    132K rw---    [ anon ]00007fe261c6f000 976564K r--s-  /home/user/random00007fe29d61c000   1576K r-x--  /lib/x86_64-linux-gnu/libc-2.13.so00007fe29d7a6000   2044K -----  /lib/x86_64-linux-gnu/libc-2.13.so00007fe29d9a5000     16K r----  /lib/x86_64-linux-gnu/libc-2.13.so00007fe29d9a9000      4K rw---  /lib/x86_64-linux-gnu/libc-2.13.so00007fe29d9aa000     24K rw---    [ anon ]00007fe29d9b0000    132K r-x--  /lib/x86_64-linux-gnu/ld-2.13.so00007fe29dba6000     12K rw---    [ anon ]00007fe29dbcc000     16K rw---    [ anon ]00007fe29dbd0000      4K r----  /lib/x86_64-linux-gnu/ld-2.13.so00007fe29dbd1000      8K rw---  /lib/x86_64-linux-gnu/ld-2.13.so00007ffff29b2000    132K rw---    [ stack ]00007ffff29de000      4K r-x--    [ anon ]ffffffffff600000      4K r-x--    [ anon ] total           980684K

This is very similar to before, but with an extra line (bolded), which kicks up virtual memory usage a bit (from 4MB to 980MB).

However, let’s re-run pmap with the -x option. This shows the resident set size (RSS): only 4KB of random are resident. Resident memory is memory that’s actually in RAM. There’s very little of random in RAM because we’ve only accessed the very start of the file, so the OS has only pulled the first bit of the file from disk into memory.

pmap -x 1272712727:   ./mem_munchAddress           Kbytes     RSS   Dirty Mode   Mapping0000000000400000       0       4       0 r-x--  mem_munch0000000000600000       0       4       4 r----  mem_munch0000000000601000       0       4       4 rw---  mem_munch000000000147d000       0       4       4 rw---    [ anon ]00007fe261c6f000       0       4       0 r--s-  random00007fe29d61c000       0     288       0 r-x--  libc-2.13.so00007fe29d7a6000       0       0       0 -----  libc-2.13.so00007fe29d9a5000       0      16      16 r----  libc-2.13.so00007fe29d9a9000       0       4       4 rw---  libc-2.13.so00007fe29d9aa000       0      16      16 rw---    [ anon ]00007fe29d9b0000       0     108       0 r-x--  ld-2.13.so00007fe29dba6000       0      12      12 rw---    [ anon ]00007fe29dbcc000       0      16      16 rw---    [ anon ]00007fe29dbd0000       0       4       4 r----  ld-2.13.so00007fe29dbd1000       0       8       8 rw---  ld-2.13.so00007ffff29b2000       0      12      12 rw---    [ stack ]00007ffff29de000       0       4       0 r-x--    [ anon ]ffffffffff600000       0       0       0 r-x--    [ anon ]----------------  ------  ------  ------total kB          980684     508     100

If the virtual memory size (the Kbytes column) is all 0s for you, don’t worry about it. That’s a bug in Debian/Ubuntu’s -x option. The total is correct, it just doesn’t display correctly in the breakdown.

You can see that the resident set size, the amount that’s actually in memory, is tiny compared to the virtual memory. Your program can access any memory within a billion bytes of 0x00007fe261c6f000, but if it accesses anything past 4KB, it’ll probably have to go to disk for it*.

What if we modify our program so it reads the whole file/array of bytes?

#include #include #include #include #include  int main() {  char *random_bytes;  FILE *f;  int offset = 0;   // open "random" for reading                                                                                                                                                f = fopen("/home/user/random", "r");  if (!f) {    perror("couldn't open file");    return -1;  }   random_bytes = mmap(0, 1000000000, PROT_READ, MAP_SHARED, fileno(f), 0);   if (random_bytes == MAP_FAILED) {    printf("error mapping the file\n");    return -1;  }   for (offset = 0; offset < 1000000000; offset += 4) {    int i = *(int*)(random_bytes+offset);     // to show we're making progress                                                                                                                                            if (offset % 1000000 == 0) {      printf(".");    }  }   // at the end, wait for signal so we can check mem                                                                                                                          printf("\ndone, run `pmap -x %d`\n", getpid());  pause();}

Now the resident set size is almost the same as the virtual memory size:

$ pmap -x 53785378:   ./mem_munchAddress           Kbytes     RSS   Dirty Mode   Mapping0000000000400000       0       4       4 r-x--  mem_munch0000000000600000       0       4       4 r----  mem_munch0000000000601000       0       4       4 rw---  mem_munch0000000002271000       0       4       4 rw---    [ anon ]00007fc2aa333000       0  976564       0 r--s-  random00007fc2e5ce0000       0     292       0 r-x--  libc-2.13.so00007fc2e5e6a000       0       0       0 -----  libc-2.13.so00007fc2e6069000       0      16      16 r----  libc-2.13.so00007fc2e606d000       0       4       4 rw---  libc-2.13.so00007fc2e606e000       0      16      16 rw---    [ anon ]00007fc2e6074000       0     108       0 r-x--  ld-2.13.so00007fc2e626a000       0      12      12 rw---    [ anon ]00007fc2e6290000       0      16      16 rw---    [ anon ]00007fc2e6294000       0       4       4 r----  ld-2.13.so00007fc2e6295000       0       8       8 rw---  ld-2.13.so00007fff037e6000       0      12      12 rw---    [ stack ]00007fff039c9000       0       4       0 r-x--    [ anon ]ffffffffff600000       0       0       0 r-x--    [ anon ]----------------  ------  ------  ------total kB          980684  977072     104

Now if we access any part of the file, it will be in RAM already. (Probably. Until something else kicks it out.) So, our program can access a gigabyte of memory, but the operating system can lazily load it into RAM as needed.

And that’s why your virtual memory is so damn high when you’re running MongoDB.

Left as an exercise to the reader: try running pmap on a mongod process before it’s done anything, once you’ve done a couple operations, and once it’s been running for a long time.

* This isn’t strictly true**. The kernel actually says, “If they want the first N bytes, they’re probably going to want some more of the file” so it’ll load, say, the first dozen KB of the file into memory but only tell the process about 4KB. When your program tries to access this memory that is in RAM, but it didn’t know was in RAM, it’s called a minor page fault (as opposed to a major page fault when it actually has to hit disk to load new info). back to context

** This note is also not strictly true. In fact, the whole file will probably be in memory before you map anything because you just wrote the thing with dd. So you’ll just be doing minor page faults as your program “discovers” it.

  • Post to YC Hacker News
  • Add to Reddit
  • Post to Twitter
  • Post to Slashdot
  • Post to Delicious
  • Post to StumbleUpon
  • Post to Facebook

If you liked this, you might enjoy:

  • Got Mongo Working on Hostmonster!
  • VM of Death and Doom from Hell
pmap, RAM, virtual memory Share this post! This entry was posted by Kristina Chodorow on August 30, 2011 at 1:07 pm, and is filed under Linux, Programming. Follow any responses to this post through RSS 2.0. You can leave a response or trackback from your own site.
  • Disqus
  • Like
  • Dislike
    • 3 people liked this.

Add New Comment