Thursday, December 13, 2007

Case Project Chapter 3: Memory Management In Linux

Linux kernel is a Unix-like operating system kernel. It is the namesake of the Linux family of operating systems. Released under the GNU General Public License (GPL) and developed by contributors worldwide, Linux is one of the most prominent examples of free software / open source.[3]

The Linux Kernel was initially conceived and assembled by Linus Torvalds in 1991. Early on, the Minix community contributed code and ideas to the Linux kernel. At the time, the GNU Project had created many of the components required for a free software operating system, but its own kernel, GNU Hurd, was incomplete and unavailable. The BSD operating system had not yet freed itself from legal encumbrances. This meant that despite the limited functionality of the early versions, Linux rapidly accumulated developers and users who adopted code from those projects for use with the new operating system.[4] Today the Linux kernel has received contributions from thousands of programmers.

Linux Memory Management

The Linux memory manager implements demand paging with a copy-on-write strategy relying on the 386's paging support. A process acquires its page tables from its parent (during a fork()) with the entries marked as read-only or swapped. Then, if the process tries to write to that memory space, and the page is a copy-on-write page, it is copied, and the page is marked read-write. An exec() results in the reading in of a page or so from the executable. The process then faults in any other pages it needs.

Each process has a page directory which means it can access 1 KB of page tables pointing to 1 MB of 4 KB pages which is 4 GB of memory. A process' page directory is initialized during a fork by copy_page_tables(). The idle process has its page directory initialized during the initialization sequence.

Each user process has a local descriptor table that contains a code segment and data-stack segment. These user segments extend from 0 to 3 GB (0xc0000000). In user space, linear addresses and logical addresses are identical.

On the 80386, linear address run from 0GB to 4GB. A linear address points to a particular memory location within this space. A linear address is not a physical address--it is a virtual address. A logical address consists of a selector and an offset. The selector points to a segment and the offset tells how far into that segment the address is located)

The kernel code and data segments are privileged segments defined in the global descriptor table and extend from 3 GB to 4 GB. The swapper page directory (swapper_page_dir is set up so that logical addresses and physical addresses are identical in kernel space.

The space above 3 GB appears in a process' page directory as pointers to kernel page tables. This space is invisible to the process in user mode but the mapping becomes relevant when privileged mode is entered, for example, to handle a system call. Supervisor mode is entered within the context of the current process so address translation occurs with respect to the process' page directory but using kernel segments. This is identically the mapping produced by using the swapper_pg_dir and kernel segments as both page directories use the same page tables in this space. Only task[0] (the idle task, sometimes called the swapper task for historical reasons, even though it has nothing to do with swapping in the Linux implementation) uses the swapper_pg_dir directly.

  • The user process' segment_base = 0x00, page_dir private to the process.
  • user process makes a system call: segment_base=0xc0000000 page_dir = same user page_dir.
  • swapper_pg_dir contains a mapping for all physical pages from 0xc0000000 to 0xc0000000 + end_mem, so the first 768 entries in swapper_pg_dir are 0's, and then there are 4 or more that point to kernel page tables.
  • The user page directories have the same entries as swapper_pg_dir above 768. The first 768 entries map the user space.
The upshot is that whenever the linear address is above 0xc0000000 everything uses the same kernel page tables.

The user stack sits at the top of the user data segment and grows down. The kernel stack is not a pretty data structure or segment that I can point to with a ``yon lies the kernel stack.'' A kernel_stack_frame (a page) is associated with each newly created process and is used whenever the kernel operates within the context of that process. Bad things would happen if the kernel stack were to grow below its current stack frame.

User pages can be stolen or swapped. A user page is one that is mapped below 3 GB in a user page table. This region does not contain page directories or page tables. Only dirty pages are swapped.

Minor alterations are needed in some places (tests for process memory limits comes to mind) to provide support for programmer defined segments.