Virtual Memory

https://www.geeksforgeeks.org/virtual-memory-operating-systems/

1. Three memory problems:

1.1 Not enough RAM

MIPS gives each program its own 32-bit address space
Programs can access any byte in their 32-bit address space (A: 2(32)bytes=4GB, A 32 bit address space gives you(theoretically) 4GB of memory you can address. In practice the OS reserves some if it so it is closer to 2GB of usable space.)

Always 32 bits in MIPS. The size depends on installed memory(30-bit RAM, address space(1GB)<4GB) Crash if we try to access more than 1GB.

1.2 Holes in our address space

Memory Fragmentation: As processes are loaded and removed from memory, the free memory space is broken into little pieces. It happens after sometimes that processes cannot be allocated to memory blocks considering their small size and memory blocks remains unused.

1.2.1How do we keep programs secure?

Each program can access any 32-bit memory address
What if multiple programs access the same address? -- sw R2, 1024(R0) will write to address 1024 regardless of the program that's running
They can corrupt or crash each other: security and reliability
If all programs have access to the same 32-bit memory space:
- Can crash if less than 4GB of RAM memory in the system
- Can run out of space if we run multiple programs
- Can corrupt other programs' data
How do we solve this?
- Key to the problem:"same memory space"
- Can we give each program it's own virtual memory space?
- if so, we can:
- - Separately map each program's memory space to the RAM memory(Mapping gives us flexibility in how we use the physical RAM memory) space
  - And even move it to disk if we run out of memory

1.3 Programs writing over each other

2. What is virtual memory?

Virtual memory is a layer of indirection

Any problem in computer science can be solved by adding indirection.

2.1 Indirection

Virtual memory takes program addresses and maps them to RAM address

VM: mapping gives us flexibility in how we use the RAM

2.2 How does it solve the problems

2.2.1 not enough memory

Map some of the program's address space to the disk
When we need it, we bring it into memory(Mapping lets us use our disk to give the illusion of unlimited memory)

2.2.2 holes in the address space

We can map a program's addresses to RAM addresses however we like
Each program has its own mapping.
Mappings lets us put our program data wherever we want in the RAM.

2.2.3 keeping programs secure

Program 1's and Program 2's addresses map to different RAM addresses
Because each program has its own address space, they can't access each other's data: security and reliability

3. How does VM work?

3.1 Basic idea: separate memory spaces

Virtual memory: what the program sees
Physical memory: the physical RAM in the computer

3.2 Virtual Addresses(VA)

What the program uses
In MIPS, this is the full 32-bit address space: 0 to 2(32)-1

3.3 Physical Addresses(PA)

What the hardware uses to talk to the RAM
Address space determined by how much RAM is installed

3.4 Making VM work: translation

3.4.1 How does a program access memory?

Program executes a load with a virtual address(VA)
Computer translates the address to the physical address(PA) in memory
(If the physical address(PA) is not in memory, the operating system loads it in from disk, then store it in RAM)
The computer then reads the RAM using the physical address(PA) and returns the data to the program

3.5 Page Tables

The map from Virtual Addresses(VA) to Physical Addresses(PA) is the Page Table.

So far we have had one Page Table Entry (PTE) for every Virtual Address.

1 word == 4 bits

We need to translate every possible address:

Our programs have 32-bit Virtual Address spaces
That's 2^30 words that need Page Table Entries(1 billion entries!): 1 word VS 1 entry (If they don't have a Page Table Entry then we can't access them because we can't find the physical address)
How can we make this more manageable?
- What if we divided memory up into chunks(pages) instead of words?
Coarse-grained: pages instead of words
- The Page Table manages larger chunks(pages) of data:(each entry match a chuck(4KB) instead a word)
- - Fewer Page Table entries needed to cover the whole address space
  - But, less flexibility in how to use the RAM(have to move a page at a time)
- Today:
- - Typically 4KB pages(1024 words per page): In total we have 1 billion words, so 1 billion / 1024 is 1 million Page Table entries. This is much more manageable.
  - Sometimes 2MB pages(524,228 words per page)

3.6 Address translation

3.6.1 What happens on a 32-bit machine with 256 MB of RAM and 4KB pages?

32 bit Virtual Addresses
28 bit Physical Addresses
4kB page require 12 bits for the Page offset

3.7 Page faults

3.7.1 What happens when the data is not in RAM?

Page Table Entry says the page is on disk
Hardware(CPU) generates a page fault exception
The hardware jumps to the OS page fault handler to clean up
- The OS chooses a page to evict from RAM and write to disk
- If the page is dirty(has been changed), it needs to be written back to disk first
- The OS then reads the page from disk and puts it in RAM
- The OS then changes the Page Table to map the new page
The OS jumps back to the instruction that caused the page fault.(This time it won't cause a page fault since the page has been loaded.)

In the time it takes to do handle one page fault you could execute 80 million cycles on a modern CPU.

Page faults are the SLOWEST possible thing that can happen to a computer(except for human interaction).

3.8 Virtual memory and paging

very slow: when you have to page
very good: you don't crush because you run out of memory
Some systems do not page:
- iOS: the OS kills your program if you use too much memory
- OS X 10.9: the OS compresses your program first, then pages if it has to

4. Memory protection

Using virtual memory to prevent programs from corrupting each other

If each program has its own Page Table, then we can map each program's virtual addresses to unique physical addresses
Prevents programs from accessing each other's data

4.1 Program address space in Linux

Each program has it's own 32-bit virtual address space
Linux defines how that address space is used:
- 1GB reserved for the kernel. No read/write.(Keeps kernel data mapped)
- Program static data at the bottom
- heap grows up
- stack grows down(and has a fixed maximum size)
- the bottom 128MB zround zero unavailable(for I/O)
Random offsets to enhance security
- You never know exactly where a certain piece of data/code will be.
Kernel space is shared by all.
Shared data is mapped to both.
Separate mappings allow us to keep programs isolated.
Mappings allow programs to share data.

4.2 How do we provide separate mappings?

each process needs its own Page Table
OS makes sure they only map to the same physical address when we want sharing

5. Making virtual memory work

VM is great:
- Unlimited programs/memory, protection, flexibility, etc.
But it comes at a high cost:
- Every memory operation has to look up in the page table
- Need to access 1) the page table and 2) the memory address(2X memory accesses) (1.33 memory accesses per instruction. This is going to hurt.)
How can we make a page table look up really fast?
- software would be far too slow
Perhaps a hardware page table cache?

5.1 Making VM fast: the TLB

To make VM fast we add a special Page Table cache: the Translation Lookaside Buffer(TLB)
- Fast: less than 1 cycle
- Very similar to a cache
To be fast, TLBs must be small:
- Separate TLBs for instructions(iTLB) and data(dTLB)
- 64 entries, 4-way(4kB pages)
- 32 entries, 4-way(2MB pages)

5.2 What can happen when we access memory?

Good: Page in RAM

PTE in the TLB
- excellent
- <1 cycle to translate, then go to RAM(or cache)
PTE not in the TLB
- poor
- 20-1000 cycles to load PTE from RAM, then go to RAM(With 1.33 memory accesses per instruction we can't afford 20-1000 cycles very often.)

Bad: Page not in RAM

PTE in the TLB(unlikely)
- Horrible
- 1 cycle to know it's on disk
- ~80M cycles to get it from disk
PTE not in the TLB
- (ever so slightly more) horrible
- 20-1000 cycles to know it's on disk
- ~80M cycles to get it from disk

5.3 Making TLBs(seem) bigger

Make pages larger
Add a second TLB that is larger, but a bit slower
Have hardware to fill the TLB automatically(This is called a "hardware page table walk". Basically the hardware assumes the page table is in a special form in memory, and it can go get data from it on a TLB miss without having to go to the OS)

5.4 Example translation(empty TLB)

TLB hit

TLB miss

6. Multi-level Page Tables

6.1 Page table size

For 32-bit machine with 4kB pages we need:
- 1M Page Table Entries(32 bits - 12 bits for page offset= 20 bits, 2^20=1M)
- Each PTE is about 4 bytes
- 4MB total
Not bad
- except each program needs its own page table
- If we have 100 programs running, we need 400MB of page tables
And here's the tough part:
- We can't swap the page tables out to disk
- If the page table is not in RAM, we have no way to access it to find it.
How can we fix this?
- just add more indirection...

6.2 Multi-level page tables

With multi-level page tables, what is the smallest amount of page table data we need to keep in memory for each 32-bit application?

4kB + 4kB:

We always need the 1st-level page table so we can find the 2nd-level ones.

But, the 1st-level page table only helps us find other page tables. It isn't enough by itself to translate any program addresses. So we need at least one 2nd-level page table to actually translate memory addresses:

One 1st-level page table:4kB

One 2nd-level page table: 4kB

4kB+4kB per application (much better than 4MB per application!)

6.3 Multi-level page table translation

7 TLBs and Caches

VIPT

Idea:
- Look up in the cache with a virtual address
- verify that the data is right with a physical tag
VIPT: Virtually Indexed, Physically Tagged
- Data in the cache is indexed by the virtual address
- But tagged by the physical address
- We only get a hit if the tag matches the physical address, but we can start looking with the virtual address

TLB translation and Cache lookup at the same time

Use virtual page bits to index the TLB
Use page offset bits to index the cache
TLB->Physical Page
Cache->Physical Tag

Physical Page=Physical tag->Cache hit!

Fast: look in the TLB at the same time as the cache

Safe: Cache hit only on PA match

But, can only use non-translated bits to index cache(limit on how large the cache can be)

8 summary

need to combine the TLB and the cache for good performance

physical caches: require translation first(slow)
Virtual caches: use virtual addresses(no translation: fast, but protection)
Virtually-Indexed, Physically-Tagged(VIPT) caches: use VA for index and PA for tag

Previousjava.lang.System.*NextJava ClassLoader

Last updated 6 years ago

Was this helpful?