Tuesday, April 14, 2020

MMU and addressing


Memory management in modern CPUs relies on a hardware component, which is called Memory Management Unit or as it is known by its acronym - MMU. This component handles access to memory address ranges. 
There are several implementations of MMU. We will try to  deal with the general concept of the MMU, without referencing to any specific system architecture. 
The provided exercises and examples are intended to help you getting thorough understanding of the subject by practicing.



Why using MMU?


The key answers are concurrency,  virtualization and isolation. 

Concurrency and Virtualization.

  1. Modern operating systems provide the illusion of running many programs at the same time. This behaviour  is defined by the term "concurrency".
  2. Another illusion it provides is that every program "sees" all the memory ranges available in your computer.  This illusion is defined by the term "virtualization".
    • But here is the catch - assuming that your computer uses only  32 address bits - providing access to 2^32 = 4 Giga byte addresses and the system runs 100 programs - it means that 400 (100 * 4) Gigs of addresses are needed to allow this operation. So, some "magic" has to be done to allow it.

Isolation.

  1. The operating system assures that each program has no permission of accessing memory that is used by another program - unless this permission is granted - by explicitly calling specific functions.
  2. As said before - a program can "see" 4 Gigs - but it does not have access to all of them.  On the other hand - the operating system itself (remember - it is also a program) has full access to everywhere.
This blog will reveal the way MMU resolves these issues in a methodological way - lesson after lesson with supplementary exercises for registered readers. You all can enjoy it and provide useful comments.

 

 



Lesson1: MMU and address ranges.

 

Physical addresses in a nutshell.


Generally, every hardware component which is not part of the CPU itself has an address range within a board's physical address map. Such components are the SDRAM memory, the Ethernet controller and the timers.
The diagram below presents the physical address map of a typical System on Chip (SoC) in which every hardware component has a fixed address range. 


Figure 1 - Complete theoretical board's physical address map

 


As you can see from the diagram, the complete board's physical address map can be seen as a sequence of contiguous "Physical Pages". This is exactly how the MMU sees it.
You can also notice, that four totally different hardware components use the same physical page. Manufacturers try to avoid it, but we are still in the theoretical phase - and it is acceptable.  


Virtual addresses in a nutshell.



Another thing that is clearly seen from figure 1, is that the SDRAM uses relatively small number of Physical Pages, but programs are not aware of that. On the contrary - they "see" more memory than the size of the SDRAM. This memory is called "Virtual Memory".

Every program is either loaded into and runs from a Physical Address on RAM or runs directly from medias that allow it. However, the program does not "know" it, unless it issues a specific operating system system call which is beyond the scope of this course. The program has no understanding of Physical Addresses.

When a program uses a pointer - it actually uses a "key". Moreover, each address used by the program to store and fetch data is a key as well. This key is called "Virtual Address".

Virtual Address is neither a hardwired address nor an address that can be accessed through  chip select mechanism or through the data lines. This is a pure logical address.

When a program accesses a Virtual address - a transparent mechanism of translation to Physical address in handled by the MMU.  




Constructing address value in paging system.


What we are still missing is the the understanding of how an address is constructed from pages.
So, we can see pages as blocks. Therefore, every address can be calculated by summing the address of the block with an offset in it. This implies on both Virtual And Physical addresses.  Each block referring to Physical Addressing is actually the "Physical Page" that we have already discussed. In the Virtual Addressing case - it is the "Virtual Page".





Figure 2 - Layout of Addresses in a paging system






The above figure assumes that you are all familiar with Binary and Hexadecimal Numeral Systems.



It is extremely important to understand that the number of bits assigned for the offset field is the same for Virtual and Physical addresses.

On the other hand - the number of bits assigned to the Page Block Address and the "Index"  may be different. The  Xscale Intel's 81348 IO processor is an example of that - where a 32bit Virtual Address can be translated into a 36bit Physical Address.

  • The Index part of the Virtual Address is used for pointing into a memory location of a "token" that is used by the MMU to make the translation.
  • The offset value can vary between 0 and 2^(number of bits assigned to the offset) - 1.   
  • During the translation, only the bits assigned to the Index are replaced by the MSB of the Page Block Address.
Presentation 1 demonstrates the calculations made by the MMU in order to perform the translation.




Presentation 1: Example of Virtual to Physical Address translation.

 

 

Tokens.


For now, all you have to know is that the MMU "senses" an array of "Tokens". This array is constructed  (or modified) by one of the following: 

  • operating system code that loads a new program.
  • the boot loader, before loading the operating system startup code.
This is done by filling the array with tokens or modify them and by issuing specific CPU low level instruction (in "assembly" language) to notify the MMU, that it should use the updated array's contents.

There are basically two types of tokens: 
  • Indirect:
  • The token may:
    • Map one entry in an array of indirect tokens to an entry in another array of direct tokens.
    • Map one entry in an array of indirect tokens to an entry in another array of indirect tokens.
  •  Direct:
  • The token will map one entry in an array of direct tokens to one Physical Page Base Address.

For the sake of simplicity, in this lesson we assume the following:
  • Our theoretic system uses direct tokens, whilst in "real life" - they use indirect tokens.
  • The direct token location is calculated based on the value in the "Index" part of the Virtual Address.



The direct token is a 32bit value consisting of a bit mask representation as the following:
  • Most significant bits of the Physical Page Base Address it is associated with.
  • An indication for the size of the Physical Page.
  • An ID of the program that uses it. This ID plays a major role in writing secure code.
  • Operations and access types that the associated program is permitted to use.  




Presentation 2: Examples of direct tokens for a system with 32bit Physical Address Range.



Loading and securing our code using MMU.


By default - once the boot sequence has been finished, a programmer can access the whole physical address of the board - as long as the MMU has not been initialized. In fact, during the initialization time there is no such thing as Virtual address. All the addresses are physical - and all are accessible, as long as a "C" pointer is big enough to contain the value of the physical address.

However, in modern multitasking operating systems - this is exactly what we want to avoid. 

We want to restrict one program's access to the memory used by another program.
This is not applicable by software conditional statements. Instead, we need the hardware for our aid.That piece of hardware is the MMU.

Setting up the array:


  • The array MUST be placed in a contiguous physical memory area on RAM (Let's assume that we refer to the SDRAM - though it is not always the case) . This means - it should be "seen" by the CPU as one block with a start physical address and a size. 
  • System software fills this array with tokens and then sends a CPU specific command to the MMU to start "sensing" it.
  • There may be two types of systems:
    • Tiny system
      • The operating system uses one array, with each program assigned a portion of it, and each "token" contains also the associated program ID.
    • Full scale system
      • The operating system may reinitialize the MMU at run time as many times as it needs. 
      • It does it by filling and modifying different arrays of "tokens". At run-time, only one array is loaded to the CPU internal non-addressable memory.
      • All the "tokens" in each array are associated with the same program ID. 



Use case 1 : Tiny system - Memory allocation and mapping through MMU on a system with one array of tokens.

 

Consider the following scenario:

  • The first two physical pages are used by hardware components such as Ethernet, UART etc... . The RAM starts at Page 3.
  • Program1 is loaded into a Virtual Page that is translated to Physical Page 3.
  • Program2 is loaded into a Virtual Page that is translated to Physical Page n-4.
  • Program1 allocates two physical pages from the SDRAM by calling the function malloc or its equivalent.
    • In return, it gets a virtual address. It is associated with a sequence of locations in an array of tokens. The associated tokens are (n-4) and (n-3) which are translated to Physical Pages are (n-1) and (n-3) respectively.
    • Program1 has full access to the allocated physical pages. 
  • Program2 issues a "mapping" to read from the same set of physical pages already allocated by Program1 - without permission to modify their contents.
    • In return, it gets a virtual address - associated with another set of successive tokens (n-1) and (n) which in turn will point to Physical Pages (n-1) and (n-3) respectively.

The below presentation demonstrates the whole process.





Presentation 3: Example of allocation and mapping on tiny systems.


Use case 1 - Summary:

 

  1. There is one array of tokens in the system, that the MMU should "know" about.
  2. Each Page (either Virtual or Physical) is associated with one token.
  3. There may be tokens that are not associated with a Physical Page. We will deal with it in Lesson3 - MMU and exception handling
  4. The number of the successive tokens is determined by the allocation size.
  5. Successive tokens may not correspond to successive Physical Pages, but they do correspond to successive Virtual Pages
  6. The programs don't "know" which Physical Addresses are used - since they "hide" behind the assigned tokens.
  7. The access type for each physical page is determined by the token which is selected and set by either the allocating or the mapping call. Both the selection and the setting of the token are part of the internal implementation of these calls, and are hidden from the program.
  • Read only.
  • Full Access.
  • Extended Access.  We will discuss it while dealing with caches, CPU privilege modes and domains.
  • No Access.  

Limitations exposed by this use case 1:

  1. The size of physical memory that can be assigned to each program depends on the usage of the array.
  2.  Using one array for all the programs means that whenever a program is unloaded or replaced , an expensive task of scanning and clearing of "tokens" takes place.
  3. Looking for a Virtual Address range for a new program becomes a very complex procedure, since as time goes by - it's harder to find a chunk of successive tokens. 


    Use case 2: General system - Memory allocation and mapping through MMU on a system with each task associated with one array of tokens.

     

    Consider the following scenario:

    • The first two physical pages are used by hardware components such as Ethernet, UART etc... . The RAM starts at Page 3.
    • Program1 is associated with array1 and allocates small initial virtual memory portion associated with token n - linked to physical page n-2.
    • Program2 is associated with array2 and allocates small initial virtual memory portion associated with token n - linked to physical page 4.
    • Program2 issues a "mapping" to read from a virtual address corresponding to Token(n-1) directly linked to the physical page which was assigned to Program1.
    • Program2 allocates virtual memory associated with tokens (n-4) to (n-2). Token (n-4) is linked to physical page (n-1), Token (n-3) is linked to physical page (n-3) and Token (n-2) is uses some "tentative" link to physical page (n-4).
      • "Tentative" means that the operating system provides this link as a hint, and the "real" link will be established only when the program will need to access the virtual memory location corresponding to this "Token".   

    The below presentation demonstrates the whole process.



    Presentation 4: Example of allocation and mapping on general systems.


    Use case 2 - Summary:

    1. There is an array of tokens per each program, that the MMU should "know" about.
    2. Each Page (either Virtual or Physical) is associated with one token.
    3. There may be tokens that are not associated with a Physical Page. We will deal with it in Lesson3 - MMU and exception handling
    4. The number of the successive tokens is determined by the allocation size.
    5. Successive tokens may not correspond to successive Physical Pages, but they do correspond to successive Virtual Pages
    6. The programs don't "know" which Physical Addresses are used - since they "hide" behind the assigned tokens.
    7. The access type for each physical page is determined by the token which is selected and set by either the allocating or the mapping call. Both the selection and the setting of the token are part of the internal implementation of these calls, and are hidden from the program.
    • Read only.
    • Full Access.
    • Extended Access.  We will discuss it while dealing with caches, CPU privilege modes and domains.
    • No Access.  

    Use case 2 vs use case 1:

    1. This use case needs a lot of maintenance by the operating system. The system has to play around with all the arrays.
    2. On the other hand - the limitation of finding "free" tokens is not hard as in use case 1.
    3. Operating systems generally work according to use case 2.  

     

    What next:



    In order to process, let's replace some of the terms, that used so far by new ones: The term "array" will be replaced by the term "Page table". The "Token" term will be replaced by the term "Page table entry" or "PTE".
    In addition, we used the term "programs" throughout  this discussion. Currently, we distinct between "programs" and "operating system". However, there are many types of programs. Some of them use isolation and some don't.
    The next lesson will use the new terms.


     








    No comments:

    Post a Comment