* Most of the material is taken from this course on Brilliant.
Hexadecimal Representation
- Base 16 is reffered to as hexadecimal or hex for short.
- Convention: Prefix binary numbers with 0b and hexadecimal numbers with 0x (e.x. 45 is 0b101101 in binary and 0x2D in hexadecimal)
- By Concatenating numbers in binary or hex we get the same result (given that hex numbers have same number of digits and we write all zeros in the binary representation)
Linear Memory Model
- Memory can be understood as a long tape of cells, and the program can read from or write to cells.
- We need two concepts: 1) Address, an integer that specifies a place on the tape, 2) Data, the information stored in each cell
- In modern computer architectures, one memory address points to one byte (i.e. each cell can contain 1 byte of information).
- The beginning address of memory is determined by the OS.
- When working with memory, often sizes that are powers of 2 are used (e.x. MiB, GiB instead of MB and GB).
- If a computer has 1 MiB of memory and the beginning address of the memory is 0xC00000, the address of the last byte of memory is 0xCFFFFF
(0xC00000 + 0x100000 = 0xD00000 and we have 0xD00000 - 1 = 0xCFFFFF).
- Since one byte can hold values no larger than 255, we often work with integers that span multiple bytes to hold larger numbers.
For example, we need at least 2 bytes of memory to represent the number 43690, which is 0xABCD in hexadecimal (i.e. 0xAB in one byte and 0xCD in another consecutive one).
- The order in which multi-byte integers are stores depends on the computer's architecture. Storing the most-significant byte first is called big-endian. Storing the least-significant byte first is called little-endian.
- The entire memory can be considered as a giant 1-dimensional tape of bytes. However, to make it easy to work with, it is often wraped across multiple lines such that there are many bytes per line (often 16),
and the memory address of the first byte on each line will be specified in a column to the left of the resulting table.
This method of displaying memory is called a hex dump and is quite common in debuggers, packet sniffers, hex editors and the like.
Memory of Programs
- The compiler reads source files, compiles them, and outputs a file called an executable file which contains instructions and information that are specific to both the OS and the CPU architecture.
- There are two essential pieces of information that are stored in the executable file: machine code and data.
Machine code is the compiled version of functions together with arguments and local variables defined in the function. It is a set of instructions that will be executed by the CPU. Each instruction is a basic operation like reading from, and writing to, memory, adding and subtracting numbers, etc.
Data is the compiled version of global variables together with their initial values (0 if not initialized, at least in C) and constants.
- Memory segments
- stack: Allocated memory for local variables. One type of stack is the call stack, where each stack element is a stack frame. A stack frame contains information for one function call. It consists of arguments, local variables, and return address. Stack frames are allocated before the function runs, and removed from the call stack after the function call finishes.
- heap: Dynamically allocated memory by malloc and similar functions. When we want to have a dynamically resizable data structure or to free unused memory before the function exits, etc we use malloc and free to do this.
malloc is a function that allocates a certain amount of memory that we specify. E.x. int *x = (int *)malloc(sizeof(int) * 100); This allocates a continuous chunk of memory of (100 time size of int) bytes and assigns the address to the pointer variable x.
- static: Allocated memory for global variables. This segment cannot be resized and the memory cannot be freed while the program runs.
- code: Allocated memory for machine code that corresponds to functions.
Virtual Memory
(assuming one CPU core)
- Each time an executable file is executed, the OS creates a process, allocates memory, and lets the process run.
Suppose you are typing in the text editor and the music player is playing.
It looks like the text editor and music player are running at the same time in parallel.
However, they are actually being switched back and forth very frequently.
The OS gives each process a certain amount of time to run, then pauses it and switches to another process. This feature is called multitasking.
- Each process has a virtual memory space. This is a memory address space that is isolated from other processes.
This means that a process can access its own virtual memory space, but not the memory space of another process.
Each virtual memory space contains the four segments we learned about: code, stack, static, and heap.
Memory addresses for virtual memory spaces are called virtual addresses. Physical addresses are accessible to the OS but not to processes.
When processes access memory, the memory management unit (MMU) does the virtual-to-physical memory address mapping.
- A memory page is a chunk of memory that has a fixed size (most common size is 4 KiB).
Techniques for Performance
- It is often the case that files which have been accessed recently will be accessed again. Therefore, keeping those files in RAM will improve performance, since memory access is faster than disk access.
- In order to keep the recently accessed files in memory, the OS uses the page cache, which is an in-memory copy of files.
- Each file in the page cache is managed in pages, which are 4096-byte chunks of memory.
Caching
- There are two basic types of memory units that are used in integrated circuits: SRAM (static random access memory) and DRAM (dynamic random access memory).
Typically, one SRAM cell uses six transistors to store one bit of data. One DRAM cell uses one transistor and one capacitor to store one bit of data.
SRAM is faster, but more expensive, so DRAM is used as the backing store and SRAM as the cache.
- Instead of having one large caching layer, modern CPUs have multiple layers of caches (L1 cache and L2 cache).
The memory address translation (performed by MMU) is done at the L1 cache. Therefore the L2 cache can simply work with physical addresses directly.
The L1 cache is usually separated into two parts: I-cache which stands for “instruction cache”, and D-cache which stands for “data cache”.
Machine code instructions are accessed sequentially, while data around the top of the stack segment are accessed repeatedly, but not sequentially.
Therefore, instructions and data have a different locality of reference pattern, and separating the two increases the cache hit ratio.
The L2 cache, on the other hand, does not separate the two.