Operating systems from scratch; level 2 (younger half)

  • Tutorial

In this part we will write a memory manager in order to unlock the use of Vec , String , HashMap and all that. Immediately after that, we implement the FAT32 file system and connect the driver for EMMC (such a thing for communicating with SD cards). In the end there will be a couple of new teams in our shell: cd , pwd , cat , ls .


Zero Lab


First Lab: Younger Half and Older Half


Usefulness



Phase 0: Getting Started


To begin with, we make sure that nothing has changed in our environment from the very beginning:


  • There Linux, BSD or macOS
  • It is 64 bit
  • You have a usb port

Also found this: git , wget , tar , screen , make and all that was in the last series. We are convinced of stability and move on.


Obtaining the required code


We clone cs140e this repository into our daddy by name cs140e (or as you called it there):


git clone https://web.stanford.edu/class/cs140e/assignments/2-fs/skeleton.git 2-fs

After all this, there should be something like this inside:


cs140e
├── 0-blinky
├── 1-shell
├── 2-fs
└── os

Now you need to merge the repository os , in which there is some of your code, with the code that will be used in this part:


cd os
git fetch
git checkout 2-fs
git merge master

You may have to resolve merge conflicts. You may see something similar to:


Auto-merging kernel/src/kmain.rs
CONFLICT (content): Merge conflict in kernel/src/kmain.rs
Automatic merge failed; fix conflicts and then commit the result.

Automatically failed. Will have to hands. In order to resolve this, you will need to manually change it kmain.rs . Be sure to save all the changes from the previous series. To resolve conflicts, you need to add fixed files using git add and git commit . Any useful info on resolving merger conflicts and generally about git can be found in the githowto.com git guide . You can go through this tutorial. Anything will come in handy.


Firmware update


It is required to download the firmware again using the command make fetch from the turnip 2-fs . The team itself will download and unpack everything. It remains for us to put the files from the subdirectory files/ to the root of our microSD card. Namely files firmware/bootcode.bin , firmware/config.txt , firmware/fixup.dat and firmware/start.elf . Do not forget that if you use the bootloader from the previous part in quality kernel8.img , then you need to add this line to config.txt :


kernel_address=0x4000000

Install ttywrite


Now Makefile daddy os/kernel has an additional goal install . She collects the core and sends it directly to the raspberry using ttywrite from the last part. Accordingly, if the raspberry kernel8.img contains a bootloader written in the previous series, this very bootloader will accept and load our file with the kernel. In order to start this, we need to execute make install in the daddy with the kernel code.


At the same time, this target from ttywrite directly Makefile calls ttywrite just by name. Those. ttywrite must be in one of the locations indicated by the environment variable PATH . In order to cram ttywrite into one of these places we can go in 1-shell/ttywrite and perform cargo install . Our utility will collect and copy to the right place. If everything is correct, then we can call ttywrite --help and it will give us help.


By default, make install uses /dev/tty.SLAB_USBtoUART for the device name. Edit Makefile so that it indicates what you need. Those. What device do you have for your adapter? There is a variable with a name PI_TTY . So give it a different meaning.


ALLOCATOR.initialize() вызывает панику!
Наша оболочка должна была остаться работоспособной. Однако если вы попробуете протестировать цель make install , то обноружите, что оболочка не работает. Скорее всего дело в вызове ALLOCATOR.initialize() . Там нет распределителя памяти. Только заглушка, которая просто паникует при выполнении безо всяких предупреждений об этом. Чуть позже этот достадный факт мы исправим. Пока можно просто закоментировать эту строчку, чтоб всё более или менее, но работало.

Phase 1: Memory Line



In this phase, we will implement two memory allocators. The simplest bump allocator and a slightly more complex bin allocator. That’s practically all we need for memory allocation on the heap to work. In other words Vec , Box they String will work. In order to determine the amount of available memory, we will use ARM tags ( ATAGS ). In addition, we need to implement the internals of the function panic_fmt that is ultimately called when called panic! .


Subphase A: Panic!


Let's start with the implementation panic_fmt . We will work with the file kernel/src/lang_items.rs .


Language Items


When the Rust compiler is configured to compile programs for purposes without an operating system (such as our Raspberry Pi), the compiler asks us for a couple of functions. We need to write them with our own hands. Such things are called language items. These elements are functions whose calls the compiler substitutes under certain conditions. We can define such functions for any elementary language. To do this, we need to annotate such a function with an attribute #[lang_item] .


Rust currently requires us only two of these functions:


  • panic_fmt: (fmt: ::std::fmt::Arguments, file: &str, line: u32, col: u32) -> ! which is called when called panic! . Arguments panic! are passed as a parameter fmt . In addition, the file name, line and column numbers of the place where it was called are transmitted panic! . This is appropriate file , line and col .
  • eh_personality : This function depends on the specific OS or ABI. It is called upon stack unwinding or after abort, if necessary. Usually all this happens when panic! you exit the stream. We will not implement this function.

Both of these features are already declared in kernel/src/lang_items.rs . We need to add a function panic_fmt so that it displays something useful.


Implementation panic_fmt


Now we will add this function. Our implementation should output the transmitted information to our console, and then go into an endless loop loop . By the way. fmt::Arguments implements trait Display . Hence we can use kprintln!("{}", fmt) . Make a conclusion as you personally like it. For example, you can be inspired by the panic of the Linux kernel :


            (
       (      )     )
         )   (    (
        (          `
    .-""^"""^""^"""^""-.
  (//\\//\\//\\//\\//\\//)
   ~\^^^^^^^^^^^^^^^^^^/~
     `================`

    The pi is overdone.

---------- PANIC ----------

FILE: src/kmain.rs
LINE: 40
COL: 5

index out of bounds: the len is 3 but the index is 4

Then test the implementation with panic_fmt some kind of kernel panic. I remind you that you can use it make install for compilation with subsequent execution through the bootloader. And yes. ALLOCATOR.initialize() it causes panic! somewhere inside. So when you run it, you should also see an error message.


In addition to this, try to cause panic in many ways. Using panic!() , using other similar macros. And something else. When you are sure that your implementation is working for yourself, proceed to the next phase.


Subphase B: ATAGS


In this subphase, we gash the iterator over the ARM (ATAGS) tags that the raspberry firmware provides us. We will use the iterator in order to determine the amount of available memory. The main work will be done in the directory pi/src/atags and in the file kernel/src/allocator/mod.rs .


ARM Tags


ATAGS or ARM tags is a mechanism used by the ARM bootloader and firmware in order to transfer various information to the kernel of the operating system. Linux can also be configured to use ATAGS.


Malinka places an array of ATAG structures starting at the address 0x100 . In addition, each tag begins with an 8-bit header:


struct AtagHeader {
    dwords: u32,
    tag: u32,
}

The field dwords contains the size of the entire tag including the title itself. The size is in double words (double words, i.e. 32-bit word). The minimum size is 2 (in the size of the header). The field tag contains type ATAG. There are about a dozen of these in ATAGS Help . Malinka uses only four of them. They are interesting to us:


Name Type ( tag ) The size Description
CORE 0x54410001 5 or 2 (if empty list) Used as a starter
NONE 0x00000000 2 Signifies the end
Mem 0x54410002 4 Describes a piece of physical memory
CMDLINE 0x54410009 different Passing arguments to the kernel

The type of tag determines how we will interpret the data after the header. By clicking on the links that are tied to names, you can find out more. For example, a tag MEM contains something like the following:


struct Mem {
    size: u32,
    start: u32
}

Tags lie in memory sequentially without any filling between them. This list starts with a tag CORE . And the last tag on this list is this NONE . Everything else can be in any order. In order to find out where the next tag begins, you need to look at the contents dwords . Graphically, it all looks something like this:


ATAGS


Union and Security



Raw structures for ATAG tags are declared in the file pi/src/atags/raw.rs . They are used there union . Joins in Rust are almost identical to joins in Neat C. They define a structure in which some fields share memory.


pub struct Atag {
    dwords: u32,
    tag: u32,
    kind: Kind
}

pub union Kind {
    core: Core,
    mem: Mem,
    cmd: Cmd
}

In other words, associations allow you to write arbitrary structures into memory without taking into account the correct choice. It turns out that access to specific union fields is not safe in Rust.


There are already wrappers unsafe in many places. At least you don’t have to worry about how to contact the associations yourself. However, passing associations to the end users of the library is a bad idea. For this reason, there is another structure Atag in the file pi/src/atags/atag.rs . This structure is completely secure for access. It is we who will pass it to the external code. In the process of adding a module, atag you will write a transformation between the internal representation and this structure.


Почему передача объединений конечным пользователям будет плохой идеей? [enduser-unsafe]

Мы прилагаем достаточно много усилий для создания безопасного интерфейса для небезопасных структур. Вы ещё не один раз увидите подобное в Rust. Стандартная библиотека тоже хороший пример этого. Какая же польза от создания безопасной прослойки? Можем ли мы перенести подобный подход на такие языки, как Няшный Си?

Command line arguments


The tag CMDLINE deserves extra attention. This tag is declared this way:


struct Cmd {
    /// The first byte of the command line string.
    cmd: u8
}

According to the comment, the field cmd contains the first byte of the string. In other words &cmd , it is a pointer to a C-like string, which eventually ends with a null byte. The safe version of this tag is Cmd(&'static str) . When you convert from raw -version to safe, you will need to determine the size of this line. Those. find the null terminator (character with code 0 ). After that, you can use the address and size to convert this to a slice using slice::from_raw_parts() . In turn, this slice can be converted to a string using str::from_utf8() or str::from_utf8_unchecked() . You have already used them in Lab 1.


Implementation atags


Well. Now we have everything we need in order to implement the module atags that lies in pi/src/atags . Start by implementing raw::Atag::next() from atags/raw.rs . This method determines the address of the next ATAG, after the current one. Here you have to resort to the unsafe code. Then implement helper methods and properties to convert from raw -structures to safe ones, which can be found in atags/atag.rs . When implementing, you From<&'a raw::Cmd> for Atag will have to use a bit of unsafe code. At the end, implement the trait Iterator for Atags , which can be found in atags/mod.rs . Here, a bit of unsafe code may also be required unsafe .


Подсказки:

Вы можете (по крайней мере попытайтесь!) реализовать метод Atags::next() в три строчки кода.

Вы можете преобразовать из x: &T в *const u32 при помощи x as *const T as *const u32 .

Обратное преобразование из x: *const T в &T можно выполнить при помощи &*x .

Вы можете выполнять арифметические операции над сырыми указателями при помощи add() sub() или offset .

Testing atags


Test your own implementation of ATAGS. Try to get all the values ​​through an iterator and print the whole thing to the console. Straight in kernel/src/kmain.rs . You should see at least one of three tags other than NONE . At the same time, make sure that the implementation of each ATAG meets your expectations. Once the implementation is complete, proceed to the next subphase.


Подсказка : Используйте `{:#?} для более красивого вывода структур на консольку.



Что содержат теги с типом CMDLINE ? [atag-cmdline]

Есть ли какая либо ценность в этих тегах? Какие аргументы прошивка вам передала? Как вы думаете, откуда это всё взялось и как это можно использовать?



Сколько памяти вам доступно согласно тегу MEM ? [atag-mem]

Каков точный начальный адрес и размер доступной памяти, который сообщается в теге MEM ? Насколько всё это близко к 1 гигабайту памяти, которая имеется у малинки?

Subphase C: Warming Up


In this subphase, we will prepare everything necessary for the subsequent writing of two memory allocators. We implement align_up and functions align_down that align addresses to powers of two. In addition, we implement a function memory_map that returns the start and end addresses from system memory. This function will be used by both allocators to determine how much memory is available for allocation.


Alignment


An address in memory is called N-aligned when it is completely divisible by N . In other words, for the address k will be true k % n == 0 . Usually we don’t need to worry about aligning our memory. However, we are now system programmers. Our increased attention to this topic is related to the fact that hardware, protocols, and all that, prescribe quite certain properties for themselves in terms of alignment. For example, the 32-bit ARM architecture requires the stack pointer to be 8-byte aligned. The AArch64 architecture (our case) requires alignment for a stack pointer of 16 bytes. x86-64 requires roughly the same. Addresses of pages of memory should be aligned on 4 kilobytes. There are many more different examples, but we can already see that we cannot do without it.


In Cute C, the memory addresses returned by the allocator from libC will be guaranteed to be 8-byte aligned on a 32-bit system and 16 bytes on a 64-bit system. In addition, the caller cannot specifically control the alignment of the returned memory address and must take care of itself. For this, there is, for example, a function posix_memalign from the POSIX standard.


Почему для Няшного Си выбраны именно такие свойства выравнивания? [libc-align]

Выбор гарантии в 8 или 16 байт для выравнивания для сишной функции malloc небезоснавателен. Почему в стандартной библиотеке Си были выбранны именно такие гарантии?

In nyashnye Xi ads malloc() and free() look like so:


void *malloc(size_t size);

void free(void *pointer);

Rust uses alloc and dealloc , which look like this:


// `layout.size()` is the requested size, `layout.align()` the requested alignment
unsafe fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr>;

// `layout` should be the same as was used for the call that returned `ptr`
unsafe fn dealloc(&mut self, ptr: *mut u8, layout: Layout);

Note that the calling code may indicate alignment. As a result, the allocator, and not the calling code, must take care to return a aligned pointer. When you implement memory allocators in the following sub-phases, you should make sure that the return address is correctly aligned.


In addition, it is worth canceling, which dealloc , unlike from free Nyashnoye, requires that the caller transmit Layout exactly the same as for alloc . Therefore, external code must take care of storing size and alignment for a specific allocated memory.


Как вы думаете, почему Rust разделяет обязанности между аллокатором и вызывающим кодом именно так? [onus]

В Няшном аллокатор имеет меньше ограничений на выравнивание адресов памяти, которые он может возвращать. Но при этом он должен сам хранить размер выделенного пространства. В Rust всё ровно наоборот. Как вы думаете, почему Rust выбрал противоположный путь? Какие приемущества это даёт для аллокатора и для вызывающего кода?

Goodies: align_up and align_down


When you implement allocators in the following sub-phases, it will be useful for you to determine the next or previous aligned address. Those. for the u next address >= or <= u , which will be aligned in powers of two. There are already (unrealized of course) align_up and align_down . You can find them in the file kernel/src/allocator/util.rs . They are declared in approximately this way:


/// Align `addr` downwards to the nearest multiple of `align`.
/// Panics if `align` is not a power of 2.
fn align_down(addr: usize, align: usize) -> usize;

/// Align `addr` upwards to the nearest multiple of `align`.
/// Panics if `align` is not a power of 2.
fn align_up(addr: usize, align: usize) -> usize;

Release these features right now. You can verify the correctness of this process by calling make test or cargo test from a directory kernel . The tests themselves can be found in the file kernel/src/allocator/tests.rs . If everything is correct in this part, then all tests from align_util must pass.


Внимание : в процессе тестирования kprint{ln}! превращаются в вызовы print{ln}! , так что всё должно работать.

Подсказки:

Реализация будет занимать одну или две строки.

Реализации align_up и align_down будут очень похожи между собой.

Thread safety


Memory allocators like malloc() from libC and our two, which we will implement, are global . Those. they can be called anywhere and anytime. Including in any stream. Therefore, allocators must be thread safe. Rust takes thread safety very seriously. For this reason, it is difficult to implement an allocator, which will not be such, even if our system still does not have a mechanism for servicing concurrency (for example, threads).


The thread-safe memory allocators topic is quite extensive. A lot of research work can be found on this topic. At the moment, we will not touch on this topic. Just wrap our memory allocator in Mutex . This wrapper is already in kernel/src/allocator/mod.rs . Read all this code now. Please note that there is an implementation of the Alloc trait there . It is thanks to this that Rust will know that this is quite a valid allocator. The implementation of this trait is required for such things as #[global_allocator] (c kmain.rs ). Sobsna #[global_allocator] is an annotation that is added to a statically declared allocator to tell Rust to use it for all of these Vec , String and others. Box . Those. all calls alloc() and dealloc() will be redirected there.


Switch between allocator implementations


The implementation Alloc for Allocator from kernel/src/allocator/mod.rs simply redirects its calls to imp::Allocator after it deals with the lock mutex . In this case, the module is imp virtual . It does not lead to any file in the file system tree. Instead, it uses an annotation to sort of #[path = "bump.rs"] tell Rust where to actually find it. This allows us to switch between the two implementations simply by assigning a different path for #[path] . First, we use the bump allocator from the file bump.rs . Then you should switch to the bin allocator from the file bin.rs .


Goodies: memory_map


The last piece in the file kernel/src/allocator/mod.rs is the function memory_map . This function is called from Allocator::initialize() , which will then be called from kmain() . This one initialize() creates an instance of the structure imp::Allocator and places it inside our wrapper.


The function memory_map returns the starting and ending addresses of free memory in the system. Please note that free memory and all available memory are not the same thing. Those. we need to define it all using ATAGS. At the same time, as the beginning of the free memory section, we take the end of the memory used for kernel code. The already declared variable is responsible for this binary_end . It should point somewhere where it points _end (declared in layout.ld ).


Implement memory_map using Atags subphase B and the variable binary_end . Make sure that this function returns the correct value. After that, try calling something like String::from("Hi!") . Or any other way to allocate memory on the heap of your choice. Make sure that it will be called panic!() from the bump allocator. If it memory_map works correctly, and a certain code from a bump-allocator causes a panic, you can go to the next subphase. We will write code that will correct this misunderstanding with panic.


Субфаза D: Bump Allocator



In this part, we will implement the simplest allocator. Bump Allocator The main work is in the file kernel/src/allocator/bump.rs .


Bump Allocator работает следующим образом. When alloc allocator returns current as a pointer. At the same time aligning it as necessary. Immediately after that, it increases current by the requested size (taking into account the necessary alignment of course). If the memory runs out, an error is returned. When dealloc nothing happens.


Below is a picture that shows what happens after 1 kilobyte of memory is allocated, and then another 512 bytes. Please note that in this case there are no problems with alignment.


BUMP


The task itself is to realize everything you need kernel/src/allocator/bump.rs . Those. functions new() , alloc() and dealloc() for bump::Allocator . Use align_up and align_down to ensure proper address alignment. Unit tests are provided for self-testing. They can be performed using make test or cargo test from the catalog kernel . Actually, at least all allocator::bump_* tests must pass allocator::bump_* .


Внимание . Убедитесь, что вы не выполняете каких либо операций, потенциально приводящих к переполнению!

Используйте методы saturating_add и saturating_sub для предотвращения переполнения.

Once all unit tests succeed, try testing your implementation out a bit kmain() . For example, like this:


let mut v = vec![];
for i in 0..1000 {
    v.push(i);
    kprintln!("{:?}", v);
}

As soon as the implementation is ready, proceed to the next subphase.


Как выглядит цепочка вызовов alloc ? [bump-chain]

Допустим вы приостановили выполнение программы при вызове bump::Allocator::alloc() . Как будет выглядеть цепочка вызовов в этом месте? Другими словами: подробно объясните, как вызов v.push(i) приводит к вызову bump::Allocator::alloc() .

Subphase E: Bin Allocator



In this subphase, we will implement a more complex allocator: a bin allocator. The main work in the file kernel/src/allocator/bin.rs .


A bin allocator divides memory into segments, classifying them by size. Each individual dimension class (or bean in another way) contains a linked list of memory links with the size defined by this class. Each piece of allocated memory is rounded to the nearest bin. If there is an element in the linked list of beans, it is taken out and returned. If there is no free memory in the bin, then the new memory is returned from the global pool. Deallocations add an item to the linked list of the corresponding bin.


One of the popular hipster ideas is to assign each bin a size that is a multiple of the powers of two. For example, an allocator can choose to divide memory into k - 2 bins with sizes 2^n for those n that begin with 3 and before k ( 2^3 , 2^4 ... 2^k ). Any request to allocate or free memory equal 2^3 to will be processed by the bin 2^3 . Size between 2^3 and 2^4 bin with size 2^4 and so on:


  • bin 0 ( 2^3 bytes) will handle sizes in between (0, 2^3]
  • bin 1 ( 2^4 bytes) will handle sizes in between (2^3, 2^4]
  • bin 2 ( 2^5 bytes) will handle sizes in between (2^4, 2^5]
  • bin k ( 2^k bytes) will handle sizes in between (2^(k - 1), 2^k]
  • Well, you got the point ...

Linked list


An intrusive linked list for addresses has . You can find it in the file kernel/src/allocator/linked_list.rs . In addition, import kernel/src/allocator/bin.rs can be found in the file LinkedList , which kakbe hints at the usefulness of its use.


Что за интузивный связанный список такой?

В интузивном связанном списке указатели next previous если нужно) хранятся в самих передаваемых значениях. При этом такой список не требует для своей работы отдельной дополнительной памяти. С другой стороны пользователь должен предоставить хранилище в самом элементе.

A brand new clean list is created using a call LinkedList::new() . You can add an address using push() . The first address (if any) can be deleted and returned using pop() . Or simply returned without removal using peek() . Something like this:


let mut list = LinkedList::new();
unsafe {
    list.push(address_1);
    list.push(address_2);
}

assert_eq!(list.peek(), Some(address_2));
assert_eq!(list.pop(), Some(address_2));
assert_eq!(list.pop(), Some(address_1));
assert_eq!(list.pop(), None);

It LinkedList provides two iterators. The first can be obtained by calling iter() . It returns the addresses. The second can be obtained by calling iter_mut() . This returns Node which refers to each address in the list. The methods value() and pop() can be used to read the value or to take out the value, respectively.


let mut list = LinkedList::new();
unsafe {
    list.push(address_1);
    list.push(address_2);
    list.push(address_3);
}

for node in list.iter_mut() {
    if node.value() == address_2 {
        node.pop();
    }
}

assert_eq!(list.pop(), Some(address_3));
assert_eq!(list.pop(), Some(address_1));
assert_eq!(list.pop(), None);

Read the code LinkedList now. Pay particular attention to the safety features that you must provide for push() . Most likely you will want to use the one provided LinkedList to implement the beans in your memory allocator.


Почему удобно будет использовать именно интузивный связанный список? [ll-alloc]

Это хорошее годное решение — использовать предоставленную реализацию связанного списка. Какие проблемы могут возникнуть, если вместо этого мы попытаемся использовать обычный связанный список, который выделяет дополнительную память?

Fragmentation


The concept of memory fragmentation refers to that memory that is not used, but cannot be freed. The allocator either processes this, or creates a fairly strong fragmentation of memory throughout use. Under ideal conditions, an ideal allocator has zero fragmentation. In other words, it never uses more memory than is necessary to process the request. And it can always use all available memory to process new distribution requests. In practice, this property is not desirable or achievable, subject to some limitations. However, for a good memory allocator, low fragmentation is a good goal.


Usually there are two types of fragmentation:


  • Internal fragmentation . This is the amount of memory that is spent on internal expenses due to rounding (to guarantee correct alignment, for example). In the case of a bin allocator, this is the difference between the size of the distribution and the size of the class of the bin in which it is processed.
  • External fragmentation . The amount of memory spent by the allocator due to the inability to use free memory for new allocations. For our bin allocator, this is equivalent to the amount of free space in each bin that cannot be used to process a request for distribution of a sufficiently large request. Even if the sum of all the pieces of free memory meets or exceeds the requested size.

Your implementation of the allocator should try to avoid fragmentation as much as possible.


Implementation


Implement the bin allocator in the file kernel/src/allocator/bin.rs . The specific design of the dispenser (in addition to being a bin-like dispenser) is entirely up to you. In this case, the allocator must be able to reuse the freed memory. In addition, the distributor should not be subjected to excessive internal or external fragmentation. The unit tests provided verify all of these things. You can run the tests with make test or cargo test . And do not forget to replace bump.rs with bin.rs in the annotations #[path] in the file kernel/src/allocator/mod.rs . Well i.e. instead of using a bump allocator, use a bin type allocator when the latter is ready.


Once all the tests have passed and you put the bin allocator as a global one, you can proceed to the next phase.


Как выглядит ваш аллокатор? [bin-about]

Кратко объясните общий дизайн вашего распределителя памяти. По меньшей мере стоит ответить на эти вопросы:
  • Какие размеры для каждого класса (бина) вы выбрали и почему?
  • Как ваш аллокатор обрабатывает разные выравнивания?
  • В каких границах будет внешняя и внутренняя фрагментации для вашего дизайна аллокатора?




Каким образом можно уменьшить фрагментацию? [bin-frag]

Вероятно ваша версия аллокатора создаёт достаточно большую фрагментацию памяти и это нормально! Но как это можно было бы сделать лучше, чем есть? Напишите пару идей (прямо в комментарии) парочку таких идей для уменьшения фрагментации.




UPD : The next half of the lab