总结官网文档的Fuchsia的根底concept

Fuchsia Conceptions

1. component

  1. component Framework:反对component通信、library、建设等等的局部
  2. component manager:

    1. 启动零碎(零碎中最早启动/最晚敞开的component)同时启动别的必要的component,例如filesystem
    2. 中介作用,调用capability routing等
    3. 反对component与环境交互、反对扩大
  3. component manifest:针对特定component的一系列形容/配置文件
  4. component lifecycle:被component framework or component runner决定

    Bind:A调用B的capability称为A binds to B

    • eager binding:如果b是B的child切eager,A分割到B,那么也bind到b
    • reboot:component退出之后会重启(包含运行胜利退出)
  5. topology、identifier、realm都是之前在getstart外面提到过的
  6. Environment:让develpoer能够设定component realm的行为

Driver

  1. Fuchsia中,driver能够 bind to matching "parent" devices, and publish "children" of their own.

    This hierarchy extends as required: one driver might publish a child, only to have another driver consider that child their parent, with the second driver publishing its own children, and so on.

driver启动步骤:

  • system start 开启root,root申请bind driver
  • system在零碎中找适合的driver并绑定到root上
  • driver运行,可能会创立新的root去申请新的driver

    • 例如PIC driver发现有一个新的外围设备,就会创立一个新的parent node,而后这个新node会申请一个新的driver来绑定,每次发现一个新的外围设备都会反复一遍此步骤
  • 绑定之后,进行init driver,包含init interface等等
$ dm dump[root]   <root> pid=1509      [null] pid=1509 /boot/driver/builtin.so      [zero] pid=1509 /boot/driver/builtin.so   [misc]      <misc> pid=1645         [console] pid=1645 /boot/driver/console.so         [dmctl] pid=1645 /boot/driver/dmctl.so         [ptmx] pid=1645 /boot/driver/pty.so         [i8042-keyboard] pid=1645 /boot/driver/pc-ps2.so            [hid-device-001] pid=1645 /boot/driver/hid.so         [i8042-mouse] pid=1645 /boot/driver/pc-ps2.so            [hid-device-002] pid=1645 /boot/driver/hid.so   [sys]      <sys> pid=1416 /boot/driver/bus-acpi.so         [acpi] pid=1416 /boot/driver/bus-acpi.so         [pci] pid=1416 /boot/driver/bus-acpi.so            [00:00:00] pid=1416 /boot/driver/bus-pci.so            [00:01:00] pid=1416 /boot/driver/bus-pci.so               <00:01:00> pid=2015 /boot/driver/bus-pci.proxy.so                  [bochs_vbe] pid=2015 /boot/driver/bochs-vbe.so                     [framebuffer] pid=2015 /boot/driver/framebuffer.so            [00:02:00] pid=1416 /boot/driver/bus-pci.so               <00:02:00> pid=2052 /boot/driver/bus-pci.proxy.so                  [e1000] pid=4628 /boot/driver/e1000.so                     [ethernet] pid=2052 /boot/driver/ethernet.so            [00:1f:00] pid=1416 /boot/driver/bus-pci.so            [00:1f:02] pid=1416 /boot/driver/bus-pci.so               <00:1f:02> pid=2156 /boot/driver/bus-pci.proxy.so                  [ahci] pid=2156 /boot/driver/ahci.so            [00:1f:03] pid=1416 /boot/driver/bus-pci.so

设施sys领有driver host,而后这时候加载了[acpi]设施和相应的driverbus-acpi.so

随后,ACPI遍历枚举,找到一个pci bus 于是创立一个parent(蕴含一些protocol) [pci] pid=1416 /boot/driver/bus-acpi.so;driver host这时候把bus-pci.so这个driver绑定下来;

During its binding,这个driver扫描所有的pic bus上的devices

其中,PIC device 00:02:00是 intel ethernet interface, 在零碎中咱们又找到了e1000.so这个 driver适宜绑定(protocol适合)。

这是,PIC driver创立一个 parent(蕴含一些protocol),同时又创立了一个新的driver host(2052)

随后创立一个代理 <00:02:00> pid=host 2052 /boot/driver/bus-pci.proxy.so;这个代理用于driver host(2052)和PIC driver的接口

随后进行DSO(e1000.so)和driver host的绑定

随后,这个DSO publishes a ZX_PROTOCOL_ETHERNET_IMPL, which binds to a matching child (the ethernet.so DSO on line 9; it's considered a match because it has a ZX_PROTOCOL_ETHERNET_IMPL protocol).

这时候在device filesystem中最终的那个ethernet device为:

/dev/sys/platform/pci/00:02:00/e1000

ethernet.so)publishes a ZX_PROTOCOL_ETHERNET用于给client调用

  1. Driver binding

    要遵循肯定的标准

  2. Driver ops

这些hook(图中方块中)在运行时被别的driver调用

  1. Driver lifecycle
  2. Device driver lifecycle:

    1. binding program被 binding compiler 编译,产生ZIRCON_DRIVER 宏,领导把 binding program放入 ELF NOTE section,Device Coordinator能够不必加载整个driver就能够查看到信息
    2. init():被须要非凡初始化的driver/do not want to visibly publish their device(s) until that succeeds的driver调用
    3. bind():offer the driver a device to bind to,driver须要创立一个child device
    4. create()
    5. release():当初曾经不启用该method
  3. device lifecycle:

    1. device_add():减少一个child device;parent device是device passed in to the bind() 或者another device which has been created by the same device driver.
    2. device_async_remove():remove;The removal of a device consists of four parts: running the device's unbind() hook, removal of the device from the Device Filesystem, dropping the reference acquired by device_add() and running the device's release() hook.
    3. unbind():可选,运行过程中保障不会承受外来信息
    4. parent device保障child device在敞开的时候对相干申请返回错误信息
    5. release和unbind:递归的开释设施;例如在下图中

                  +------------+            | USB Device | .unbind()            +------------+ .release()                  |            +------------+            |  WLAN PHY  | .unbind()            +------------+ .release()              |        |    +------------+  +------------+    | WLAN MAC 0 |  | WLAN MAC 1 | .unbind()    +------------+  +------------+ .release()

.unbind() 从USB device开始向下,到底之后,两个MAC开始release,而后反向release

  1. device power management
  2. device protocol: 这个中央有点没看懂,提到了process与device、device protocol的调用等全副的过程,大略了解了一些

    • 大略是一个束缚,任何听从本束缚的driver都该当提供一系列的function
    • Platform dependent vs platform independen:dependent指的是client和driver中多加一层,例如buffrer调用性能等,缩小代码反复
    • process:Fuchsia based on driver host

      • driverhost: a process contains a protocol stack ,driverhost 动静加载driver

    具体解释:见下面driver启动步骤:

  3. platform bus

这些是底层的driver,为高层driver提供接口、反对等等,在系统启动的时候会事后加载

Filesystem

  1. File lifecycle

    1. Establishing a Connection:用户 发送RPC requests 给 filesystem servers using a FIDL
    2. namespace:齐全在client端。 which is a table of "absolute path" -> "handle" mappings. All paths accessed from within a process are opened by directing requests through this namespace mapping.
    3. passing data:也用RPC messages,use the FIDL protocol
    4. mmap:给client返回的是 virtual memory objects;只利用于read-only的文件
    5. Other Operations acting on paths: 比方rename(old,new),须要两个门路, Fuchsia filesystems use this ability to refer to one Vnode while acting on the other.
    6. vnode:用于标记门路、一个文件等等
  2. Filesystem Lifecycle

  1. Filesystem Management:只有管理员有权限
  2. Mounting:先init,后和parent (mounting) filesystem相连 ;what mountpoints exist elsewhere 取决于具体情况,不是所有中央都能够拜访到
  3. FVM: keep virtual mapping from (virtual partitions, blocks) to (slice, physical block).

          +---------------------------------+ <- Physical block 0      |           metadata              |      | +-----------------------------+ |      | |       metadata copy 1       | |      | |  +------------------------+ | |      | |  |    superblock          | | |      | |  +------------------------+ | |      | |  |    partition table     | | |      | |  +------------------------+ | |      | |  | slice allocation table | | |      | |  +------------------------+ | |      | +-----------------------------+ | <- Size of metadata is described by      | |       metadata copy 2       | |    superblock      | +-----------------------------+ |      +---------------------------------+ <- Superblock describes start of      |                                 |    slices      |             Slice 1             |      +---------------------------------+      |                                 |      |             Slice 2             |      +---------------------------------+      |                                 |      |             Slice 3             |      +---------------------------------+      |                                 |

partition table :name,partation ID,这个partation中曾经调配进来的slice的数量

slice allocation table: 由slice entries形成

每一个slice entry蕴含:allocation statusif it is allocated,        what partition it belongs to and        what logical slice within the partition the slice maps to
  1. MinFs: MinFS is a simple, unix-like filesystem built for Zircon.
  2. BlobFs: BlobFS is a content-addressable filesystem optimized for write-once, 次要用于package

BlobFs下disk构造:

  • The Superblock storing filesystem-wide metadata,
  • The Block Map, a bitmap used to keep track of free and allocated data blocks,
  • The Node Map, a flat array of Inodes (reference to where a blob's data starts on disk) or ExtentContainers (reference to several extents containing some of a blob's data).

    • node分两种,Inodes, ExtentContainers
    • Properties of the node linked-list:存在一些标准,保障extent是有序的,否则将认为是谬误
  • The Journal, a log of filesystem operations that ensures filesystem integrity, even if the device reboots or loses power during an operation, and
  • The Data Blocks, where blob contents and their verification metadata are stored in a series of extents.

    • Currently BlobFS does not perform defragmentation.
  1. Random access compression in BlobFS

    1. 默认是zstd
    2. 为保障page demand,将文件分成frame来压缩/解压缩(chunked compression)
  2. Block devices:和filesystem一样,program作为client,随后向devhost发送申请(通过RPC)

    fast block i/o:register a “transaction buffer”,传递例如:写入地位+写入内容起始地址等等,防止拷贝造成的大量开销

  3. zxcrypt
  4. Life of an 'Open':在Fuchsia中,open不是一个system call,client通过channel连贯filesystem;process初始化后,将会被附以namespace

    1. standard library定义了open函数
    2. Fdio:为 files, sockets, services,等多种提供对立的接口
    3. FIDL:一些协定,保障client和server的交互

Process

  1. core library

    1. FBL:继承了一些c++构造,也增加了一些   2. FXL:is a platform-independent library containing basic C++ building blocks
  2. Namespace

    1. namespaces are defined per-component 每一个component有他本人的root
  3. Object:The items within a namespace are called objects,例如一个namespace指向一个object,这个object是一个file或者是一个dict

    1. access:用FIDL,能够创立新的obj,也能够拜访子obj
    2. obj name:能够有不同的名字指向同一个obj,这个名字又上一层container决定(相似于dict)
  4. Object Relative Path Expressions:例如a/b/c的门路名称,然而不反对拜访container外(例如..)
  5. Client Interpreted Path Expressions: 用户能够自定义root地位

SandBox

  1. process 创立的时候,没有任何权限,通常会赋予一些handle等
  2. process的namespace很重要
  3. Component capabilities:是process的component将会取得一个/svc directory 在namespace中
  4. Legacy components:/svc提供的service是environemnt中service的子集

JOB

In Fuchsia, jobs are a means of organizing, controlling, and regulating processes

  1. job能够有child jobexception逆向流传(p<-c),policy&quota正向流传(p->c)
  2. 从root job开始,往下造成job tree

Booting

启动步骤
  1. Kernal启动之后,userspace先boot
  2. userboot job要求疾速,kernal给userboot a handle to the ZBI,usrboot在ZBI中找到bootfs image,而后decompress,找到须要的library等等。
  3. 随后启动第一个process-> component manager
  4. component manager启动如下几个component

  1. driver manager->start processes:driver hosts,driver hosts run driver
  2. fshost:start filesystem,finding block devices,找到并load fvm和zxcrypt,随后启动minfs和blobfs文件系统
  3. appmgr:component manager uses the /pkgfs handle from fshost to load appmgr. 用于share capabilities

Startup sequence

appmgr创立app realm,app realm创立sysmgr,sysmgr创立sys realm

The sys realm holds a large number of FIDL services,sys realm 会开启很多service并且治理、lazy start一些component

至此,boot complete

FIDL

1   library fidl.examples.echo;23   @discoverable4   protocol Echo {5       EchoString(struct {6           value string:optional;7       }) -> (struct {8           response string:optional;9       });10  };

这里是:创立了一个class,这个class Echo能够被clinet看到,有一个me1:thod叫EchoString,参数是value,返回操作是response一个string

IPC models in FIDL
1 library fidl.examples.echo;23   @discoverable4   protocol Echo {5       EchoString(struct {6           value string:optional;7       }) -> (struct {8           response string:optional;9       });1011      SendString(struct { value string:optional; });1213      ->ReceiveString(struct { response string:optional; });14  };

SendString函数是一个只发送的函数,client发送之后,不论是否有回复,间接持续运行

ReceiveString函数是一个event函数,client不申请数据,只在server发送data过去之后运行

Workflow
  1. 用户构建*.fidl文件,并存在FIDL library外面,不同的library能够互相import
  2. publisher:FIDL libraries被放在SDK或者public respository中
  3. consumer:用FIDL compiler生成适宜用户本身语言的代码

Life of a handle

次要解说了FIDL如何转移handle权限

kernal

system call:零碎调用,大多数通过handle调用

Handles and Rights:能够传递、能够复制(复制的时候能够缩小权限)

Kernel Object IDs:Every object in the kernel has a "kernel object id" or "koid",用于标识,进而调整lifecycle等等

Running Code: Jobs, Processes, and Threads:job蕴含process,process蕴含thread

Without a Job Handle, it is not possible for a Thread within a Process to create another Process or another Job.

Message Passing: Sockets and Channels:socket面向流,channel有一个buffer

Objects and Signals:每个object有最多32个signal,signal标记例如:object是否有读权限

Waiting: Wait One, Wait Many, and Ports

Events, Event Pairs:event是最简略的object, Event Pairs是互相通信的一对event

Shared Memory: Virtual Memory Objects (VMOs):represent a set of physical pages of memory,

Virtual Memory Address Regions (VMARs) :provide an abstraction for managing a process's address space.

LK

zircon 基于LK进行开发

kernal objects

handle

handle绑定在一个process或者kernal上,handle bound to the kernel we say it's 'in-transit'.

handle链接process和指定的kernal-object,创立的时候有一些初始的权限,这些权限在复制时能够被摈弃。

回收:kernal-object在没有任何一个refer的时候,被销毁或者放入回收站;每一个handle对应的kernal object肯定是保障valid。

Signal

1 bit信息,用于交互信息,例如:channle里是否有未被读出的内容。

system call

Scheduling

design

每一个logical CPU有本人的scheduler,scheduler之间通过IPI交换

每个CPU有本人的一组FIFO queue,这些queue有不同的权限(总共分32个权限),In each queue is an ordered list of runnable threads awaiting execution

对于这些queue:

  1. CUP先抉择高优先级的queue,popfront
  2. 如果这个过程在timeslice没执行完,放在适合的队列队尾
  3. 如果timeslice没用完,放在队首,然而下一次只能执行剩下的timeslice工夫
  4. 如果wait share resource,放在期待队列,如果这个过程在timeslice没执行完,放在适合的队列队尾,如果timeslice没用完,放在队首,然而下一次只能执行剩下的timeslice工夫
Priority management
  1. 总共有0-31这32个权限分级
  2. 权限boost between [-MAX_PRIORITY_ADJ, MAX_PRIORITY_ADJ]当:
  • When a thread is unblocked, after waiting on a shared resource or sleeping, it is given a one point boost.
  • When a thread yields (volunteers to give up control), or volunteers to reschedule, its boost is decremented by one but is capped at 0 (won’t go negative).
  • When a thread is preempted and has used up its entire timeslice, its boost is decremented by one but is able to go negative.
  1. 如果一个thread管制resource导致另一个更高权限的thread被block,it is given a temporary boost up
CPU assignment and migration

每个thread有一个CPU affinity mask:例如喜爱1和3CPU,就是0b101,用两个1的地位示意。

When selecting a CPU for a thread the scheduler will choose, in order:

  1. The CPU doing the selection, if it is idle and in the affinity mask.
  2. The CPU the thread last ran on, if it is idle and in the affinity mask.
  3. Any idle CPU in the affinity mask.
  4. The CPU the thread last ran on, if it is active and in the affinity mask.
  5. The CPU doing the selection, if it is the only one in the affinity mask or all cpus in the mask are not active.
  6. Any active CPU in the affinity mask

Zircon Fair Scheduler

Briefly, these properties are:

  • Intuitive bandwidth allocation mechanism: A thread with twice the weight of another thread will receive approximately twice the CPU time, relative to the other thread over time. Whereas, a thread with the same weight as another will receive approximately the same CPU time, relative to the other thread over time.
  • Starvation free for all threads: Proportional bandwidth division ensures that all competing threads receive CPU time in a timely manner, regardless of how low the thread weight is relative to other threads. Notably, this property prevents unbounded priority inversion.
  • Fair response to system overload: When the system is overloaded, all threads share proportionally in the slowdown. Solving overload conditions is often simpler than managing complex priority interactions required in other scheduling disciplines.
  • Stability under evolving demands: Adapts well to a wide range of workloads with minimal intervention compared to other scheduling disciplines.

在Zircon中,应用的是最坏状况Fair scheduler:Worst-Case Fair Weighted Fair Queuing (WF2Q)

Security

each thread has two stacks instead of the usual one: a "safe stack" and an "unsafe stack".

unsafe的用来寄存例如指向heap的指针,safe的用来存储例如return addr,避免栈溢出等

shadow call stack pointer为shadow-call-stack代码提供反对

Cryptographically Secure Pseudo Random Number Generator:随机数生成

Errors

error: 被分为不同的category:The first error code in each category is the generic code and is used when no more specific code applie

和传统的没有什么很大的区别

Zircon Kernel IPC Limits

如果读取kernal buffer速度比写入慢,可能造成run out of kernel buffers

waiting

Timer Slack:

Slack defines how the system may alter the timer's deadline. Timer指的是例如一个object期待肯定的工夫或者期待timer勾销。

slack示意timer能够合并,从而减少等待时间;Amount is the allowed deviation from the deadline;

Tracing

用于检测kernal/user space的过程状态

trace provider写入buffer,manager通过socket传送给trace client

trace client通过manager治理trace provider是否运行:

A trace client contacts the trace manager to request that tracing should either start or stop. A trace client can also request to save collected trace data.

packages

一组文件,提供一个或多个程序

  1. package 从Fuchsia server BLOB上下载,有同样内容的BLOB同名

Base packages:These are the packages that are part of the foundation of the Fuchsia operating system

cached packages:These are packages on the device which are not part of base. These packages exist when the device is flashed or paved, so these packages are usable if the device boots without a network connection

Universe packages:在Fuchsia server伤的package

package构造
  • meta.far

    • meta/package:a JSON file that contains the name and version of the package.
    • meta/contents:content file。pm指令生成的
  • BLOBs outside of meta/

    • most files of a package exist outside of the meta/directory and each are a BLOB.
package url
fuchsia-pkg://<repository>/<package-name>?hash=<package-hash>#<resource-path>
Developing with Fuchsia packages
  1. development host提供HHTTP反对,target host 通过TCP port 8083 通过IP地址连贯
  2. 用fx bulid 指令bulid
  3. Triggering package updates

Security

verified exec(VX):

Fuchsia has taken verification into the runtime of the system

VX considers two security models: the running software model and the verified boot model.

The Verified Boot Security Model:The goal of a defender in this security model is to recover by eliminating untrustworthy states (code and data) in which attackers could persist control across reboots.

  1. 有untrust state,间接reboot,随后一处untrustworthy state
  2. 回绝回滚,避免attacker通过回滚到历史版本绕过攻打

The Running Software Model

In this security model, the aim of a defender is to solve or mitigate possible vulnerabilities by hardening code against malicious input.

Phases of Verification:

Phase Zero: Hardware to First Bootloader:the hardware is assumed to be trusted.

Phase One: First Bootloader to Main Bootloader:第一个bootloader被验证过后,就取得验证、执行软件的权限,也是硬件

Main Bootloader to Preauthorized Code:main bootloader测验Preauthorized Code,Preauthorized Code在硬件上运行,蕴含例如kernal,driver,package manager等等。

Phase Three: Non-Preauthorized Code

Non-Preauthorized Code可能不会被一些device运行

上述的Implementation

Main Bootloader

The main bootloader implementation relies on Android Verified Boot for verification and kernel rollback protection.

BlobFS

BlobFS is a cryptographic, content-addressed filesystem purpose-built to support verified execution.

Package Management System

Component Framework

Source code

vendoring = third party code

Session

存储特定用户会话所需的属性及配置信息

Elements

UI增加到session上的component是element