关于系统:Fuchsia-concepts

8次阅读

共计 15444 个字符,预计需要花费 39 分钟才能阅读完成。

总结官网文档的 Fuchsia 的根底 concept

Fuchsia Conceptions

1. component

  1. component Framework:反对 component 通信、library、建设等等的局部
  2. component manager:

    1. 启动零碎(零碎中最早启动 / 最晚敞开的 component)同时启动别的必要的 component,例如 filesystem
    2. 中介作用,调用 capability routing 等
    3. 反对 component 与环境交互、反对扩大
  3. component manifest:针对特定 component 的一系列形容 / 配置文件
  4. component lifecycle:被 component framework or component runner 决定

    Bind:A 调用 B 的 capability 称为 A binds to B

    • eager binding:如果 b 是 B 的 child 切 eager,A 分割到 B,那么也 bind 到 b
    • reboot:component 退出之后会重启(包含运行胜利退出)
  5. topology、identifier、realm 都是之前在 getstart 外面提到过的
  6. Environment:让 develpoer 能够设定 component realm 的行为

Driver

  1. Fuchsia 中,driver 能够 bind to matching “parent” devices, and publish “children” of their own.

    This hierarchy extends as required: one driver might publish a child, only to have another driver consider that child their parent, with the second driver publishing its own children, and so on.

driver 启动步骤:

  • system start 开启 root,root 申请 bind driver
  • system 在零碎中找适合的 driver 并绑定到 root 上
  • driver 运行,可能会创立新的 root 去申请新的 driver

    • 例如 PIC driver 发现有一个新的外围设备,就会创立一个新的 parent node,而后这个新 node 会申请一个新的 driver 来绑定,每次发现一个新的外围设备都会反复一遍此步骤
  • 绑定之后,进行 init driver,包含 init interface 等等
$ dm dump
[root]
   <root> pid=1509
      [null] pid=1509 /boot/driver/builtin.so
      [zero] pid=1509 /boot/driver/builtin.so
   [misc]
      <misc> pid=1645
         [console] pid=1645 /boot/driver/console.so
         [dmctl] pid=1645 /boot/driver/dmctl.so
         [ptmx] pid=1645 /boot/driver/pty.so
         [i8042-keyboard] pid=1645 /boot/driver/pc-ps2.so
            [hid-device-001] pid=1645 /boot/driver/hid.so
         [i8042-mouse] pid=1645 /boot/driver/pc-ps2.so
            [hid-device-002] pid=1645 /boot/driver/hid.so
   [sys]
      <sys> pid=1416 /boot/driver/bus-acpi.so
         [acpi] pid=1416 /boot/driver/bus-acpi.so
         [pci] pid=1416 /boot/driver/bus-acpi.so
            [00:00:00] pid=1416 /boot/driver/bus-pci.so
            [00:01:00] pid=1416 /boot/driver/bus-pci.so
               <00:01:00> pid=2015 /boot/driver/bus-pci.proxy.so
                  [bochs_vbe] pid=2015 /boot/driver/bochs-vbe.so
                     [framebuffer] pid=2015 /boot/driver/framebuffer.so
            [00:02:00] pid=1416 /boot/driver/bus-pci.so
               <00:02:00> pid=2052 /boot/driver/bus-pci.proxy.so
                  [e1000] pid=4628 /boot/driver/e1000.so
                     [ethernet] pid=2052 /boot/driver/ethernet.so
            [00:1f:00] pid=1416 /boot/driver/bus-pci.so
            [00:1f:02] pid=1416 /boot/driver/bus-pci.so
               <00:1f:02> pid=2156 /boot/driver/bus-pci.proxy.so
                  [ahci] pid=2156 /boot/driver/ahci.so
            [00:1f:03] pid=1416 /boot/driver/bus-pci.so

设施 sys 领有 driver host,而后这时候加载了 [acpi] 设施和相应的 driverbus-acpi.so

随后,ACPI 遍历枚举,找到一个 pci bus 于是创立一个 parent(蕴含一些 protocol)[pci] pid=1416 /boot/driver/bus-acpi.so;driver host 这时候把 bus-pci.so 这个 driver 绑定下来;

During its binding,这个 driver 扫描所有的 pic bus 上的 devices

其中,PIC device 00:02:00 是 intel ethernet interface,在零碎中咱们又找到了 e1000.so 这个 driver 适宜绑定(protocol 适合)。

这是,PIC driver 创立一个 parent(蕴含一些 protocol),同时又创立了一个新的 driver host(2052)

随后创立一个代理 <00:02:00> pid=host 2052 /boot/driver/bus-pci.proxy.so;这个代理用于 driver host(2052)和 PIC driver 的接口

随后进行 DSO(e1000.so)和 driver host 的绑定

随后,这个 DSO publishes a ZX_PROTOCOL_ETHERNET_IMPL, which binds to a matching child (the ethernet.so DSO on line 9; it’s considered a match because it has a ZX_PROTOCOL_ETHERNET_IMPL protocol).

这时候在 device filesystem 中最终的那个 ethernet device 为:

/dev/sys/platform/pci/00:02:00/e1000

ethernet.so)publishes a ZX_PROTOCOL_ETHERNET用于给 client 调用

  1. Driver binding

    要遵循肯定的标准

  2. Driver ops

这些 hook(图中方块中)在运行时被别的 driver 调用

  1. Driver lifecycle
  2. Device driver lifecycle:

    1. binding program 被 binding compiler 编译,产生 ZIRCON_DRIVER 宏,领导把 binding program 放入 ELF NOTE section,Device Coordinator 能够不必加载整个 driver 就能够查看到信息
    2. init():被须要非凡初始化的 driver/do not want to visibly publish their device(s) until that succeeds 的 driver 调用
    3. bind():offer the driver a device to bind to,driver 须要创立一个 child device
    4. create()
    5. release():当初曾经不启用该 method
  3. device lifecycle:

    1. device_add():减少一个 child device;parent device 是 device passed in to the bind() 或者 another device which has been created by the same device driver.
    2. device_async_remove():remove;The removal of a device consists of four parts: running the device’s unbind() hook, removal of the device from the Device Filesystem, dropping the reference acquired by device_add() and running the device’s release() hook.
    3. unbind():可选,运行过程中保障不会承受外来信息
    4. parent device 保障 child device 在敞开的时候对相干申请返回错误信息
    5. release 和 unbind:递归的开释设施;例如在下图中

                  +------------+
                  | USB Device | .unbind()
                  +------------+ .release()
                        |
                  +------------+
                  |  WLAN PHY  | .unbind()
                  +------------+ .release()
                    |        |
          +------------+  +------------+
          | WLAN MAC 0 |  | WLAN MAC 1 | .unbind()
          +------------+  +------------+ .release()

​ .unbind() 从 USB device 开始向下,到底之后,两个 MAC 开始 release,而后反向 release

  1. device power management
  2. device protocol: 这个中央有点没看懂,提到了 process 与 device、device protocol 的调用等全副的过程,大略了解了一些

    • 大略是一个束缚,任何听从本束缚的 driver 都该当提供一系列的 function
    • Platform dependent vs platform independen:dependent 指的是 client 和 driver 中多加一层,例如 buffrer 调用性能等,缩小代码反复
    • process:Fuchsia based on driver host

      • driverhost:a process contains a protocol stack,driverhost 动静加载 driver

    具体解释:见下面driver 启动步骤:

  3. platform bus

这些是底层的 driver,为高层 driver 提供接口、反对等等,在系统启动的时候会事后加载

Filesystem

  1. File lifecycle

    1. Establishing a Connection:用户 发送 RPC requests 给 filesystem servers using a FIDL
    2. namespace:齐全在 client 端。which is a table of “absolute path” -> “handle” mappings. All paths accessed from within a process are opened by directing requests through this namespace mapping.
    3. passing data:也用 RPC messages,use the FIDL protocol
    4. mmap:给 client 返回的是 virtual memory objects;只利用于 read-only 的文件
    5. Other Operations acting on paths: 比方 rename(old,new),须要两个门路,Fuchsia filesystems use this ability to refer to one Vnode while acting on the other.
    6. vnode:用于标记门路、一个文件等等
  2. Filesystem Lifecycle

  1. Filesystem Management:只有管理员有权限
  2. Mounting:先 init,后和 parent (mounting) filesystem 相连;what mountpoints exist elsewhere 取决于具体情况,不是所有中央都能够拜访到
  3. FVM:keep virtual mapping from (virtual partitions, blocks) to (slice, physical block).

          +---------------------------------+ <- Physical block 0
          |           metadata              |
          | +-----------------------------+ |
          | |       metadata copy 1       | |
          | |  +------------------------+ | |
          | |  |    superblock          | | |
          | |  +------------------------+ | |
          | |  |    partition table     | | |
          | |  +------------------------+ | |
          | |  | slice allocation table | | |
          | |  +------------------------+ | |
          | +-----------------------------+ | <- Size of metadata is described by
          | |       metadata copy 2       | |    superblock
          | +-----------------------------+ |
          +---------------------------------+ <- Superblock describes start of
          |                                 |    slices
          |             Slice 1             |
          +---------------------------------+
          |                                 |
          |             Slice 2             |
          +---------------------------------+
          |                                 |
          |             Slice 3             |
          +---------------------------------+
          |                                 |

partition table:name,partation ID,这个 partation 中曾经调配进来的 slice 的数量

slice allocation table:由 slice entries 形成

每一个 slice entry 蕴含:allocation status
if it is allocated,
        what partition it belongs to and
        what logical slice within the partition the slice maps to
  1. MinFs: MinFS is a simple, unix-like filesystem built for Zircon.
  2. BlobFs: BlobFS is a content-addressable filesystem optimized for write-once, 次要用于 package

​ BlobFs 下 disk 构造:

  • The Superblock storing filesystem-wide metadata,
  • The Block Map, a bitmap used to keep track of free and allocated data blocks,
  • The Node Map, a flat array of Inodes (reference to where a blob’s data starts on disk) or ExtentContainers (reference to several extents containing some of a blob’s data).

    • node 分两种,Inodes, ExtentContainers
    • Properties of the node linked-list:存在一些标准,保障 extent 是有序的,否则将认为是谬误
  • The Journal, a log of filesystem operations that ensures filesystem integrity, even if the device reboots or loses power during an operation, and
  • The Data Blocks, where blob contents and their verification metadata are stored in a series of extents.

    • Currently BlobFS does not perform defragmentation.
  1. Random access compression in BlobFS

    1. 默认是 zstd
    2. 为保障 page demand,将文件分成 frame 来压缩 / 解压缩(chunked compression)
  2. Block devices:和 filesystem 一样,program 作为 client,随后向 devhost 发送申请(通过 RPC)

    fast block i/o:register a“transaction buffer”,传递例如:写入地位 + 写入内容起始地址等等,防止拷贝造成的大量开销

  3. zxcrypt
  4. Life of an ‘Open’:在 Fuchsia 中,open 不是一个 system call,client 通过 channel 连贯 filesystem;process 初始化后,将会被附以 namespace

    1. standard library 定义了 open 函数
    2. Fdio:为 files, sockets, services, 等多种提供对立的接口
    3. FIDL:一些协定,保障 client 和 server 的交互

Process

  1. core library

    1. FBL:继承了一些 c ++ 构造,也增加了一些
       2. FXL:is a platform-independent library containing basic C++ building blocks
  2. Namespace

    1. namespaces are defined per-component 每一个 component 有他本人的 root
  3. Object:The items within a namespace are called objects,例如一个 namespace 指向一个 object,这个 object 是一个 file 或者是一个 dict

    1. access:用 FIDL,能够创立新的 obj,也能够拜访子 obj
    2. obj name:能够有不同的名字指向同一个 obj,这个名字又上一层 container 决定(相似于 dict)
  4. Object Relative Path Expressions:例如 a /b/ c 的门路名称,然而不反对拜访 container 外(例如..)
  5. Client Interpreted Path Expressions: 用户能够自定义 root 地位

SandBox

  1. process 创立的时候,没有任何权限,通常会赋予一些 handle 等
  2. process 的 namespace 很重要
  3. Component capabilities:是 process 的 component 将会取得一个/svc directory 在 namespace 中
  4. Legacy components:/svc 提供的 service 是 environemnt 中 service 的子集

JOB

In Fuchsia, jobs are a means of organizing, controlling, and regulating processes

  1. job 能够有 child jobexception 逆向流传(p<-c),policy&quota 正向流传(p->c)
  2. 从 root job 开始,往下造成 job tree

Booting

启动步骤
  1. Kernal 启动之后,userspace 先 boot
  2. userboot job 要求疾速,kernal 给 userboot a handle to the ZBI,usrboot 在 ZBI 中找到 bootfs image,而后 decompress,找到须要的 library 等等。
  3. 随后启动第一个 process-> component manager
  4. component manager 启动如下几个 component

  1. driver manager->start processes:driver hosts,driver hosts run driver
  2. fshost:start filesystem,finding block devices,找到并 load fvm 和 zxcrypt,随后启动 minfs 和 blobfs 文件系统
  3. appmgr:component manager uses the /pkgfs handle from fshost to load appmgr. 用于 share capabilities

Startup sequence

appmgr 创立 app realm,app realm 创立 sysmgr,sysmgr 创立 sys realm

The sys realm holds a large number of FIDL services,sys realm 会开启很多 service 并且治理、lazy start 一些 component

至此,boot complete

FIDL

1   library fidl.examples.echo;
2
3   @discoverable
4   protocol Echo {
5       EchoString(struct {
6           value string:optional;
7       }) -> (struct {
8           response string:optional;
9       });
10  };

这里是:创立了一个 class,这个 class Echo 能够被 clinet 看到,有一个 me1:thod 叫 EchoString,参数是 value,返回操作是 response 一个 string

IPC models in FIDL
1 library fidl.examples.echo;
2
3   @discoverable
4   protocol Echo {
5       EchoString(struct {
6           value string:optional;
7       }) -> (struct {
8           response string:optional;
9       });
10
11      SendString(struct { value string:optional;});
12
13      ->ReceiveString(struct { response string:optional;});
14  };

SendString 函数是一个只发送的函数,client 发送之后,不论是否有回复,间接持续运行

ReceiveString 函数是一个 event 函数,client 不申请数据,只在 server 发送 data 过去之后运行

Workflow
  1. 用户构建 *.fidl 文件,并存在 FIDL library 外面,不同的 library 能够互相 import
  2. publisher:FIDL libraries 被放在 SDK 或者 public respository 中
  3. consumer:用 FIDL compiler 生成适宜用户本身语言的代码

Life of a handle

次要解说了 FIDL 如何转移 handle 权限

kernal

system call:零碎调用,大多数通过 handle 调用

Handles and Rights:能够传递、能够复制(复制的时候能够缩小权限)

Kernel Object IDs:Every object in the kernel has a “kernel object id” or “koid”,用于标识,进而调整 lifecycle 等等

Running Code: Jobs, Processes, and Threads:job 蕴含 process,process 蕴含 thread

​ Without a Job Handle, it is not possible for a Thread within a Process to create another Process or another Job.

Message Passing: Sockets and Channels:socket 面向流,channel 有一个 buffer

Objects and Signals:每个 object 有最多 32 个 signal,signal 标记例如:object 是否有读权限

Waiting: Wait One, Wait Many, and Ports

Events, Event Pairs:event 是最简略的 object,Event Pairs 是互相通信的一对 event

Shared Memory: Virtual Memory Objects (VMOs):represent a set of physical pages of memory,

Virtual Memory Address Regions (VMARs):provide an abstraction for managing a process’s address space.

LK

zircon 基于 LK 进行开发

kernal objects

handle

handle 绑定在一个 process 或者 kernal 上,handle bound to the kernel we say it’s ‘in-transit’.

handle 链接 process 和指定的 kernal-object,创立的时候有一些初始的权限,这些权限在复制时能够被摈弃。

回收:kernal-object 在没有任何一个 refer 的时候,被销毁或者放入回收站;每一个 handle 对应的 kernal object 肯定是保障 valid。

Signal

1 bit 信息,用于交互信息,例如:channle 里是否有未被读出的内容。

system call

Scheduling

design

每一个 logical CPU 有本人的 scheduler,scheduler 之间通过 IPI 交换

每个 CPU 有本人的一组 FIFO queue,这些 queue 有不同的权限(总共分 32 个权限),In each queue is an ordered list of runnable threads awaiting execution

对于这些 queue:

  1. CUP 先抉择高优先级的 queue,popfront
  2. 如果这个过程在 timeslice 没执行完,放在适合的队列队尾
  3. 如果 timeslice 没用完,放在队首,然而下一次只能执行剩下的 timeslice 工夫
  4. 如果 wait share resource,放在期待队列,如果这个过程在 timeslice 没执行完,放在适合的队列队尾,如果 timeslice 没用完,放在队首,然而下一次只能执行剩下的 timeslice 工夫
Priority management
  1. 总共有 0 -31 这 32 个权限分级
  2. 权限 boost between [-MAX_PRIORITY_ADJ, MAX_PRIORITY_ADJ]当:
  • When a thread is unblocked, after waiting on a shared resource or sleeping, it is given a one point boost.
  • When a thread yields (volunteers to give up control), or volunteers to reschedule, its boost is decremented by one but is capped at 0 (won’t go negative).
  • When a thread is preempted and has used up its entire timeslice, its boost is decremented by one but is able to go negative.
  1. 如果一个 thread 管制 resource 导致另一个更高权限的 thread 被 block,it is given a temporary boost up
CPU assignment and migration

每个 thread 有一个 CPU affinity mask:例如喜爱 1 和 3CPU,就是 0b101,用两个 1 的地位示意。

When selecting a CPU for a thread the scheduler will choose, in order:

  1. The CPU doing the selection, if it is idle and in the affinity mask.
  2. The CPU the thread last ran on, if it is idle and in the affinity mask.
  3. Any idle CPU in the affinity mask.
  4. The CPU the thread last ran on, if it is active and in the affinity mask.
  5. The CPU doing the selection, if it is the only one in the affinity mask or all cpus in the mask are not active.
  6. Any active CPU in the affinity mask

Zircon Fair Scheduler

Briefly, these properties are:

  • Intuitive bandwidth allocation mechanism: A thread with twice the weight of another thread will receive approximately twice the CPU time, relative to the other thread over time. Whereas, a thread with the same weight as another will receive approximately the same CPU time, relative to the other thread over time.
  • Starvation free for all threads: Proportional bandwidth division ensures that all competing threads receive CPU time in a timely manner, regardless of how low the thread weight is relative to other threads. Notably, this property prevents unbounded priority inversion.
  • Fair response to system overload: When the system is overloaded, all threads share proportionally in the slowdown. Solving overload conditions is often simpler than managing complex priority interactions required in other scheduling disciplines.
  • Stability under evolving demands: Adapts well to a wide range of workloads with minimal intervention compared to other scheduling disciplines.

在 Zircon 中,应用的是最坏状况 Fair scheduler:Worst-Case Fair Weighted Fair Queuing (WF2Q)

Security

each thread has two stacks instead of the usual one: a “safe stack” and an “unsafe stack”.

unsafe 的用来寄存例如指向 heap 的指针,safe 的用来存储例如 return addr,避免栈溢出等

shadow call stack pointer为 shadow-call-stack 代码提供反对

Cryptographically Secure Pseudo Random Number Generator:随机数生成

Errors

error:被分为不同的 category:The first error code in each category is the generic code and is used when no more specific code applie

和传统的没有什么很大的区别

Zircon Kernel IPC Limits

如果读取 kernal buffer 速度比写入慢,可能造成 run out of kernel buffers

waiting

Timer Slack:

Slack defines how the system may alter the timer’s deadline. Timer 指的是例如一个 object 期待肯定的工夫或者期待 timer 勾销。

slack 示意 timer 能够合并,从而减少等待时间;Amount is the allowed deviation from the deadline;

Tracing

用于检测 kernal/user space 的过程状态

trace provider 写入 buffer,manager 通过 socket 传送给 trace client

trace client 通过 manager 治理 trace provider 是否运行:

A trace client contacts the trace manager to request that tracing should either start or stop. A trace client can also request to save collected trace data.

packages

一组文件,提供一个或多个程序

  1. package 从 Fuchsia server BLOB 上下载,有同样内容的 BLOB 同名

Base packages:These are the packages that are part of the foundation of the Fuchsia operating system

cached packages:These are packages on the device which are not part of base. These packages exist when the device is flashed or paved, so these packages are usable if the device boots without a network connection

Universe packages:在 Fuchsia server 伤的 package

package 构造
  • meta.far

    • meta/package:a JSON file that contains the name and version of the package.
    • meta/contents:content file。pm 指令生成的
  • BLOBs outside of meta/

    • most files of a package exist outside of the meta/directory and each are a BLOB.
package url
fuchsia-pkg://<repository>/<package-name>?hash=<package-hash>#<resource-path>
Developing with Fuchsia packages
  1. development host 提供 HHTTP 反对,target host 通过 TCP port 8083 通过 IP 地址连贯
  2. 用 fx bulid 指令 bulid
  3. Triggering package updates

Security

verified exec(VX):

Fuchsia has taken verification into the runtime of the system

VX considers two security models: the running software model and the verified boot model.

The Verified Boot Security Model:The goal of a defender in this security model is to recover by eliminating untrustworthy states (code and data) in which attackers could persist control across reboots.

  1. 有 untrust state,间接 reboot,随后一处 untrustworthy state
  2. 回绝回滚,避免 attacker 通过回滚到历史版本绕过攻打

The Running Software Model

In this security model, the aim of a defender is to solve or mitigate possible vulnerabilities by hardening code against malicious input.

Phases of Verification:

Phase Zero: Hardware to First Bootloader:the hardware is assumed to be trusted.

Phase One: First Bootloader to Main Bootloader:第一个 bootloader 被验证过后,就取得验证、执行软件的权限,也是硬件

Main Bootloader to Preauthorized Code:main bootloader 测验 Preauthorized Code,Preauthorized Code 在硬件上运行,蕴含例如 kernal,driver,package manager 等等。

Phase Three: Non-Preauthorized Code

Non-Preauthorized Code 可能不会被一些 device 运行

上述的 Implementation

Main Bootloader

The main bootloader implementation relies on Android Verified Boot for verification and kernel rollback protection.

BlobFS

BlobFS is a cryptographic, content-addressed filesystem purpose-built to support verified execution.

Package Management System

Component Framework

Source code

vendoring = third party code

Session

存储特定用户会话所需的属性及配置信息

Elements

UI 增加到 session 上的 component 是 element

正文完
 0