关于人工智能:联邦学习FATE数据上传读取训练保存

fate如何装置？本文续这篇文章。

背景

fate是一个服务，还原联邦学习，所以分client和host两种身份，一般来说用户都是client，用户想要上传本人的数据，合并别人数据最终取得一个更好的模型，所以要“上传”数据。

在 FATE 框架中，横向联邦的场景被称为 homo，纵向的被称为 hetero，比方纵向平安晋升树模型就叫做 hetero secure boost。

上传

官网文档：https://fate.readthedocs.io/e...
强烈建议对着官网文档看我这个！

工具

FATE框架能够应用pipeline工具进行上传。

先下载fate_client，因为Pipeline是fate_client里的一个工具。

pip install fate_client

依据文档，想要应用pipeline，须要命令行配合应用

pipeline init --ip=xxx --port=xxx

先terminal外面对pipeline初始化能力应用pipeline，ip和port要跟FATE启动时的ip和port要对应，如果是standalone，那么ip是127.0.0.1，port个别是9380。

如果记不清fate的配置了，应用（临时还没找到，等着补上

flow

如果记不清pipeleine的配置了，应用

pipeline config show

查看

Python开发

python文件如下代码即可上传csv文件。
每一个上传的数据都会有本人的table_name和namespace，fate用这两个字段来命名辨别每一个上传的数据。

from pipeline.backend.pipeline import PipeLinepipeline = PipeLine() \        .set_initiator(role='guest', party_id=9999) \        .set_roles(guest=9999, host=10000) # what do guest and host stands for?data_path='/root/Downloads/dummy.csv'table_name='dummy'namespace='dummy'pipeline.add_upload_data(file=data_path,table_name=table_name,namespace=namespace)pipeline.upload(drop=1) # what does drop=1 or 0 mean?

胜利运行后，terminal会呈现相似字样。

从FATE服务中取得数据

官网文档：https://fate.readthedocs.io/e...强烈建议对着官网文档看我这个！

文档中的sbt，其实就是Secure Boost Tree，一个决策树模型，因为应用了FATE，所以叫Secure。

工具

FATE中应用Reader类，从FATE框架中取得数据。

文档中说“load data”，我一开始认为load data是从本地load，汗！文档最好改成load data from FATE service……

应用Reader类取得数据后，能够应用DataTransform类进行变换。文档和代码有提，能够参考文档。应用Intersection能够取得两份数据的PSI值，依据Component文档，PSI是两份数据中交加水平的指标，FATE当然还提供了更多的函数，文档的代码只是举了一个PSI的例子。

Python

from pipeline.component import Reader, DataTransform, HeteroSecureBoost, Evaluationfrom pipeline.interface import Data# set pipeline operation party ids.pipeline = PipeLine() \        .set_initiator(role='guest', party_id=9999) \        .set_roles(guest=9999, host=10000)reader_0 = Reader(name="reader_0")# bind reader operation tablesreader_0.get_party_instance(role='guest', party_id=9999).component_param(    table={"name": "dummy", "namespace": "dummy"})data_transform_0 = DataTransform(name="data_transform_0")# bind transformation operation partydata_transform_0.get_party_instance(role='guest', party_id=9999).component_param(    with_label=True)# state a boost tree and evaluationhetero_secureboost_0 = HeteroSecureBoost(name="hetero_secureboost_0",                                         num_trees=5,                                         bin_num=16,                                         task_type="classification",                                         objective_param={"objective": "cross_entropy"},                                         encrypt_param={"method": "paillier"},                                         tree_param={"max_depth": 3})evaluation_0 = Evaluation(name="evaluation_0", eval_type="binary")# add everyone into pipeline and ready for trainingpipeline.add_component(reader_0)pipeline.add_component(data_transform_0,data=Data(train_data=reader_0.output.data))pipeline.add_component(hetero_secureboost_0, data=Data(train_data=data_transform_0.output.data))pipeline.add_component(evaluation_0, data=Data(data=hetero_secureboost_0.output.data))pipeline.compile()# trainingpipeline.fit()# load another dataset via predict_pipeline# predict_pipeline.predict()# save resultspipeline.dump("pipeline_saved.pkl")

应用pipeline确认操作对象；
定义reader后绑定reader的对象，data_transform也是一样。
如何查看取出的数据具体是什么？（之后补上

如果训练失败了，python会提醒，能够用fate board或者fate client查看。
fate board和fate client怎么应用？（之后补上

对于一个pipeline能够通过dump把所有信息保留到pkl中。