关于算法:EL2242状态机

EL2242 THE BRIEF/INSTRUCTIONS State Machines also called finite state machines are a method of modelling a system that at its basis describes the behavior of a system and how it is contained into set states with each state being a distinct part of the overall model. Actions that allow the movement of a system between states are called transitions. A simple state machines is a turnstile and is described below: Figure 1 – Turnstile described as a state machine Image taken from - https://en.wikipedia.org/wiki/Finite-state_machineAs can be seen in figure 1 the turnstile has 2 states, locked and unlocked, with 2 actions that are allowed (coin inserted and the user pushing the turnstile). The entry to the system is the “locked” state. If a user was to “push” the state remains the same (Locked). When a coin is inserted the state is changed to unlocked. If the user pushes this turnstile (to move through it) the state changes back to Locked. If in the unlocked state another coin is inserted then the machine stays in the unlocked state. The state machine diagram shown in figure 1 describes the above paragraph in an easy to understand concept that allows the implementation of this logical system in a number of ways. You are required to develop a simple finite state machine using the C programming language + ARM microcontroller and Using Verilog HDL in the Quartus development environment. The state machine will mimic the operation of a 2 stage security gate.The security gate works with the following operation. ...

April 17, 2023 · 3 min · jiezi

关于算法:鹅厂发布的这个算力集群最快4天训练万亿参数大模型

大模型要胜利,算力是要害。 这是腾讯云面向大模型训练场景,公布的全新一代的HCC高性能计算集群性能参数: “算力性能和上一代相比晋升3倍,服务器接入带宽从1.6T晋升到3.2T。” 采纳最新一代腾讯云星星海自研服务器,并搭载NVIDIA H800 Tensor Core GPU的这代HCC高性能集群,单GPU卡反对输入最高1979 TFlops的算力。 具体强在哪里? 去年10月,腾讯实现首个万亿参数的AI大模型——混元NLP大模型训练。在等同数据集下,将训练工夫由50天缩短到11天。如果基于新一代集群,训练工夫将进一步缩短至4天。 01、单纯堆卡,算力并不能线性增长模型热度继续低落,但要训练一个胜利的大模型,算力,算法、数据三者缺一不可。 越强的大模型,越须要更强的算力来实现训练。领有弱小的算力是AI大模型胜利的要害。 在单体服务器计算能力无限的状况下,须要将上千台服务器相连,打造大规模、分布式的高性能计算集群。业界标杆的大模型,对训练算力需要广泛十分高,应用成千上万张GPU卡。 如此宏大的参数规模,独自一块GPU运算卡甚至都实现不了最根本的装载,这也使得咱们要用网络联接成千上万的服务器组建大规模算力集群,为大模型提供所需的算力。 HCC高性能计算集群就是在这样的需要下诞生,然而,要把这么多的卡“串联“起来,背地须要很强的技术能力。 因为依据木桶效应,单纯堆卡并不能带来算力的线性增长。它须要的是计算、存储、网络以及下层的框架等各个环节全面协调配合,能力输入一个高性能、高带宽、低提早的智算能力平台。 02、最强算力背地是底层自研技术的冲破为了提供极致的算力输入,腾讯云HCC高性能集群,从底层基础设施到下层的训练框架,做了多方面的技术创新。 2.1 计算:业界当先的超高密度,将单点算力性能晋升至更高服务器的单机性能是集群算力的根底。在非稠密规格状况下,新一代集群单GPU卡反对输入最高 495 TFlops(TF32)、989 TFlops (FP16/BF16)、1979 TFlops(FP8)的算力。 针对大模型训练场景,腾讯云星星海服务器采纳6U超高密度设计,相较行业可反对的上架密度进步30%; 利用并行计算理念,通过CPU和GPU节点的一体化设计,将单点算力性能晋升至更高; 全面降级第四代英特尔至强扩大处理器,服务器PCIe带宽、内存带宽最高晋升100%。 2.2 网络:自研星脉高性能计算网络,将集群算力再晋升20%咱们晓得,模型参数量越大,对带宽的需要就越高。成千上万的GPU卡协同工作数周甚至更久,GPU 与 GPU 间、服务器与服务器节点之间存在海量的外部数据交互需要。 传统的中小模型训练,往往只须要大量 GPU 服务器参加,跨服务器的通信需要绝对少,能够沿用通用的 100Gbps 带宽。而万亿参数大模型训练,是一种带宽敏感的计算业务,往往是All-to-All的通信模式。 在大模型场景下,相比单点GPU故障只影响集群算力的千分之几,一条链路的负载不均导致网络梗塞,就会成为木桶短板,影响到数十个甚至更多GPU的连通性。 同时,集群训练也会引入额定的通信开销,导致 N 个 GPU 算力达不到单个GPU 算力的N 倍。业界开源的GPU汇合通信库(比方NCCL),也不能将网络的通信性能施展到极致。 如果说业界最新代次的GPU是跑车,那么咱们须要一条业余赛道,能力让N个GPU组成的大模型训练集群最大限度地施展后劲。 腾讯自研的星脉高性能计算网络,就是这条业余赛道。这条赛道对GPU集群网络做了深度定制。减少了网络节点带宽,为计算节点提供3.2T ETH RDMA高性能网络,大幅升高了通信耗时的占比。 这相当于同样的GPU卡,用超带宽网络将集群算力提至更高。实测结果显示,搭载同样的GPU,最新的3.2T星脉网络相较1.6T网络,让集群整体算力晋升20%。 这条赛道,对“交通规则”也做了优化。在大规模的训练集群中,GPU之间的通信实际上由多种形式的网络承载,有机间网络,也有机内网络。 传统上的通信计划,存在大量的机间网络通信,导致集群的通信开销很大。星脉高性能计算网络将两种网络同时利用起来,将小流聚合为大流,通过缩小流量的数目,从而晋升整网的传输性能。实测显示,在大规模All-to-All场景下,星脉高性能计算网络能帮忙通信的传输性晋升30%。 基于多轨道聚合的无阻塞网络架构、被动拥塞管制和定制减速通信库,目前,新一代集群能提供业界当先的集群构建能力,反对单集群高达十万卡级别的组网规模。 腾讯自研高性能汇合通信库TCCL,基于星脉网络硬件平台深度优化,在全局门路布局、拓扑感知亲和性调度、网络故障实时告警/自愈等方面融入了定制设计的解决方案。绝对业界开源汇合通信库,为大模型训练优化40%负载性能,打消多个网络起因导致训练中断问题。 在超大集群场景下,依然能放弃优良的通信开销比和吞吐性能,满足大模型训练以及推理业务的横向扩大。 2.3 存储:TB级吞吐能力和千万级IOPS,缩小计算节点期待近5年,模型参数量增长十万倍,而GPU显存只增长了 4 倍。实践上,云上的池化资源能解决这一问题。 但训练场景下,几千台计算节点会同时读取一批数据集,存储桶还面临着高并发的问题。大模型的数据集次要是GB级的大文件,从加载模型到启动实现须要数分钟,如果GPU资源闲置,也会拖慢整体训练效率。 如果说大模型算力中的网络,是为GPU修了一条业余赛道。那么高性能存储,则是一个“秒换轮胎”的维修站,提前备好数据,尽量减少计算节点的期待,让集群性能进一步迫近最优。 新一代集群,引入了腾讯云最新自研存储架构,具备TB级吞吐能力和千万级IOPS,反对不同场景下对存储的需要。 COS+GooseFS计划,提供基于对象存储的多层缓存减速,大幅晋升端到端的数据读取性能,为大模型场景提供海量、极速、高性价比的存储计划;将公开数据集、训练数据、模型后果对立存储到对象存储COS中,实现数据对立存储和高效流转。GooseFS按需将热数据缓存到GPU内存和本地盘中,为大模型训练提供低延时的本地化拜访能力,减速训练过程、晋升训练效率。 ...

April 17, 2023 · 1 min · jiezi

关于算法:快速排序的实现方法

一、疾速排序(Quick Sort)疾速排序采纳分治法。首先从数列中挑出一个元素作为两头值。顺次遍历数据,所有比两头值小的元素放在右边,所有比两头值大的元素放在左边。而后按此办法对左右两个子序列别离进行递归操作,直到所有数据有序。最现实的状况是,每次划分所抉择的两头数恰好将以后序列简直等分(平均排布),整个算法的工夫复杂度为O(n logn)。 最坏的状况是,每次所选的两头数是以后序列中的最大或最小元素(正序和逆序都是最坏),整个排序算法的工夫复杂度为O(n²)。 均匀工夫复杂度为O(n logn),空间复杂度为O(logn),是一种不稳固的排序算法。 附算法实现源码: //疾速排序template <class T>int Partition(T data[],int left,int right){ T pivot=data[left]; while(left<right) { while(left<right&&data[right]>pivot) right--; data[left]=data[right]; while(left<right&&data[left]<=pivot) left++; data[right]=data[left]; } data[left]=pivot; return left;}template <class T>void QuickSort(T data[],int left,int right){if(left<right){ int p=Partition(data,left,right); QuickSort(data,left,p-1); QuickSort(data,p+1,right);}}二、抉择排序(Selection Sort)遍历所有数据,先在数据中找出最大或最小的元素,放到序列的起始;而后再从余下的数据中持续寻找最大或最小的元素,顺次放到序列中直到所有数据有序。原始数据的排列程序不会影响程序消耗工夫O(n²),绝对费时,不适宜大量数据排序。 均匀工夫复杂度为O(n²),空间复杂度为O(1),是一种不稳固的排序算法。 附算法实现源码: //抉择排序template <class T>void SelectionSort(T data[],int n){ for(int i=1;i<n;i++) { int k=i-1; for(int j=i;j<n;j++) { if(data[j]<data[k]) { k=j; } } if(k!=i-1) { T t=data[k]; data[k]=data[i-1]; data[i-1]=t; } }}三、插入排序(Insertion Sort)将前i个(初始为1)数据假设为有序序列,顺次遍历数据,将以后数据插入到前述有序序列的适当地位,造成前i+1个有序序列,顺次遍历完所有数据,直至序列中所有数据有序。数据是反序时,消耗工夫最长O(n²);数据是正序时,消耗工夫最短O(n)。实用于局部数据曾经排好的大量数据排序。 ...

April 14, 2023 · 1 min · jiezi

关于算法:ITP4501-Mobile-Systems

ITP4501 Programming Techniques for Mobile SystemsAssignmentSemester 3Institute of Vocational EducationDepartment of Information and Communications TechnologyHDSE (IT114105)ITP4501 Programming Techniques for Mobile SystemsSummer Semester 2021-2022AssignmentSubmission Guidelines• This is an individual assignment. • The submission deadline of the assignment to is 11:55pm, 3 July 2022 (Sunday). • You need to submit all program sources (in a single zip file) and your answer of two questions in section 7 (in a MS Word file) to the Moodle website http://moodle.vtc.edu.hk assignment dropbox before the deadline. You are advised to upload your work at a time reasonably earlier than the cut-off date and time. Moodleallows multiple submissions, however, only the latest copy will be retained. You will receive ZERO MARKS for LATE SUBMISSION. • You are also required to give a demonstration. 40 marks will be deducted if demonstration is not done. 1 Aims and ObjectivesØ To gain experience in mobile application UI and program design.Ø To gain practical skill of Android application development. Ø To understand the constraints and limitation of mobile application and the ways to overcome them.Ø To obtain knowledge on connecting the mobile device to the internet services and building a multi-tier distributed system.2 IntroductionIn this assignment, you are required to develop an Android Application to play a Tic-Tac-ToeGame. This app will also record the result and corresponding time required to complete a game and use charts to show the history records.You can use following link to know how to play a Tic-Tac-Teo game:https://en.wikipedia.org/wiki/Tic-tac-toe Page 2 of 4ITP4501 Programming Techniques for Mobile SystemsAssignmentSemester 3 3 Functional RequirementsListed below are the basic requirements of your application. You need to refer to the Local Database section for the database schema. ...

April 13, 2023 · 5 min · jiezi

关于算法:奶奶看了都会云服务器部署开源ChatGLM6B让你也能拥有自己的ChatGPT

1.背景大家好啊,我是小卷。 最近ChatGPT不仅公布了GPT-4,而且解除封印能够联网了。不得不赞叹AI更新迭代的速度真快,都跟不上节奏了。然而大家也留神到了吧,随着ChatGPT的每次更新,OpenAI对其凋谢应用的限度也越来越大。之前国内网轻易拜访GPT3,当初动不动就封号 所以,明天就来教大家部署国内清华大学开源的ChatGLM-6B。简略介绍下,ChatGLM是对话语言模型,对中文问答和对话进行了优化。以后训练模型有62亿参数,后续还会推出1300亿参数的大模型,期待国内的ChatGLM能越做越弱小。 ChatGLM的开源地址:THUDM/ChatGLM-6B 废话不多说了,间接上成果,以下是由ChatGLM中文对话的后果(不是ChatGPT哦) (PS:文末给大家筹备了ChatGLM的收费体验地址 和 算力平台收费体验形式,肯定看到文章结尾哦) 2.筹备工作官网阐明ChatGLM对硬件的配置要求至多13G的显存 要筹备的货色如下: 一台GPU云服务器(16GB显存,32G内存)云服务器上已装置好显卡驱动cuda和pytorch框架(平台都有现成的镜像,间接装置即可)再来说说服务器厂商的抉择,GPU服务器比拟贵,所以小卷比照了一些大厂和小厂的GPU规格,这里只看配置符合要求且价格适合的 厂商配置价格劣势阿里云4核-15G内存-16显存NVIDIA T41878/月大厂服务,然而价格太贵腾讯云10核-40G- NVIDIA T48.68/小时大厂服务,但独占1颗GPU价格略高华为云8核-32G-16显存NVIDIA T43542/月太贵mistGPU8核-32G-24G显存NVIDIA 30904.5/小时毛病:只有1GB收费存储揽睿星舟10核-40G-24G显存NVIDIA 30901.9/小时举荐,配置高且价格低,当初NVIDIA 3090有特价咱们这里应用揽睿星舟这个算力平台的服务器,价格就是劣势哦。须要留神的是,GPU服务器要选按量计费,就是你用的时候按应用时长计费,不必时关掉就不会计费 3.服务器配置这一步购买服务器并装置环境,比较简单 3.1注册应用关上揽睿星舟官网注册地址:https://www.lanrui-ai.com/register?invitation_code=4104, 注册账号时邀请码填写4104,这样平台会给你收费充值一笔钱 咱们就能够收费体验服务器了。右上角也能够给本人账户充值 3.2购买服务器并装置镜像在网站的算力市场购买须要的服务器配置,这里我选的是3090-24G这款,点击应用按钮进入镜像装置界面 运行环境镜像选公共镜像 -> pytorch 间接用最新的就行,而后高级设置里抉择预训练模型chatglm-6b(这样会事后加载chatGLM的模型到服务器,无需再手动下载)而后创立实例(确保本人账号里有足够的余额) 期待5分钟左右,工作空间就创立好了,点击进入 -> JupyterLab 进入服务器,接下来就筹备ChatGLM的装置就行了 4.部署ChatGLM4.1Git减速配置为了防止git clone太慢,提前在命令行设置git学术资源减速 # 执行上面2条命令,设置git学术资源减速git config --global http.proxy socks5h://172.16.16.39:8443git config --global https.proxy socks5h://172.16.16.39:8443前面的步骤中再执行git clone命令就不会卡住了。 要勾销git学术减速也简略,执行上面的命令(所有步骤执行完后再勾销哦~) # 勾销git学术资源减速git config --global --unset https.proxygit config --global --unset http.proxy4.2下载ChatGLM源代码进入Jupyter的页面后,能够看到2个目录,对目录做下阐明: data目录,存放数据,平台共享的imported_models目录,寄存预训练模型,即创立工作空间时你抉择的模型点击data目录下,能够看到ChatGLM-6B文件夹,外面是ChatGLM的源代码。 如果没有ChatGLM-6B目录,那么咱们这一步须要下载代码,操作如下: 页面关上一个Terminal终端,在Terminal终端执行命令 git clone https://github.com/THUDM/ChatGLM-6B.git 4.3 装置依赖1.执行命令切换到ChatGLM-6B的目录 ...

April 13, 2023 · 1 min · jiezi

关于算法:COSC-363计算机图形

COSC 363: Computer Graphics Assignment 2Aim In this assignment you will implement a ray tracer that can handle different types of geometric objects and global illumination features, and demonstrate its capability in enhancing the visual realism of a rendered scene. The Basic Ray Tracer (Max. marks: 10)In labs 6, 7, you will develop a simple ray tracer that can handle scenes containing planes and spheres. You will also implement methods for generating shadows and reflections. This assignment builds upon that ray tracer. As a minimum, your ray tracer should include the following features/objects. ...

April 13, 2023 · 3 min · jiezi

关于算法:CSE111-Setup

CSE111 Assignment 4Background informationTo be truly parallel, sorting a single list when multiple CPU cores are available should show a significant speedup over a single threaded approach. Radix sorting lends itself to truly parallel implementations; consult the literature for approaches you might consider taking. Remember that MSD is a sorting outcome, not a sorting algorithm so investigate sorting algorithms that lend themselves to parallel implementation. SetupSSH into any of the CSE111 teaching servers using your CruzID Blue credentials: ...

April 13, 2023 · 3 min · jiezi

关于算法:COMP2003J最短路径和生成树算法

Assignment 3: Shortest Paths andMinimum Spanning TreesCOMP2003J: Data Structures and Algorithms 2Weight: 50% of final gradeDocument Version: 1.0IntroductionThe goal of this assignment is to analyze and program some graph algorithmsand visualize them. This assignment includes three tasks.Task 1 - Shortest Paths (5%) A program called DijkstraLabeller.java tries to label the shortestpath for a given weighted graph with a starting vertex by harnessingDijkstra’s algorithm. It may work but not be perfect. Please study thisimplementation carefully and point out its weakness(es), which can besuch as lacking enough information in returned objects, low efficiencyetc. When you find out a point, you need to make an in-depth analysis.For example, assume that this implementation has a low-efficiencyissue; you need to specify where they are from, their time complexity,etc.(10%) Based on the analysis from the previous step, you need to reimplementthis solution to solve these issues. You need to create anew java class named DijkstraLabeller2.java within the packagedsa.algorithms. If needed, you can create a few other classes. Forexample, as we mentioned in our lecture, if you want to use anadaptable priority queue, you may need to create a new interface andits implementation as well. In your solution, you can use java built-indata structures, such as Map, List etc. However, a graph and its edgesand vertices must be represented by the classes provided within theassignment.(5%) Create a test class named TestDijkstraLabeller.java to checkthat your solution is correct and make comparisons with the previoussolution.Task 2 - Minimum Spanning Trees(5%) A program called KruskalLabeller.java manages to label theminimum spanning tree in a given graph by utilizing Kruskal’salgorithm. Similar to Task 1, please study this implementation carefullyand point out its weakness(es), particularly in terms of its efficiency.(10%) Based on the analysis from the previous step, you need to reimplementthis solution to solve these issues. You need to create anew java class named KruskalLabeller2.java within the package dsa.algorithms. If needed, you can create a few other classes. Forexample, you may need to implement Union-Find structure. In yoursolution, you can use java built-in data structures, such as Map, Listetc. However, a graph and its edges and vertices must be representedby the classes provided within the assignment.(5%) Create a test class named TestKruskalLabeller.java to checkthat your solution is correct and make comparisons with the previoussolution.Task 3 - Visualization(10%) Visualization can help us to better understand graphs and examine ourgraph algorithms. This task requires you to study the existing java-basedtechniques for graph visualization and choose a suitable one to implement asolution to visualize the graphs used in your testing in Trask 2 and Task 3 anddemonstrate the process of Dijkstra's algorithm and Kruskal's algorithm.Instructions• Download the file Assignment3-Source.zip from Brightspace. Thecontents of this file include DijkstraLabeller.java andKruskalLabeller.java and all their dependent classes.• When you study the weaknesses of the existing implementations, youneed to record these weaknesses and your analysis in your report.• In your solutions of Task 1 and Task 2, a graph must be representedby IGraph, and its vertices and edges must be represented by IVertexand IEdge, which are defined in dsa.iface package. a Graphimplementation: EdgeListGraph is provided in dsa.impl package,you should use it in your testing to hold your graph data.• You can design your own returned data type to hold any data you needfor the next step to visualize graphs.• This assignment requires you to do some independent researchoutside of what is directly covered in the lectures. For example, twochapters in Goodrich and Tamassia’s book are suggested to read, i.e.,Chapter 9.5 Adaptable Priority Queues, Chapter 14.7.3 DisjointPartitions and Union-Find Structures. You can learn the solutionsprovided by these chapters and then make your own solutions.• When testing your implementation and making comparisons, youshould compare their efficiencies at different graphs and record theresults and analyze them in your report.• In the task of visualization, you can use any java-based components.• You should summarise the studies for graph visualization and brieflydepict your solution. It is essential to put critical screenshots of thevisualization produced by your program into the report. SubmissionThis is an individual assignment. Therefore, all code and the reportmust be written by yourself. Assignment 1 contained some adviceabout avoiding plagiarism in programming assignments.• Submit a zip file to Brightspace, which should include all java files,libraries, and data used in your project. All code should be wellformattedand well-commented to describe what it is trying to do.• Submit a pdf report to Brightspace. This report should be a humanreadabledocument (i.e., do not simply include code). In your report, itis recommended to have the following essential topics, but not limited:o Record the weaknesses of the existing implementations andprovide your in-depth analyses.o Depict any tricks (novel or different ideas) used in your solution.o Document the testing strategies and record results and provideyour analyses.o Include a short literature review about Java-based graphvisualization.o Depict your visualization solution.o List newly added java classes, and describe their functionalities.• The pdf file of your report must be submitted as a separate file, i.e., itcannot be compressed into the zip file with your code or data, for thepurpose of originality checking.

April 12, 2023 · 4 min · jiezi

关于算法:INFT-3019网络架构

INFT 3019 Network Architecture 2022Assignment 2: Wireless Implementation (25%)Due: Tuesday 14th June 2022 @ 11:59 PM(Week 14)Individual AssignmentSubmission: via the course website OverviewStelmaria Incorporated has finally finished moving their headquarters to Mawson Lakes Tech Parkand now want to upgrade their wireless infrastructure from a standard WPA2 PSK setup to a moreenterprise solution with Wireless LAN Controllers (WLCs) and WPA2 Enterprise. They are also lookingto learn more about WAN connections that are available to the company to enhance theirinterconnectivity between branch offices. They will also be looking at implementing more WAN links,currently Stelmaria has a reliance on Mawson Lakes Tech Park to route all traffic from all branchoffices to the ISP and back to the branch offices. Stelmaria is looking for other options that giveredundancy and scalability for branch offices and its headquarters.In this assignment you will be making use of the skills you have learnt over the entire course tocreate an IP addressing scheme, a network implementation with wireless and recommend a WANsolution for Stelmaria that meets their needs as a growing enterprise.DeliverablesYou will be required to complete four deliverables and include them in your submission: IP addressing scheme (Excel Spreadsheet or Word Document/PDF). Test documentation (Excel Spreadsheet or Word Document/PDF). Completed network configuration (Packet Tracer file). Recommended WAN solution, justifications, and assumptions (Word Document/PDF).Do not add these deliverables to a ZIP archive on submission. Submit them as separate files.WeightingThe assignment is worth 25% of your overall grade for this course. The following table breaks downeach component of the assignment, giving it a percentage out of the 25% for this assignment. TheImplementation is worth 17% and the WAN solution is worth 8%IP addressing scheme 2.5%Basic configuration 1%VLANs & VTP 1%IP addressing implementation 1%OSPF & Routing 3%DHCP 2%NAT 2%Wireless 3.5%Testing 1%4/04/2022 v1.0WAN Solution 8% ...

April 12, 2023 · 9 min · jiezi

关于算法:剑指offer第二版刷题笔记Java版

03.数组中的反复数字题目形容:在一个长度为 n 的数组 nums 里的所有数字都在 0~n-1 的范畴内。数组中某些数字是反复的,但不晓得有几个数字反复了,也不晓得每个数字反复了几次。请找出数组中任意一个反复的数字。示例 1:输出:[2, 3, 1, 0, 2, 5, 3] 输入:2 或 3 。思路一:排序先对原数组进行排序,再在排序的数组中遍历,判断num[n]和nums[n+1]是否相等,如果相等就间接输入。这种思路比较简单,代码如下: class Solution { public int findRepeatNumber(int[] nums) { Arrays.sort(nums); for(int i=0;i<nums.length-1;i++) { if(nums[i] == nums[i+1]) { return nums[i]; } } return -1; }}复杂度剖析: 工夫复杂度:数组排序工夫复杂度为O(nlogn),数组遍历工夫复杂度为O(n),总的工夫复杂度为O(nlogn)。空间复杂度:没有引入额定的存储,因而空间复杂度为O(1)。思路二:哈希引入一个哈希表,程序遍历数组,以数组元素值为key,如果key不存在就退出哈希表,如果key存在就间接输入,代码如下: class Solution { public int findRepeatNumber(int[] nums) { HashMap<Integer, Integer> hashMap = new HashMap<Integer, Integer>(); int result = -1; for(int i=0;i < nums.length;i++) { if(hashMap.get(nums[i]) == null) { hashMap.put(nums[i], 1); }else{ result = nums[i]; break; } } return result; }}复杂度剖析: ...

April 11, 2023 · 4 min · jiezi

关于算法:MTRN4110-阶段性任务A

MTRN4110 22T2 Phase A Task Description(Week 1-3) Updated 3/6/2022: Asking macOS users to configure Makefi le to use C++14 (Page 12, Section 4.1.20)Released 30/5/2022 Overview of the Course Project:The main project of MTRN4110 22T2 is a simulation-based project adapted from the Micromousecompetition. Webots will be used as the simulation platform throughout the project. You will designa mobile robot and implement a controller and a vision program to negotiate a maze autonomouslyin Webots. The project will contribute 60% to your final mark in this course.The project consists of four sequential phases which are connected, but attempting one phase is notdependent on the completion of another: Phase A: Driving and Perception (Week 1-3, 14%, individual) Phase B: Path Planning (Week 4-6, 14%, individual) Phase C: Computer Vision (Week 7-9, 14%, individual) Phase D: Integration and Improvement (Week 10-12, 18%, group)This document will describe the tasks of Phase A.UNSW MTRN4110 Robot Design 22T2 Page 2 of 17 ...

April 11, 2023 · 16 min · jiezi

关于算法:哈希算法竞猜游戏系统特性及开发源码示例

哈希(Hash)是一种加密算法,也称为散列函数或杂凑函数。哈希函数是一个公共函数,它能够将任何长度M的音讯映射到一个较短且固定长度的值H(M),称为H(M)作为哈希值、散列值(Hash Value)、杂凑值或者音讯摘要。它是一种单向明码零碎,即从明文到密文的不可逆映射。只有一个加密过程,但没有解密过程。 1、枯燥性(Monotonicity):枯燥性是指如果曾经有一些内容通过哈希分派到了相应的缓冲中,又有新的缓冲退出到零碎中。哈希的后果应可能保障原有已调配的内容能够被映射到原有的或者新的缓冲中去,而不会被映射到旧的缓冲汇合中的其余缓冲区。 2、负载(Load):负载问题实际上是从另一个角度对待分散性问题。既然不同的终端可能将雷同的内容映射到不同的缓冲区中,那么对于一个特定的缓冲区而言,也可能被不同的用户映射为不同的内容。与分散性一样,这种状况也是该当防止的,因而好的哈希算法应可能尽量升高缓冲的负荷。 3、平衡性(Balance):平衡性是指哈希的后果可能尽可能散布到所有的缓冲中去,这样能够使得所有的缓冲空间都失去利用。很多哈希算法都可能满足这一条件。 4、分散性(Spread):在分布式环境中,终端有可能看不到所有的缓冲,而是只能看到其中的一部分。当终端心愿通过哈希过程将内容映射到缓冲上时,因为不同终端所见的缓冲范围有可能不同,从而导致哈希的后果不统一,最终的后果是雷同的内容被不同的终端映射到不同的缓冲区中。这种状况显然是应该防止的,因为它导致雷同内容被存储到不同缓冲中去,升高了零碎存储的效率。分散性的定义就是上述情况产生的重大水平。好的哈希算法应可能尽量避免不统一的状况产生,也就是尽量升高分散性。 哈希竞猜游戏零碎开发源码示例/* @Author: Carlos@Date: 2020-07-2 23:48:50@LastEditTime: 2020-07-2 23:48:50@LastEditors: Carlos@Description: Hash */include "stdio.h"include "stdlib.h"define HASHSIZE 7 //定义散列表长为数组的长度define NULLKEY -1typedef struct{ int *elem; //数据元素存储地址,动态分配数组int count; //以后数据元素个数} HashTable;/** @Description: 哈希函数初始化@Param: HashTable *hashTable 构造体指针@Return: 无@Author: Carlos */void Init(HashTable *hashTable){ int i;hashTable->elem = (int *)malloc(HASHSIZE * sizeof(int));hashTable->count = HASHSIZE;for (i = 0; i < HASHSIZE; i++){ hashTable->elem[i] = NULLKEY;}}/** @Description: 哈希函数(除留余数法)@Param: int data 哈希的数据@Return: 哈希后data存储的地址@Author: Carlos */int Hash(int data){ ...

April 10, 2023 · 1 min · jiezi

关于算法:二-初始化Server

二 初始化Serverhello,又到了本期的博客了,这一期我将会给大家介绍启动时redis是如何初始化网络状态的,大家一起高兴的学习吧!! 先看一看初始化server在main函数被调用的代码: int main(int argc char * argv[]){ loadServerConfig(server.configfile, config_from_stdin, options); /* ***** */ initServer(); /* **** */}当将配置文件加载到全局变量server中时,这时redis就会依据配置文件中的内容去初始化服务端状态了,server在全局中的定义如下: /* Global vars */struct redisServer server; /* Server global state */接下来咱们来看看初始化server具体干了哪些事件。 1 信号 signal(SIGHUP, SIG_IGN); signal(SIGPIPE, SIG_IGN); setupSignalHandlers(); makeThreadKillable();signal(SIGHUP, SIG_IGN)用于疏忽SIGHUP信号,它是一种在Unix和类Unix操作系统中宽泛应用的信号。在Unix零碎中,当管制终端挂起时,会向过程组中的所有过程发送SIGHUP信号,通常用于从新初始化过程。通常状况下,当一个过程接管到SIGHUP信号时,它会终止执行,但应用signal(SIGHUP, SIG_IGN)能够疏忽这个信号,从而防止过程被终止。在下面的代码中,当Redis服务器接管到SIGHUP信号时,它不会做任何事件。 signal(SIGPIPE, SIG_IGN) 的作用是疏忽对于管道/套接字等读取端曾经敞开的写入操作而产生的 SIGPIPE 信号。在应用网络套接字进行通信时,如果对方断开连接,而以后套接字依然在写数据,那么就会产生 SIGPIPE 信号,程序默认状况下会退出。通过将 SIGPIPE 信号的处理函数设置为 SIG_IGN,程序将疏忽该信号,防止程序异样退出。 void setupSignalHandlers(void) { struct sigaction act; /* When the SA_SIGINFO flag is set in sa_flags then sa_sigaction is used. * Otherwise, sa_handler is used. */ sigemptyset(&act.sa_mask); act.sa_flags = 0; act.sa_handler = sigShutdownHandler; sigaction(SIGTERM, &act, NULL); sigaction(SIGINT, &act, NULL); sigemptyset(&act.sa_mask); act.sa_flags = SA_NODEFER | SA_RESETHAND | SA_SIGINFO; act.sa_sigaction = sigsegvHandler; if(server.crashlog_enabled) { sigaction(SIGSEGV, &act, NULL); sigaction(SIGBUS, &act, NULL); sigaction(SIGFPE, &act, NULL); sigaction(SIGILL, &act, NULL); sigaction(SIGABRT, &act, NULL); } return;}setupSignalHandlers()函数用于设置Redis过程的信号处理程序。在Unix/Linux零碎中,信号是一种异步通信机制,用于解决过程之间的异步事件。在Redis中,有多种信号能够被解决,比方SIGTERM示意终止过程,SIGINT示意中断过程等。通过设置信号处理程序,Redis能够对这些异步事件做出响应,例如在SIGTERM信号到来时进行清理工作并优雅地敞开Redis过程。 ...

April 9, 2023 · 14 min · jiezi

关于算法:私有化部署chatGPT告别网络困扰

最近的chatGPT是热气腾腾,根本人手一个。工具用的好,工作5分钟,划水一整天。 不过最近ChatGPT的拜访越来越限度了,拜访官网都有网络的问题,明天小卷给大家介绍一个计划,私人独享属于本人的chatGPT,不再放心想用的时候拜访不了的状况。 我的项目是Github上开源chatGPT我的项目,基于 OpenAI GPT-3.5 Turbo API 的demo。地址:https://github.com/ddiu8081/chatgpt-demo 成果如下: 应用步骤1.服务器上安装nodejs环境筹备一个海内服务器(丑陋国节点),这里以Ubuntu操作系统为例 Node:须要应用 Node v18 或更高版本# 更新apt-get install update# 装置nodeapt-get install node# 装置npmapt-get install npm# 装置n模块apt-get install -g n# 装置最新版的nodesudo n latest最初通过node -v查看版本号,在v18版本以上即可 2.装置pnpm举荐应用pnpm治理依赖,装置命令: npm i -g pnpm3.下载代码从github上克隆代码下载 git clone https://github.com/ddiu8081/chatgpt-demo.git4.装置依赖代码下载完后,进入到chat-demo目录下,装置须要的依赖 pnpm install5.增加API Key须要用到你本人的ChatGPT账号的key,获取地址:https://platform.openai.com/account/api-keys 须要将.env.example文件重命名为.env文件,并将你的key写到.env文件里 # 重命名文件mvn .env.example .env# 写入keyvim .env将上面文本中的key替换为你的key,替换后按esc键,而后再输出:wq 保留退出 OPENAI_API_KEY=sk-xxx... 6.运行利用运行我的项目,并且让你的利用能在公网拜访到,执行命令如下,呈现IP和端口号就阐明运行胜利了 pnpm run dev --host 0.0.0.0 须要留神的是,下面显示的IP通常是云服务器内网IP,不可间接拜访,须要应用服务器的公网IP:3000端口号拜访。 以阿里云为例,每个服务器都有公网IP和内网IP,记得改为公网IP

April 9, 2023 · 1 min · jiezi

关于算法:排序算法实现细节

一、疾速排序(Quick Sort)疾速排序采纳分治法。首先从数列中挑出一个元素作为两头值。顺次遍历数据,所有比两头值小的元素放在右边,所有比两头值大的元素放在左边。而后按此办法对左右两个子序列别离进行递归操作,直到所有数据有序。最现实的状况是,每次划分所抉择的两头数恰好将以后序列简直等分(平均排布),整个算法的工夫复杂度为O(n logn)。 最坏的状况是,每次所选的两头数是以后序列中的最大或最小元素(正序和逆序都是最坏),整个排序算法的工夫复杂度为O(n²)。 均匀工夫复杂度为O(n logn),空间复杂度为O(logn),是一种不稳固的排序算法。 附算法实现源码: //疾速排序template <class T>int Partition(T data[],int left,int right){ T pivot=data[left]; while(left<right) { while(left<right&&data[right]>pivot) right--; data[left]=data[right]; while(left<right&&data[left]<=pivot) left++; data[right]=data[left]; } data[left]=pivot; return left;}template <class T>void QuickSort(T data[],int left,int right){if(left<right){ int p=Partition(data,left,right); QuickSort(data,left,p-1); QuickSort(data,p+1,right);}}二、抉择排序(Selection Sort)遍历所有数据,先在数据中找出最大或最小的元素,放到序列的起始;而后再从余下的数据中持续寻找最大或最小的元素,顺次放到序列中直到所有数据有序。原始数据的排列程序不会影响程序消耗工夫O(n²),绝对费时,不适宜大量数据排序。 均匀工夫复杂度为O(n²),空间复杂度为O(1),是一种不稳固的排序算法。 附算法实现源码: //抉择排序template <class T>void SelectionSort(T data[],int n){ for(int i=1;i<n;i++) { int k=i-1; for(int j=i;j<n;j++) { if(data[j]<data[k]) { k=j; } } if(k!=i-1) { T t=data[k]; data[k]=data[i-1]; data[i-1]=t; } }}三、插入排序(Insertion Sort)将前i个(初始为1)数据假设为有序序列,顺次遍历数据,将以后数据插入到前述有序序列的适当地位,造成前i+1个有序序列,顺次遍历完所有数据,直至序列中所有数据有序。数据是反序时,消耗工夫最长O(n²);数据是正序时,消耗工夫最短O(n)。实用于局部数据曾经排好的大量数据排序。 ...

April 8, 2023 · 1 min · jiezi

关于算法:COMP3121-动态编程

COMP3121/9101 22T2 — Assignment 4 (UNSW Sydney)Due 28th July 2022 at 4pm Sydney timeIn this assignment we apply dynamic programming. There are four problems each worth 20 marks,for a total of 80 marks.Your solutions must be typed, machine readable PDF files. All submissions will be checked forplagiarism!For each question requiring you to design an algorithm, you must justify the correctness of youralgorithm. If a time bound is specified in the question, you also must argue that your algorithmmeets this time bound.To describe a dynamic programming algorithm, you must include: the subproblem specification, the recurrence, any base cases, how the overall answer is calculated, including (if necessary) the order in which the subprob-lems are solved and the time complexity analysis.The recurrence, base cases and final answer calculation must each be accompanied with (oftenbrief) worded reasoning to justify the correctness of the algorithm.Partial credit will be awarded for progress towards a solution.Question 1You are playing a video game, where you control a character in a grid with m rows and n columns.The character starts at the square in the top left corner (1, 1), and must walk to the square inthe bottom right corner (m,n). The character can only move one square at a time downwards orrightwards. Every square (i, j), other than the starting square and the ending square, contains aknown number of coins ai,j .1.1 [10 marks] Design an algorithm which runs in O(mn) time and determines the maximumnumber of coins that your character can accumulate by walking from (1, 1) to (m,n) using acombination of downwards and rightwards moves.1.2 [10 marks] After playing this game many times, you have broken the controller, and youcan no longer control your character. They now walk randomly as follows: if there is only one possible square to move to, they move to it; otherwise, they move right with probability p and down with probability 1 p.Note that this guarantees that the character arrives at (m,n).Design an algorithm which runs in O(mn) time and determines the expected number of coins thatyour character will accumulate by walking from (1, 1) to (m,n) according to the random processabove.Recall that for a discrete random variable X which attains values x1, . . . , xn with probabilitiesp1, . . . , pn, the expected value of X is defined asCOMP3121/9101 22T2 — Assignment 4 (UNSW Sydney)Question 2You are managing a garage with two mechanics, named Alice and Bob. Each mechanic can serveat most one customer at a time.There will be n customers who come in during the day. The ith customer wants to be served forthe entire period of time starting at time si and ending at time ei. You may assume that thecustomers are indexed by their order of arrival, i.e. s1 < s2 < . . . < sn.For each customer i, the business will: make ai dollars if customer i is served by Alice; make bi dollars if customer i is served by Bob; lose ci dollars if customer i is not served.Your task is to maximise the net earnings of the garage, which is calculated as the total amountmade minus the total amount lost.2.1 [8 marks] Consider the following greedy algorithm.Process each customer i in order of arrival as follows. If both Alice and Bob are available at time si:– if ai ≥ bi, assign customer i to Alice;– otherwise, assign the customer to Bob. If only one mechanic is available at time si, assign customer i to that mechanic. If neither mechanic is available at time si, do not serve customer i.Design an instance of the problem which is not correctly solved by this algorithm. You must: specify a number of customers n, for each customer, provide values for si, ei, ai, bi and ci, apply the greedy algorithm to this instance and calculate the net earnings achieved, and show that a higher net earnings figure can be achieved.2.2 [12 marks] Design an algorithm which runs in O(n2) time and determines the maximumnet earnings of the garage.Question 3You are given a simple directed weighted graph with n vertices and m edges. The edge weightsmay be negative, but there are no cycles whose sum of edge weights is negative.3.1 [10 marks] An edge e is said to be useful if there is some pair of vertices u and v such thate belongs to at least one shortest path from u to v.Design an algorithm which runs in O(n3) and determines the set of useful edges.3.2 [10 marks] An edge is said to be very useful if there is some pair of vertices u and v suchthat e belongs to every shortest path from u to v.Design an algorithm which runs in O(n3) and determines the set of very useful edges.2COMP3121/9101 22T2 — Assignment 4 (UNSW Sydney)Question 4There are 2n players who have signed up to a chess tournament. For all 1 ≤ i ≤ 2n, the ith playerhas a known skill level of si, which is a non-negative integer. Let S =∑2ni=1 si, the total skill levelof all players.In the tournament, there will be n matches. Each match is between two players, and each playerwill play in exactly one match. The imbalance of a match is the absolute difference between theskill levels of the two players. That is, if a match is played between the ith player and the jthplayer, its imbalance is |si ? sj |. The total imbalance of the tournament is the sum of imbalancesof each match.The organisers have provided you with a valuem which they consider to be the ideal total imbalanceof the tournament.Design an algorithm which runs in O(n2S) time and determines whether or not it is possible toarrange the matches in order to achieve a total imbalance of m, assuming:4.1 [4 marks] all si are either 0 or 1;4.2 [16 marks] the si are distinct non-negative integers. ...

April 7, 2023 · 5 min · jiezi

关于算法:COSC1073-RMIT-Classification

RMIT Classification: TrustedCOSC1073 Programming 1, Semester 1 2022Background InformationFor this assessment you need to write an object-oriented console application in the Java programming language which adheres to the following basic object-oriented programming principles:a) Your code should follow good object-oriented principles such as encapsulation, composition, cohesion, etc…b) Setting the visibility of all instance variables to private or protected.c) Using getter and setter methods only where needed and appropriate with consideration given to scope & visibility.d) Only use static variables and methods when there is good reason to do so, such as reducing code repetition whilst still maintaining good objected-oriented design.e) Using superclass methods to retrieve and / or manipulate superclass properties from within subclasses, avoiding direct access to class attributes if possible.You are being assessed on your ability to write a program in an object-oriented manner. Writing the program in a procedural style is not permitted and will be considered inadmissible receiving a zerograde.Programming guidelinesThe following guidelines apply throughout this assessment:f) Arrays should be appropriately sized.g) You must use an array to store any collections of objects in your program.h) You must not use ArrayList, HashMap or any other data structures within the Java API or from 3rd party libraries.i) You are not permitted to use streams from Java 1.8.j) You must use the provided program template, and must not modify the main() method.Page 1 of 11Page 2 of 11RMIT Classification: TrustedOverviewThe manager of a theatre wants you to build an online program that allows customers to buy ticketsonline. The system will be able to:? Create a sale or a cart. ? Add tickets to the cart. ? Display the details of the cart.? Updating the cart.? Check-out.You will be addressing these requirements by implementing a series of classes designed to meet these needs. The classes must be designed according to object-oriented principles.Getting StartedThis assessment is divided up into several stages:? Stage 1 – A ticket to be added (6 marks)? Stage 2 – A cart to hold the ticket(s) (9 marks) ? Stage 3 – Updating the ticket(s) in the cart (9 marks)? Stage 4 – VIP features (10 marks)? Code quality (6 marks)The assessment is designed to be easy at earlier stages but more difficult in later stages. The specification will be more descriptive in the earlier stages. The later stages of the assessment are less descriptive as you will need to be more independent and resolve design issues on your own.RMIT Classification: TrustedStage 1 – A ticket to bookIn this stage, you will implement a basic online shopping cart. You are required to create one class: Ticket. Build the Ticket class with the following specifications:? Instance variableso String nameo int priceo int quantity? No-argument constructor Ticket()o Initialize name to String “UNKNOWN” o Initialize price to -1 o Initialize quantity to 0? Constructor ticket(String name, int price, int quantity) o Initialize instance variable name to name if the latter is not nullo Initialize instance variable price to price if the latter is >= 0o Initialize instance variable quantity to quantity if the latter is >= 1o If any of the parameter passed in is invalid, throw IllegalArgumentException.? Public mutators & accessorso setName(String name) & getName()o setPrice(int price) & getPrice()o setQuantity(int quantity) & getQuantity()o Setters should consider the validity of the parameter passed in? Public method getTotalPrice() which returns the total cost of the item to purchase. The total price is computed using the formula: price * quantity? Override the toString() method which returns the following String: o “name quantity X $price = $totalCost”The CartManager class contains the main() method. Complete the stage1(Ticket ticket) method, which prompts the user for one ticket and sets the attributes of Ticket passed into stage1(). Print the information of the item. Example outputs are as follows. Given the same user input, your program is expected to produce the same output as follows. Only minor format differences (e..g., extra line break) are allowed. Stage 1 Enter the name of the ticket:The Matrix ReloadedEnter the ticket price:15Enter the quantity:3The Matrix Reloaded 3 X $15 = $45Page 3 of 11Page 4 of 11RMIT Classification: TrustedStage 2 – A basic shopping cartYou must complete Stage 1 before attempting Stage 2. If Stage 1 has not been completed Stage 2 will not be assessed.In this stage, you will create a basic shopping cart. The shopping cart is supposed to (1) store acustomer’s information; (2) allow a customer to add items; and (3) compute the total cost of the items within the shopping cart. Implement the class ShoppingCart. Specifically, the ShoppingCart class should contain:? Private fieldso String customerNameo String dateo Ticket[] inCartTickets o static final int CAPACITY of value 5 (provided)o int count? No-argument constructor ShoppingCart()o Initialize customerName to String “UNKNOWN” o Initialize date to String “1 May 2022” o Initialize cartTickets to an empty array of size CAPACITYo Initialize count to 0? Constructor ShoppingCart(String name, String date) o Initialize instance variable name to nameo Initialize instance variable date to dateo Initialize inCartTickets to an empty array of size CAPACITYo Initialize count to 0? Public accessor getName() which returns the customer name of the shopping cart? Public mutator setName(String name) which sets the customer name? Public accessor getDate() which returns the sign up date of the customer ? Public mutator setDate(String date) which sets the current date of shopping cart? Public member method add(Ticket ticket), which adds the ticket parameter to inCartTicketsarray. The method determines whether the ticket passed in is not null and is different to any ticket in inCartTickets. If yes, it adds ticket to the shopping cart and returns true. Otherwise, it prints a message “Ticket invalid or already added.” and returnsfalse. Note, all tickets should have different names ignoring cases. “Star wars”, “star wars” and “Star Wars” are considered the same. If the shopping cart is full, the ticket then cannot be added. It prints “SHOPPING CART IS FULL” and returns false. ? Public member method getCount() which returns quantity of all tickets in cart. Note that the quantity of all tickets must be computed as the total quantity of all different ticketsin the cart.E.g. if there are 1 sale of 3 tickets and 1 sale of 1 ticket, the method will return 4, not 2. ? Public member method getCost() which returns the total cost of the tickets in cart.? Public member method printTotal() which outputs (1) the customer and the date; (2) number of total sales in the shopping cart; (3) the information of each sale in the shopping cart; and (4) the total cost of the shopping cart. If the cart is empty, outputs (1) the customer and the Page 5 of 11RMIT Classification: Trusteddate; and (2) the message “SHOPPING CART IS EMPTY.”. Please refer to the example outputs when implementing this method.Complete stage2(ShoppingCart cart) method in CartManager. Ask user to enter name and date. Sets the customerName and date of cart passed in. Print the information of the shopping cart created by calling printTotal() (note, it is the first-time calling printTotal() ).Prompt the user for two ticket sales, create two objects of the Ticket class, and add these two objects into the shopping cart created. Hint: Before prompting for the second item, you may need to call scanner.nextLine(); to allow the user to input a new string (depends on your program). Print info of the shopping cart by calling printTotal() (calling printTotal() for the second time).Example outputs are as follows. Given the same user input, your program is expected to produce the same output . Only minor format differences (e..g., extra line break) are allowed. Example output after printTotal() is called for the first time: Stage 2 Enter the name of the customer:Andrew SmithEnter the current date:1 May 2022Andrew Smith - 1 May 2022SHOPPING CART IS EMPTYExample outputs after printTotal() is called for the second time:Enter the name of the ticket:The Matrix ReloadedEnter the ticket price:15Enter the quantity:3Enter the name of the ticket:Top GunEnter the ticket price:25Enter the quantity:2Andrew Smith - 1 May 2022Number of tickets: 5The Matrix Reloaded 3 X $15 = $45Top Gun 2 X $25 = $50Total cost: $95User Prompt in stage2()Example output of the first printTotal() callExample output of the second printTotal() callRMIT Classification: TrustedStage 3 – Updating the shopping cartYou must complete Stage 2 before attempting Stage 3. If Stage 2 has not been completed Stage 3 will not be assessed.In this stage, you will need to allow user to update the items already added to the shopping cart. You need to extend the ShoppingCart class by adding:? Public method removeTicket(String ticketName) o It removes a ticket from the shopping cart if its name matches with the parameter ticketName. After removing, output this message “Ticket ticketName removed from the cart”. It returns true in this case. o If no item in the cart matches with ticketName, print this message “Ticket not found. Cart remains unchanged.” It returns false in this case.? Public method updateTicket(String ticketName)o Modifies a ticket’s quantity and returns true if successful. o If an item with a matched name of the parameter ticketName can be found in cart, prompt the user to enter the new quantity with the message “Please enter the new quantity:”. Update the ticket quantity according to the user input. o If the ticket cannot be found (by ticketName), print the message: “Ticket not found.Cart remains unchanged.” and return false.? Public method checkout()o Generate a summary of the shopping cart by calling printTotal(). Allow customer to check-out. o Removes all items from the shopping cart and returns true. o If the shopping cart is empty, do not generate a summary. Instead, output the message “SHOPPING CART IS EMPTY.” and returns false.Implement stage3() method in CartManager. Note that in main(),stage3() will work on the cart object that has been updated by Stage 2. Ask user for the ticket to be removed. Perform the removal by calling removeTicket(String ticketName), and print the summary of the cart by calling the printTotal().Next, ask for the ticket name to be modified, then call updateTicket(String ticketName). Print the information of the shopping cart by calling the printTotal() method. Finally, prompt the user to check?out and call the checkout() method.Example outputs are as follows. Given the same user input, your program is expected to produce the same output as follows. Only minor format differences (e..g., extra line break) are allowed. Stage 3 Do you want to remove a ticket from the cart? Y/NYEnter the name of the ticket:Case insensitivetop gunTop Gun removed from the shopping cart.Andrew Smith - 1 May 2022Number of tickets: 3The Matrix Reloaded 3 X $15 = $45Total cost: $45Page 6 of 11Page 7 of 11RMIT Classification: TrustedAnother example: Stage 3 Do you want to remove a ticket from the cart? Y/NYEnter the name of the ticket:TopgunTicket not found. Cart remains unchanged.Andrew Smith - 1 May 2022Number of tickets: 5The Matrix Reloaded 3 X $15 = $45Top Gun 2 X $25 = $50Total cost: $95Example outputs of updateTicket() and printTotal(). Assume 2 tickets of “Top Gun” in the cart.Do you want to update a ticket from the cart? Y/NYEnter the name of the ticket:Top gunPlease enter the new quantity:4Andrew Smith - 1 May 2022Number of tickets: 7The Matrix Reloaded 3 X $15 = $45Top Gun 4 X $25 = $100Total cost: $145Example outputs after calling checkout() if not empty. Do you want to checkout? Y/NYAndrew Smith - 1 May 2022Number of tickets: 7The Matrix Reloaded 3 X $15 = $45Top Gun 4 X $25 = $100Total cost: $145 Thank you for shoppingNo such ticket!Case insensitive.Page 8 of 11RMIT Classification: TrustedStage 4 – VIP Shopping CartYou must complete Stage 3 before attempting Stage 4. If Stage 3 has not been completed Stage 4 will not be assessed.In this stage, you will extend the program to create a shopping cart for VIP customer. VIP customershave a few privileges: (1) If the total price of purchased ticket is greater or equal to $100, then the last ticket is free for one. Using the last example from Stage 3, Andrew Smith ordered 4 tickets for Top Gun but only needs to pay for 3, e.g. $75. (2) Collect points after check-out. The number of points collected is equal to the cost that a customer actually pays at check-out. If a VIP pays $75 at check-out, then 75 points will be rewarded. (3) A VIP customer can use the points to pay at check-out. Specifically, every 20 points can be usedequivalently as $1. When a VIP proceeds to check-out, your program should prompt the user for number of points he/she wishes to use. When using points to pay, a customer can only use multiples of 20 points, e.g., 20, 40, 100 points etc. If the customer enters a number that is not multiples of 50, the nearest lower multiple of 20 will be used, e.g. 80 for 95, 40 for 49. If the customer enters a number greater than the total points available, print the message: “Not enough points. Please re-enter or enter-1 to quit point redeeming.”. If a valid number is entered, deduct the points from balance and continue the checkout. If -1 is entered, the program will proceed with the normal checkout.To achieve the above requirements, you will need to implement a new class VIPCart that is a subclass of the ShoppingCart class. Additional private fields, new methods or overridden methods may be added. You then need to implement stage4() method. In this method, prompt the user to create a VIP shopping cart. Add one item to the shopping cart, and then call checkout().Example outputs are as follows. Given the same user input, your program is expected to produce the same output as follows. Only minor format differences (e..g., an extra line break) are allowed. Example, a VIP ordered 3 tickets of “The Matrix Reloaded”, and redeemed 100 points at checkout. Stage 4 Enter the name of the VIP customer:Andrew SmithEnter the current date:1 May 2022Enter the VIP points available:350Enter the name of the ticket:The Matrix ReloadedEnter the ticket price:15Enter the quantity:3Andrew is a VIP!Page 9 of 11RMIT Classification: TrustedAndrew Smith - 1 May 2022Number of tickets: 3The Matrix Reloaded 3 X $15 = $45Total cost: $45Do you want to checkout? Y/NYHow many points to redeem? Enter -1 to checkout without using points.118Redeemed 100 points for $5Total price paid: $4040 VIP pointed added. Available 290 points.Thank you for shoppingAndrew from the above example is now buying two sets of tickets. Assume he has 350 VIP points.Enter the name of the ticket:The Matrix ReloadedEnter the ticket price:15Enter the quantity:3Enter the name of the ticket:Top GunEnter the ticket price:25Enter the quantity:4Andrew Smith - 1 May 2022Number of tickets: 7The Matrix Reloaded 3 X $15 = $45Top Gun 4 X $25 = $75 (with VIP discount)Total cost: $120Do you want to checkout? Y/NYHow many points to redeem? Enter -1 to checkout without using points.-1100 points redeemed, not 118 points!The last ticket is free!! So only pay for 3.Page 10 of 11RMIT Classification: TrustedNo VIP points usedTotal price paid: $120120 VIP pointed added. Available 470 points.Thank you for shoppingMarking GuideImplementation of Ticket class? Private fields, mutators, and accessors? Constructors? Mutators & accessors? getTotalPrice()? toString()Implementation of stage1()[5 marks] [ 1 mark ] [ 1 mark ] [ 1 mark ] [ 1 mark ] [ 1 mark ][1 mark]Implementation of ShoppingCart class? Private fields, mutators and accessors? Constructors? add()? geCount()? getCost()? printTotal()[8 marks][ 1 mark ] [ 1 mark ] [2 marks][ 1 mark ] [ 1 mark ] [2 marks]Implementation of stage2() [1 mark]Extending ShoppingCart class? removeTicket()? updateTicket()? checkout()Implementation of stage3()[7 marks] [3 marks][3 marks][ 1 mark ][2 marks]Implementation of VIPCart class? VIP points accumulation? Redeeming VIP points? VIP discount for order over $100[8 marks][2 marks][3 marks][3 marks]Implementation of stage4() [2 marks]Code quality? No duplicate codes (segments with overlap of >= 3 lines)? Proper variable names? Comments? Consistent indentation? No overly long methods (methods >= 50 lines)? No magic numbers[6 marks][ 1 mark ] [ 1 mark ] [ 1 mark ] [ 1 mark ] [ 1 mark ] [ 1 mark ]Total [40 marks]Penalties ...

April 7, 2023 · 14 min · jiezi

关于算法:CS-369算法问题

See Canvas for due datesIn the first part of this assignment, we use a Hidden Markov Model to model secondarystructure in protein sequences and implement a couple of algorithms we saw in lectures.In the second part, we simulate sequences down a tree according to the Jukes-Cantormodel then use distance methods to try to reconstruct the tree.Write your code in Python and present your code embedded in a report in a JupyterNotebook. Make sure you test your code thoroughly and write clear, commented codethat others can understand.Submit two files to Canvas: the .ipynb and .html both showing code and results by 10pmon the due date.There are 30 marks in total for this assessment. ...

April 6, 2023 · 7 min · jiezi

关于算法:ECON7150题型解析

Assignment 2Due June 3 2022 at 4 PMInstructions:? All answers to the assignment must be neatly hand written.? Scan your assignment and save it as one pdf document. If you do not have access to aflatbed scanner you can use a phone app such as “Adobe Scan” or “Microsoft Office Lens”.? Submit your assignment on Blackboard through Turnitin.? Make sure you show all steps, key formulae, and workings clearly. Final solutions shouldbe simplified as much as possible and either highlighted or circled. Round to the nearesthundredth if necessary.? 100 Marks - 30% of overall assessment ...

April 6, 2023 · 4 min · jiezi

关于算法:CSE-101数据结构算法

Introduction to Data Structures and AlgorithmsProgramming Assignment 8 In this project you will re-create the Dictionary ADT from pa7, but now based on a Red-Black Tree. Redblack trees are covered in Chapter 13 of the text, and will be discussed at length in lecture. All relevantalgorithms for RBTs (and BSTs) are posted on the webpage under Examples/Pseudo-code. Aside fromhaving a RBT as its underlying data structure, your Dictionary ADT will have only slight changes to itsinterface. The recommended approach for this project is to just copy Dictionary.cpp from pa7 and makethe necessary changes, but you can start from scratch if you feel it is necessary. The header file Dictionary.his posted in Examples/pa8. It's most significant difference from the header file for pa7 is a new Node fieldof type int called color. Other than that, the only difference is a new section for RBT helper functions.Although these functions are listed as optional, and you may make changes as you like, you should considerthem as absolutely necessary for this project. ...

April 6, 2023 · 4 min · jiezi

关于算法:FIT-5003软件安全算法

FIT 5003 Software Security2022 S1Environment SetupIn this unit, hands-on labs and assignments will be conducted on a dedicated virtualmachine image. You are strongly suggested to follow the guidelines to set up your ownhands-on environment before doing your assignments and labs. The environmentincludes the virtual machine software, e.g., VirtualBox and Linux (Ubuntu 16.04), withwhich you can work on the assignments and labs using your own personal computers.Getting familiar with them is critical.Virtual Machine Software:VirtualBox is recommended for the assignment in this unit, which is open-source andcompletely free. It can be downloaded at https://www.virtualbox.org/wiki/Downloads.We note that other virtual machine software like VMware Player and Parallels Desktopare also compatible to use.Go to the download page shown as below and choose the appropriate installationpackage according to your host operating system: ...

April 6, 2023 · 3 min · jiezi

关于算法:CSCI-3110-数据结构算法

CSCI 3110 Assignment 1 posted: 04.05.2022Instructor: Travis Gagie due: midnight 20.05.2022You can work in groups of up to three people. One group member should submit a copy of thesolutions on Brightspace, with all members’ names and banner numbers on it; the other groupmembers should submit text files with all members’ names and banner numbers. (Brightspacewon’t let us assign marks to people who haven’t submitted anything!) You may consult with otherpeople but each group should understand the solutions: after discussions with people outside thegroups, discard any notes and do something unrelated for an hour before writing up your solutions;it’s a problem if no one in a group can explain one of their answers. For programming questionsyou should submit your code, which should compile and run correctly to receive full marks. ...

April 5, 2023 · 8 min · jiezi

关于算法:MTH2222-数学算法

MTH2222 Mathematics of UncertaintySem 1, 2022Assignment 2Due onTuesday May 3rd by 5 pm. Submission via Moodle using folderAssignment- MTH2222 students work is assessed on questions 1,2,3,4,5,6. MTH2225 studentswork is assessed on questions 2,3,4,5,6,7.The goal of this problem is to show the following. If X and Y are normallydistributed and are uncorrelated, then they might still be dependent! Sup-pose that X is a standard normal distributions. Let be a random variable,independent of X, which takes values in {?1, 1}, each with probability 1/2.(a) Find the distribution of Y = X. [2 marks](b) Are X and Y independent? Justify your answer. [3 marks](c) Are X and Y correlated?Justify your answer. [3 marks](d) Is (X, Y ) bivariate normal? Justify your answer. [2 marks][10 marks]Suppose that X1, X2 are independent geometric with parameter p, wherep ∈ (0, 1/2]. Find the p which maximises the probability of the event X1 = X2.[4 marks]Let X be a Poisson with parameter 1. Prove (step by step) thatP(X < 4) =∫ ∞116x3e?xdx.[4 marks]Let X be a random variable with MGFMX(t) = tet2,for some parameter > 0. Find P(X > ln ). [4 marks]Find the constant c such thatf(x) = ce?x?e?x, with x ∈ IRis a probability density function. [4 marks]Let p1 < p2 < p3 . . . be the prime numbers, i.e. natural numbers which arenot the product of two smaller natural numbers (1 is not prime with thisdefinition). For all i ∈ IN, let i = p?2i , and Xi be a random variable takingvalues on {0, 1, 2, . . .}, such thatP(Xi = k) = (1? i)ki .Assume (Xi)i are independent. Let M =∏∞i=1 pXii . Find the p.m.f. of M . Youmight need that∑∞k=1 k2 = pi2/6 and that each natural number has a uniquedecomposition in terms of products of primes.[6 marks]For MTH2225 Students only. Let (Sn)n be a simple random walk. Findthe (approximate) probabilityP(S10000 ≥ 100).[4 marks]http://www.daixie0.com/contents/13/6716.html ...

April 4, 2023 · 2 min · jiezi

关于算法:CSE3BDC大数据处理方法解析

La Trobe UniversityDepartment of Computer Science and Computer EngineeringCSE3BDC Assignment 2022Objectives Gain in depth experience playing around with big data tools (Hive, SparkRDDs, and SparkSQL).Solve challenging big data processing tasks by finding highly efficient solutions.Experience processing three different types of real dataa. Standard multi-attribute data (Bank data)b. Time series data (Twitter feed data)c. Bag of words data.Practice using programming APIs to find the best API calls to solve your problem. Hereare the API descriptions for Hive, Spark (especially spark look under RDD. There are a lotof really useful API calls).a) [Hive] https://cwiki.apache.org/confluence/display/Hive/LanguageManualb) [Spark] http://spark.apache.org/docs/latest/api/scala/index.html#packagec) [Spark SQL] https://spark.apache.org/docs/latest/sql-programming-guide.htmlhttps://spark.apache.org/docs/latest/api/scala/index.html#org...thttps://spark.apache.org/docs/latest/api/sql/index.htmlIf you are not sure what a spark API call does, try to write a small example and try it inthe spark shellThis assignment is due 10:00 a.m. on Friday 20th of May, 2022.Penalties are applied to late assignments (accepted up to 5 business days after the due dateonly). Five precent is deducted per business day late. A mark of zero will be assigned toassignments submitted more than 5 days late.This is an individual assignment. You are not permitted to work as a part of a group whenwriting this assignment.Submission checklist• Ensure that all of your solutions read their input from the full data files (not the smallexample versions)• Check that all of your solutions run without crashing in the docker containers that youused in the labs.• Delete all output files• Archive up everything into a single zip file and submit your assignment via LMSCopying, PlagiarismPlagiarism is the submission of somebody else’s work in a manner that gives the impressionthat the work is your own. For individual assignments, plagiarism includes the case where twoor more students work collaboratively on the assignment. The Department of ComputerScience and Computer Engineering treats plagiarism very seriously. When it is detected,penalties are strictly imposed.Expected quality of solutionsa) In general, writing more efficient code (less reading/writing from/into HDFS and lessdata shuffles) will be rewarded with more marks.b) This entire assignment can be done using the docker containers supplied in the labsand the supplied data sets without running out of memory. It is time to show yourskills!c) I am not too fussed about the layout of the output. As long as it looks similar to theexample outputs for each task. That will be good enough. The idea is not to spend toomuch time massaging the output to be the right format but instead to spend the time tosolve problems.d) For Hive queries. We prefer answers that use less tables.The questions in the assignment will be labelled using the following:• [Hive]o Means this question needs to be done using Hive• [Spark RDD]o Means this question needs to be done using Spark RDDs, you are not allowedto use any Spark SQL features like dataframe or datasets.• [Spark SQL]o Means this question needs to be done using Spark SQL and therefore you arenot allowed to use RDDs. In addition, you need to do these questions using thespark dataframe or dataset API, do not use SQL syntax.Assignment structure:• A script which puts all of the data files into HDFS automatically is provided for you.Whenever you start the docker container again you will need to run the following scriptto upload the data to HDFS again, since HDFS state is not maintained across dockerruns:$ bash put_data_in_hdfs.shThe script will output the names of all of the data files it copies into HDFS. If youdo not run this script, solutions to the Spark questions will not work since theyload data from HDFS.• For each Hive question a skeleton .hql file is provided for you to write your solution in.You can run these just like you did in labs:$ hive -f Task_XX.hql• For each Spark question, a skeleton project is provided for you. Write your solution inthe .scala file in the src directory. Build and run your Spark code using the providedscripts:$ bash build_and_run.shTips:Look at the data files before you begin each task. Try to understand what you aredealing with!For each subtask we provide small example input and the corresponding output in theassignment specifications below. These small versions of the files are also suppliedwith the assignment (they have “-small” in the name). It’s a good idea to get yoursolution working on the small inputs first before moving on to the full files.In addition to testing the correctness of your code using the very small example input.You should also use the large input files that we provide to test the scalability of yoursolutions.It can take some time to build and run Spark applications from .scala files. So for theSpark questions it’s best to experiment using spark-shell first to figure out a workingsolution, and then put your code into the .scala files afterwards.Task 1: Analysing Bank Data [38 marks total]We will be doing some analytics on real data from a Portuguese banking institution1. The datais stored in a semicolon (“;”) delimited format.The data is supplied with the assignment at the following locations:Small version Full versionTask_1/Data/bank-small.csv Task_1/Data/bank.csvThe data has the following attributesAttributeindexAttributenameDescriptionage numericjob type of job (categorical: "admin.", "unknown", "unemployed","management", "housemaid", "entrepreneur", "student",“blue-collar", "self-employed", "retired", "technician", "services")marital marital status (categorical: "married", "divorced", "single"; note:"divorced" means divorced or widowed)education (categorical: "unknown", "secondary", "primary", "tertiary")default has credit in default? (binary: "yes", "no")balance average yearly balance, in euros (numeric)housing has housing loan? (binary: "yes", "no")loan has personal loan? (binary: "yes", "no")contact contact communication type (categorical: “unknown", "telephone","cellular")day last contact day of the month (numeric)month last contact month of year (categorical: "jan", "feb", "mar", ...,"nov", "dec")duration last contact duration, in seconds (numeric)campaign number of contacts performed during this campaign and for thisclient (numeric, includes last contact)pdays number of days that passed by after the client was last contactedfrom a previous campaign (numeric, -1 means client was notpreviously contacted)previous number of contacts performed before this campaign and for thisclient (numeric)poutcome outcome of the previous marketing campaign (categorical:"unknown","other","failure","success")termdeposit has the client subscribed a term deposit? (binary: "yes","no"): Banking data source: http://archive.ics.uci.edu/ml/datasets/Bank+MarketingHere is a small example of the bank data that we will use to illustrate the subtasks below (weonly list a subset of the attributes in this example, see the above table for the description ofthe attributes):job marital education balance loanmanagement Married tertiary 2143 Yestechnician Divorced secondary 29 Yesentrepreneur Single secondary 2 Noblue-collar Married unknown 1506 Noservices Divorced secondary 829 Yestechnician Married tertiary 929 YesManagement Divorced tertiary 22 Notechnician Married primary 10 NoPlease note we specify whether you should use [Hive] or [Spark RDD] for each subtask at thebeginning of each subtask.a) [Hive] Report the number of clients of each job category. Write the results to“Task_1a-out”. For the above small example data set you would report the following(output order is not important for this question):"blue-collar" 1"entrepreneur" 1"management" 2"services" 1"technician" 3[8 marks]b) [Hive] Report the average yearly balance for all people in each education category.Write the results to “Task_1b-out”. For the small example data set you would reportthe following (output order is not important for this question):"primary" 10.0"secondary" 286.6666666666667"tertiary" 1031.3333333333333"unknown" 1506.0[8 marks]c) [Spark RDD] Group balance into the following three categories:a. Low: -infinity to 500b. Medium: 501 to 1500 =>c. High: 1501 to +infinityReport the number of people in each of the above categories. Write the results to“Task_1c-out” in text file format. For the small example data set you should get thefollowing results (output order is not important in this question):(High,2)(Medium,2)(Low,4)[10 marks]d) [Spark RDD] Sort all people in ascending order of education. For people with thesame education, sort them in descending order by balance. This means that all peoplewith the same education should appear grouped together in the output. For eachperson report the following attribute values: education, balance, job, marital, loan.Write the results to “Task_1d-out” in text file format (multiple parts are allowed). Forthe small example data set you would report the following:("primary",10,"technician","married","no")("secondary",829,"services","divorced","yes")("secondary",29,"technician","divorced","yes")("secondary",2,"entrepreneur","single","no")("tertiary",2143,"management","married","yes")("tertiary",929,"technician","married","yes")("tertiary",22,"management","divorced","no")("unknown",1506,"blue-collar","married","no")[12 marks] Task 2: Analysing Twitter Time Series Data [32 marks]In this task we will be doing some analytics on real Twitter data2. The data is stored in a tab(“\t”) delimited format.The data is supplied with the assignment at the following locations:Small version Full versionTask_2/Data/twitter-small.tsv Task_2/Data/twitter.tsvThe data has the following attributesAttributeindexAttribute name DescriptiontokenType In our data set all rows have Token type of hashtag. Sothis attribute is useless for this assignment.month The year and month specified like the following:YYYYMM. So 4 digits for year followed by 2 digits formonth. So like the following 200905, meaning the yearand month of Maycount An integer representing the number tweets of this hashtag for the given year and monthhashtagName The #tag name, e.g. babylove, mydate, etc.Here is a small example of the Twitter data that we will use to illustrate the subtasks below:Token type Month count Hash Tag Namehashtag 200910 2 babylovehashtag 200911 2 babylovehashtag 200912 90 babylovehashtag 200812 100 mycoolwifehashtag 200901 201 mycoolwifehashtag 200910 1 mycoolwifehashtag 200912 500 mycoolwifehashtag 200905 23 abchashtag 200907 1000 abc: Twitter data source: http://www.infochimps.com/datasets/twitter-census-conversatio...a) [Spark RDD] Find the single row that has the highest count and for that row report themonth, count and hashtag name. Print the result to the terminal output using println.So, for the above small example data set the result would be:month: 200907, count: 1000, hashtagName: abc[6 marks]b) [Do twice, once using Hive and once using Spark RDD] Find the hash tag name thatwas tweeted the most in the entire data set across all months. Report the total numberof tweets for that hash tag name. You can either print the result to the terminal oroutput the result to a text file. So, for the above small example data set the outputwould be:abc 1023[12 marks total: 6 marks for Hive and 6 marks for Spark RDD]c) [Spark RDD] Given two months x and y, where y > x, find the hashtag name that hasincreased the number of tweets the most from month x to month y. Ignore the tweetsin the months between x and y, so just compare the number of tweets at month x andat month y. Report the hashtag name, the number of tweets in months x and y. Ignoreany hashtag names that had no tweets in either month x or y. You can assume thatthe combination of hashtag and month is unique. Therefore, the same hashtag andmonth combination cannot occur more than once. Print the result to the terminaloutput using println. For the above small example data set:Input x = 200910, y = 200912Output hashtagName: mycoolwife, countX: 1, countY: 500For this subtask you can specify the months x and y as arguments to the script. This isrequired to test on the full-sized data. For example:$ bash build_and_run.sh 200901 200902[14 marks]Task 3: Indexing Bag of Words data [30 marks]In this task you are asked to create a partitioned index of words to documents that contain thewords. Using this index you can search for all the documents that contain a particular wordefficiently.The data is supplied with the assignment at the following locations3:Small version Full versionTask_3/Data/docword-small.txt Task_3/Data/docword.txtTask_3/Data/vocab-small.txt Task_3/Data/vocab.txtThe first file is called docword.txt, which contains the contents of all the documents stored inthe following format:AttributeindexAttribute name DescriptiondocId The ID of the document that contains the wordvocabId Instead of storing the word itself, we store an ID from thevocabulary file.count An integer representing the number of times this wordoccurred in this document.The second file called vocab.txt contains each word in the vocabulary, which is indexed byvocabIndex from the docword.txt file.Here is a small example content of the docword.txt file.docId vocabId count3 6003 7022 1205 2002 5001 1005 20004 1223 12001 1000: Data source: http://archive.ics.uci.edu/ml/datasets/Bag+of+WordsHere is an example of the vocab.txt filevocabId wordplanecarmotorbiketruckboatComplete the following subtasks using Spark:a) [spark SQL] Calculate the total count of each word across all documents. List thewords in ascending alphabetical order. Write the results to “Task_3a-out” in CSVformat (multiple output parts are allowed). So for the above small example input theoutput would be the following (outputs with multiple parts will be considered in order ofthe part number):boat,2200car,620motorbike,2502plane,1100truck,122Note: spark SQL will give the output in multiple files. You should ensure that thedata is sorted globally across all the files (parts). So, all words in part 0, will bealphabetically before the words in part 1.[8 marks]b) [spark SQL] Create a dataframe containing rows with four fields: (word, docId, count,firstLetter). You should add the firstLetter column by using a UDF which extracts thefirst letter of word as a String. Save the results in parquet format partitioned byfirstLetter to docwordIndexFilename. Use show() to print the first 10 rows of thedataframe that you saved.So, for the above example input, you should see the following output (the exact ...

April 4, 2023 · 12 min · jiezi

关于算法:MATH40082选进算法

MATH40082 (Computational Finance)Assignment No. 2: Advanced MethodsVersion 105453491 Background Theory1.1 Convertible BondsYou are asked to price a bond contract in which the holder has the option to choose between receivingthe principle F or alternatively receiving R underlying stocks with price S at time t = T . The contractcan therefore be expressed as a function of the underlying stock price S and time t, assuming that risk freeinterest rates are constant and the default risk of the bond is negligible. The terminal condition of such acontract is therefore given byV (S, T ) = max(F,RS) (1)The issuer of the bond has also decided to pay out a continuous coupon at the rate ofK(t) = Ce?t (2)for constants C and of their choosing. This means that the holder of the bond contract will receiveCe?tdt (3)at each instant in time from the issuer.Assume now that the risk-neutral process followed by underlying stock price is given bydS = ((t)? S) dt+ SdW. (4)Here is the mean reversion rate, is the elasticity of variance in the market, and the function is givenby(t) = (1 + )Xetwhere X and are constant model parameters that can be determined from the market, reflecting thedividend payout policy of the firm.It is relatively straightforward to show that the market value V (S, t) of this contract satisfies the followingfor some functions A and B to be derived.11.2 Options embedded in the contractIn this section we consider the case where the firm issuing the bond contract looks to embed further Americanstyle options into the contract. The first addition they make to the contract is to enable the holder to exercisethe decision to convert the bond in stock at any time before the maturity of the contract. This results inan American style condition which gives the inequalityV ≥ RS (8)for all t < T .From this condition we can assume that for large enough S the bond holder will always choose to convertso thatV (S, t)→ RS as S →∞ (9)is the boundary condition for large S.Call OptionIf the bond has written in the contract that the issuer may buy back the bond at the price Cp over sometime period t < t0 then the value of the bond must satisfy the following conditionV (S, t) ≤ max(CP , RS) if t ≤ t0. (10)For this condition to work we are assumming that the holder is allowed to choose to convert rather than bebought out.Put OptionAnother option often written in a bond contract is that the holder may have the option to sell the bondback to the issuer at the price Pp over some time period t < t0. Then in this case the value of the bond mustsatisfy the following conditionV (S, t) ≥ Pp if t ≤ t0. (11)Default BarrierSometimes the bond holders are given an option to force the company to buy back the bond if the value ofequity goes below some pre-specified level and the firm is at risk of bankruptcy. We assume here that theyonly have the option to force the sale at the price Kp during some fixed time period t < t0, and that thebarrier B below which S must not go is written in the contract. In this case we find the value of the bondmust satisfy the following conditionV (S, t) = Kp if S ≤ B and t ≤ t0. (12)2 Tasks2.1 European Options Include in your report a brief derivation of the boundary condition (7) for large S. You will need tofind the functions A(t) and B(t) by solving the approximate problemassuming that the solution to this equation is of the formV (S, t) = SA(t) +B(t).(understanding 5 marks)2 Write code to calculate the value of the option V . You must use the finite-difference method witha Crank-Nicolson scheme, along with an appropriate method to solve the algebraic system. You canderive an analytical result when = 1 and = 0 to check that your code is working (this is left up toyou to check), but do not include these results in your report. Write out the correct numerical scheme(i.e. aj =, bj =, cj = and dj =), including at j = 0 and j = jMax, in your report. Be careful tomake your notation clear and understandable.(coding 3 marks, understanding 5 marks)Unless otherwise instructed, you should assume that the following standard values for the parametersapply: T = 2, F = 170, R = 4, r = 0.0229, = 0.125, = 0.0146, X = 46.47, C = 1.95, = 0.01, = 0.197 and = 7.11.Plot out the value of the option V (S, t) as a function of the underlying asset price S (at t = 0) for thefollowing two cases: ...

April 4, 2023 · 9 min · jiezi

关于算法:COMS-W4115翻译器的实现原理

COMS W4115 Programming Languages and TranslatorsHomework Assignment 3Submit this assignment online via Courseworks as a PDF file. Fill in or annotate this PDF or print it out, write on it, and scan it.Please keep your answers in the boxes.Do this assignment alone. You may consult the instructor and the TAs, but not other students.Name: Uni: (20 pts.) For the following C array on a processor with the usual alignment rules,i n t a [ 2 ] [ 3 ] ;(a) Show the order in which its elements are arranged in memory.(b) Write an expression for the byte address of ai in terms of a (the address of the start of the array), i , and j .(c) Verify parts a) and b) by writing a small C program that tests your hypothesis. Examine the assembly language output withthe C compiler’s -S flag (e.g., gcc -O -S array.c). Such a program should be simple and contain and access such an array,but not be so simple that the compiler optimizes most of it away. On the next page, include in an annotated assemblylisting that explains how it verifies your hypothesis. Make sure the assembly listing is no more than about 40 lines, eitherby simplifying your program or trimming the output.C program:1Assembly listing:(20 pts.) For a 32-bit little-endian processor with the usual alignment rules, show the memory layout and size in bytes of thefollowing three C variables.union {s t r u c t {char a ; / 8 - b i t /i n t b ; / 32 - b i t /shor t c ; / 16 - b i t /} s ;s t r u c t {i n t d ; / 32 - b i t /shor t e ; / 16 - b i t /} t ;} u1 ;Layout:Size in bytes:s t r u c t {char a ;i n t b ;shor t c ;shor t d ;} s1 ;Layout:Size in bytes:s t r u c t {shor t a ;char b ;shor t c ;char d ;shor t e ;} s2 ;Layout:Size in bytes:(20 pts.) Draw the layout of the stack just before bar is called in foo. Indicate storage for function arguments, local variables,return addresses, and stored frame pointers. Indicate where the stack and frame pointers point.void bar ( i n t x , i n t y , i n t z ) ;void foo ( i n t a , i n t b) {i n t d , e ;bar (2 , 5 , 7 ) ;}(20 pts.) Draw the layouts of s1 and s2 and the virtual tables for the Ellipse and Square classes.pub l i c c l a s s Shape {double x , y ;pub l i c double area ( ) { . . . }}c l a s s E l l i p s e extends Shape {p r i va t e double height , width ;pub l i c double area ( ) { . . . }}c l a s s Square extends Shape {p r i va t e double width ;pub l i c double area ( ) { . . . }}pub l i c c l a s s Main {pub l i c s t a t i c void main ( ) {Shape s1 = new Square (10 , 3 , 1 4 ) ;Shape s2 = new E l l i p s e (3 , 8 , 2 , 6 ) ;System . out . p r i n t l n ( s1 . area ( ) ) ;}}Square Virtual Table:s1 object:Ellipse Virtual Table:s2 object:(20 pts.) For the program below written in a C-like language with nested function definitions,void main ( ) {i n t x = 5 ;void bar ( ) {x = x + 2 ;}void foo ( ) {i n t x = 8 ;bar ( ) ;p r i n t f ( "%d\n" , x ) ;}foo ( ) ; / Body o f main ( ) /}What would it print if the language used static scoping?What would it print if the language used dynamic scoping?

April 4, 2023 · 4 min · jiezi

关于算法:书籍推荐深度学习中的正则化

书籍:Regularization in Deep Learning 作者:Peng Liu 出版:MANNING 入群邀请: 7个业余方向交换群+1个材料需要群 原文地址: 书籍举荐-《深度学习中的正则化》 01 书籍介绍深度学习中的正则化将教您如何应用正则化技术工具箱来进步模型性能。它涵盖了公认的正则化办法和开创性的古代办法。每种技术都是通过图形、插图和循序渐进的代码演示来介绍的,这些演示能够使简单的数学更容易了解。 在本书中,您将学习如何用随机噪声加强数据集,改良模型的架构,在优化过程中如何利用正则化。您很快就会构建专一的深度学习模型,防止宏大的复杂性,即便应用新的或凌乱的数据集也能提供更精确的后果。这些实用的正则化技术进步了训练效率,并有助于防止过拟合误差,使您的深度学习模型更具通用性和适应性! 02 作者介绍Peng Liu是一位经验丰富的数据科学家,专一于高性能机器学习模型在生产中的利用钻研和开发。他领有新加坡国立大学统计学博士学位,并作为大学客座讲师传授高级剖析课程。他专门钻研深度学习的统计方面。 主页: https://faculty.smu.edu.sg/profile/liu-peng-6471  03 书籍纲要 1.  书籍举荐-《Modern CMake for C++》中文版&英文版 2.  深度解读深度学习在主动驾驶规控中的利用 3. 书籍举荐-《深度学习的数学了解》 4.  书籍举荐-《卡尔曼滤波与信息交融》 5.  书籍举荐-《计算机视觉的特征描述》 6.  书籍举荐-《机器人手册》第二版

April 4, 2023 · 1 min · jiezi

关于算法:ACM算法竞赛日常训练DAY10题解与分析月月给华华出题华华给月月出题-筛法-欧拉函数-数论

DAY10共2题: 月月给华华出题华华给月月出题难度较大。 作者:Eriktse 简介:211计算机在读,现役ACM银牌选手力争以通俗易懂的形式解说算法!❤️欢送关注我,一起交换C++/Python算法。(优质好文继续更新中……) 原文链接(浏览原文取得更好浏览体验):https://www.eriktse.com/algorithm/1104.html在做明天这两道题之前,强烈建议先看这篇文章《【ACM数论】和式变换技术,兴许是最好的解说之一》。 月月给华华出题题目传送门:https://ac.nowcoder.com/acm/problem/23048 当N = n时,咱们能够失去以下式子: $$ans_n = \sum_{i=1}^{n}\frac{i}{gcd(i, n)}$$ 依据咱们的教训,在gcd不不便确定的状况下,能够新增枚举变量,即新增一个d变量来枚举gcd(i, n),如下: $$\sum_{d|n}\sum_{i=1}^{n}[gcd(i, n) = d]\frac{i}{d}$$ 接下来令i = id,失去上面的式子: $$\sum_{d|n}\sum_{i=1}^{\frac{n}{d}}i[gcd(i,\frac{n}{d})=1]$$ 无妨将n/d间接变为d,这个对后果是没有影响的,因为枚举的都是n的因子罢了。 $$\sum_{d|n}\sum_{i=1}^{d}i[gcd(i,d)=1]$$ 前面这一坨的后果是: $$\sum_{i=1}^{d}i[gcd(i,d)=1] = \frac{d \times \phi(d)}{2}$$ 简略证实:咱们晓得gcd(i, n) = gcd(n - i, n),所以和n的gcd相等的数总是对称呈现的,因而若gcd(i, n) = 1,则必然有gcd(n - i, i) = 1,也就是说和n互质的所有数的平均值为n/2,将平均值乘上个数phi[n]即为“与n互质的所有正整数之和”。留神当n=1时,该当非凡解决,因为此时n - 1 = 1会产生计数缺失。而对于n > 1的状况,如果要满足n - i = i则n为偶数,而此时n / 2必然不与n互质,所以计数是精确的。于是最终后果为: $$ans_n=\sum_{d|n}\frac{d \times \phi(d)}{2}$$ 用欧拉筛筛出phi(欧拉函数),而后枚举d,向d的所有倍数加上奉献即可。 #include <bits/stdc++.h>#define int long longusing namespace std;const int N = 1e6 + 9;int phi[N], ans[N];//phi[n] = n * ((p1 - 1) / p1) * ((p2 - 1) / p2) * ... * ((pk - 1) / pk),其中p为不同的质数void init(int n){ bitset<N> vis; vector<int> prim; //初始化vis[1]和phi[1] vis[1] = true, phi[1] = 1; for(int i = 2;i <= n; ++ i) { //当i没被筛掉,阐明是一个质数,退出prim数组中并设置phi[i] = i - 1 if(!vis[i])prim.push_back(i), phi[i] = i - 1; //上面这个循环在更新i * prim[j]的一些属性 for(int j = 0;j < prim.size() && i * prim[j] <= n; ++ j) { vis[i * prim[j]] = true;//乘上了一个质数,那么i * prim[j]必定不是质数了 if(i % prim[j] == 0) { //此时i外面曾经蕴含prim[j],阐明i * prim[j]没有呈现新的质因子 phi[i * prim[j]] = phi[i] * prim[j]; break; } phi[i * prim[j]] = phi[i] * (prim[j] - 1); } }}signed main(){ int n;scanf("%lld", &n); init(n); for(int i = 2;i <= n; ++ i)//枚举所有d = i { for(int j = 1;i * j <= n; ++ j)//枚举所有d的倍数 i * j { ans[i * j] += i * phi[i] / 2; } } //这里答案 + 1是加上当d = 1时的后果 for(int i = 1;i <= n; ++ i)printf("%lld\n", 1 + ans[i]); return 0;}华华给月月出题题目传送门:https://ac.nowcoder.com/acm/problem/23047 ...

April 4, 2023 · 3 min · jiezi

关于算法:CS265网络算法程序

CS265 Computer Networking:Instructions: Complete the following problem. There are 100 total points. If a problem has multiple parts,they are equi-valued. Please use a word processor or text editor for solutions, and submit as a pdf file viablackboard.Problem 1 Imagine two autonomous systems ASX and ASY , and assume all of the following:• ASY has been allocated a prefix PY containing a host H’s public IP address.• A router R resides within the domain of ASX.• ASX uses RIP for intra-AS routing, a shortest AS-PATH policy for inter-AS routing, and an intra-AShot-potato policy to break AS-PATH ties for inter-AS routing.• ASX contain three gateways, G1, G2, and G3.• R has three interfaces, I1, I2, and I3, and entries (G1, I1) and (G2, I2), in its forwardingtable.• G1 is fewer hops away than G2 along their respective shortest paths from R, though the path to G1traverses an older set of links with lower average throughput.• At time Tj the only path to prefix PY to have been advertised to ASX is ASB ASC ASY withNEXT-HOP address G7.Answer each of the following:a. If G7 has an inter-AS peering link with ASX only to G2, to which interface will R forward datagramsaddressed to H?b. Imagine that at some time Tk, which is after Tj, a new BGP advertisement is received at ASX forprefix PY , with AS-PATH ASD ASE ASY and NEXT-HOP G8. Furthermore, G8 has a peeringlink with both G2 and G1. Should R’s forwarding table be updated based on this new information?If so, how?c. Suppose that instead of RIP, ASX uses OSPF with an average throughput metric for routing pathcosts. Should R’s forwarding table be updated after receipt of the new advertisement at time Tk inthis case?2Answer the following:a. Was Google actively malicious in this incident?b. Why was Japan most severely affected by the incident?c. What red flag was missed by Verizon that could have decreased the severity of thisincident? ...

April 4, 2023 · 2 min · jiezi

关于算法:CPT206-财政计算

CPT206 Computer Programming for Financial Mathematics:Coursework 3 Task SpecificationThomas SeligSet: Saturday, 30 April, 2022Due date: Sunday, 22 May, 2022, 10pmThis is the specification task sheet for the Coursework 3 assessment component of your CPT206module. The task covers all Learning Outcomes, and has a weighting of 70% towards the finalgrade for this module. This assignment has two parts: a coding part described in Section 1, and areport described in Section 2. The submission deadline for this assignment is Sunday, 22 May,2022, at 10pm. Detailed submission instructions are provided in Section 3.1 Program description (65 marks)The aim of this coursework is to build a program to store, manipulate, and retrieve informationfrom the evolution of stock prices over time. All the work should be coded into a single JavaNetBeans project, with the class structure and different functionalities of the program described asfollows. All classes should be properly encapsulated, as seen in the Lectures and Labs throughoutthe semester. Your project should also contain a Controller class for testing.1.1 PricePoint class (10 marks)The basic building block of the program will be a simple PricePoint data class. A PricePointobject comprises of a pair (t, p(t)), where t is a time coordinate, and p(t) the price at time t.PricePoint objects should be compared according to their time coordinate.1.2 PriceData class (25 marks)A PriceData object stores a collection of PricePoint objects. You should choose a suitable datastructure in the Java collection framework for this, according to the following conditions.• There should be no duplicate time coordinates among the PricePoint objects in the collection.• The collection should be maintained in increasing order of time coordinates.Leave a comment in your code explaining your choice of data structure for this.In addition, your PriceData class should have the following functionalities. ...

April 4, 2023 · 7 min · jiezi

关于算法:STAT3022线性表达式模型

Multiple Linear Regression Models - Part 2Residual Diagnostics, Unusual observationsSTAT3022Applied linear modelsRegression DiagnosticsBackgroundRecall the MLR modely = X + , E(y) = X, Var(y) = Var() = 2InAssuming the design matrix X is full-ranked,BackgroundSimilar to model diagnostics for SLR, diagnostic for MLR is basedon the residuals, which depends critically on the hat matrix H. H is symmetric, i.e H> = H. As a result, the matrix In ?His also symmetric. Next, HX = X. As a result, (In?H)X = X?X = 0. Third, H2 = H, so we say H is idempotent. As a result, thematrix In ?H is also idempotent. Finally, as proved in the Tutorial 4, trace(H) =∑ni=1 hii = p.2Residual vector? First, let’s compute its expectation:E(e) = E {(In?H)y} = (In?H)E(y) = (In?H)X = 0.? Second, let’s compute the variance-covariance matrix.Var(e) = Var {(In?H)y} = (In?H) Var(y)(In?H)>= (In?H)2 In(In?H) = 2(In?H)(In?H)= 2(In?H),i.e Var(ei) = 2(1? hii), Cov(ei, ej) = ?2hij .These computation tell us that (1) each residual term ei has asmaller variance than the true error i, and (2) these residuals arecorrelated.3Residuals plotsWe can use similar residual plots similar to in the case of simplelinear regression for model diagnostics. Specifically, To check constant variance assumption: Use the plot ofresidual ei vs. fitted values y?i or the plot of residual vs. eachcovariate. no news is good news. To check normality assumption: Use normal quantile-quantileplot, or normality test.4A reasonable constant-variance A distinct characteristic of MLR compared to SLR is that theyhave more than one predictor. As such, the intercorrelationbetween predictors play important roles in the estimatedcoefficients as well as inference of the MLR. Such intercorrelation is known as multicollinearity (multi:many; collinear: linear dependence). We will study three cases: ...

April 4, 2023 · 4 min · jiezi

关于算法:基于Labelstudio的UIE半监督智能标注方案本地版

基于Labelstudio的UIE半监督智能标注计划(本地版)更多技术细节参考上一篇我的项目,本篇次要偏重本地端链路走通教学,提速提效: 基于Labelstudio的UIE半监督深度学习的智能标注计划(云端版),提效 更多内容参考文末码源 自然语言解决信息抽取智能标注计划包含以下几种: 基于规定的标注计划:通过编写一系列规定来辨认文本中的实体、关系等信息,并将其标注。 基于规定的标注计划是一种传统的办法,它须要人工编写规定来辨认文本中的实体、关系等信息,并将其标注。这种办法的长处是易于了解和实现,但毛病是须要大量的人工工作,并且规定难以笼罩所有状况。基于机器学习的标注计划:通过训练模型来自动识别文本中的实体、关系等信息,并将其标注。 基于机器学习的标注计划是一种自动化的办法,它应用曾经标注好的数据集训练模型,并应用模型来主动标注文本中的实体、关系等信息。这种办法的长处是能够解决大量的数据,并且能够自适应地调整模型,但毛病是须要大量的标注数据和计算资源,并且模型的性能受到标注数据的品质和数量的限度。基于深度学习的标注计划:通过应用深度学习模型来自动识别文本中的实体、关系等信息,并将其标注。 基于深度学习的标注计划是一种最新的办法,它应用深度学习模型来主动从文本中提取实体、关系等信息,并将其标注。这种办法的长处是能够解决大量的数据,并且具备较高的准确性,但毛病是须要大量的标注数据和计算资源,并且模型的训练和调试须要业余的常识和技能。基于半监督学习的标注计划:通过应用大量的手工标注数据和大量的未标注数据来训练模型,从而实现主动标注。 基于半监督学习的标注计划是一种利用大量的手工标注数据和大量的未标注数据来训练模型的办法。这种办法的长处是能够利用未标注数据来进步模型的性能,但毛病是须要大量的未标注数据和计算资源,并且模型的性能受到标注数据的品质基于近程监督的标注计划:利用已知的知识库来主动标注文本中的实体、关系等信息,从而缩小手工标注的工作量。本次我的项目次要解说的是基于半监督深度学习的标注计划。 1.智能标注本地版 Machine Learning 集成教学1.1 本地启动 Label Studio装置label-studio: #创立名为label_studio的虚拟环境(示例的Python版本为3.8)conda create -n labelstudio python=3.8#激活虚拟环境conda activate labelstudio#pip装置label-studio (version=1.7.2)pip install label-studio==1.7.21.2 启动 Machine Learning Backend 在终端中顺次执行下列命令: #装置label-studio机器学习后端,dirname为放代码的文件夹门路cd dirnamegit clone https://github.com/heartexlabs/label-studio-ml-backend#装置label-studio及其依赖cd label-studio-ml-backendpip install -U -e .#(可选) 装置label-studio中examples运行所需的requirementspip install -r label_studio_ml/examples/requirements.txt创立与启动模型:定义模型在应用label-studio后端之前,要先定义好本人的训练模型,模型的定义须要继承自label-studio指定的类,具体可参考第四节。创立后端模型:依照要求创立好的模型文件的门路假如为/Users/kyrol/Desktop/my_ml_backend.py,终端中执行以下命令: # 初始化自定义机器学习后端label-studio-ml init my_ml_backend --script /Users/kyrol/Desktop/my_ml_backend.py#命令执行结束会在以后文件夹下创立名为 my_ml_backend 的文件夹, 外面放有 my_ml_backend.py, _wsgi.py 等内容。#其中,_wsgi.py是要运行的python 主文件,能够查看外面内容。留神:同时须要把依赖文件放入my_ml_backend.py文件夹。# 开启机器学习后端服务label-studio-ml start my_ml_backend胜利启动后,在终端中能够看到 ML 后端的 URL。 1.3 模型配置与训练开启可视化窗口,再开启一个终端窗口,首先,激活conda对应的环境;而后,cd 到label-studio代码所在门路;而后,执行以下终端命令,启动可视化的窗口: 在启动自定义机器学习后端之后,就能够将其增加到 Label Studio 我的项目中。 ...

April 3, 2023 · 9 min · jiezi

关于算法:几种排序算法优缺点

一、疾速排序(Quick Sort)疾速排序采纳分治法。首先从数列中挑出一个元素作为两头值。顺次遍历数据,所有比两头值小的元素放在右边,所有比两头值大的元素放在左边。而后按此办法对左右两个子序列别离进行递归操作,直到所有数据有序。最现实的状况是,每次划分所抉择的两头数恰好将以后序列简直等分(平均排布),整个算法的工夫复杂度为O(n logn)。 最坏的状况是,每次所选的两头数是以后序列中的最大或最小元素(正序和逆序都是最坏),整个排序算法的工夫复杂度为O(n²)。 均匀工夫复杂度为O(n logn),空间复杂度为O(logn),是一种不稳固的排序算法。 附算法实现源码: //疾速排序template <class T>int Partition(T data[],int left,int right){ T pivot=data[left]; while(left<right) { while(left<right&&data[right]>pivot) right--; data[left]=data[right]; while(left<right&&data[left]<=pivot) left++; data[right]=data[left]; } data[left]=pivot; return left;}template <class T>void QuickSort(T data[],int left,int right){if(left<right){ int p=Partition(data,left,right); QuickSort(data,left,p-1); QuickSort(data,p+1,right);}}二、、抉择排序(Selection Sort)遍历所有数据,先在数据中找出最大或最小的元素,放到序列的起始;而后再从余下的数据中持续寻找最大或最小的元素,顺次放到序列中直到所有数据有序。原始数据的排列程序不会影响程序消耗工夫O(n²),绝对费时,不适宜大量数据排序。 均匀工夫复杂度为O(n²),空间复杂度为O(1),是一种不稳固的排序算法。 附算法实现源码: //抉择排序template <class T>void SelectionSort(T data[],int n){ for(int i=1;i<n;i++) { int k=i-1; for(int j=i;j<n;j++) { if(data[j]<data[k]) { k=j; } } if(k!=i-1) { T t=data[k]; data[k]=data[i-1]; data[i-1]=t; } }}三、插入排序(Insertion Sort)将前i个(初始为1)数据假设为有序序列,顺次遍历数据,将以后数据插入到前述有序序列的适当地位,造成前i+1个有序序列,顺次遍历完所有数据,直至序列中所有数据有序。数据是反序时,消耗工夫最长O(n²);数据是正序时,消耗工夫最短O(n)。实用于局部数据曾经排好的大量数据排序。 ...

April 1, 2023 · 1 min · jiezi

关于算法:吟游诗人在线调戏谷歌新出的AI机器人Bard

一周前加了Bard的waitlist,昨天晚上终于收到Google的邮件告诉,明天有工夫来拷打 调戏一下。先问下用什么训练的。答复比拟抽象。 隔壁的GPT-3.5还是用的2021年前的物料。看来Bard感觉本人更行。 那就问问往年的奥斯卡吧。如同不太对。是2022年的。 果决认怂。这次最佳影片并没有说错。 问问其它奖项吧。如同对了。 然鹅前面的怎么deja vu了? 《黑白魔女库伊拉》可不是去年的电影 跟它讲至多还有两部电影错了,它还真就只找了两部。沙丘: ??? 既然叫Bard,那问问诗的事吧。是挺有名的,不过为啥只有一半? 讲讲李白 果然不懂中文。 写不了代码 但乘法还是会做的。 再简单行的吗?如同不行。那讲讲个别思路呢? 它如同没有搞懂上下文,开始胡诌了。 单词对话仿佛没有字数限度。 即便很长的对话也能追溯上下文(真的吗) 还没有对外的API。 嗯,挺实诚的呢。 === 全文完 ===

March 31, 2023 · 1 min · jiezi

关于算法:FIT5216-模型离散优化问题

FIT5216: Modelling Discrete Optimization ProblemsAssignment 2 (extended): Testing Vaccines1 OverviewFor this assignment, your task is to write a MiniZinc model for a given problem specification.• Submit your work to the MiniZinc auto grading system (using the submit button in theMiniZinc IDE). You must submit via the IDE to be graded and receive marks.• Submit your model (copy and paste the contents of the .mzn file) using the Moodle assignment.You have to submit by the due date (1st May 2022, 11:55pm), using MiniZinc and using theMoodle assignment, to receive full marks. You can submit as often as you want before the due date.Late submissions without special consideration receive a penalty of 10% per day. Submissions arenot accepted more than 3 days after the original deadline.This is an individual assignment. Your submission has to be entirely your own work. Wewill use similarity detection software to detect any attempt at collusion, and the penalties arequite harsh. If in doubt, contact your teaching team with any questions!2 Problem StatementIn the race to find an effective vaccine for COVID-19 for the many new strains developing theWorld Health Organization is a trialing different vaccines combinations.1. The vaccines are givenas an enumerated type:enum VACCINE;To determine the effectiveness of vaccines and their combinations we need to test them on differentsubsets of the population, to determine efficacy and any other details. Its important to also testplacebos, to see what effect is just from having any treatment. So we are guaranteed to a value inthe set representing the placebo vaccine (an injection of sterile water)VACCINE: placebo;The placebo vaccine is guaranteed to occur before any real vaccine in the type:constraint assert(forall(v in VACCINE)(placebo <= v),"vaccine (v) appears before placebo\n"));The number of test populations which will undergo different vaccine combinations is defined byN.int: N; % number of test populationsset of int: POP = 1..N;1This is a fictional scenario, and completely made up! But this type of problem, called an experiment design,is quite common in practice.1The testing will happen over a W week trial.int: W; % number of weeksset of int: WEEK = 1..W;We have to determine for each population a treatment plan, indicating what activity is plannedfor each week. The possibly treatments for every member in the population are: WAIT do nothing;VAX give a vaccination shot; PCR give a PCR test; RAT give a RAT test; and SAT give a SATtest. Each treatment has an associated cost. This is defined in MiniZinc asenum TREATMENT = { WAIT, VAX, PCR, RAT, SAT };array[TREATMENT] of int: cost; % cost of a treatment to the populationThe key decisions are for each population, what is the treatment plan and what are the set ofvaccinations they get:array[POP,WEEK] of var TREATMENT: schedule;array[POP] of var set of VACCINE: vaccinations;The constraints on the assignment are as follows: ...

March 31, 2023 · 7 min · jiezi

关于算法:龙蜥白皮书精选基于-SM4-算法的文件加密fscrypt实践

文/张天佳 通常咱们会以文件作为数据载体,应用磁盘,USB 闪存,SD 卡等存储介质进行数据存储,即使数据曾经离线存储,依然不能保障该存储介质不会失落,如果失落那么对于咱们来说有可能是灾难性的事件。因而对这些离线存储的重要数据文件进行加密是十分有必要的,本节将介绍如何应用国密算法加密文件系统中的文件。 01 fscrypt 简介内核中的 fscrypt 是一个库,文件系统能够应用它以反对文件和目录的通明加密。 与 dm-crypt 不同,fscrypt 在文件系统级别而不是块设施级别运行。这容许它应用不同的密钥加密不同的文件,并在同一文件系统上领有未加密的文件。这对于多用户零碎十分有用,在该零碎中,每个用户的静态数据都须要与其余用户进行加密隔离。除了文件名,fscrypt 不加密文件系统的元数据。 与作为栈式文件系统的 eCryptfs 不同,fscrypt 是间接集成到反对的文件系统中,目前反对 fscrypt 的文件系统是 ext4、F2FS 和 UBIFS。fscrypt 容许读取和写入加密文件,而无需在页面缓存中同时缓存解密和加密页面,从而将应用的内存简直减半并使其与未加密文件保持一致。同样,须要一半的 dentry 和 inode。eCryptfs 还将加密文件名限度为 143 字节,从而导致应用程序兼容性问题;fscrypt 容许残缺的 255 个字节 (NAME_MAX)长度的文件名。最初,与 eCryptfs 不同,fscrypt API 能够由非特权用户应用,而无需依赖其它任何组件。 fscrypt 不反对就地加密文件。相同,它反对将空目录标记为已加密。而后,在用户空间提供密钥后,在该目录树中创立的所有惯例文件、目录和符号链接都将被通明地加密。 02 反对的加密模式和用法fscrypt 容许为文件内容指定一种加密模式,为文件名指定一种加密模式。不同的目录树容许应用不同的加密形式。目前反对以下几种加密形式对: AES-256-XTS 算法用于加密内容,AES-256-CTS-CBC 算法用于加密文件名AES-128-CBC 算法用于加密内容,AES-128-CTS-CBC 算法用于加密文件名Adiantum 算法同时用于加密文件内容和文件名AES-256-XTS 算法用于加密内容,AES-256-HCTR2 算法用于加密文件名(仅限 v2 策略)SM4-XTS 算法用于加密内容,SM4-CTS-CBC 算法用于加密文件名(仅限 v2 策略)AES-128-CBC 仅为具备不反对 XTS 模式的加速器的低功耗嵌入式设施应用。要应用 AES-128-CBC,必须启用 CONFIG_CRYPTO_ESSIV 和 CONFIG_CRYPTO_SHA256(或其余 SHA-256 实现)以便应用 ESSIV。 Adiantum 是一种基于流明码的模式,即便在没有专用加密指令的 CPU 上也很快。与 XTS 不同,它也是真正的宽块模式。要应用 Adiantum,必须启用 CONFIG_CRYPTO_ADIANTUM。此外,应启用 ChaCha 和 NHPoly1305 的疾速实现,例如 ARM 架构上的 CONFIG_CRYPTO_CHACHA20_NEON 和 CONFIG_CRYPTO_NHPOLY1305_NEON。 ...

March 31, 2023 · 2 min · jiezi

关于算法:ACS130-C语言算法项目题

ACS130 Introduction to Systems Engineering and SoftwareC Programming ProjectAssignment weighting: 25% of module markAssignment released: Friday 1 April (Semester 2, week 8)Assignment due: 11.59pm Monday 9 May (Semester 2, Week 11)Submission (all dropboxes are in the ‘Assessment’ tab): Submit your .c file to the submission dropbox entitled. “C Program .c file”. Do not copy your.c file into any other format. Do not submit .exe files to the dropbox.Submit any text files to “Supplementary material”.Please make sure your code runs and it must run on Codeblocks or XCode if using a Mac.Marking and Feedback: You will be required to attend a 1-1 viva to demonstrate your code. Wewill download your code from Blackboard to run. You will get oral feedback during this markingsession. A day and time will be provided for everyone closer to the time. Provisional marks maychange as a result of unfair means and/or late submissions. If you don’t turn up to the markingsession, then your code will not be marked.Assignment briefing: This assignment will assess your ability to solve a problem using C as atool. Your program must be well laid out and commented, including a meaningful header comment.Your program must provide meaningful interactions with the user. You must also use all of thefollowing C programming elements:• selection (if or switch case),• repetition (for or/and while or/and do while)• at least one function with an array of struct in the input parameter list; the array of struct isdeclared in main (or another function) and this function must set values in the array of struct,and the calling function (main or other) must make use of the array of struct on return (youchoose whether to pass by value or pass by address)• at least one function with a pointer to a variable of type char, int, float or double in theparameter list; the function must set the contents of the variable, and the calling function(main or other) must make use of the variable on return• your program needs several instances where the user must input commands from thekeyboard in order to interact with the program• your program must loop to the start allowing the user to run through the program time aftertime (this is an independent requirement from the second point to use repetition)Optional:o you can create other functions and/or structures so as to modularise your code (but you musthave the 2 functions mentioned above).o you can use anything else we have covered in this module, eg, arrays/strings, an input and/oroutput fileYou need to decide on what problem to solve in your program. Do NOTuse one of the problems from the ACS130 lectures, labs, previousassignments or past papers. The 3-5pm Friday 29th April session (week 9): I will be available if youwant to discuss your problem with me. During this session I will not belooking at individual code or helping you with debugging, but I will beavailable to answer general questions or clarify certain concepts withrespect to lecture slides and tutorial questions.Instructions to the screen need to be clear so that I can successfully understand howyour code runs!Component Marks CommentsDoes the program use any global variables? Does the program use while(1)loop/s? Does the program use break (other than in switch case), go to, jump,continue etcYes/no (If yes, 50% penalty)Program layout and commenting:• Meaningful header comment (including a detailed description of code)• sufficient comments interspersed within code• Good indentation• Meaningful variable names /2Marks available for executing program only:Is there sufficient interaction between user and program? Yes/NoInteraction with the userThe user needs to be able to interact several times with the program.• Is the user given prompts to enter any information that is needed fromthe keyboard?• Is the information displayed to the user easy to understand, includingany required ranges?• Does the program loop to start?If No, 0 for thissection ...

March 31, 2023 · 7 min · jiezi

关于算法:ICSI311-编程原理

COLLEGE OF ENGINEERING AND APPLIED SCIENCESDEPARTMENT OF COMPUTER SCIENCEICSI311 Principles of Programming LanguagesTable of ContentsPart I: General information ………………….………………………………………………………………………...……… 02Part II: Grading Rubric ….…………………………………………………………………………………………...…………... 03Part III: Description ……………...…………….…………………………………………………………………………….……. 03 2Part I: General Information• All assignments are individual assignments unless it is notified otherwise.• All assignments must be submitted via Blackboard. No late submissions or e-mail submissions or hardcopies will be accepted.• Unlimited submission attempts will be allowed on Blackboard. Only the last attempt will be graded.• Work will be rejected with no credit if The work is late. The work is not submitted properly (Blurry, wrong files, crashed files, files that can’t open, etc.). The work is a copy or partial copy of others' work (such as work from another person or theInternet).• Students must turn in their original work. Any cheating violation will be reported to the college.Students can help others by sharing ideas and should not allow others to copy their work.• Documents to be submitted:o Scheme source file(s) with inline comments with file extension .rkto Test case document with file extension .PDF• Students are required to submit a design, all the error-free source files with Javadoc style inlinecomments and supporting files. Lack of any of the required items or programs with errors will result ina low credit or no credit.• Grades and feedback: TAs will grade. Feedback and grades for properly submitted work will be postedon Blackboard. For questions regarding the feedback or the grade, students should reach out to theirTAs first. Students have limited time/days from when a grade is posted to dispute the grade. Checkemail daily for the grade review notifications sent from the TAs. Any grade dispute request after thedispute period will not be considered. Part II: Grading RubricSee the requirements in part III.Part III: DescriptionGoals:• Review and develop a deep and comprehensive understanding of functional programming• Review and develop a deep and comprehensive understanding of the divide-and-conquer technique andprogramming technique such as recursion• Review and develop a deep and comprehensive understanding of data structures such as binary search treesInstructions:Please read the following requirements carefully. A program that doesn’t meet the following requirements may bereturned with 0 points awarded.• Use Scheme as a pure functional programming language.• No other built-in functions are allowed except the basic functions such as the ones listed below.o Numeric Predicate Functions: =, <>, > , <, >= , <=o CAR, CDR, CONS, append• Use recursion, and not iteration.• Adequate comments must be provided for each logical block in a function for all functions in a program.• Adequate comments must be provided for the entire program.• Codes must be formatted properly using indentations to enhance readability.• The program must be tested completely.o First, test each function completely. Choose two various test cases at least, explain how to arrive at the output. List the test cases and their results in comment format at the end of the program (for grading). Write the explanations in a separate document using the template provided. You must namethis document using the following naming convention: FirstNameLastNameBSTTestCases andsubmit a PDF for this document. A Word template is provided.o Next, test the entire program by invoking the functions in logical order. Choose two various test cases at least, explain how to arrive the output. List the test cases and their results in comment format at the end of the program (for grading). Add the explanations to the above-mentioned document.o As mentioned in the previous steps, you must include the test cases and their results at the bottom ofthe program in comment format and complete the explanations in the test case document. A program without test cases included at the end of program or without test case explanationdocument will be returned with 0 points awarded. A submission of test case explanation document without the source codes will be returnedwith 0 points awarded.• Submit the program with file extension .rkt and the test case document as a PDF.Write a program that implements a binary search tree (BST). Assume that a BST organizes integers and can’t containduplicate values.A binary search tree organizes its data by value. Each node n in a binary search tree satisfies the following properties:• n’s value is greater than all values in its left subtree TL.• n’s value is less than all values in its right subtree TR.• Both TL and TR are binary search trees.4A non-empty binary search tree can be represented as a list of three elements - (a value, TL, TR). The first element is anode's value(root), the second element is the node's left subtree, and the third element is a node's right subtree. ...

March 31, 2023 · 6 min · jiezi

关于算法:MATH4007计算方法统计

MATH4007/G14CST COMPUTATIONAL STATISTICSAssessed Coursework 2 — 2021/2022Your work should be submitted electronically via the module’s Moodle page by 15:00 Wednes-day 4th May 2022. Since this work is assessed, your submission must be entirely your ownwork (see the University’s policy on Academic Misconduct). Submissions up to five working dayslate will be subject to a penalty of 5% of the maximum mark per working day.Submission requirementsThe submission should be uploaded electronically via the submission box on Moodle, and contain: ...

March 31, 2023 · 6 min · jiezi

关于算法:机器学习实战系列一工业蒸汽量预测最新版本下篇含特征优化模型融合等

工业蒸汽量预测(最新版本下篇)5.模型验证5.1模型评估的概念与正则化5.1.1 过拟合与欠拟合### 获取并绘制数据集import numpy as npimport matplotlib.pyplot as plt%matplotlib inlinenp.random.seed(666)x = np.random.uniform(-3.0, 3.0, size=100)X = x.reshape(-1, 1)y = 0.5 * x**2 + x + 2 + np.random.normal(0, 1, size=100)plt.scatter(x, y)plt.show() 应用线性回归拟合数据 from sklearn.linear_model import LinearRegressionlin_reg = LinearRegression()lin_reg.fit(X, y)lin_reg.score(X, y)# 输入:0.49537078118650090.4953707811865009准确率为 0.495,比拟低,直线拟合数据的水平较低。 ### 应用均方误差判断拟合水平from sklearn.metrics import mean_squared_errory_predict = lin_reg.predict(X)mean_squared_error(y, y_predict)# 输入:3.07500257656365773.0750025765636577### 绘制拟合后果y_predict = lin_reg.predict(X)plt.scatter(x, y)plt.plot(np.sort(x), y_predict[np.argsort(x)], color='r')plt.show() 5.1.2 回归模型的评估指标和调用办法### 应用多项式回归拟合# * 封装 Pipeline 管道from sklearn.pipeline import Pipelinefrom sklearn.preprocessing import PolynomialFeaturesfrom sklearn.preprocessing import StandardScalerdef PolynomialRegression(degree): return Pipeline([ ('poly', PolynomialFeatures(degree=degree)), ('std_scaler', StandardScaler()), ('lin_reg', LinearRegression()) ])应用 Pipeline 拟合数据:degree = 2poly2_reg = PolynomialRegression(degree=2)poly2_reg.fit(X, y)y2_predict = poly2_reg.predict(X)# 比拟真值和预测值的均方误差mean_squared_error(y, y2_predict)# 输入:1.09873921424178561.0987392142417856绘制拟合后果plt.scatter(x, y)plt.plot(np.sort(x), y2_predict[np.argsort(x)], color='r')plt.show() ...

March 31, 2023 · 25 min · jiezi

关于算法:ENST20001-人类行为与环境分析

ENST20001 Human Behaviour and Environment Assignment 2: Assessing factors that influence environmental actionDue Date: 11pm Sunday May 15, 2022Word limit: 1500 words (+/-10%) excluding reference list, interviewee quotes, coding table and transcripts ofinterviews and field notesSubmission: Electronic copy submitted online through LMS (in the assessment section)Assessment weighting: 35% of final gradeLate penalty: 5% per day ObjectiveThe objective of this assignment is to explore the relative importance of contextual and attitudinal factors in shapingenvironmentally significant behaviour.It responds in part to Stern (2000)’s assertion that: “The attitude-behaviour relationship is strongest whencontextual factors are neutral and approaches zero when contextual factors are strongly positive or negative,effectively compelling or prohibiting the behaviour in question” (Stern 2000, p415).Learning outcomesThrough completing this assignment, you will develop: ...

March 30, 2023 · 7 min · jiezi

关于算法:ECE-536电力系统保护

ECE 536 - Power System ProtectionSpring 2022, 3 creditsTue, Thu 10:00-11:20am, Strand Agriculture Hall 260Course reference materials, assignments, grading: Canvas (ECE_536_001_S2022)Catalog Course Description: Fundamentals of protective relaying. Relay input sources. Gen-eration, transmission and distribution systems protection. Stability and load shedding.Office hours: By appointment (see dedicated Keybase channel)Course Content: Introduction to protective relaying Review: per unit system, symmetrical components, fault analysis Relay input sources Protection design fundamentals Grounding principles Bus protection Transmission line protection Generator and motor protection Stability and load sheddingCourse Specific Measurable Student Learning Outcomes:Upon completing this course, students will be able to ... ...

March 30, 2023 · 2 min · jiezi

关于算法:COMSM0088先进数据分析

Tableau Geocoding Cheat SheetAdvanced Data AnalyticsCOMSM00881 PreambleTableau, for whatever reason, does not natively support UK territory geocodingout of the box. As such, if we want to represent our data as a map, there aresome steps we must follow before simply dragging data onto the sheet.There are two ways in which we can gain support for UK geocoding, eachwith their benefits and drawbacks.1.1 Geocoding PacksThe simplest way is to make use of a precompiled geocoding pack. These typ-ically overwrite the preexisting geocoding presets offered by Tableau. For thistutorial, we will use the pack offered by Craig Bloodworth of The InformationLab, found here. Or, go to the URL https://www.theinformationlab.co.uk/2015/06/01/uk-filled-map-geocoding-pack-for-tableau/ This is ref-erenced in the coursework specification.Download the geocoding pack, following either the link available in the abovereferenced webpage, or directly from here, or https://fileshare.theinformationlab.co.uk/index.php/s/RIHqO61bWZsWvaY/download.1.1.1 Setting up the Geocoding PackFollow the steps below in order to correctly set up the geocoding pack afterdownloading it. ...

March 30, 2023 · 5 min · jiezi

关于算法:ACCT2019-管理账号

ACCT2019 Management AccountingGroup AssignmentSemester 1, 2022 Instructions for Parts A & BScope: There are two parts in this assignment. Part A is a group assessment and Part Bis an individual assessment. Part A requires students, as a group, to carry out ananalysis of the case study (Gretzky Pty Ltd – described in this document) andsubmit an executive report in PowerPoint format. Part B requires each studentto map the Gretzky Pty Ltd case study data in the SAP accounting system andcomplete several transactions and reports and submit a document. Thisassignment requires students to demonstrate their:i) Ability to identify and apply relevant management accounting conceptsand techniques to practical business contexts and make recommendationswith a focus on the usage of qualitative and quantitative information.ii) Specialist SAP software skills by mapping the business scenario in SAP,determination of relevant master data and transactions, their creationand/or execution and producing relevant reports from the SAP accountingsystem. ...

March 30, 2023 · 17 min · jiezi

关于算法:MAS61006专题与讨论

MAS61006 Assessed ProjectThis project counts for 40% of the assessment for MAS61006.1 AimThe aim of this project is to assess you on the Bayesian modelling via computational methods skillsthat you have learned on this module. Exploration and choice of appropriate modelling approach, aswell as how you can disseminate your Bayesian inference to a general audience are key elements ofthis assessment.2 BackgroundYou are a statistician working with a pharmaceutical company who market a birth control drug.Your involvement with the client is to assist them in better understanding their potential customerbase by investigating a range of demographic variables that may be linked to the uptake of birthcontrol by a woman. Primary interest is in: Identifying the key demographic variables that have an effect on birth control use, Quantifying any such demographic variable effect, and Predicting the chance of certain demographic groups purchasing birth control in the future (toknow their key marketing groups).There is a single deliverable for this project, in the form of a written report.3 DataThe data made available to you by the client are the results of a market research investigation.This is called birth-control-data.csv, and is available on the course Blackboard page. This datacontains information on 1,934 women regarding the following variables: birthControl: an binary (0/1) response for whether the subject uses birth control (1 encodedas use of birth control and 0 as not), region: a factor variable describing the primary care region that the subject belongs within.Note that this market research involved 60 care regions, which does not cover the full range ofthe company’s target market, homeStyle: a factor variable indicating whether the subject lives in a rural or urban area (0 isencoded as rural, and 1 as urban), children: the number of children the subject has. Note that the average number of childrenan individual has in this market research study is 2.65, age: the age of the subject. Note that this variable has been standardised, so that the averageage in this study is 0, wealth: the financial wealth of the subject. Note that this variable has been standardised, sothat the average wealth measure in this study is 0.1You can import this data to R using the read_csv() function in the usual way.4 Scope of analysisThe requirements of this analysis are: ...

March 30, 2023 · 5 min · jiezi

关于算法:机器学习实战系列一工业蒸汽量预测最新版本上篇含数据探索特征工程等

机器学习实战系列[一]:工业蒸汽量预测背景介绍火力发电的基本原理是:燃料在焚烧时加热水生成蒸汽,蒸汽压力推动汽轮机旋转,而后汽轮机带动发电机旋转,产生电能。在这一系列的能量转化中,影响发电效率的外围是锅炉的焚烧效率,即燃料焚烧加热水产生高温高压蒸汽。锅炉的焚烧效率的影响因素很多,包含锅炉的可调参数,如焚烧给量,一二次风,引风,返料风,给水水量;以及锅炉的工况,比方锅炉床温、床压,炉膛温度、压力,过热器的温度等。 相干形容经脱敏后的锅炉传感器采集的数据(采集频率是分钟级别),依据锅炉的工况,预测产生的蒸汽量。 数据阐明数据分成训练数据(train.txt)和测试数据(test.txt),其中字段”V0”-“V37”,这38个字段是作为特色变量,”target”作为指标变量。选手利用训练数据训练出模型,预测测试数据的指标变量,排名后果根据预测后果的MSE(mean square error)。 后果评估预测后果以mean square error作为评判规范。 原我的项目链接:https://www.heywhale.com/home/column/64141d6b1c8c8b518ba97dcc 1.数据探索性剖析import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsfrom scipy import statsimport warningswarnings.filterwarnings("ignore") %matplotlib inline# 下载须要用到的数据集!wget http://tianchi-media.oss-cn-beijing.aliyuncs.com/DSW/Industrial_Steam_Forecast/zhengqi_test.txt!wget http://tianchi-media.oss-cn-beijing.aliyuncs.com/DSW/Industrial_Steam_Forecast/zhengqi_train.txt--2023-03-23 18:10:23-- http://tianchi-media.oss-cn-beijing.aliyuncs.com/DSW/Industrial_Steam_Forecast/zhengqi_test.txt正在解析主机 tianchi-media.oss-cn-beijing.aliyuncs.com (tianchi-media.oss-cn-beijing.aliyuncs.com)... 49.7.22.39正在连接 tianchi-media.oss-cn-beijing.aliyuncs.com (tianchi-media.oss-cn-beijing.aliyuncs.com)|49.7.22.39|:80... 已连贯。已收回 HTTP 申请,正在期待回应... 200 OK长度: 466959 (456K) [text/plain]正在保留至: “zhengqi_test.txt.1”zhengqi_test.txt.1 100%[===================>] 456.01K --.-KB/s in 0.04s 2023-03-23 18:10:23 (10.0 MB/s) - 已保留 “zhengqi_test.txt.1” [466959/466959])--2023-03-23 18:10:23-- http://tianchi-media.oss-cn-beijing.aliyuncs.com/DSW/Industrial_Steam_Forecast/zhengqi_train.txt正在解析主机 tianchi-media.oss-cn-beijing.aliyuncs.com (tianchi-media.oss-cn-beijing.aliyuncs.com)... 49.7.22.39正在连接 tianchi-media.oss-cn-beijing.aliyuncs.com (tianchi-media.oss-cn-beijing.aliyuncs.com)|49.7.22.39|:80... 已连贯。已收回 HTTP 申请,正在期待回应... 200 OK长度: 714370 (698K) [text/plain]正在保留至: “zhengqi_train.txt.1”zhengqi_train.txt.1 100%[===================>] 697.63K --.-KB/s in 0.04s 2023-03-23 18:10:24 (17.9 MB/s) - 已保留 “zhengqi_train.txt.1” [714370/714370])# **读取数据文件**# 应用Pandas库`read_csv()`函数进行数据读取,宰割符为‘\t’train_data_file = "./zhengqi_train.txt"test_data_file = "./zhengqi_test.txt"train_data = pd.read_csv(train_data_file, sep='\t', encoding='utf-8')test_data = pd.read_csv(test_data_file, sep='\t', encoding='utf-8')1.1 查看数据信息#查看特色变量信息train_data.info()<class 'pandas.core.frame.DataFrame'>RangeIndex: 2888 entries, 0 to 2887Data columns (total 39 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 V0 2888 non-null float64 1 V1 2888 non-null float64 2 V2 2888 non-null float64 3 V3 2888 non-null float64 4 V4 2888 non-null float64 5 V5 2888 non-null float64 6 V6 2888 non-null float64 7 V7 2888 non-null float64 8 V8 2888 non-null float64 9 V9 2888 non-null float64 10 V10 2888 non-null float64 11 V11 2888 non-null float64 12 V12 2888 non-null float64 13 V13 2888 non-null float64 14 V14 2888 non-null float64 15 V15 2888 non-null float64 16 V16 2888 non-null float64 17 V17 2888 non-null float64 18 V18 2888 non-null float64 19 V19 2888 non-null float64 20 V20 2888 non-null float64 21 V21 2888 non-null float64 22 V22 2888 non-null float64 23 V23 2888 non-null float64 24 V24 2888 non-null float64 25 V25 2888 non-null float64 26 V26 2888 non-null float64 27 V27 2888 non-null float64 28 V28 2888 non-null float64 29 V29 2888 non-null float64 30 V30 2888 non-null float64 31 V31 2888 non-null float64 32 V32 2888 non-null float64 33 V33 2888 non-null float64 34 V34 2888 non-null float64 35 V35 2888 non-null float64 36 V36 2888 non-null float64 37 V37 2888 non-null float64 38 target 2888 non-null float64dtypes: float64(39)memory usage: 880.1 KB此训练集数据共有2888个样本,数据中有V0-V37共计38个特色变量,变量类型都为数值类型,所有数据特色没有缺失值数据;数据字段因为采纳了脱敏解决,删除了特色数据的具体含意;target字段为标签变量 ...

March 30, 2023 · 30 min · jiezi

关于算法:机器学习算法九-基于线性判别模型的LDA手写数字分类识别

1.机器学习算法(九): 基于线性判断模型的LDA手写数字分类辨认本我的项目链接:https://www.heywhale.com/home/column/64141d6b1c8c8b518ba97dcc 1.1 LDA算法简介和利用线性判断模型(LDA)在模式识别畛域(比方人脸识别等图形图像辨认畛域)中有十分宽泛的利用。LDA是一种监督学习的降维技术,也就是说它的数据集的每个样本是有类别输入的。这点和PCA不同。PCA是不思考样本类别输入的无监督降维技术。LDA的思维能够用一句话概括,就是“投影后类内方差最小,类间方差最大”。咱们要将数据在低维度上进行投影,投影后心愿每一种类别数据的投影点尽可能的靠近,而不同类别的数据的类别核心之间的间隔尽可能的大。即:将数据投影到维度更低的空间中,使得投影后的点,会造成按类别辨别,一簇一簇的状况,雷同类别的点,将会在投影后的空间中更靠近办法。 LDA算法的一个指标是使得不同类别之间的间隔越远越好,同一类别之中的间隔越近越好。那么不同类别之间的间隔越远越好,咱们是能够了解的,就是越远越好辨别。同时,协方差不仅是反映了变量之间的相关性,同样反映了多维样本分布的离散水平(一维样本应用方差),协方差越大(对于负相关来说是绝对值越大),示意数据的散布越扩散。所以下面的“欲使同类样例的投影点尽可能靠近,能够让同类样本点的协方差矩阵尽可能小”就能够了解了。 $J(w)=\frac{w^T|\mu_1 - \mu_2^~|^2}{s^2_1+s^2_2}$ 如上述公式 $J(w)$ 所示,分子为投影数据后的均值只差,分母为方差之后,LDA的目标就是使得 $J$ 值最大化,那么能够了解为最大化分子,即便得类别之间的间隔越远,同时最小化分母,使得每个类别外部的方差越小,这样就能使得每个类类别的数据能够在投影矩阵 $w$ 的映射下,分的越开。 须要留神的是,LDA模型实用于线性可分数据,对于上述实战中用到的MNIST手写数据(其实是分线性的),然而仍然能够获得较好的分类成果;但在当前的实战中须要留神LDA在非线性可分数据上的审慎应用。 1.2.算法利用LDA在模式识别畛域(比方人脸识别,舰艇辨认等图形图像辨认畛域)中有十分宽泛的利用,因而咱们有必要理解一下它的算法原理。不过在学习LDA之前,咱们有必要将其与自然语言解决畛域中的LDA辨别开,在自然语言解决畛域,LDA是隐含狄利克雷散布(Latent DIrichlet Allocation,简称LDA),它是一种解决文档的主题模型,咱们本文探讨的是线性判别分析,因而前面所说的LDA均为线性判别分析。 LDA除了能够用于降维以外,还能够用于分类。一个常见的LDA分类根本思维是假如各个类别的样本数据合乎高斯分布,这样利用LDA进行投影后,能够利用极大似然预计计算各个类别投影数据的均值和方差,进而失去该类别高斯分布的概率密度函数。当一个新的样本到来后,咱们能够将它投影,而后将投影后的样本特色别离带入各个类别的高斯分布概率密度函数,计算它属于这个类别的概率,最大的概率对应的类别即为预测类别。 2.相干流程把握LDA算法基本原理把握利用LDA进行代码实战Part 1 Demo实际 Step1:库函数导入Step2:模型训练Step3:模型参数查看Step4:数据和模型可视化Step5:模型预测Part 2 基于LDA手写数字分类实际 Step1:库函数导入Step2:数据读取/载入Step3:数据信息简略查看与可视化Step4:利用LDA在手写数字上进行训练和预测3.代码实战3.1 Demo实际Step1:库函数导入# 根底数组运算库导入import numpy as np # 画图库导入import matplotlib.pyplot as plt # 导入三维显示工具from mpl_toolkits.mplot3d import Axes3D# 导入LDA模型from sklearn.discriminant_analysis import LinearDiscriminantAnalysis# 导入demo数据制作方法from sklearn.datasets import make_classificationStep2:模型训练# 制作四个类别的数据,每个类别100个样本X, y = make_classification(n_samples=1000, n_features=3, n_redundant=0, n_classes=4, n_informative=2, n_clusters_per_class=1, class_sep=3, random_state=10)# 将四个类别的数据进行三维显示fig = plt.figure()ax = Axes3D(fig, rect=[0, 0, 1, 1], elev=20, azim=20)ax.scatter(X[:, 0], X[:, 1], X[:, 2], marker='o', c=y)plt.show() ...

March 29, 2023 · 7 min · jiezi

关于算法:机器学习算法八基于BP神经网络的乳腺癌的分类预测

机器学习算法(八):基于BP神经网络的乳腺癌的分类预测本我的项目链接:https://www.heywhale.com/home/column/64141d6b1c8c8b518ba97dcc 1.算法简介和利用1.1 算法简介BP(Back Propagation)网络是1986年由Rumelhart和McCelland为首的科学家小组提出,是一种按误差逆流传算法训练的多层前馈网络,是目前利用最宽泛的神经网络模型之一。BP网络能学习和存贮大量的输出-输入模式映射关系,而无需事先揭示形容这种映射关系的数学方程。它的学习规定是应用最速降落法,通过反向流传来一直调整网络的权值和阈值,使网络的误差平方和最小。BP神经网络模型拓扑构造包含输出层(input)、隐层(hide layer)和输入层(output layer)。在模仿过程中收集零碎所产生的误差,通过误差反传,而后调整权值大小,通过该一直迭代更新,最初使得模型趋于整体最优化(这是一个循环,咱们在训练神经网络的时候是要一直的去反复这个过程的)。 BP神经网络模型要点在于数据的前向流传和误差反向流传,来对参数进行更新,使得损失最小化。 误差反向流传算法简称反向流传算法(即BP算法)。应用反向流传算法的多层感知器又称为BP神经网络。BP算法是一个迭代算法,它的根本思维为: (1)先计算每一层的状态和激活值,直到最初一层(即信号是前向流传的); (2)计算每一层的误差,误差的计算过程是从最初一层向前推动的(这就是反向流传算法名字的由来); (3)更新参数(指标是误差变小)。迭代后面两个步骤,直到满足进行准则(比方相邻两次迭代的误差的差异很小)。 在这个过程,函数的导数链式法则求导很重要,须要手动推导BP神经网络模型的梯度反向流传过程,熟练掌握链式法则进行求导,对参数进行更新。 1.2.算法利用BP反映了生物神经系统解决外界事物的根本过程,是在模仿人脑神经组织的根底上倒退起来的计算零碎,是由大量处理单元通过宽泛互联而形成的网络体系,它具备生物神经系统的基本特征,在肯定水平上反映了人脑性能的若干反映,是对生物零碎的某种模仿,具备大规模并行、分布式解决、自组织、自学习等长处,被广泛应用于语音剖析、图像识别、数字水印、计算机视觉等很多畛域,获得了许多突出的成绩。最近因为人工神经网络的疾速倒退,它曾经成为模式识别的强有力的工具。神经网络的使用开展了新的畛域,解决其它模式识别不能解决的问题,其分类性能特地适宜于模式识别与分类的利用。 2.相干流程把握BP算法基本原理把握利用BP进行代码实战Part 1 Demo实际 Step1:库函数导入Step2:模型训练Step3:模型参数查看Step4:数据和模型可视化Step5:模型预测Part 2 基于BP神经网络的乳腺癌分类实际 Step1:库函数导入Step2:数据读取/载入Step3:数据信息简略查看与可视化Step4:利用BP神经网络在乳腺癌数据上进行训练和预测3.代码实战3.1 Part 1 Demo实际Step1:库函数导入# 根底数组运算库导入import numpy as np # 画图库导入import matplotlib.pyplot as plt # 导入三维显示工具from mpl_toolkits.mplot3d import Axes3D# 导入BP模型from sklearn.neural_network import MLPClassifier# 导入demo数据制作方法from sklearn.datasets import make_classificationfrom sklearn.metrics import classification_report, confusion_matriximport seaborn as snsimport warningsfrom sklearn.exceptions import ConvergenceWarningStep2:模型训练# 制作五个类别的数据,每个类别1000个样本train_samples, train_labels = make_classification(n_samples=1000, n_features=3, n_redundant=0, n_classes=5, n_informative=3, n_clusters_per_class=1, class_sep=3, random_state=10)# 将五个类别的数据进行三维显示fig = plt.figure()ax = Axes3D(fig, rect=[0, 0, 1, 1], elev=20, azim=20)ax.scatter(train_samples[:, 0], train_samples[:, 1], train_samples[:, 2], marker='o', c=train_labels)plt.title('Demo Data Map')Text(0.5,0.92,'Demo Data Map') ...

March 28, 2023 · 4 min · jiezi

关于算法:训练个中文版ChatGPT没那么难不用A100开源AlpacaLoRARTX-4090就能搞定

转载自机器之心编辑:一点人工一点智能原文:训练个中文版ChatGPT没那么难:不必A100,开源Alpaca-LoRA+RTX 4090就能搞定2023 年,聊天机器人畛域仿佛只剩下两个营垒:「OpenAI 的 ChatGPT」和「其余」。ChatGPT 功能强大,但 OpenAI 简直不可能将其开源。「其余」营垒体现欠佳,但不少人都在做开源方面的致力,比方前段时间 Meta 开源的 LLaMA。LLaMA 是一系列模型的总称,参数量从 70 亿到 650 亿不等,其中,130 亿参数的 LLaMA 模型「在大多数基准上」能够胜过参数量达 1750 亿的 GPT-3。不过,该模型并没有通过指令微调(instruct tuning),因而生成成果较差。为了进步模型性能,来自斯坦福的研究者帮忙其实现了指令微调的工作,训练了一个名为 Alpaca(羊驼)的 70 亿参数新模型(基于 LLaMA 7B)。具体来说,他们让 OpenAI 的 text-davinci-003 模型以 self-instruct 形式生成 52K 指令遵循(instruction-following)样本,以此作为 Alpaca 的训练数据。试验结果表明,Alpaca 的很多行为都与 text-davinci-003 相似。也就是说,只有 7B 参数的轻量级模型 Alpaca 性能可媲美 GPT-3.5 这样的超大规模语言模型。对于一般研究者来说,这是一种切实可行的便宜微调形式,不过须要的运算量依然较大(作者示意他们在 8 个 80GB A100 上微调了 3 个小时)。而且,Alpaca 的种子工作都是英语,收集的数据也都是英文,因而训练进去的模型未对中文优化。为了进一步升高微调老本,另一位来自斯坦福的研究者 ——Eric J. Wang 应用 LoRA(low-rank adaptation)技术复现了 Alpaca 的后果。具体来说,Eric J. Wang 应用一块 RTX 4090 显卡,只用 5 个小时就训练了一个和 Alpaca 程度相当的模型,将这类模型对算力的需要降到了生产级。而且,该模型能够在树莓派上运行(用于钻研)。LoRA 的技术原理。LoRA 的思维是在原始 PLM 旁边减少一个旁路,做一个降维再升维的操作,来模仿所谓的 intrinsic rank。训练的时候固定 PLM 的参数,只训练降维矩阵 A 与升维矩阵 B。而模型的输入输出维度不变,输入时将 BA 与 PLM 的参数叠加。用随机高斯分布初始化 A,用 0 矩阵初始化 B,保障训练的开始此旁路矩阵仍然是 0 矩阵(引自:https://finisky.github.io/lora/)。LoRA 的最大劣势是速度更快,应用的内存更少,因而能够在生产级硬件上运行。Eric J. Wang 公布的 Alpaca-LoRA 我的项目。我的项目地址:https://github.com/tloen/alpaca-lora对于想要训练本人的类 ChatGPT 模型(包含中文版类 ChatGPT)但又没有顶级算力资源配置的研究者来说,这无疑是一大惊喜。因而,在 Alpaca-LoRA 我的项目问世后,围绕该项目标教程和训练成绩不断涌现,本文将介绍其中的几个。 ...

March 27, 2023 · 2 min · jiezi

关于算法:前缀和算法练习集

截断数组给定一个长度为 n 的数组 a1,a2,…,an。 当初,要将该数组从两头截断,失去三个非空子数组。 要求,三个子数组内各元素之和都相等。 请问,共有多少种不同的截断办法? 输出格局 第一行蕴含整数 n。 第二行蕴含 n 个整数 a1,a2,…,an。 输入格局 输入一个整数,示意截断办法数量。 数据范畴 前六个测试点满足 $1≤n≤10$。 所有测试点满足 $1≤n≤10^5$,$−10000≤a_i≤10000$。 输出样例1: 41 2 3 3输入样例1: 1输出样例2: 51 2 3 4 5输入样例2: 0输出样例3: 20 0输入样例3: 0code #include <iostream>#include <algorithm>using namespace std;const int N = 100010;typedef long long LL;int n;int s[N];int main(){ cin >> n; for(int i = 1; i <= n; i ++ ) { int x; cin >> x; s[i] = s[i - 1] + x; } // 每个数组的元素和为原数组的1/3 if(s[n] % 3) puts("0"); else { LL res = 0; // 寻找两刀的地位 for(int j = 3, cnt = 0; j <= n; j ++ ) { // 第一刀满足三分之一,则记录该刀, 第一刀范畴[1, n - 2] (cnt的意思就是 j - 2 地位后面有多少能够切第一刀的中央) if(s[j - 2] == s[n] / 3) cnt ++; // 第二刀满足后半局部三分之一,如果满足,则能够组成答案(第二刀跟后面的cnt个第一刀组合,所以总个数减少cnt个) if(s[n] - s[j - 1] == s[n] / 3) res += cnt; } printf("%lld\n", res); } return 0;}K倍区间给定一个长度为 N 的数列,$A_1,A_2,…A_N$,如果其中一段间断的子序列 $A_i,A_{i+1},…A_j$ 之和是 K 的倍数,咱们就称这个区间 [i,j] 是 K 倍区间。 ...

March 26, 2023 · 4 min · jiezi

关于算法:SLAM经典FSATLIO

转载自:柑橘全程机械化平台编辑:一点人工一点智能原文:SLAM经典:FSAT-LIO明天给大家分享IEEE Robotics and Automation Letters上发表经典文章:FAST-LIO: A Fast, Robust LiDAR-inertial Odometry Package by Tightly-Coupled Iterated Kalman Filter。文章中次要介绍了香港大学火星实验室研发的一种SLAM算法—FAST-LIO。论文地址:https://arxiv.org/abs/2010.08196v3 01  背景介绍SLAM是随着机器人的倒退而衍生出的一个概念,为的是解决机器人的自主定位和导航问题,当初曾经被广泛应用在汽车主动驾驶、无人机、测绘以及AR和VR方面。SLAM又有激光SLAM和视觉SLAM之分,其次要区别是所用传感器和获取数据不同。FAST-LIO算法就是属于激光SLAM。 02  问题与办法早在FAST-LIO之前,就曾经呈现了许多经典的激光SLAM算法,例如LOAM、 LeGO-LOAM等等,但文章的作者他们缓缓发现这些算法都或多或少得存在以下局限性:1)激光雷达测量中的特色点通常是环境中的几何构造(例如,边缘和立体)。当无人机在芜杂的环境中工作时,因为没有强有力的特色,基于激光雷达的解决方案很容易产生进化。在激光雷达只具备小视场的时候,这个问题更加显著。2)因为沿着扫描方向的高分辨率,激光雷达扫描通常蕴含许多特色点。尽管这些特色点并不能保障在产生进化的状况下牢靠地确定姿势,然而将这些大量的特色点交融到IMU测量中仍旧须要微小的计算资源,这是UAV机载计算机难以累赘的。3)因为激光雷达通过多个激光器/接收器进行采样点,扫描中的激光点无奈保障在同一时间采样,这会导致呈现静止失真,也将显著升高扫描配准。另外,无人机螺旋桨和电机的继续旋转也给IMU测量带来了显著的噪声。为了解决以上问题,文章中采取了以下办法:1)为了应答产生进化的疾速静止、噪声或芜杂环境,文章中采纳紧耦合迭代卡尔曼滤波器来交融激光雷达特色点和IMU测量。2)为了升高大量激光雷达特色点带来的微小计算量,文章中提出了一种新的卡尔曼增益计算公式,并证实了它与传统卡尔曼增益公式的等价性。新公式的计算复杂度取决于状态维数而不是测量维数。3)文章提出了一个反向流传过程来弥补激光雷达采样时带来的静止失真。 03  算法FAST-LIO算法的整个过程如下图所示。在建图过程中,首先将激光雷达输出数据输出到特征提取模块,以此来取得立体特色和边缘特色。而后将提取的特色和IMU测量值输出到状态预计模块,在10Hz−50Hz下进行状态预计。之后,预计的姿势将特色点注册到全局框架中,并将它们与到目前为止构建的特色点图合并。更新后的地图会在下一步中退出更多新的点。与过来的算法显著不同的是,在IMU的状态预计模块中减少了反向流传的过程,并将反向流传所得的后果与正向流传所得后果一起输出到残余计算中去。这其中各个局部波及的大量数学推导,感兴趣的敌人能够去自行浏览原文。 04  试验文章的作者们为了验证FAST-LIO的卓越性,别离进行了三个试验。在第一个试验中,他们在其余条件不变得状况下别离应用新、旧得卡尔曼增益计算公式,并比拟了最终的的计算工夫和特色点的数量。后果如表所示,事实证明,新的卡尔曼增益计算公式的确能够很好地缩小计算量和计算工夫在第二个试验中,作者们将搭载了FAST-LIO算法的无人机置于室内进行航行测试。并与LOAM算法进行比拟。测试结果表明,FAST-LIO的运算速度更快、更稳固。其试验过程中无人机航行的角速度和加速度以及最终比拟后果如下图所示。在第三个试验中,作者们将搭载了FAST-LIO算法的无人机置于室外进行航行测试。最终该试验中的漂移率小于0.05% (在140米轨道上漂移0.07米)。该试验将扫描速率设置为 10 Hz,一次扫描的均匀解决工夫为25 ms,均匀有1497个无效特色点。该试验对香港大学主楼的测绘后果如下图所示。 05  总结这篇文章提出了一种基于紧耦合迭代卡尔曼滤波器的激光雷达惯性测程框架。该框架应用正向和反向流传来预测状态和弥补激光雷达扫描中的静止。此外,文章中证实并实现了一个复杂度更低,但与卡尔曼增益计算等效的公式。最初,所有的测试都证实,该算法能够提供准确、实时和牢靠的导航后果。1、 书籍举荐-《3D点云剖析:传统、深度学习和可解释的机器学习办法2、 SLAM技术支持的物体6Dof位姿预计的自训练方法3、 书籍举荐-《基于Pose SLAM的建图、布局与摸索》4、 基于多传感器交融的定位和建图零碎5、 书籍举荐-《卡尔曼滤波与信息交融》6、书籍举荐-《大规模SLAM技术》

March 24, 2023 · 1 min · jiezi

关于算法:别让疲劳成为你的最后一程路如何避免驾驶疲劳丨曼孚科技

疲劳驾驶的结果有多重大? 当驾驶员轻微疲劳时,他们的思维和动作会变得缓慢,导致操作不及时。 当驾驶员重度疲劳时,他们可能会遗记操作或不盲目打瞌睡,甚至失去对车辆的控制能力。 据英国交通钻研实验室统计,每年因驾驶疲劳导致的路面交通事故约占全事故率的10%。驾驶员须要一种能检测到疲劳行为的办法,如通过语音、触动、警报等形式,及时揭示驾驶员调整状态,缩小交通事故产生。 现阶段,对疲劳驾驶的检测钻研次要包含主观检测与主观检测。 主观检测是通过对驾驶员的自我检测表、自我调查表、皮尔逊疲劳量表以及睡眠尺度断定驾驶员的疲劳状态。此办法不仅对驾驶员依赖程度较高,且难以实时检测,已逐渐被主观检测取缔。 主观检测共分为两类,即对驾驶员与车辆状态的检测。 一、驾驶员状态检测■ 基于生理指标的检测:通过对驾驶员脉搏、脑电信号、心电信号的检测把握驾驶员的身材状态,这种办法能够精准检测出疲劳驾驶状态,但其老本较高且有可能影响失常驾驶。■ 基于行为状态的检测:该办法采纳非染指的形式,通过图像检测比照驾驶员的面部特色,如眼睛特色、眼帘方向、嘴部状态与头部地位判断驾驶员的疲劳状态。 二、车辆状态检测 车辆状态检测通过剖析车速、车辆轨迹、方向盘握力/扭转力、车道偏离、刹车油门等数据,将之与失常状态下的数据比对,从而推断驾驶员疲劳驾驶水平,实现对驾驶员的间接监控。 相较上一种办法,车辆检测部署老本更低,但该办法并不间接监控驾驶员,且依赖路线车道状况(清晰度),在简单的理论场景中,难以精确评估驾驶员的疲劳与分心状态,容易导致误报。 现阶段,驾驶员行为状态检测是市面上支流的疲劳驾驶检测计划。显然,人在疲劳状态下的特色直观且显著,如眨眼次数、眼球转动、打哈欠、拍板等,这些状态会被摄像头记录下来,并加以辨认判断。 技术层面讲,行为状态检测次要利用计算机视觉技术(人脸识别)剖析驾驶员的面部特色,辨认其身份及表情、眼睛状态、头部姿势等指标,从而判断驾驶员的疲劳状态。 在计算机视觉工作中,模型的性能和利用成果间接受到训练数据的品质与数量影响,而数据标注是确保训练数据品质的关键步骤。 作为算法训练的根底,传感器采集到的非结构化数据须要通过人工/主动标注技术,能力转换成模型可了解的结构化数据。 举个例子,假如咱们正在开发一个基于行为状态的疲劳驾驶检测零碎,它能够应用车辆摄像头捕获驾驶员的图像,并应用深度学习算法来检测疲劳驾驶行为。在这个例子中,咱们须要对训练数据进行标注,以便算法能够辨认和学习不同的疲劳驾驶行为。 具体来说,基于疲劳行为状态检测的数据标注技术能够包含以下几个步骤:■ 确定标注目标:首先须要明确标注目标,例如检测疲劳驾驶、预防交通事故等。■ 抉择标注工具:依据标注目标和对象,抉择适宜的标注工具与形式。例如,能够应用视频标注技术捕获驾驶员的行为状态,而后利用相应的工具进行标注。■ 制订标注标准:为了保障标注数据的一致性和准确性,须要制订标注标准和规范。例如,规定疲劳驾驶的标记是驾驶员频繁打哈欠、眼睛闭合工夫长等。■ 进行标注:依据标注标准和规范,进行标注。标注人员须要仔细观察采集数据,并依据规范进行标注。■ 审核标注后果:为了保障标注数据的品质,须要对标注后果进行审核。能够随机抉择局部数据进行二次标注,并比对后果。■ 整顿存储数据:标注实现后,须要整顿和存储标注数据。能够将数据存储到数据库或者文件中,并进行备份。■ 数据分析和应用:标注数据实现后,能够进行数据分析和应用。例如,能够应用标注数据来训练疲劳驾驶检测模型,或者剖析驾驶员的行为模式等。 以上步骤,能够帮忙算法学习和辨认不同的疲劳驾驶行为,从而进步疲劳驾驶检测零碎的准确性和可靠性,确保驾驶员的平安。 在将来,随着科技一直倒退,更加高效、精确、牢靠的疲劳驾驶检测技术将会不断涌现,为驾驶员的行车平安提供更好的保障。

March 24, 2023 · 1 min · jiezi

关于算法:多车立体事件相机数据集用于3D感知的事件相机数据集

以下文章来源于智能汽车开发者平台 ,作者Alex Zihao Zhu1编辑:一点人工一点智能原文:多车平面事件相机数据集:用于3D感知的事件相机数据集 00  摘要基于事件的摄像机是一种新的无源传感形式,与传统的摄像机相比有许多长处,包含极低的提早、异步数据采集、高动静范畴和极低的功耗。最近,人们对利用算法来应用事件执行各种3D感知工作十分感兴趣,比方特色跟踪、视觉里程数测量和平面深度预计。然而,目前不足像传统相机那样丰盛的标记数据,无奈用于测试和开发。在本文中,咱们展现了一个大型数据集,该数据集采纳了基于事件的同步平面摄影零碎,该零碎由一个手持式设施携带,在各种不同的照明程度和环境中,由一架六轴飞行器航行,在汽车顶部驱动,并装置在摩托车上。从每个相机中,咱们提供事件流、灰度图像和IMU读数。此外,咱们利用IMU、刚性装置的激光雷达零碎、室内外静止捕获和GPS的组合,以高达100Hz的频率为每个摄像机提供精确的姿态和深度图像。为了进行比拟,咱们还提供了同步的灰度图像和基于框架的平面摄像机零碎的IMU读数。 01  简介基于EVENT的相机通过检测图像的对数强度的变动来感知世界。通过以几十微秒的精度记录这些变动,以及异步的、简直是即时的反馈,与传统相机通常有几十毫秒的提早相比,它们能够实现极低的提早响应。此外,通过跟踪日志强度的变动,摄像机具备十分高的动静范畴(>130dB,而传统摄像机约为60dB),这使得它们对照明的戏剧性变动的场景十分有用,如室内-室外的过渡,以及有强光源的场景,如太阳。然而,大多数古代机器人算法都是为同步传感器设计的,测量后果以固定的工夫距离达到。此外,生成的事件自身并不带有任何强度信息。图1:残缺的传感器设施,包含平面DAVIS相机、VI传感器和Velodyne激光雷达因而,必须开发新的算法以充分利用该传感器提供的劣势。可怜的是,因为测量方法的不同,咱们不能间接利用传统相机捕捉到的大量标签数据。事实证明,这些数据对于为新办法提供实在和统一的评估、训练机器学习零碎以及为无奈接触到这些传感器的钻研人员提供新的倒退机会来说,是极其重要的。在这项工作中,咱们旨在提供一些不同的序列,以促成钻研和开发一些不同问题的新鲜解决方案。一个次要的奉献是建设了第一个具备同步平面事件摄像零碎的数据集。通过校准的平面零碎对于用度量衡进行深度预计很有帮忙,这有助于解决诸如姿态预计、绘图、避障和3D重建等问题。在基于事件的摄像机的平面深度预计方面曾经有了一些工作,然而,因为不足精确的地面实况深度,评估只限于小的、不相干的序列,包含摄像机后面的几个物体。相比之下,这个数据集提供了来自两个同步和校准的动静视觉和被动像素传感器(DAVIS- m346b)的事件流,在各种光照和速度下的室内和室外长序列,以及准确的深度图像和高达100Hz的姿态,由装置在相机顶部的激光雷达零碎产生,如图1,同时还有静止捕获和GPS。咱们心愿这个数据集能够帮忙为一些利用中基于事件的算法评估提供一个独特的根底。残缺的数据集能够在网上找到:https:// daniilidis-group.github.io/mvse本文的次要奉献可演绎为:●   第一组带有同步平面事件相机的数据集,具备精确的地面实况深度和姿势。●   来自手持式钻机、六轴飞行器、汽车和摩托车的事件数据,以及来自3D激光雷达、IMU和基于框架的图像的校准传感器数据,来自各种不同的速度、照明程度和环境。 02  相干工作2.1 相干数据集目前,有一些现有的数据集提供了来自单目事件相机的事件,并与其余各种传感形式和地面实况测量相结合,实用于测试一些不同的3D感知工作。Weikersdorfer等人[1]将晚期的128x128分辨率的eDVS传感器与Primesense RGBD传感器联合起来,并提供了一个室内序列的数据集,其地面实况姿态来自静止捕获零碎,深度来自RGBD传感器。Rueckauer等人[2]提供了来自DAVIS 240C相机的纯旋转静止的数据,以及基于陀螺仪报告的角速度的地面实况光学流,只管这受到报告速度中的乐音影响。Barranco等人[3]提出了一个数据集,其中的DAVIS 240B相机装置在一个平移歪斜安装的顶部,与微软的Kinect传感器一起连贯在一个挪动基地上。该数据集提供了基地在室内环境中以5自由度挪动的序列,以及来自基地上的轮子编码器和平移歪斜安装的角度的地面实况深度、光学流和姿态。尽管来自Kinect的深度是精确的,但光学流和姿态会受到底座的轮子编码器的地位预计的影响而产生漂移。Mueggler等人[4]提供了一些用于在各种室内和室外环境中进行姿态预计的手持序列,这些序列由DAVIS 240C生成。一些室内场景提供了姿势的根底实在,是由动作捕获零碎捕捉的。然而,没有户外序列,或其余具备显著位移的序列,具备地面实况信息。Binas等人[5]提供了一个装置在汽车挡风玻璃前面的DAVIS 346B的大型数据集,其中有12个小时的驾驶,旨在对各种驾驶相干的工作进行端对端学习。作者提供了一些来自车辆的辅助测量数据,如转向角、加速器踏板地位、车速等,以及来自GPS安装的经度和纬度。然而,没有提供6自由度的姿态,因为只能从所提供的GPS输入中推断出2D平移。这些数据集为开发和评估基于事件的办法提供了贵重的数据。然而,迄今为止,他们只有单目序列,地面实况6自由度的姿态仅限于小型室内环境,很少有序列具备地面实况深度。相比之下,这项工作提供了在各种室内和室外环境中具备地面实况姿势和深度图像的平面序列。 2.2 基于事件的3D感知晚期的工作[6],[7]提出了平面深度预计的后果,有一些空间和工夫老本。起初在[8]、[9]和[10]的工作中,将平面深度的单干办法适应于基于事件的摄像机,因为它们实用于异步的、基于点的测量。同样,[11]和[12]利用了一套工夫、极性、排序和极性束缚来确定匹配,而[13]则将其与基于方位仪库输入的匹配进行比拟。作者在[14]中展现了一种新的办法来确定外极线,利用于立体匹配。在[15]中,作者提出了一个新的上下文描述符来进行匹配,[16]中的作者应用了一个经验纯旋转的平面事件相机来进行深度预计和全景拼接。也有一些对于基于事件的视觉测距和SLAM问题的工作。作者在[17]和[18]中提出了在事件空间中进行特色跟踪的新办法,他们在[19]和[20]中对这些办法进行了扩大,别离进行视觉和视觉惯性测距。在[1]中,作者将一个基于事件的相机与深度传感器联合起来,进行视觉测距和SLAM。[21]中的作者应用事件来预计摄像机的角速度,而[22]和[23]则通过建设一个最高比例的地图来进行视觉测向。此外,[24]和[25]还将事件与来自IMU的测量值相交融,进行视觉惯性测距。尽管较新的作品基于公共数据集进行评估,如[4],但大多数是在仅为论文而产生的小数据集上进行评估,使得性能的比拟变得很艰难。对于基于平面事件的摄像机来说,状况尤其如此。在这项工作中,咱们试图产生更宽泛的根底假相,以便对新算法进行更有意义的评估,为办法之间的比拟提供根底。 03  数据集对于该数据集中的每个序列,咱们以ROS bag1格局提供以下测量后果:●  事件,APS灰度图像和来自左右DAVIS相机的IMU测量。●   来自VI传感器的图像和IMU测量。●   来自Velodyne VLP-16激光雷达2的点云。●   右边DAVIS相机的地面实况参考姿态。●   右边和左边DAVIS相机的地面实况参考深度图像。 3.1 传感器表I中列出了传感器及其特点。此外,图2a显示了传感器安装的CAD图,所有的传感器轴都被表明,图2显示了传感器在每辆车上的装置形式。  图2:从左到右:(a):传感器安装的CAD模型。所有的传感器轴都被贴上标签,并涂上对应的色彩:R:X、G:Y、B:Z,每对轴之间只有大概90度的旋转组合。(b): 装置在六轴飞行器上的传感器包。(c): 应用玻璃吸力三角架装置在汽车天窗上的传感器包。(d): DAVIS相机和VI传感器装置在摩托车上。请留神,在所有的配置中,VI传感器都是倒着装置的。最好以黑白形式观看。表1 传感器和特色如第五节所述,所有传感器之间的外在因素是通过校准来预计的。对于事件的产生,两个实验性的mDAVIS-346B相机被装置在一个程度的平面设置中。这些相机与[26]类似,但具备更高的346x260像素的分辨率,高达50fps的APS(基于帧的图像)输入,和更高的动静范畴。立体声设施的基线是10厘米,摄像机的工夫戳同步是通过应用从左侧摄像机(主摄像机)产生的触发信号,通过内部电线向右侧(从摄像机)输送同步脉冲。两台摄像机都有4毫米的镜头,程度视场角约为87度,每台摄像机上都有一个额定的红外切割滤波器,以克制来自静止捕获零碎的红外闪光。APS的曝光是手动设置的(没有主动曝光),这取决于照明条件,但相机之间总是雷同的。尽管灰度DAVIS图像的工夫戳是同步的,但遗憾的是没有方法同步图像采集自身。因而,图像之间可能有高达10ms的偏移。为了提供地面实况的参考姿态和深度(第四节),咱们将Velodyne Puck LITE装置在平面DAVIS相机上方。Velodyne激光雷达零碎提供了传感器四周大量点的高度准确深度。激光雷达的装置形式是,激光雷达较小的垂直视场与平面DAVIS设施的视场齐全重叠。在室外场景中,咱们还装置了一个GPS设施,作为经纬度的第二个地面实况参考。通常状况下,GPS被搁置在远离传感器安装的中央,以防止USB 3.0数据线的烦扰。此外,咱们还装置了一个VI传感器[27],最后由Skybotix开发,用于与基于框架的办法进行比拟。该传感器与IMU有一对立体声,都是同步的。可怜的是,惟一的装置抉择是将摄像机倒置装置,但咱们提供了它们与DAVIS摄像机之间的转换。 3.2 序列表二中列出了所有的序列和统计摘要,图三中列出了叠加了事件的APS图像样本。1)  具备静止捕获性能的六轴飞行器:  传感器装置在六轴飞行器的计算堆上面,向下倾斜25度,如图2b所示。两个静止捕获零碎被用来为这个数据集生成序列,一个在室内,一个在室外(图4)。26.8m x 6.7m x 4.6m的室内区域用20台Vicon Vantage VP-16摄像机进行检测。30.5米x 15.3米x 15.3米的户外网区装备了全天候静止捕获零碎,由34台高分辨率Qualisys Oqus 700摄像机组成。这两个零碎通过发射红外频闪和跟踪搁置在六轴飞行器上的红外反射标记,以100Hz的频率提供毫米级精度的姿态。咱们在每个区域提供不同长度和速度的航行序列。2)  手持式:为了测试高动静范畴状况下的性能,整个传感器安装在室外和室内环境以及有无内部照明的室内环境中都进行了循环。地面实况姿势和深度是由激光雷达SLAM提供的。表二:每辆车的序列。T:总工夫,D:总行驶间隔,lvlmax 。最大线速度,llmax : 最大角速度,MER:均匀事件率。这些序列没有VI-Sensor的数据。+一个硬件故障导致这些序列的右侧DAVIS灰度图像生效。*图3:白天和早晨的室内和室外序列的样本图像与重叠的事件(蓝色和红色)。最好以黑白观看。图4:静止捕获场地。左:室内Vicon场地;右:户外Qualisys场地。3)  户外驾驶:对于慢速到中速的序列,传感器安装被装置在一辆轿车的天窗上,如图2c所示,并以最高12米/秒的速度在西费城的几个街区行驶。在白天和早晨的状况下都提供了序列,包含太阳间接在相机视线内的序列。地面实况是由激光雷达地图的深度图像,以及来自环形关闭激光雷达测距和GPS的姿势提供的。对于高速序列,DAVIS平面设施和VI传感器与GPS设施一起被装置在摩托车的车把上(图2d)。这些序列波及以高达38米/秒的速度行驶。经度和纬度以及相对速度是由GPS提供的。 04  地面实况的生成为了提供地面实况姿势,在有条件的状况下,会应用动作捕获姿势。否则,如果有激光雷达,Cartographer[28]将用于驱动序列,将激光雷达扫描和IMU数据交融成激光雷达的循环闭合2D姿势,利用第五章D节的校准将其转换为左DAVIS帧。对于户外场景,咱们也提供原始的GPS读数。对于每个有激光雷达测量的序列,咱们运行激光雷达测绘(LOAM)算法[29]来生成密集的三维部分地图,这些地图被投射到每个DAVIS相机中,以20Hz的频率生成密集的深度图像,并为手持序列提供3D姿态。咱们应用了两种独立的激光雷达测距算法,因为咱们留神到,LOAM产生了更好的、排列更参差的部分地图,而Cartographer的环形闭合则产生了更准确的全局地位,对于较长的轨迹来说,漂移更少。尽管Cartographer只预计了一个2D的姿态,但咱们置信这是一个无效的假如,因为所驾驶的路线在大多数状况下都有一个繁多的统一的等级。 4.1 地面实况姿态对于室内和室外运动捕获畛域的序列,在每个工夫t的传感器设施worldHbody(t)的主体框架的姿态是以100Hz测量的,精度为毫米级。对于户外序列,咱们依附Cartographer来执行循环闭合,并将激光雷达扫描和IMU数据交融到一个繁多的循环闭合的主体(在这种状况下是激光雷达)的2D姿态中,并使其漂移最小。为了对最终姿态的品质进行量化掂量,咱们将地位与GPS测量值对齐,并为数据集中的每个户外序列提供叠加的卫星图像,以及所提供的高空实景和GPS之间的地位差别。图7提供了Car Day 2的样本笼罩,其中Cartographer和GPS之间的平均误差始终在5m左右,没有漂移。这个误差在所有的户外驾驶序列中是统一的,总体平均误差为4.7米,与GPS的预期误差大小类似。请留神,440秒左右的误差峰值是因为微小的GPS误差造成的,对应于笼罩图右上方的黑体局部。在这两种状况下,对于每个从左DAVIS帧到帧取点的序列,外在的变换,示意为4×4的同质变换矩阵体worldHDAVIS,而后用来预计工夫t的左DAVIS绝对于工夫t0的第一个左DAVIS的姿态: 4.2 深度图的生成在有激光雷达的每个序列中,每个DAVIS相机的深度图像都是为每个激光雷达测量而生成的。咱们首先通过将以后测量四周的部分窗口中的每个激光雷达点云转换为以后测量的框架,应用LOAM的姿态来生成一个部分地图。在每次测量时,确定窗口大小,使窗口中以后、第一和最初一个LOAM姿态之间的间隔至多为d米,并且以后、第一和最初一个LOAM姿态之间至多有s秒,其中d和s是为每个序列调整的参数。这些地图的例子能够在图5中找到。而后,咱们应用规范的针孔投影方程,将所失去的点云中的每个点p投射到每个DAVIS相机的图像中:其中是投影函数:而K是矩形图像的相机本征矩阵(即投影矩阵的左上方3×3)。任何落在图像边界之外的点都会被抛弃,图像中每个像素地位上最靠近的点被用来生成最终的深度图,其例子能够在图6中找到。此外,咱们还通过应用相机本征和OpenCV对改正后的深度图像进行勾销改正和扭曲,提供没有任何失真的原始深度图像。 05  校准在这一节中,咱们形容了为校准每个DAVIS和VI-Sensor相机的外在参数而进行的各种步骤,以及每个相机、IMU和激光雷达之间的外在转换。所有的校准后果都以yaml模式提供。应用Kalibr工具箱3 [30], [31], [32]对相机本征、平面外征和相机-IMU外征进行校准,应用相机和范畴校准工具箱4 [33]对左DAVIS相机和Velodyne激光雷达的外征进行校准。在动作捕获世界帧中的Mocap模型姿势与左DAVIS相机姿势之间的手眼校准是用CamOdoCal5[34]进行的,并由人工进行微调。图5:为地面实况生成的样本地图。左图:汽车第1天序列的全图,绿色为轨迹;  右图:来自Hexacopter Indoor 3序列的部分地图。图6:深度图像(红色)与事件叠加(蓝色),来自Hexacopter Indoor 2和Car Day 1序列。请留神,因为激光雷达的垂直视场和范畴无限,图像的局部区域(彩色区域,特地是顶部)没有深度。这些局部在数据中被标记为NaN。最好以黑白形式观看。图7:GPS和Cartographer在卫星图像上叠加的 Car Day 2 轨迹的比拟。请留神,Cartographer和GPS之间的误差峰值对应于左侧笼罩图右上方的黑体局部,次要是因为GPS误差造成的。最好以黑白观看。为了对消所装置设施的变动,在收集数据的每一天,以及每次批改传感有效载荷的时候,都要反复每一次校准。除了校准参数外,每天的原始校准数据也可按需提供,以便用户在须要时进行本人的校准。 ...

March 23, 2023 · 1 min · jiezi

关于算法:SLAM技术支持的物体6Dof位姿估计的自训练方法

转载:深蓝AI分享嘉宾:卢子琦文稿整顿:张琳编辑:东岸因为@一点人工一点智能原文:SLAM技术支持的物体6Dof位姿预计的自训练方法 01 嘉宾介绍卢子琦,目前于麻省理工学院计算机科学于人工智能实验室攻读博士学位,师从Prof. JohnLeonard,钻研方向为鲁棒和子进步的物体级机器人感知于建图。个人主页:https://520xyxyzq.github.io 02  背景2.1 Object-based SLAMSLAM就是机器人同步定位与建图,通过一些传感器的测量数据同时去建设环境的地图,且利用这个地图对于机器人的状态进行预计,机器人的状态包含机器人的位姿、速度和机器人的参数,比方内参。环境地图包含比方点的地位,线的地位,面的地位。常见的SLAM零碎由前端和后端组成,如图1所示,前端个别从一些原始的传感器数据中采集一些特色,后端利用概率的推断模型对采集的模型进行交融生成全局统一的环境地图。图1 SLAM构造要晓得环境中有哪些物体,就须要进行物体级SLAM,简略而言就是以物体为指标的SLAM零碎,对物体和机器人的状态进行预计,如图2所示。一个是对空间中的几何体加上语义信息,对上游的工作有作用,另一个是十分节俭存储空间的示意。如果用浓密点云就须要用很多的存储空间,然而基于物体级的SLAM造成的示意是十分轻量化的形容。图2 物体级SLAM 2.2 如何做Object SLAM?和宽泛SLAM相似,首先要在原始的测量数据中提取特色,应用物体的感知模型,包含二维的指标检测,也包含实例宰割。明天波及的是六自由度物体预计指标检测,后端也是用概率推断模型对于多帧进行交融生成全局统一的地图。图3列举了一些指标SLAM的文章和办法。图3 相干SLAM办法和文章 2.3 为什么object SLAM是比拟艰难的问题?因为有一些宽泛的SLAM具备的挑战,也面临一些新的挑战。宽泛的challenge包含ambiguous data assosiation的问题,比方在一个停车场检测到一辆车,那么怎么把真的观测和地图外面的进行关联,那么哪一辆车是以后被观测的车呢?另外一个问题就是动静的问题,比方有一个车在前进,如何判断这辆车是在前进,如何避免这个前进的车对相机跟踪产生影响,而后如何依据这个车的行进去一直地更新地图,这些问题是比拟难解决的。新的挑战次要是源于引入了object perception model,这两个模型联合的过程中就会产生一些information瓶颈,比方在deep learning model做出一些预测的时候,很难对不确定性进行量化,很难晓得预测是好是坏。在这种状况下如何去应用深度学习的model,如何给观测赋权重是一个比拟艰难的问题。另外,一个比拟重要的在object SLAM畛域中的问题就是domain gap问题,在新的环境中会有性能降落的问题。就是在训练perception model的时候,个别在特定环境中采取数据,给数据增加标注,用这些标注的数据训练网络。但当应用或测试这个网络时,往往在一个新环境中测试,训练和测试环境之间很可能有一些区别,比方光照的不同,背景的不同,噪声状况的不同,这个差别会使测试数据和训练数据造成散布不匹配的问题,这个问题就是一个domain gap的问题。还会导致perception model性能降落的问题。合成数据在真实情况中应用时,它的成果会大打折扣。心愿可能做到的体现是机器人在摸索不同环境的时候,可能主动的适应以后的环境,把它的perception model调整到比拟好的性能状态。这里对于object SLAM的介绍告一段落,如果感兴趣能够在面4的主页中关注。图4 主页 03  办法介绍3.1 什么是6自由度物体位姿预计?如图5所示,图片中有物体,而后通过模型计算物体绝对于相机的位姿,这个位姿包含3自由度的旋转和3自由度的平移,所以称它为6自由度的物体位姿预计。具备代表性的工作,比如说CNN和明天会波及到的办法。图5 6自由度的物体位姿预计明天要探讨的不是如何去设计一个更好的6自由度位姿预计,而是在实在场景中的体现如何,把它从文章中拿进去,和其余的位姿模型在同样的benchmark中进行比照它们的体现最终如何。BOP办法进行6自由度位姿预计,而后这个benchmark它的指标就是这样的体现,模型对应的物体是刚性物体,它们的输出是RGB和RGBD的图像。BOP challenge依据指标对不同的model进行打分,而后分数比拟高的就能够取得奖项,每年的会议上都有BOP challenge的workshop,介绍如图6所示。图6 BOP六维物体位姿预计BOP challenge在2019年的后果,在这一年有很多办法在一些task下来竞争,表1列举了不同办法的性能比拟,依照性能从高到低排列。能够看到这一年的经典办法就是基于这种特色的办法是因为基于深度学习办法的。表1 BOP challenge性能比拟针对下面的问题给出解释,首先不足在真实世界中训练的图片,还有实在的测试图片和通常应用的合成的训练图片之间有比拟大的domain gap。这两个起因属于一个问题,就是短少在实在环境或者测试环境中带有6自由度物体标注的数据。为了解决这个问题,有哪些计划呢?一种解决方案就是去进步合成数据的真实性,生成更加成熟的数据,另一种计划是能够利用test devirament没有标注的数据去进步体现。须要用到文章应用的self-training。Semi-supervised learning联合一些带有标签和数据和不带有标签的数据去进步模型的预测性能。为什么这样的事件可能胜利呢?为什么可能用不带有标签的数据去进步性能呢?因为不带标签的数据上往往携带了对于预测的task有用的一些信息。比方雾天的数据是不带标签的数据的话,那它就携带了这种background的信息,这样的信息有可能被提出的semi-supervised learning的办法学习到,进步模型的体现。可怜的是,Semi-supervised learning大部分的办法都没有对于收敛的一个保障,很可能越去训练它这个模型的体现越差,因为预报的一些谬误的在这个训练过程中会一直的增强本身导致的。 3.2 什么是self-training?self-training是比拟晚期的办法,用学习模型的预测去进步模型预测的能力。图7是具体的流程图,首先从一些带有标签的数据开始去训练Deep CNN model,而后用模型在不带标签的数据上预测,再把这些预测当做新的标签,这些标签就叫做伪标签,并不是实在的标签,是模型的预测。这些伪标签可能会有好有坏,为了选出好的伪标签,须要应用selection algorithm选出外面高质量的label造成一些带有伪标签的数据。把这些带有伪标签的数据和原始的带有实在标签的预训练的数据联合在一起,微调或从新训练网络。能够看到,整个流程图中比拟重要的一环就是抉择算法,如果通过这个算法可能胜利的抉择出高质量的数据的话,就能够进步性能体现,反之可能会升高性能体现。图7 self-training流程图对于文章SLAM-supported self-training for 6D object pose estimation,首先是一些动机,为什么要做6自由度物体的位姿预计,因为它能够给出这些环境中的几何和语义的信息,如图8所示。图8 环境的几何和语义信息在一个环境中训练,在另外一个环境中测试,就会存在domain gap问题。这个问题的体现展现了一个video可视化问题,在合成数据上训练,有了实在数据再测试,能够看到它很难对这些物体进行正确的预测。那么如何去解决这样的问题呢?一种最简略的形式就是在测试数据中采集一些数据,给这些数据加上物体位姿的标注,而后微调6自由度位姿预计器。然而,整个6自由度物体位姿标注的过程十分费时费力,更重要的一点是心愿机器人在摸索不同环境的时候是不被打断的,如果机器人进入到新的环境,还要去标注这个新环境的数据,那它对机器人的自主运行就是一个很不利的事件。所以心愿做的就是机器人可能本人去给它采集到的数据进行标注,做一个self label。图9 domain gap问题应运而生,有一些办法来解决问题,个别用合成带有标签的数据和一些实在不带标签的数据一起去进步位姿预计的性能。如图10所示,它们能够分为single-view methods和Multi-view methods,前者输出的数据是无序的,然而个别机器人采集的数据都是依照肯定的秩序采集的,会有工夫和空间上的连续性。single-view不能利用连续性,于是利用Multi-view办法,交融不同视角对于物体位姿的预计来造成更加牢靠的对于物体的了解,用这个更加牢靠的位姿对一些数据做标注,再微调,但大部分须要高精度相机的静止信息。图10 single-view办法和multi-view办法于是,提出了一种用SLAM来反对的办法,通过机器人采集的数据把它放到一个这种鲁棒的物体级SLAM的零碎外面,而后生成一个全局统一的,包含相机的位姿和物体的位姿,而后生成一些伪标签,利用一致性的标签作为新的训练数据去微调位姿预计模型,如图11所示。图11 SLAM反对的办法办法的流程图如图12所示,从带有标签的图片数据动手,预训练一个6D的物体位姿预计器,把这个预计器放在机器人上,在前进过程中对物体的位姿进行预计,而后联结物体的位姿预计和机器人的里程计造成位姿图。用提出的一些鲁棒的优化办法求解SLAM预计,包含机器人的位姿和物体的位姿,从这些模型所预测的物体位姿和优化的位姿物体之中选出比拟高质量的物体位姿作为伪标签,把它和原始的带有实在标签的数据进行交融。整个流程图和self-training是一样的过程,从宏观上来看,办法左半边是在做一个鲁棒的状态预计,造成全局统一的场景地图,右半边实际上是在用semi-supervised learning进步物体位姿预计的性能,办法联合了两方面的一个成绩。图12 办法流程图如何进行鲁棒的位姿图优化来失去比拟牢靠的SLAM预计?提出了一种主动协方差调整的位姿图优化,这里如果开展讲可能须要很长时间,在这边只做一个比拟宏观的介绍。如果大家有趣味,能够去文章中的相干章节看到比拟细节的公式推导。首先要思考为什么要做这样一个主动协方差的调整,个别在做位姿图预计的过程中会假如观测是合乎高斯分布的,这样就能把问题转换为一个非线性最小二乘问题去求解,为了指定这样的高斯分布,须要两个量,一个是冀望,一个是方差,对于高维的高斯分布须要一个冀望和一个协方差矩阵。冀望很好失去,能够通过SLAM预计还有测量模型计算每个测量的期望值,但协方差个别都是经验性的给出一个值,在理论中依据对于传感器噪声大小的一个了解去制订这样的协方差值,比方传感器的噪声比拟大,给一个比拟大的协方差矩阵,反之给一个比拟小的协方差矩阵。当初对于物体位姿的预计都是从深度学习模型失去的,也就是说传感器变成了模型,对噪声没有十分牢靠的了解,预测没有方法很好的量化。在这种状况下,如何指定协方差矩阵?提出的计划是不指定协方差矩阵,把协方差矩阵和SLAM的变量进行联结优化,如图13所示,在公式里展现。第一项代表物体位姿的损失值,最初一项是机器人里程计的损失值,第一项是正则化项,目标是避免值跑到正无穷,像零这个方向去正则化。求解联结优化的问题是用的alternating minimization办法,这个办法有两个劣势,第一个劣势是对最优的协方差矩阵有一个解析解,第二个益处是能够在重量级别对协方差矩阵进行拟合。失去位姿预测时,对六个自由度的重量进行不同水平的拟合,与传统办法相比更加灵便,也可能拟合更宽泛的噪声模型。图13 主动协方差调整公式推导如图14所示是hybrid pseudo-labeling办法,在两种位姿中选取高质量的伪标签。图14 hybrid pseudo-labeling办法如图15所示,Hybrid model利用了两种数据,一种数据是模型间接在图片上预测的物体位姿,另一种是通过优化失去的物体位姿,为了对位姿进行好坏的评估,有两种评估办法,一种利用几何信息,另一种利用视觉信息,几何信息应用卡方测试,预测的物体位姿是否和优化的物体位姿有显著的差别,如果有显著差别可能是比拟差的位姿预计,反之是比拟好的位姿预计。视觉查看依据物体位姿预计生成一个渲染图片,把渲染物体和实在物体比照,转换到特色空间,在特色空间上的向量看它们是不是类似。通过这两个check,就能失去比拟高质量的位姿标签数据。图15 Hybrid model提出办法的后果如图16所示,在两个数据集上进行试验,并测试方法。第一个数据集是一个公开数据集,叫做YCB video experiment。首先用一些合成数据去预训练,而后拿到模型上进行self-training。值得强调的是,在进行self-training时,不去应用这些label标注,齐全通过self-training生成标注,最初一步就是把self-training后的放在下面去评估体现。Video展现的是它们在测试集的体现,就是在self-training之前和之后进行的比照,能够看到self-training后性能更加稳固,可能检测出更多物体,也有更少离群的位姿预计。图16 提出办法的后果如图17所示,第二个试验是在实在车下面做的试验,把相机放在机器人上,围绕物体进行导航。做实在机器人试验的目标就是为了测试方法对于挑战的可行性,提出办法在静止含糊等状况下仍然能够失去比拟好的性能,比拟多的进步训练后的体现,离群值很少。图17 实在车试验 ...

March 23, 2023 · 1 min · jiezi

关于算法:Lion-超越-AdamW-的优化算法

出品人:Towhee 技术团队 优化算法,即优化器,在训练神经网络中起着根底作用。 近年来引入了大量手工优化器,其中大部分是自适应优化器。然而,具备解耦权重衰减的 Adam,也称为 AdamW,和具备因数二次矩的 Adafactor,依然是训练大多数深度神经网络的事实上的规范优化器,尤其是最近最先进的语言、视觉和多模态模型。 另一个方向是主动发现这样的优化算法。学习优化 (L2O) 办法倡议通过训练参数化模型(例如神经网络)来发现优化器以输入更新。然而,那些通常在无限数量的小工作上训练的黑盒优化器很难推广到最先进的设置,在这些设置中,更大的模型须要更多的训练步骤来训练。 另一类办法利用强化学习或蒙特卡洛采样来发现新的优化器,其中搜寻空间由由预约义操作数和运算符组成的树定义。为了使搜寻易于治理,他们通常通过应用固定操作数和限度树的大小来限度搜寻空间,从而限度发现的可能性。因而,所发现的算法尚未达到最先进的程度。 Google 和 UCLA 提出 Lion (EvoLved Sign Momentum) 优化算法,以优越的性能和良好的成果超过了经典的优化算法。Lion 通过程序搜寻来发现优化算法,并且用其改良深度神经网络训练。它可能解决在有限、稠密空间的搜寻挑战。 除了更具效率以外,Lion 还具备弱小的泛化能力,逾越了体系结构、数据集和工作的限度。在图像分类方面,Lion 在 ImageNet 上将 ViT 的准确性进步了高达 2%,并在 JFT 上节俭了高达 5 倍的预训练计算。 在视觉语言比照学习方面,Lion 在 ImageNet 上实现了 88.3% 的zero-shot和 91.1% 的微调准确度,别离超过了之前的最佳后果 2% 和 0.1%。在扩散模型上,Lion 的体现优于 Adam,实现了更好的 FID 得分并将训练计算缩小了高达 2.3 倍。对于自回归、掩码语言建模和微调,Lion 也体现出与 Adam 相当或更好的性能。 Lion 利用了高效的搜寻技术来摸索有限稠密的程序空间。为了弥合代理和指标工作之间的狭义间隙,它还引入了程序抉择和简化策略。与自适应优化器不同,Lion 对于通过符号操作计算的每个参数都采纳了雷同的幅度的更新。它的性能增益随着训练批次大小的减少而减少。因为符号函数产生的更新的范数更大,它还须要较小的学习率。 AdamW vs. Lion optimizer.上图为 AdamW 与 Lion 优化算法的比照。 能够看到,与 AdamW 和各种自适应优化器须要同时保留一阶和二阶矩相比,Lion 只须要动量,将额定的内存占用减半。这在训练大型模型和/或大批量时很有用。 ...

March 23, 2023 · 1 min · jiezi

关于算法:双指针算法模板及练习

文章和代码曾经归档至【Github仓库:algorithms-notes】或者公众号【AIShareLab】回复 算法笔记 也可获取。日志统计小明保护着一个程序员论坛。当初他收集了一份”点赞”日志,日志共有 N 行。 其中每一行的格局是: ts id示意在 ts 时刻编号 id 的帖子收到一个”赞”。 当初小明想统计有哪些帖子已经是”热帖”。 如果一个帖子曾在任意一个长度为 D 的时间段内收到不少于 K 个赞,小明就认为这个帖子曾是”热帖”。 具体来说,如果存在某个时刻 T 满足该帖在 [T,T+D) 这段时间内(留神是左闭右开区间)收到不少于 K 个赞,该帖就曾是”热帖”。 给定日志,请你帮忙小明统计出所有曾是”热帖”的帖子编号。 输出格局 第一行蕴含三个整数 N,D,K。 以下 N 行每行一条日志,蕴含两个整数 ts 和 id。 输入格局 按从小到大的程序输入热帖 id。 每个 id 占一行。 数据范畴 $1≤K≤N≤10^5$, $0≤ts,id≤10^5,$ $1≤D≤10000$ 输出样例: 7 10 20 10 1010 1010 19 1100 3100 3输入样例: 13个别做法for(time) // 遍历所有的时间段{ memset(cnt, 0, sizeof cnt); for (id) // 遍历所有的帖子 { cnt[id] ++ ; // 统计次数 if(cnt[id] >= k) st[id] = true; }}for( int i = 1; i <= 100000; i ++ ){ if(st[i]) cout << i <<endl;} ...

March 22, 2023 · 2 min · jiezi

关于算法:快递管家无需API开发连接伙伴云实现快递信息自动同步到表单汇总

1.  快递管家用户应用场景:每当商家应用快递管家打印面单后,常须要物流人员通过订单编号在搭档云查问订单详情,而后将快递公司、快递编码等信息更新到相应的地位,确保订单信息的完整性。但人工手动数据费时费力且易出错,一旦订单其中的某项数字出错,还需从新核查,减少大量的工作量。 因而,商家经常在想这一套流程是否能够实现自动化?如果要连贯两个不同零碎的数据,往往须要零碎开发,费用高,工夫周期长,并且像快递管家这种比拟灵便,企业常常会调整应用流程,零碎字段,这会导致须要一直地进行调整和开发。 2. 快递管家如何无代码集成第三方零碎?利用集简云零碎,企业能够轻松实现这个性能,将多个软件中的数据主动同步,并且无需开发,即使没有任何技术常识的业务人员,也能够轻松应用。 集简云:连贯软件与软件更简略的形式: 通过集简云无代码集成平台,无需开发就能够将快递管家无缝集成到各种第三方利用零碎,例如:OA办公零碎,客户服务零碎,MySQL数据库,企业微信,表单零碎,CRM等数十款利用零碎,以及企业外部零碎进行数据同步与性能执行。 查看残缺的可用利用列表:集简云apps 集简云的应用流程:· 触发动作:当一个利用零碎产生了什么事件时 · 执行动作:主动在一个或者多个不同零碎中执行不同事件 【快递管家+字段查问+搭档云+搭档云】具体操作演示快递100是一站式快递服务平台,提供超800家快递查问及网点、电话查问;提供收费快递查问API接口,为您搜寻周边优质快递员;反对在线寄快递、在线领取。 字段查问是集简云的一个内置利用,可做为执行利用应用。其次要性能是设置一个字段列表进行字段匹配关系查问。 搭档云提供比云表格/在线Excel更灵便的权限治理和数据合作性能,搭配自动化工作流与大数据分析引擎,疾速构建各类企业治理利用与绩效数据仪表盘,本人入手,5分钟配置一个业务场景,还能与微信残缺买通。 1. 实现目标: 每当快递管家打印面单时,主动查问该订单在搭档云的数据,并更新相干快递公司、快递编码等信息。无需人工再一一手动录入,省时省力,晋升工作效率。 2. 数据流程由两个局部组成(触发&执行) ● 触发动作:当快递管家有快递面单打印时● 执行动作:字段查问主动设置匹配关系● 执行动作:搭档云主动查问数据列表● 执行动作:搭档云自动更新数据 点击此模板,立刻应用 3.  达成成果:通过集简云,每当快递管家打印面单时,主动查问该订单在搭档云的数据,并更新相干快递公司、快递编码等信息。无需人工再一一手动录入,省时省力,晋升工作效率。 3.  更多流程示例:● 电商零碎+快递管家:当电商零碎有新增订单或者新增退货订单后,将订单/退货地址增加的快递管家打印运单或者追踪退货货运进度信息 ● 表单零碎+快递管家:当表单零碎中增加发货信息后,主动同步到快递管家创立运单 ● 飞书OA零碎 + 快递管家:通过飞书机器人发送货运信息,创立货运订单 集简云: 让连贯更简略集简云是一款超级软件连接器,无需开发,无需代码常识就能够轻松买通数百款软件之间的数据连贯,构建自动化与智能化的业务流程。通过自动化业务流程,每月可节俭您数百甚至数万小时的人工成本。 咱们置信:业务流程自动化与智能化是企业新的增长引擎官网地址:「集简云官网」软件集成能够如此简略 为什么抉择集简云 ?无需开发,简略疾速地扩大现有零碎的性能 通过集简云能够疾速扩大您现有零碎的性能,例如为您的表单零碎减少微信揭示,邮件揭示,短信揭示性能,为您的微信公众号减少赠送卡券同步CRM零碎性能,为您的OA办公零碎减少逻辑判断与数据存储性能等等。而这所有无需任何技术开发,简略疾速地晋升您零碎的能力。 业务流程自动化,节俭企业数万小时的人工成本 您的团队还在人工导出导入不同零碎之间的数据信息,手动的在不同的零碎中录入,批改和执行各种操作吗?通过集简云,无需任何开发既能够疾速搭建自动化的业务流程,简略快捷,人人可用,几分钟创立的自动化业务流程或者能够节俭企业数万小时的人工成本。 利用AI人工智能技术,晋升业务流程的效率与价值 在自动化业务流程之外,集简云提供了AI人工智能组件,帮忙企业将那些须要人工参加的重复性工作转由AI人工智能技术主动解决,包含语义剖析,预测模型,信息主动提取等多种不同的AI模块。 「集简云官网」软件集成能够如此简略集简云开放平台:让您的零碎领有与500+款软件连贯的能力 集简云开放平台是集简云为开发者(软件公司,企业外部开发者,独立开发者)提供疾速与集简云平台中的利用进行连贯的能力,您能够将您的软件接口上线到集简云平台轻松实现数百款应用软件的数据互通。您也能够将集简云的集成能力嵌入到您的软件系统中,将数百款软件的集成能力变成您产品的性能与卖点,扩大额定支出,晋升客户成交率,成交金额与满意度。

March 22, 2023 · 1 min · jiezi

关于算法:基于PaddleOCR的多视角集装箱箱号检测识别

基于PaddleOCR的多视角集装箱箱号检测辨认一、我的项目介绍集装箱号是指装运进口货物集装箱的箱号,填写托运单时必填此项。标准箱号形成基本概念:采纳ISO6346(1995)规范 规范集装箱箱号由11位编码组成,如:CBHU 123456 7,包含三个局部: 第一局部由4位英文字母组成。前三位代码次要阐明箱主、经营人,第四位代码阐明集装箱的类型。列如CBHU 结尾的规范集装箱是表明箱主和经营人为中远集运第二局部由6位数字组成。是箱体注册码,用于一个集装箱箱体持有的惟一标识第三局部为校验码由前4位字母和6位数字通过校验规定运算失去,用于辨认在校验时是否产生谬误。即第11位编号本教程基于PaddleOCR进行集装箱箱号检测辨认工作,应用大量数据别离训练检测、辨认模型,最初将他们串联在一起实现集装箱箱号检测辨认的工作 成果展现: 二、环境筹备首先点击左侧套件抉择PaddleOCR 进行下载。 三、数据集介绍本教程所应用的集装箱箱号数据集,该数据蕴含3003张分辨率为1920×1080的集装箱图像 1、PaddleOCR检测模型训练标注规定如下,两头用"\t"分隔: " 图像文件名 json.dumps编码的图像标注信息"ch4_test_images/img_61.jpg [{"transcription": "MASA", "points": [[310, 104], [416, 141], [418, 216], [312, 179]]}, {...}]其中json.dumps编码前的图像标注信息是蕴含多个字典的list,字典中的 points 示意文本框的四个点的坐标(x, y),从左上角的点开始顺时针排列。 transcription 示意以后文本框的文字,当其内容为“###”时,示意该文本框有效,在训练时会跳过。 2、PaddleOCR辨认模型训练标注规定如下,两头用"\t"分隔: " 图像文件名 图像标注信息 "train_data/rec/train/word_001.jpg 简略可依赖train_data/rec/train/word_002.jpg 用科技让简单的世界更简略四、数据整顿4.1 检测模型所需数据筹备将数据集3000张图片按2:1划分成训练集和验证集,运行以下代码 from tqdm import tqdmfinename = "all_label.txt"f = open(finename)lines = f.readlines() t = open('det_train_label.txt','w')v = open('det_eval_label.txt','w')count = 0for line in tqdm(lines): if count < 2000: t.writelines(line) count += 1 else: v.writelines(line)f.close()t.close()v.close()4.2 辨认模型所需数据筹备咱们依据检测局部的正文,裁剪数据集尽可能只蕴含文字局部图片作为辨认的数据,运行以下代码 ...

March 21, 2023 · 4 min · jiezi

关于算法:哈啰基于轨迹与端智能的还车体验优化

还车背景还车流程 在定点还车的模式下,用户还车须要在一些指定的区域里。此时用户停好车后在APP或小程序内点击“我要还车”,手机会将地位信息传输给后端,零碎会判断是否在站点内,如在站点内会提醒用户点击“确认关锁”,用户手动敞开车锁实现还车。 如果用户未受权地位信息,零碎获取不到用户手机地位,会提醒用户车辆不在站点内,此时用户再次点击还车,咱们会给车辆下发地位,如果获取到车辆在站点内,则提醒用户能够关锁还车。如果依然无奈获取到车辆地位,会影响到用户的还车。 还车问题 还车以后的问题次要包含用户还车体验不佳和风险系数高,用户会感觉还车速度慢、流程繁琐以及乱收费,而某些用户手机地位也存在地位滞后、漂移等状况,个别用户可能会批改手机地位,给G端、B端经营带来困扰。产生问题的起因有地位问题、判责死板繁琐、定位差别区域、关锁在订单前等方面。 还车指标 咱们心愿用户还车既快又准,制订了一些指标,包含NPS贬损、一次还车成功率、还车时长、还车定位精度、服务求助率和判责准确率等。 解题抓手咱们通过对定位状况进行剖析,发现用户地位存在判责不精确的危险,而车辆地位绝对用户地位更加精确,但因为须要下发指令从新定位并返回,会减少还车时长。能够利用的信息包含订单中继续上报的车辆地位、车身中的智能件——减速计和通过站点治理解决局部区域定向漂移。 基于此,咱们次要的解题办法包含两个局部。一是预测还车行为,提前下发定位和提前进行判责,判断后果可推送到手机;二是利用轨迹预测还车点,来防止车辆从新定位,校验手机地位是否正当。 减速计 减速计的原理是利用重力的作用来测量物体静止的变动。设想有一个球在盒子里,程度搁置的时候,重力使得小球与下外表接触,下外表会产生一个向上的力,与重力相对消,则产生一个向上的加速度,大小为G。 咱们去采集了不同场景下车辆减速计的值,在现实的状况下图中稳定的中央示意车辆在挪动,平的中央示意车辆停下了。那么,如果将车辆中途等红绿灯和真正停驻进行辨别呢? 为此咱们进行了一些试验,用x、y、z三个值所在的立体去拟合x、y、z所在的散布,并计算俯仰角和横滚角,合乎方程的被认为停驻,两头等红绿灯期间未被辨认为停驻。在理论业务场景中,有减速计的车根本在用户点“还车”之前,85%以上的的订单都能进行预判。 轨迹预测 轨迹的实质是有时序关系的一串定位点,咱们须要进行锐角剔除、速度校验、停驻平滑等预处理,同时联合路网状况对车辆轨迹中不合理地位及时剔除,联合预处理后的轨迹缺失状况、速度合理性、方向合理性生成偏差。 接下来介绍如何联合路网对轨迹纠偏。这里应用了隐马尔可夫模型,如图有一个原始的轨迹点P,在肯定间隔内有三个路段,轨迹点离旁边路段上的地位越近,那么这个点在这个路段上的概率越大。原始两个点的间隔与投射之后两个点的间隔越靠近,转移概率越大。 咱们联合停驻前的轨迹点的速度和方向进行轨迹预测,将原始轨迹点和预测点都用来作为是否可还车的判断,在这一段里的站点都认为是能够还车的点。 通过上述办法,咱们将原始的还车链路进行了革新,将判责的局部放在用户感知之前,用户在停好车后点击还车即可来到,耗时大大缩短,晋升了用户还车的体验。 (本文作者:高婷)

March 21, 2023 · 1 min · jiezi

关于算法:题型STATS-3860B解说

Assignment 3/9155B Winter 2023 This assignment is due Monday, April 10th, at 11:55 pm. You must write your R code and answers using Rmarkdown generating asingle pdf file. Submissions must be made via Gradescope. You must carefully assign eachquestion part to its corresponding page (or pages) on your pdf file. Question partswith no pages assigned to them will receive zero marks. Each student must submit their own work. Scholastic offences are taken seriously,and students are directed to read the appropriate policy, specifically, the definition ofwhat constitutes a Scholastic Offence, at the following Web site: http://www.uwo.ca/univsec/pdf/academic_policies/appeals/scholastic_discipline_undergrad.pdfQuestion 1The denim dataset concerns the amount of waste in material cutting for a jeans manufacturerdue to five suppliers. Consider the code below to first remove two outliers from the dataset.library(faraway)data(denim) ...

March 21, 2023 · 3 min · jiezi

关于算法:CVPR-2023-GPT4与文心一言同台竞技居然是为了自动驾驶UniAD工作

以下文章来源于OpenDriveLab ,作者OpenDriveLab编辑:一点人工一点智能原文:CVPR 2023 | GPT-4与文心一言同台竞技,竟然是为了主动驾驶UniAD工作! 00  前言都说 ChatGPT 是自然语言解决中技术大魔王,国内百度的文心一言是国内技术一霸,那主动驾驶中的技术魔王,你听过说吗?另外,ChatGPT 和文心一言都好评的主动驾驶端到端模型,大家不好奇吗?图源:文心一言;关键词:技术大魔王ChatGPT 的横空出世解决了自然语言中绝大多数的工作:包含语言生成、文本分类、机器翻译、文本摘要和对话生成。ChatGPT 对自然语言解决工作体现出弱小的“统治能力”,曾经一统语言解决的江湖。国内百度的文心一言也兼顾解决了汇集中文环境中的自然语言解决的工作。看着这些自然语言解决的技术大魔王,再看看OpenDriveLab本人的钻研畛域——主动驾驶。不禁提问:一个大的工作只须要一个模型就足够了吗?会存在主动驾驶畛域的大魔王吗?主动驾驶是一项高度简单的技术,须要多个学科畛域的常识和技能,包含传感器技术、机器学习、门路布局等方面。主动驾驶还须要适应不同的路线规定和交通文化,与其余车辆和行人进行良好的交互,以实现高度牢靠和平安的主动驾驶零碎。面对这种简单的场景,大部分主动驾驶相干的工作都聚焦在具体的某个模块,对于框架性的研究则绝对匮乏。主动驾驶是个绝对艰难的工作,然而上海人工智能试验OpenDriveLab 主动驾驶团队迎难而上,勇攀高峰的精力让咱们团队的精力小伙们摸索出主动驾驶中魔王级别的算法框架——Unified Autonomous Driving(UniAD)!从工作看,UniAD 首次将检测,跟踪,建图,轨迹预测,占据栅格预测以及布局整合到一个基于 Transformer 的端到端网络框架下。从性能看,UniAD 在nuScenes 数据集下的所有相干工作都达SOTA 性能,尤其是预测和布局成果远超其余模型。目前论文已被 CVPR 2023 接管。UniAD 完满符合了大魔王“多任务”和“高性能”的特点,可称为主动驾驶中的技术大魔王。同时 UniAD 也取得了 ChatGPT 和文心一言的认可,堪称是通过了技术魔王的“同行评议”:ChatGPT 版本:咱们把论文中的文字局部输出给ChatGPT,让他来了解 UniAD。文中其余的答复也都基于在模型了解完论文之后给出的回答。文心一言版本:同样,咱们把论文的文字局部输出到文心一言中,让他来了解 UniAD。文中其余的答复也都基于在模型了解完论文之后给出的回答。想晓得的更多 UniAD 的细节,上面的两个链接会给你答案。 我的项目地址:https://github.com/OpenDriveLab/UniAD论文地址:https://arxiv.org/abs/2212.10156 01  魔王诞生无关 UniAD 的诞生,要不先听听技术大佬们:青年研究员陈立、ChatGPT 和文心一言怎么说?UniAD为什么会诞生?能够先听听咱们团队青年才俊、主动驾驶研究员陈立的认识:ChatGPT 是这样认为的文心一言也剖析得有条有理:通过咱们的青年研究员和两个技术大魔王的剖析,置信大家必定有所理解。接下来给大家具体论述为什么 UniAD 会诞生,这必然会回到一个问题:“为什么之前的模型没有同时做到这么多的工作呢?”或者还要从主动驾驶的框架开始剖析:主动驾驶UniAD框架比照 (a)传统模块化(b)多任务模块(c)端到端主动驾驶模块如上图所示,现有主动驾驶零碎可大抵归为三类:a. 传统模块化每个模型负责独自的子工作,劣势在于易于调试迭代,然而解耦就会失落最优性,各个模块的优化指标并不是以驾驶为最终目标,并且每个模块的误差会传递到之后的模块。b. 多任务模块多任务范式利用一个共享的特征提取器来实现多个子工作,益处是节俭计算成本,毛病在于不同工作之间可能会存在负面影响。c. 端到端模块端到端(End-to-end, E2E)范式以最终的驾驶性能为指标,具体又能够细分为两种范式:隐式的端到端和显式的端到端。其中隐式端到端是以传感器数据作为输出,间接输入布局或者控制指令。这种范式的益处是较为简洁,毛病是不足可解释性,难以调式及迭代。显式端到端则是将多个模块囊括在端到端模型之中,每个模块有各自的输入,并且会将提取到的特色传递到上游工作。咱们对目前显式端到端主动驾驶工作进行了比拟:端到端主动驾驶工作比照能够发现,大多数工作都关注了感知、决策和布局三局部,但具体任务存在差别,且没有框架交融所有的工作。那为什么会呈现这种状况呢?一方面受限于对主动驾驶的意识,研究者们没有对工作之间的关联和构建形式钻研分明;另一方面受限于模型的最终成果,或者有人已经尝试过把全副工作交融,然而成果不佳。为了探讨这一问题,UniAD 首次将所有检测,跟踪,建图,轨迹预测,占据栅格预测与布局都蕴含进来,从实现方面解决了这一难点。另一方面,通过严格的融化试验发现,在正确的交融形式下,所有的工作对最终的驾驶性能都是有收益的。至此,主动驾驶方面的技术魔王为了解决理论问题而来。 02  魔王登基那为什么咱们的模型能够解决不同工作的交融难的问题,从而实现多任务和高性能呢?让咱们开始揭晓主动驾驶技术大魔王的真身:整体而言,UniAD 利用多组 query 实现了全栈 Transformer 的端到端模型。如图所示,UniAD 由 2 个感知模块,2 个预测模块以及一个布局模块组成。其中感知和预测模块是通过 transformer 架构进行预测,每个模块输入的特色会传递到之后的模块来辅助上游工作。UniAD整体框架图 秘密武器1:多组 query 的全 Transformer 模型UniAD 利用多组 query 实现了全栈 Transformer 的端到端模型,咱们能够从具体 Transformer 的输入输出感触到信息交融。在 TrackFormer 中,Track query 通过与 BEV 特色通过 attention 的形式进行交互,对特色进行输入。相似的,Map query 通过 MapFormer 的更新后,失去相应的特色。MotionFormer 应用 Motion query 与 BEV 特色进行交互,失去将来轨迹。OccFormer 以密集的 BEV 特色和稠密的特色对应的地位信息来构建实例级别的占据栅格。 ...

March 21, 2023 · 1 min · jiezi

关于算法:浅谈活动场景下的图算法在反作弊应用

作者 | ANTI 导读 随着反作弊与舞弊黑产反抗愈发强烈,舞弊伎俩突飞猛进,咱们也一直尝试新的办法解决新的舞弊问题。本文次要介绍在流动场景下,利用图算法解决社团类型舞弊问题。图模型不仅可能同时融入图的拓扑构造和节点的特色进行学习,而且其作为半监督模型,能够更好地利用未标注的数据,晋升召回成果。文中提到的GCN图模型和SCGCN(多图串联模型)在舞弊召回方面均获得很好的成果。 全文4102字,预计浏览工夫11分钟。 01 引言经营流动是企业保障用户增长与留存的重要伎俩,也是企业的外围竞争力之一。其次要模式包含拉新和促活,拉新是通过老用户邀请新用户的形式获取新的用户,增大用户资源池;促活即是通过做工作的流动模式晋升DAU,减少用户粘性。举个例子,咱们平时在某APP上参加的做工作领红包流动便是经营流动的具体模式之一。企业通过联合本人的产品特点发展经营流动,能够达到晋升用户留存率和转化率的目标,从而进步企业收益和影响力。百度系APP上也有各式各样的流动,例如「邀好友领红包」,「做工作领红包」等。然而流动中会有大量作弊者(比方网络黑产)通过舞弊伎俩获取非正当利益,影响流动营销成果。此时就须要反作弊零碎通过用户画像、用户行为、设施信息等多维度信息对黑产进行甄别,为公司的经营流动保驾护航。近年来,随着反作弊与黑产之间一直的攻防反抗,黑产的舞弊伎俩也在一直迭代降级,从大规模机刷舞弊逐步演变为众包舞弊,乃至小规模真人舞弊,这使反作弊的舞弊辨认难度也一直减少,是此,咱们须要不停的迭代新的办法对黑产进行辨认和拦挡。 02 难点在经营流动中,以拉新流动为例。在拉新类型的流动中,邀请行为一旦产生,用户之间便会主动建设一种关联关系,这里咱们称之为「师徒关系」(邀请者视为「师父」,将被邀请者视为「师傅」)。举个例子,Pic.3是通过「邀新」操作产生的用户关系图,咱们称下层人物为上层人物的「师父」,称上层人物为下层人物的「师傅」。图中师父能够拉新多个师傅,与此同时会取得相应处分,通常状况下师傅越多,处分越多。 △Pic.1邀好友流动、Pic.2国庆流动 △Pic.3邀请流动人物关系阐明 目前,拉新场景反作弊建模面临以下两个问题: 1、短少刻画用户间分割信息的能力:流动反作弊业务目前利用模型蕴含树模型、DNN和机器学习模型。如果咱们把用户看作节点,会发现这些模型的学习训练更关注于节点自身的特色,而短少学习节点与节点之间的关系特色的能力。在近期的几次舞弊攻打中,发现以「社团」为根本单位进行规模式攻打的舞弊模式,他们在行为以及设施信息上具备显著的共享性,作弊者之间体现出信息强关联性,咱们须要有更好的模型来学习这种「关联性」的能力。 2、样本纯度低导致召回受限:个别获取黑样本的形式是通过人工抽样评估和客诉反馈富集,白样本是按肯定的比例随机抽样取得。然而这样做存在一个不好解决的问题,即这些白样本可能混入了未知舞弊数据,会使白样本纯度升高,进而影响有监督模型的训练成果。 上面咱们介绍图模型算法能够无效解决下面两个问题。 03 图算法利用为解决下面提出的两个业务难题,选用图神经网络模型进行业务建模。图模型的劣势在于可能同时融入了图的拓扑构造和节点的特色进行学习,不仅能够通过于节点之间建设的边关系,进行信息互联,补充模型对边关系的学习能力,从而扩充召回,而且图模型作为半监督模型,能够更好地利用未标注的数据,晋升召回成果。 3.1 图模型简介目前罕用的图神经网络模型能够分为两大类:一类是基于图游走的办法,例如random-walk游走类模型;另一类是基于图卷积的办法,例如GCN、GAT以及GraphSAGE等图卷积神经网络模型。GCN从整图的角度登程,买通了原始图构造和神经网络之间的壁垒,然而基于整图的微小计算量使其在大规模场景利用上遇到瓶颈,而从部分图角度登程的GraphSAGE能够肯定水平解决这个问题。另一种罕用图模型GAT退出了注意力机制,更多的模型参数在加强了学习能力的同时,也减少了时空复杂度,这使模型训练须要更充沛的样本信息以及计算资源。在实在业务场景中,因为样本量规模可控,所以间接选取GCN图算法进行训练,上面简略介绍GCN原理。 GCN是一个多层的图卷积神经网络,每一个卷积层仅解决一阶邻域信息,通过叠加若干卷积层能够实现多阶邻域的信息传递。 每一个卷积层的流传规定如下[1]:$$H^{(l+1)}=(\tilde{D}^{-{\frac 1 2}}\tilde{A}\tilde{D}^{-{\frac 1 2}}H^{(l)}W^{(l)})$$其中 \( \tilde{A}=A+I_{N} \)是无向图\( G \)的邻接矩阵加上自连贯, \( I_{N} \)是单位矩阵\( \tilde{D} \)是\( \tilde{A} \)的度矩阵,即 \( \tilde{D}_{ii}=\sum_j\tilde{A}_{ij} \)\( H^{(l)} \)是第\( I \)层的激活单元矩阵, \( H^0=X \)\( W^{(l)} \)是每一层的参数矩阵邻接矩阵\( A \)代表了节点的街坊信息的传递,单位矩阵\( I_{N} \)代表节点本身信息的传递,正因为这样GCN模型既能够学习到节点自身的特色,又能够学习到其与其它节点的关联信息,将本人和街坊节点的信息汇总到一起进行训练学习。△Pic.4 GCN原理图△Pic.5 举例说明 图神经网络畛域作为钻研热点之一,近年来已广泛应用到工业界的各个场景中,并获得了良好效果。 3.2 图算法利用3.2.1 基于拉新流动舞弊场景的GCN召回模型拉新流动场景建模拉新流动场景是流动次要舞弊场景之一。以「师徒邀新场景」举例来说,如果师父用户胜利邀请师傅用户成为新用户,则师父用户和师傅用户都会取得相应的处分。黑产会应用批量虚伪师傅账号帮忙师父实现邀新行为从而取得收益。通过数据统计分析,发现这些虚伪师傅用户存在共享IP、机型重合等景象。据此,尝试以「师父用户」作为图中根底节点,别离将「城市+机型」和「IP+机型」作为边关系进行图模型构建。 图裁剪因为不是所有共享IP-机型的师父均存在舞弊信号,只保留权重大于阈值T的边,达到特色加强的成果。 模型成果△table 1 模型成果比照 试验结果表明,GCN算法效果显著,使舞弊样本召回率晋升42.97%。 3.2.2 多图交融办法利用摸索从以上试验中能够看出,不同的构图形式会召回不同的舞弊群体。如果将在这些群体之间差别信息交融在一起,会不会取得更多的召回呢?于是,尝试找到一种无效的形式,将不同图信息整合到同一个模型中,晋升舞弊样本召回率。沿着多图交融的思路,提出以下三种办法别离进行试验. 交融形式edge\_union边交融将两图交融思路是「图A和图B边混建在同一图中进行训练学习」,以这样的形式将图A&图B蕴含的信息交融到一起。△Pic.6 edge\_union模型△Pic.7 edge\_union构图形式 ...

March 21, 2023 · 1 min · jiezi

关于算法:COMP27112视沉计算

Introduction to Visual ComputingCoursework Assignment 3Image Processing Exercise 1IntroductionThe aim of this exercise is to get you started with OpenCV and to do some very simpleimage processing.OpenCV (opencv.org) is an open source computer vision and machine learning library that’sreleased under the BSD licence. It can be used freely for academic and commercialpurposes. The first version was released in 2001, it now has a user community of over47,000 users and the libraries have been downloaded over 18 million times.Since it’s a well-established and widely used library, there are a lot of resources on theinternet. If you run into problems with the coursework, this should be one of the first placesto seek help. Other useful websites are listed in Blackboard.This piece of coursework is worth 7.5% of the unit’s assessment, so you should spend nomore than about 7.5 hours on it.Getting StartedFollow the instructions on the Coursework area of Blackboard to download and install theversion of OpenCV for your operating system. In this lab, you’ll write two simpleprogrammes, one to prove to yourself that you’ve installed OpenCV correctly, the second toperform thresholding.Hello OpenCVTo check that everything is working correctly, print out the installed OpenCV major andminor versions in the console. To do that, OpenCV provides two macros representing thetwo integers: CV_MAJOR_VERSION and CV_MINOR_VERSION. ...

March 21, 2023 · 5 min · jiezi

关于算法:张俊林由ChatGPT反思大语言模型LLM的技术精要

作者:张俊林原文:https://zhuanlan.zhihu.com/p/597586623编辑:一点人工一点智能原文:张俊林:由ChatGPT反思大语言模型(LLM)的技术精要 导读:ChatGPT呈现后惊喜或惊醒了很多人。惊喜是因为没想到大型语言模型(LLM,Large Language Model)成果能好成这样;惊醒是顿悟到咱们对LLM的认知及倒退理念,间隔世界最先进的想法,差得有点远。我属于既惊喜又惊醒的那一批,也是典型的中国人,中国人长于自我反思,于是开始反思,而这篇文章正是反思的后果。实话实说,国内在LLM模型相干技术方面,此刻,间隔最先进技术的差距进一步加大了。技术当先或技术差距这事件,我感觉要动静地以倒退的眼光来看。在Bert呈现之后的一到两年间,其实国内在这块的技术追赶速度还是很快的,也提出了一些很好的改良模型,差距拉开的分水岭应该是在 GPT 3.0进去之后,也就是2020年年中左右。在过后,其实只有很少的人觉察到:GPT 3.0它不仅仅是一项具体的技术,其实体现的是LLM应该往何处去的一个倒退理念。自此之后,差距拉得越来越远,ChatGPT只是这种倒退理念差别的一个天然后果。所以,我集体认为,抛开是否有财力做超大型LLM这个因素,如果单从技术角度看,差距次要来自于对LLM的认知以及将来应往何处去的倒退理念的不同。国内被国外技术甩得越来越远,这个是事实,不抵赖也不行。前阵子网上很多人担心说国内AI当初处于“危急存亡之秋”,我感觉倒也不至于这么重大。君不见,这个世界上,具备这么超前眼光的只有OpenAI一家吗?包含Google在内,其实对于LLM倒退理念的了解,显著都落后OpenAI一个身位。事实是OpenAI体现过于优良,把所有人都甩开了,不仅仅是国内。我感觉,OpenAI对LLM在理念及相干技术方面,当先国外的Google、DeepMind大概半年到一年的工夫,当先国内大略两年左右的工夫。在LLM这个事件上,感觉梯队很显著,Google应该是排在第二位,最能体现Google技术眼光的是PaLM和Pathways,推出工夫大略在22年2月到4月间,同一期间,OpenAI推出的却是InstructGPT,从这里就能够看出Google和OpenAI的差距了,至于为何这么说,你看了我前面的注释后大略能了解。DeepMind之前的重心始终在强化学习攻克游戏和AI for science这些方面,切入LLM其实很晚,应该是21年才开始器重这个方向,目前也处于追赶状态。Meta就更不用说了,重心始终不在LLM上,目前感觉也发力开始追赶。这还是目前做得最好的一批机构,尚且如此,更何况国内呢?我感觉情有可原。至于OpenAI对于LLM的理念是什么,我在本文的最初一部分,会谈谈我的认知。本文梳理自GPT 3.0呈现之后的支流LLM技术,在此之前的支流技术能够参考:《乘风破浪的PTM:两年来预训练模型的技术停顿》https://zhuanlan.zhihu.com/p/254821426我置信看完这两篇文章,可能让您对LLM畛域的技术脉络,LLM技术倒退过程中呈现过的不同倒退理念,乃至将来可能的发展趋势,有比拟清晰的认知。当然,很多中央讲的内容是我集体认识,有很大的主观性,错漏不免,所以还请审慎参考。本文试图答复上面一些问题:ChatGPT是否带来了NLP乃至AI畛域的钻研范式转换?如果是,那会带来怎么的影响?LLM从海量数据中学到了什么常识?LLM又是如何存取这些常识的?随着LLM规模逐渐增大,会带来什么影响?什么是In Context Learning?为什么它是一项很神秘的技术?它和Instruct又是什么关系?LLM具备推理能力吗?思维链CoT又是怎么做的?等等,置信看完,能让您对这些问题有一个答案。首先,在谈LLM技术现状前,先宏观地谈下我心目中的钻研范式转换问题。这样,咱们能力“先见森林,再见树木”,对具体技术为何会是如此变动有个更清晰的认知。 01  潮流之巅:NLP钻研范式的转换如果咱们把工夫线往前拉得更长一些,回到NLP畛域的深度学习时代,在更长时间窗口内察看技术变迁及其影响,可能会更容易看清其中的一些要害节点。我集体认为,在最近10年来NLP畛域的技术倒退过程中,可能存在两次大的钻研范型转换。 1.1 范式转换1.0:从深度学习到两阶段预训练模型这个范式转换所涵盖的工夫范畴,大抵在深度学习引入NLP畛域(2013年左右),到GPT 3.0呈现之前(2020年5月左右)。在Bert和GPT模型呈现之前,NLP畛域风行的技术是深度学习模型,而NLP畛域的深度学习,次要依靠于以下几项关键技术:以大量的改良LSTM模型及大量的改良CNN模型作为典型的特色抽取器;以Sequence to Sequence(或叫encoder-decoder亦可)+Attention作为各种具体任务典型的总体技术框架。在这些核心技术加持下,NLP畛域深度学习的次要钻研指标,如果演绎一下,是如何无效减少模型层深或模型参数容量。就是说,怎么能力往encoder和decoder里一直叠加更深的LSTM或CNN层,来达成减少层深和模型容量的指标。这种致力,只管的确一直减少了模型层深,然而从解决具体任务的成果角度看,总体而言,不算很胜利,或者说和非深度学习办法绝对,带来的劣势不算大。深度学习之所以不够胜利,我认为次要起因来自于两个方面:一方面是某个具体任务无限的训练数据总量。随着模型容量的减少,须要靠更大量的训练数据来撑持,否则即便你能把深度做起来,工作成果也做不下来。而在预训练模型呈现之前,很显著这是NLP钻研畛域一个重大问题;另外一个方面是LSTM/CNN特色抽取器,表达能力不够强。意思是就算给你再多的数据也没用,因为你不能无效地排汇数据里蕴含的常识。次要应该是这两个起因,妨碍了深度学习在NLP畛域的胜利解围。Bert/GPT这两个预训练模型的呈现,无论在学术研究角度看,还是工业利用角度来看,都代表了NLP畛域的一个技术飞跃,并带来了整个畛域钻研范式的转换。这种范式转换带来的影响,体现在两个方面:首先,是局部NLP钻研子畛域的消退乃至逐渐沦亡;其次,NLP不同子畛域的技术办法和技术框架日趋对立,在Bert呈现后一年左右,技术栈根本收敛到两种技术模式中。对于这两点,咱们分头来谈。 影响一:两头工作的沦亡NLP是一个宏观钻研畛域的统称,外面有形形色色具体的子畛域与子方向,如果仔细分析,从工作的性质角度,能够把这些工作分成两大类:一类能够叫做“两头工作”,一类能够称为“最终工作”。典型的两头工作包含:中文分词、词性标注、NER、句法分析、指代消解、语义Parser等,这类工作个别并不解决利用中的理论需要,大多数是作为那些解决理论需要工作的两头阶段或者辅助阶段存在的,比方简直没有需要说,我要一个句法Parser,把这个句子的句法分析树给用户看看,用户不须要看到这些NLP的两头阶段处理结果,他只关怀某个具体任务你有没有干好。“最终工作”包含比方文本分类、文本相似性计算、机器翻译、文本摘要等等,有很多。这类工作的特点是每个子畛域都解决某个理论需要,工作后果根本能间接出现给用户,比方用户的确存在给你一句英文,通知他中文是什么的需要。按理说,“两头工作”就不应该呈现,而之所以会存在,这是NLP技术倒退程度不够高的一种体现。在技术倒退晚期阶段,因为过后的技术绝对落后,很难一步做好有难度的最终工作。比方机器翻译,晚期技术要做好机器翻译是很艰难的,于是科研人员就把难题分而治之,合成成分词、词性标注、句法分析等各种两头阶段,先把每个两头阶段做好,而后再拼起来实现最终工作,这也是没方法的事件。然而自从Bert/GPT呈现之后,其实就没有必要做这些两头工作了,因为通过大量数据的预训练,Bert/GPT曾经把这些两头工作作为语言学特色,排汇到了Transformer的参数里,此时咱们齐全能够端到端地间接解决那些最终工作,而无须对这种两头过程专门建模。这里可能争议最大的是中文分词,其实情理也是一样的,哪些字应该组成一个词,这个其实你不必管,让LLM本人当特色去学就行了,只有对于解决工作有帮忙,它天然会去学该学的正当分词形式,也未必肯定要和咱们人类了解的分词规定雷同。基于以上认知,其实在Bert/GPT一呈现,你就应该得出这类NLP的两头阶段的工作,会逐渐退出历史舞台这个论断。 影响二:不同钻研方向技术路线的对立在阐明具体影响前,咱们先探讨下另外一种NLP工作划分形式,这对于了解前面内容有帮忙。如果对“最终工作”进一步进行分类,又大抵能够分为两大不同类型的工作:自然语言了解类工作和自然语言生成类工作。如果排除掉“两头工作”的话,典型的自然语言了解类工作包含文本分类、句子关系判断、情感偏向判断等,这种工作实质上都是分类工作,就是说输出一个句子(文章),或者两个句子,模型参考所有输出内容,最初给出属于哪个类别的判断。自然语言生成也蕴含很多NLP钻研子方向,比方聊天机器人、机器翻译、文本摘要、问答零碎等。生成类工作的特点是给定输出文本,对应地,模型要生成一串输入文本。这两者的差别次要体现在输入输出模式上。自从Bert/GPT模型诞生后,呈现了显著的技术对立趋势。首先,NLP中不同的子畛域,其特色抽取器都逐步从LSTM/CNN对立到Transformer上。其实,自Bert公开后不久,就应该意识到,这必然会成为技术趋势。至于其起因,在几年前我写的这篇:《放弃空想,全面拥抱Transformer:自然语言解决三大特色抽取器(CNN/RNN/TF)比拟》https://zhuanlan.zhihu.com/p/54743941中做了阐明和剖析,感兴趣的同学可参考。而且,目前Transformer不仅对立了NLP诸多畛域,也正在逐渐地替换图像处理各种工作中被宽泛应用的CNN等其它模型的过程之中,相似的,多模态模型目前也根本都采纳了Transformer模型。这种Transformer从NLP登程,攻城略地逐渐对立AI越来越多畛域的趋势,起始于2020年底呈现的Vision Transformer (ViT) ,之后蓬勃发展,到目前已大获胜利,且其持续向更多畛域拓展的势头会越来越迅猛。其次,大多数NLP子畛域的研发模式切换到了两阶段模式:模型预训练阶段+利用微调(Fine-tuning)或利用Zero/Few Shot Prompt模式。更精确地说,NLP各种工作其实收敛到了两个不同的预训练模型框架里:对于自然语言了解类工作,其技术体系对立到了以Bert为代表的“双向语言模型预训练+利用Fine-tuning”模式;而对于自然语言生成类工作,其技术体系则对立到了以GPT 2.0为代表的“自回归语言模型(即从左到右单向语言模型)+Zero /Few Shot Prompt”模式。至于为何会分化成两条技术路线,有其偶然性,对于这点咱们放在前面解释。这两种模式,看似比拟相像,但其背地蕴含了迥异的倒退思路,也会导向不同的将来倒退方向。不过遗憾的是,咱们中的绝大多数人,在过后都低估了GPT 这条倒退路线的后劲,而把视觉核心聚焦到了Bert这种模式上。 1.2 范式转换2.0: 从预训练模型走向通用人工智能 (AGI)这个范式转换所涵盖的工夫范畴,大抵在GPT3.0呈现之后(20年6月左右),始终到目前为止,咱们应该正处于这个范式转换过程中。ChatGPT是触发这次范型转换的要害节点,然而在InstructGPT呈现之前,其实LLM处于这次范式转换前的一个过渡期。 过渡期:以GPT 3.0为代表的“自回归语言模型+Prompting”模式占据统治位置后面说过,在预训练模型倒退的晚期,技术框架收敛到了Bert模式和GPT模式这两种不同的技术范型,而且人们广泛更看好Bert模式一些,相当少数的后续技术改良,都是沿着Bert那条路走的。然而,随着技术的持续倒退,你会发现,目前规模最大的LLM模型,简直清一色都是相似GPT 3.0这种“自回归语言模型+Prompting”模式的,比方GPT 3、PaLM、GLaM、Gopher、Chinchilla、MT-NLG、LaMDA等,没有例外。为什么会这样呢?背地肯定有其偶然性,我认为可能次要源于两个起因。首先,Google的T5模型,在模式上对立了自然语言了解和自然语言生成工作的外在表现形式。如上图所示,标为红色的是个文本分类问题,黄色的是判断句子相似性的回归或分类问题,这都是典型的自然语言了解问题。在T5模型里,这些自然语言了解问题在输入输出模式上和生成问题放弃了统一,也就是说,能够把分类问题转换成让LLM模型生成对应类别的字符串,这样了解和生成工作在表现形式就实现了齐全的对立。这阐明自然语言生成工作,在表现形式上能够兼容自然语言了解工作,若反过来,则很难做到这一点。这样的益处是:同一个LLM生成模型,能够解决简直所有NLP问题。而如果依然采取Bert模式,则这个LLM模型无奈很好解决生成工作。既然这样,咱们当然偏向于应用生成模型,这是一个起因。第二个起因,如果想要以零示例提醒语(zero shot prompting)或多数示例提醒语(few shot prompting)的形式做好工作,则必须要采取GPT模式。当初已有钻研(参考:On the Role of Bidirectionality in Language Model Pre-Training)证实:如果是以fine-tuning形式解决上游工作,Bert模式的成果优于GPT模式;若是以zero shot/few shot prompting这种模式解决上游工作,则GPT模式成果要优于Bert模式。这阐明了,生成模型更容易做好zero shot/few shot prompting形式的工作,而Bert模式以这种形式做工作,是人造有劣势的。这是第二个起因。然而问题来了:为什么咱们要谋求zero shot/few shot prompting这种形式来做工作呢?要解释分明这个问题,咱们首先须要搞清楚另外一个问题:什么样的LLM模型,对咱们是最现实的?上图展现了一个现实的LLM该有的样子。首先,LLM应该具备弱小的自主学习能力。假如咱们把世界上能取得的所有文本或者图片等不同类型的数据喂给它,它应该可能主动从中学习到外面蕴含的所有知识点,学习过程不须要人的染指,并且能灵便利用所学常识,来解决理论问题。因为数据是海量的,要排汇所有常识,就要十分多的模型参数来存储常识,所以这个模型必然会是一个巨无霸模型。其次,LLM应该能解决NLP任何子畛域的问题,而不仅反对无限畛域,甚至它应该能够响应NLP之外其它畛域的问题,最好是任意畛域的问题都能失去很好地答复。再者,当咱们应用LLM解决某个具体畛域问题的时候,应该用咱们人类习惯的表达方式,就是说LLM应该了解人类的命令。这体现出让LLM适配人,而不是反过来,让人去适配LLM模型。人适配LLM的典型例子,比方搜索枯肠去尝试各种不同的prompt,以试图找到好的提醒语,能力很好地解决手头问题。对于这点,上图在人类和LLM交互的接口层,举了几个例子,阐明什么是好的人应用LLM模型的接口模式。看完这个现实中的LLM,咱们再回头解释下面遗留的问题:为什么咱们要谋求zero shot/few shot prompting这种形式来做工作呢?有两个起因。第一,这个LLM模型规模必然十分微小,有能力作出这个模型,或改变这个模型参数的机构必然很少。而工作需求方是千千万万的中小机构甚至是集体,就算你把模型开源进去,他们也有力部署这个模型,更不用说再用Fine-tuning这种模式去批改模型参数了。所以,咱们应该谋求不修改模型参数,就能让工作需求方实现工作的形式,也就是应该采取prompt模式实现工作,而非Fine-tuning模式(由此可看出,soft prompting技术方向是违反这个发展趋势的)。模型制作方则将LLM作成专用服务,以LLM as Service的模式运行。作为服务反对方,思考到变幻无穷的用户需要,所以LLM模型制作方更要谋求让LLM能实现尽可能多类型的工作,这是附带的影响,也是为何超级大模型肯定会谋求走向AGI的事实因素。第二,zero shot prompting也好,few shot prompting也好,甚至促成LLM推理能力的思维链(CoT,Chain of Thought)Prompting也好,就是上图中接口层中的现有技术。具体而言,zero shot prompting的初衷,其实就是人类和LLM的现实接口,间接用人类所习惯的工作表述形式让LLM做事件,然而发现LLM并不能很好地了解,成果也不好。通过持续钻研,转而发现:对于某项工作,如果给LLM几个示例,用这些示例来代表工作形容,成果会比zero shot prompting好,于是大家都去钻研更好的few shot prompting技术。能够了解为,原本咱们心愿LLM可能用人类罕用的命令形式来执行某个工作,然而目前技术还做不到,所以退而求其次,用这些代替技术来表白人类的工作需要。如果了解了上述逻辑,很容易得出如下论断:few shot prompting(也被称为In Context Learning)只是一种过渡时期的技术。如果咱们可能更天然地去形容一个工作,而且LLM能够了解,那么,咱们必定会毫不犹豫地摈弃这些过渡期的技术,起因很显著,用这些办法来形容工作需要,并不合乎人类的应用习惯。这也是为何我将GPT3.0+Prompting列为过渡期技术的起因,ChatGPT的呈现,扭转了这个现状,用Instruct取代了Prompting,由此带来新的技术范式转换,并产生若干后续影响。 ...

March 20, 2023 · 3 min · jiezi

关于算法:COS3003-问题求解

Problem set 1 of 4Problem set ECOS3003Due date 23:59 22 MarPlease keep your answers brief and concise. Excessively long and irrelevantanswers will be penalised. You can handwrite your answers if you wish. Consider the following game, which has been loosely based on the trust modelsstudied in class. For the purposes of this game, focus on pure-strategies only.Each agent moves simultaneously. Agent 1 can take either action T, M or B, whereasAgent 2 can take actions L, C, or R.The payoffs are shown in the following normal form game.a. Outline and explain the Nash equilibria if the game is played once. (Again, focusonly on pure-strategy equilibria.)b. Now consider the case when the game is played twice. That is, in the first period atthe firm, Agents 1 and 2 simultaneously choose their actions. Their choices arerevealed before in the second period, again the agents simultaneously choose theiractions. Then the game ends. There is no discounting of payoffs between periods.ECOS3003 Problem set 2 of 4Outline how, as part of a subgame equilibrium that the threat to play either B or R inthe final period rather than M or C can help sustain the cooperative outcome of (T,L)in the first period. Interpret this two-period game as a trust game. Explain why ‘trust’(or cooperation) can be achieved in the first period of this game without having toresort to an infinite-horizon game.Cindy and Andy are both workers on a team. At the same time both workers canchoose to work on the risky project (R) or the safe project (S). If both choose to workon the R project the payoffs are 5 to Cindy and 4 to Andy. If both work on the Sproject the payoffs are 4 to Cindy and 5 to Andy. If Cindy works on R and Andy onS, the payoffs are (8, 10) to Cindy and Andy respectively. Finally, if Cindy plays Sand Andy plays R, the payoffs are (10, 7).a. What are the Nash equilibria of the game? Interpret your equilibria in terms of afirm using teams.b. Now assume that Cindy can move first and choose which project to work on. Andyobserves Cindy’s choice before making his choice of either R or S. What are thesubgame perfect equilibria of the game? Interpret this sequential game in terms ofcommunication and leadership in organisations.Consider the following delegation versus centralisation model of decision making,loosely based on some of the discussion in class.A principal has to implement a decision that has to be a number between 0 and 1; thatis, a decision d needs to be implemented where 0 1d≤ ≤ . The difficulty for theprincipal is that she does not know what decision is appropriate given the current stateof the economy, but she would like to implement a decision that exactly equals whatis required given the state of the economy. In other words, if the economy is in state s(where 0 1s≤ ≤ ) the principal would like to implement a decision d = s as theprincipal’s utility Up (or loss from the maximum possible profit) is given byPU s d= ? ? . With such a utility function, maximising utility really means makingthe loss as small as possible. For simplicity, the two possible levels of s are 0.4 and0.7, and each occurs with probability 0.5.There are two division managers A and B who each have their own biases. ManagerA always wants a decision of 0.4 to be implemented and incurs a disutility UA that isincreasing the further from 0.4 the decision d that is actually implement, specifically,0.4AU d= ? ? . Similarly, Manager B always wants a decision of 0.7 to beimplement, and incurs a disutility UB that is (linearly) increasing in the distancebetween 0.7 and the actually decision that is implemented - that is 0.7BU d= ? ? .Each manager is completely informed, so that each of them knows exactly what thestate of the economy s is.ECOS3003 Problem set 3 of 4(a) The principal can opt to centralise the decision but before making her decision –given she does not know what the state of the economy is – she asks forrecommendations from her two division managers. Centralisation means that theprincipal commits to implement a decision that is the average of the tworecommendations she received from her managers. The recommendations are sentsimultaneously and cannot be less than 0 or greater than 1.Assume that the state of the economy s = 0.7. What is the report (or recommendation)that Manager A will send if Manager B always truthfully reports s? ...

March 20, 2023 · 7 min · jiezi

关于算法:COMP3221路由算法研究

Due: March 31st, 2023 (Friday, Week 6) by 11:59 PMCOMP3221Assignment 1: Routing AlgorithmThe goal of this assignment is to implement routing protocols for a network topology usingSocket and Multi-threading Programming in Python. Important: This is an individual project;therefore, each student has to submit his/her own assignment.1 Learning ObjectivesIn this assignment, you are required to implement routing protocols for a specified networkusing Python. In particular, your task is to complete a program that helps each node exchangerouting information with its neighboring nodes, and then finding the least-cost path to allnodes in the network. Furthermore, your program has to deal with link cost changes andfailures in the network. It should have at least three separate threads to do the necessarytasks for each node such as listening (for receiving information from its neighbors), sending(for sending information updates after every 10 seconds), and routing calculations (whenlink-cost changes).On completing this assignment you will gain sufficient expertise in the following skills: Designing a routing protocol Socket and Multithreaded programming using Python Handling routing dynamics2 Network Architecture and SimulationYou are required to generate a network topology including 10 nodes and at least 15 connec-tions between nodes. Those connections and their link costs are generated randomly. Figure 1is a sample network topology with 10 nodes and 15 connections (you are not allowed to reusethis sample topology).Since we do not have access to a real network, we will simulate the network on a singlemachine and use it for implementation and testing. You can use different terminals (onefor each node in the net topology) to run different instances of your program on the samemachine (use "localhost").Figure 1: A sample network topology with 10 nodes and 15 connections.3 Program structureThe program should be named as COMP3221_A1_Routing.py and accept the followingcommand line arguments:1 python COMP3221_A1_Routing.pyFor example:1 python COMP3221_A1_Routing.py F 6005 Fconfig.txt Node-ID: the ID of a node in the net topology. In this assignment, Node-ID is indexedfollowing the English alphabet, for example, A, B, C, D, etc. Port-NO: the port number of a node listening to the information update packets. Theport number is indexed using integers starting from 6000 and is increased by one foreach node. For example, the net topology in Figs. 1 has 10 nodes including: A, B, C, D,E, F, G, H, I, and J. The port number of each node is 6000, 6001, 6002, 6003, 6004,6005, 6006, 6007, 6008, and 6009, respectively. Node-Config-File: Fconfig.txt, for example, is the configuration file for Node F thathas the following details:1 42 A 2.3 60003 C 3.2 60024 E 2.8 60045 J 5.4 6009The first line of this file indicates the number of neighbors for Node F (it is not the totalnumber of nodes in the network). The next four lines are to determine the connectionof Node F to its neighbors. Those lines start with the neighbor ID, followed by the costDistributed Systems Page 2COMP3221 Routing Algorithmto reach this neighbor from Node F, and finally the port number that this neighbor isusing for listening.For example, the second line in the Fconfig.txt above indicates that the cost to neighborA is 2.3 (floating-point numbers) and Node A is using port number 6000 for receivinginformation update packets. It is noted that all link costs are symmetric (the same inboth directions, e.g., the cost from F to A is the same as that from A to F). Moreover,Node F must not have global knowledge (i.e. information about the entire networktopology).Initially, each node creates an information update packet (containing the appropriate infor-mation: its neighboring nodes and cost links between it and its neighbors) and sends thispacket to all direct neighbors. You are free to define the exact format of the information up-date packets. Upon receiving these information update packets, each neighboring node willincorporate the provided information for routing algorithms. Each node should periodicallybroadcast the information update packet to its neighbors every 10 seconds.On receiving information update packets from all other nodes, a node can build up a reach-ability matrix. Given a view of the neighboring nodes and their reachability, a node shouldrun a routing algorithm (Bellman-Ford, Dijkstra, etc..) to compute the shortest paths to allother nodes within the network. Each node should wait for 60 seconds since starting-up andthen execute the routing algorithm.Once a node finishes running the routing algorithm, it will print out to the terminal the leastcost path to each destination node in the topology (excluding itself) along with the cost ofthis path. The following is an example output for node F in some arbitrary network:1 I am Node F2 Least cost path from F to A: FA, link cost: 2.33 Least cost path from F to B: FAGB, link cost: 4.74 Least cost path from F to C: FC, link cost: 3.25 Least cost path from F to D: FCD, link cost: 4.36 Least cost path from F to E: FE, link cost:4.57 Least cost path from F to G: FAG, link cost: 4.58 Least cost path from F to H: FEH, link cost: 59 Least cost path from F to I: FCDI, link cost: 5.810 Least cost path from F to J: FJ, link cost: 5.4Your program should execute forever (as a loop). In other words, each node should keepbroadcasting information value packets every 10 seconds and the routing algorithm shouldbe executed every time the link cost change occurs.Please note that all routing algorithms must be implemented by yourself from scratch,using libraries is not allowed.4 Dealing with changes in link cost and failureYou must ensure that your program has the ability to deal with the changes in link costsand failures. Whenever the link cost changes the network must be able to reconverge toaccommodate the costs.Distributed Systems Page 3COMP3221 Routing AlgorithmTo simulate the changes in link cost and failures, you must provide a command-line interface(CLI) for each node to give the ability to modify the link-cost to its neighbors. A simplemethod would be using a separate thread to modify the configure files. Any modification onthe CLI will affect the configured files.Once the cost of a link changes, the connected nodes must recalculate the cost of reachingother nodes and must also provide an update to its neighbors, who will then notify theirneighbors and so on until the network converges.Here is an example of the correct output for Node F before and after the failure of Node C isgiven. You should check your implementation for correctness at all nodes with multiple nodefailures.Output with all nodes working:1 I am Node F2 Least cost path from F to A: FA, link cost: 2.33 Least cost path from F to B: FAGB, link cost: 4.74 Least cost path from F to C: FC, link cost: 3.25 Least cost path from F to D: FCD, link cost: 4.36 Least cost path from F to E: FE, link cost:4.57 Least cost path from F to G: FAG, link cost: 4.58 Least cost path from F to H: FEH, link cost: 59 Least cost path from F to I: FCDI, link cost: 5.810 Least cost path from F to J: FJ, link cost: 5.4Output after Node C fails:1 I am Node F2 Least cost path from F to A: FA, link cost: 2.33 Least cost path from F to B: FAGB, link cost: 4.74 Least cost path from F to D: FAGBD, link cost: 5.45 Least cost path from F to E: FE, link cost: 4.56 Least cost path from F to G: FAG, link cost: 4.57 Least cost path from F to H: FEH, link cost: 58 Least cost path from F to I: FAGBDI, link cost: 6.99 Least cost path from F to J: FJ, link cost: 5.45 Submission and ReportYou are required to submit the following files to Canvas separately.? Code (zip file includes all implementation, and config files for all nodes)SSID_COMP3221_Code.zip.? Code Text (including all implementation in one file exported in a txt file for Plagiarismchecking)SSID_COMP3221_Code.txt.? Readme (Clearly state how to start the program, change link-cost, client failures, andDistributed Systems Page 4COMP3221 Routing Algorithmrunning environment)SSID_COMP3221_Readme.txt. Report (pdf)SSID_COMP3221_Report.pdf.The size of your report MUST be under 2 pages. Your report should briefly document your nettopology, routing algorithm, techniques and methodology used for implementation and thesimulation results of each requirement. It should act a reference for your markers to quicklyfigure out what you have and haven’t completed, how you did it, and it should mentionanything you think that is special about your system.Please note that you must upload your submission BEFORE the deadline. The CANVAS wouldcontinue accepting submissions after the due date; however, late submissions would carrypenalty per day with maximum of 5 days late submission allowed.6 Academic Honesty / PlagiarismBy uploading your submission to CANVAS you implicitly agree to abide by the Universitypolicies regarding academic honesty, and in particular that all the work is original and notplagiarised from the work of others. If you believe that part of your submission is not yourwork you must bring this to the attention of your tutor or lecturer immediately. See the policyslides released in Week 1 for fusrther details.In assessing a piece of submitted work, the School of Computer Science may reproduce itentirely, may provide a copy to another member of faculty, and/or communicate a copy ofthis assignment to a plagiarism checking service or in-house computer program. A copy ofthe assignment may be maintained by the service or the School of Computer Science for thepurpose of future plagiarism checking.7 MarkingThis assignment is worth 15% of your final grade for this unit of study. The ratio of assign-ment parts is divided as follows. Code: 80%. Report: 20%.Please refer to the rubric in Canvas (Canvas -> Assignment 1 -> Rubric) for detailed markingscheme. The report and the code are to be submitted in Canvas by the due date.After Assignment 1 marks come out, please submit your inquiries about marking withinthe 1st week. All inquiries after that will NOT be responded.Distributed Systems Page 5 ...

March 20, 2023 · 8 min · jiezi

关于算法:CSE-410565

CSE 410/565 Spring 2023Homework 2Due on March 12, 2023, at 11:59pmTotal: 60 pointsYou need to submit a pdf generated from LATEX. If you do not have a local LATEXsetup, I suggesthttps://www.overleaf.com/. You need to write the math formulas correctly using LATEX.This homework should be finished independently. AI violations are taken seriously.This exercise will allow you to obtain experience with working with cryptographic libraries. For this purpose,you are to write a program either in python, C, or Java using a standard cryptographic library and your programshould run on the UB CSE Fall 2021 virtual machine image (Ubuntu). The VM image can be downloaded fromhttps://cse.buffalo.edu/eblanton/misc/vm/. The image can be used from any operating systemusing a suitable VM environment software such as, e.g., VirtualBox. Once you start the VM, you can run commands from a terminal and directly write your program in the VM (using an editor such as emacs). When finished,you would need to copy your program to one of CSE student machines for submission.Implementation and execution tasksYour program is to call different cryptographic functions and measure the speed of different cryptographicoperations. For that purpose, it would need to create or read a small file of size 1KB and a large file of size 10MB.These can be randomly generated data or existing files of any type (of the specified size). The program is toimplement and measure the runtime of the following functionalities:(a) [4 points for 565; 6 for 410] Create a 128-bit AES key, encrypt and decrypt each of the two files using AESin the CBC mode. AES implementations need to be based on hardware implementation of AES, so ensure thatyour libraries are chosen or configured properly.(b) [4 points for 565; 6 for 410] Repeat part (a) using AES in the CTR mode.(c) [4 points for 565; 6 for 410] Repeat part (b) with a 256-bit key.(d) [4 points for 565; 6 for 410] Create a 2048-bit RSA key, encrypt and decrypt the files above with PKCS #1v2 padding (at least v2.0, but use v2.2 if available; it may also be called OAEP). This experiment can use a 1MBfile for the second file size to reduce the runtime.(e) [4 points for 565; 6 for 410] Repeat part (d) with a 3072-bit key. This experiment can use a 1MB file forthe second file size to reduce the runtime.(f) [4 points for 565; 6 for 410] Compute a hash of each of the files using hash functions SHA-256, SHA-512,and SHA3-256.(g) (CSE 565 only; 4 points) Create a 2048-bit DSA key, sign the two files and verify the correspondingsignatures. If creating a key takes two parameters, use 224 bits for the exponent size. If the hash function algorithmneeds to specified separately, use SHA-256.(h) (CSE 565 only; 4 points) Repeat part (g) with a 3072-bit DSA key (if the second parameter is required, use256).Include simple checking of correctness of your code, namely, that computed ciphertexts decrypt to the originaldata and that signed messages properly verify. (There is no need to test whether the library functions themselveswork correctly.)Your program will need to measure the following execution times: ...

March 20, 2023 · 6 min · jiezi

关于算法:CSSE2310CSSE7231数据结构算法

The University of Queensland School of Information Technology and Electrical Engineering CSSE2310/CSSE7231 — Semester 1, 2023 Assignment 1 (version 1.1 – 9 March 2023) Marks: 75 Weighting: 15% Due: 4:00pm Friday 24 March, 2023 Introduction 1 The goal of this assignment is to give you practice at C programming. You will be building on this ability in the 2 remainder of the course (and subsequent programming assignments will be more difficult than this one). You 3 ...

March 20, 2023 · 35 min · jiezi

关于算法:树状数组模板与练习

文章和代码曾经归档至【Github仓库:algorithms-notes】或者公众号【AIShareLab】回复 算法笔记 也可获取。树状数组留神:树状数组的坐标肯定要从1开始! 树状数组的利用次要是:疾速(在O(logn)的复杂度内): 在某个地位上加上一个数(单点批改)求某一个的前缀和(区间查问)其余的变式都是由这两个基本功能转换而来,例如单点查问,区间批改等等。 它与纯前缀和的区别在于能够单点批改。如果间接用前缀和进行单点批改,则它每次都会更新批改值前面的前缀和,因而会导致每个都更新一遍,复杂度为O(n)了。简略的比拟: 单点批改区间查问综合前缀和O(n)O(1)(O(n) + O(1)) / 2 = O(n)线段数组O(logn)O(logn)O(logn)根本思维 其中原数组为A,树状数组为C。每一层的关系如上所示,能够发现,雷同个数的后缀0的数在同一层,比方2--->10, 6 ---> 110。 其中: C[1] = A[1]C[2] = A[2] + C[1] = A[2] + A[1]C[3] = A[3]C[4] = A[4] + C[3] + C[2] = A[4] + A[3] + A[2] + A[1]...外围:C[x] = (x - lowbit(x), x] lowbit = x & -x = $2 ^ k$ 作用是统计二进制数字中后缀0的个数。 模板在某个地位上加上一个数(单点批改) // a[x] + v// x 的父节点是 x + lowbit(x)for(int i = x; i <= n; i += lowbit(x)) c[x] += v;求某一个的前缀和(区间查问) ...

March 18, 2023 · 3 min · jiezi

关于算法:FIT5216-建模离散优化问题求解

FIT5216: Modelling Discrete Optimization ProblemsAssignment 1: Air Defence1 OverviewFor this assignment, your task is to write a MiniZinc model for a given problem specification. Submit your work to the MiniZinc auto grading system (using the submit button in theMiniZinc IDE). Submit your model (copy and paste the contents of the .mzn file) using the Moodle assignment.You have to submit by the due date (26th March 2023, 11:59pm), using MiniZinc and usingthe Moodle assignment, to receive full marks. You can submit as often as you want before thedue date. Late submissions without special consideration receive a penalty of 10% of the availablemarks per day. Submissions are not accepted more than 7 days after the original deadline.This is an individual assignment. Your submission has to be entirely your own work. Wewill use similarity detection software to detect any attempt at collusion, and the penalties arequite harsh. If in doubt, contact your teaching team with any questions!2 Problem StatementYour task is to plan the air defence of a small region of a country under attack. The area fordefending is a rectangle which is subdivided into W ×H cells. You need to decide a set of locationsto place air defence equipment. Each cell has a value for defending it.The goal is to defend the most value within the given constraints. For task (a) you are given afixed number of equipment. For task (b) you are planning for a maximum number of equipment.Each air defence equipment location must be at least 3 cells distant from any other equipmentlocation: e.g. have a Manhattan distance of at least 3, otherwise the sensing methods can interfere.You cannot place equipment on cells with value 0 (which represents enemy control). You cannotuse more of a type of equipment than is available.You have a given budget, which is the maximum cost you can spend on equipment.Input data is given in MiniZinc data format:W = 〈 width 〉;H = 〈 height 〉;value = 〈 2D W ×H array of value of position 〉;EQUIPMENT = 〈 An set of possible equipment types 〉;cost = 〈 cost of each type of equipment 〉;avail = 〈 number available of each type of equipment 〉;radius = 〈 radius at which each equipment protects 〉;limit = 〈 limit on total number of equipment) 〉;budget = 〈 defense budget 〉;Here is a sample data set:1W = 9;H = 5;value = [| 0, 0, 1, 0, 0, 0, 0, 0, 0| 1, 4, 1, 1, 1, 1, 0, 0, 0| 1, 1, 4, 1, 1, 2, 1, 0, 0| 1, 2, 2, 2, 1, 0, 0, 0, 0| 1, 1, 1, 1, 1, 1, 0, 0, 0 |];EQUIPMENT = { S300, NASAMS, PATRIOT };cost = [ 3, 5, 8 ];avail = [ 3, 3, 1 ];radius = [1, 2, 3];budget = 15;limit = 4;On this 9x5 grid, there are 3 types of equipment, with different costs, availability, and defenceradius. We need to place (at most) 4 equipment for maximum coverage with a budget of 15.Part A - Fixed Number of EquipmentCreate a model airdefence.mzn that takes data in the format specified above and decides onexactly limit different air defence locations.Here is a sample solution. The S300 positions are represented in light blue, NASAMS positionsin light purple the defended areas are in light gray.0 0 1 0 0 0 0 0 01 4 1 1 1 1 0 0 01 1 4 1 1 2 1 0 01 2 2 2 1 0 0 0 01 1 1 1 1 1 0 0 0Note that there are 4 equipment, as required. None of the equipment is closer than 3 squaresto another equipment, and no equipment is on a zero reward (red) cell. The total protected valueis 33 the sum of the colouredYour model must define the positions of the equipment cells as x and y coordinates and theirtype, together with the total protected value. One correct output for the solution above isx = [5, 2, 1, 3];y = [3, 2, 4, 5];t = [NASAMS, S300, S300, S300];total_protection = 33;Note that you will not be able to obtain full marks by just answering part A. Some problemswill have no solution, whereas using part B they have a solution.2Part B - Bounded Number of EquipmentModify your model airdefence.mzn to treat limit as a bound on the maximal possible numberof equipment. For example an optimal profit for the example data is illustrated by the solution,where PATRIOT positions are shown as dark purple.0 0 1 0 0 0 0 0 01 4 1 1 1 1 0 0 01 1 4 1 1 2 1 0 01 2 2 2 1 0 0 0 01 1 1 1 1 1 0 0 0The number of equipment is 3 with a total cost of 14. The protected value is improved to 34.To model this extension, unused equipment slots must be defined as having x and y coordinate ...

March 17, 2023 · 5 min · jiezi

关于算法:MATH7861数值分析与解答

MATH7861 (S1, 2023) Assignment 1 Due: 4pm on 20 March 2023All assignments in this course must be submitted electronically and SUBMITTED AS A SINGLE PDF FILE.Prepare your assignment solutions using Word, LaTeX, Windows Journal, or other application, ensuring thatyour name, student number and tutorial group number appear clearly at the top to the first page, and thensave your file in pdf format. Alternatively, you may handwrite your solutions and scan or photograph yourhandwritten work to create a pdf file. Make sure that your pdf file is legible and that the file size is notexcessive. Use the assignment submission link in Blackboard to submit the pdf file. ...

March 17, 2023 · 3 min · jiezi

关于算法:COMP30027机器学习算法

The University of MelbourneSchool of Computing and Information SystemsCOMP30027 Machine Learning, 2023 Semester 1Project 1: Music genre classification with na?¨ve BayesDue: 7 pm, 7 April 2021Submission: Source code (in Python) and written responsesGroups: You may choose to form a group of 1 or 2.Groups of 2 will respond to more questions, and commensurately produce moreimplementation.Marks: The project will be marked out of 16 points (individual project) or 24 points(group project). In either case, this project will contribute 20% of your totalmark.OverviewA visualisation (mel spectrogram) of amusic clip from the GZTAN dataset [3].State-of-the-art AI research is focused on developing com-puter systems that can recognize and understand text, im-ages, and audio in the ways that humans do. A classicproblem in audio AI is the problem of music genre clas-sification, which is useful for applications like music rec-ommendation systems. Given a piece of music, how dowe interpret what “type” of music it is (e.g., pop, classical,hip-hop, or jazz)? This task is challenging for computersbecause the artists, styles, and features of music within agenre can be quite varied, and songs from different genresmay share some features.In this project, you will implement a supervised na?¨veBayes learner to classify the genre of a music clip from high-level acoustic features. You will train,test, and evaluate your classifier on a provided dataset, and then you will have a choice of eitherextending this basic model in various ways, or using it to answer some conceptual questions aboutna?¨ve Bayes.DataThe data for this assignment is drawn from the GTZAN music genre dataset [1], a dataset for musicgenre classification. It consists of 1000 30-second mp3 audio clips from 10 different classes (100samples per class). The classes are blues, classical, country, disco, hip hop, jazz, metal, pop, reggae,and rock. For this assignment, we’ll use a processed version of the dataset from Kaggle [2], whichprovides 57 high-level acoustic features [3] extracted from the music clips. You do not need theoriginal audio files for this assignment, though if you are interested, you can download them throughKaggle.Separate training and test datasets are provided. Please use the provided train/test splits for thisassignment, unless a question asks you to create your own splits. Each row in the dataset is a musicclip with the class label given in the label column.Naive Bayes classifier [4 marks]There are some suggestions for implementing your learner in the “Na?¨ve Bayes” and “Discrete &Continuous Data” lectures, but ultimately, the specifics of your implementation are up to you. Yourimplementation must be able to perform the following functions: preprocess() the data by reading it from a file and converting it into a useful format fortraining and testing train() by calculating prior probabilities and likelihoods from the training data and usingthese to build a naive Bayes model predict() classes for new items in a test dataset evaluate() the prediction performance by comparing your model’s class outputs to groundtruth labelsYour implementation should be able to handle numeric attributes and it should assume that nu-meric attributes are Gaussian-distributed. Your model will not be expected to handle nominal at-tributes.Your implementation should actually compute the priors, likelihoods, and posterior probabilitiesfor the na?¨ve Bayes model. You may use built-in functions to read data and compute Gaussian prob-abilities. However, you must implement the na?¨ve Bayes algorithm yourself and not simply call anexisting implementation such as GaussianNB from scikit-learn.Task 1. Pop vs. classical music classification [8 marks]Use the pop vs classical train.csv dataset to train your na?¨ve Bayes model and then eval-uate it on the pop vs classical test.csv dataset. Answer questions 1-2 below in a shortwrite-up (no more than 250 words total). ...

March 16, 2023 · 8 min · jiezi

关于算法:COMP9017-开发经验讨论

COMP9017COMP2017 COMP9017 Assignment 2Due: 11:59PM Tuesday 28 March 2023 local Sydney timeThis assignment is worth 5% + 30% of your final assessmentTask DescriptionAh, yes, because what the world really needs right now is yet another virtual machine. But not justany virtual machine, no! We need one with the highly coveted and incredibly useful feature of heapbanks. Because who needs to worry about memory allocation when you can just throw it in a heap, amI right? So gather ’round, folks, and get ready to develop the vm_RISKXVII, because nothing says"cutting edge" like a virtual machine named after a board game that this assignment has absolutelynothing to do with. Now, let’s dive into the specs and get this party started!In this assignment you will be implementing a simple virtual machine. Your program will take asingle command line argument being the path to the file containing your RISK-XVII assembly code.Before attempting this assignment it would be a good idea to familiarise yourself with registers,memory space, program counters, assembly and machine code. A strong understanding of theseconcepts is essential to completing this assignment. Section 3.6 and 3.7 of the course textbook providespecific detail to x86_64 architecture, however you can review these as a reference.In order to complete this assignment at a technical level you should revise your understanding ofbitwise operations, file IO, pointers and arrays.Some implementation details are purposefully left ambiguous; you have the freedom to decide on thespecifics yourself. Additionally this description does not define all possible behaviour that can beexhibited by the system; some error cases are not documented. You are expected to gracefully reportand handle these errors yourself.You are encouraged to ask questions on Ed1. Make sure your question post is of "Question" post typeand is under "Assignment" category→ "A2" subcategory. As with any assignment, make sure thatyour work is your own2, and that you do not share your code or solutions with other students.The ArchitectureIn this assignment you will implement a virtual machine for an 32-bit instruction-set. The memorymapped virtual components of your machine are outlined below:1https://edstem.org/au/courses/10466/discussion/2Not GPT-3/4’s, ChatGPT’s or copilot’s, etc.1COMP2017 COMP9017 0x0000 - 0x3ff: Instruction Memory - Contains 211 of bytes for text segment. 0x0400 - 0x7ff: Data Memory - Contains 211 of bytes for global variables, and function stack. 0x0800 - 0x8ff: Virtual Routines - Accesses to these address will cause special operations to becalled. 0xb700 +: Heap Banks - Hardware managed 128 x 64 bytes banks of dynamically allocate-ablememory.Your machine also has a total of 32 registers, as well as a PC (program counter) that points to theaddress of the current instruction in memory. Each of the general-purpose registers can store 4 bytes(32 bits) of data that can be used directly as operands for instructions. All registers are general-purpose except for the first one, which has an address of 0. This register is called the zero register,as any read from it will return a value of 0. Writes to the zero register are ignored.During execution you should not store any information about the state of the machine outsideof the virtual memory devices and the register bank.Note: A register stores a single value using a fixed bit width. It the size of a register corresponding tothe processor word size, in this case 32 bits. Think of them as a primitive variable. Physical processorhardware is constrained, and the number of registers is always fixed. There are registers which servespecific purposes, and those which are general. Please identify these in the description and considerthem for your solution. You need not consider special purpose registers, such as floating point, in thisassignment.RISK-XVII Instruction-Set ArchitectureAn Instructions-Set Architecture(ISA) specifies a set of instructions that can be accepted and executedby the target machine. A program for the target machine is an ordered sequence of instructions.Our virtual machine will operate on a home-brewed ‘RISK-XVII’ instruction set architecture. Duringmarking, you will be provided with binaries in this ISA to run on your virtual machine. RISK-XVII is a reduced version of the well-known RV32I instruction set architecture, and your virtualmachine should be able to execute binary programs compiled for RV32I, as long as they do notinclude instructions that were not specified by ‘RISK-XVII’.There are in total 33 instructions defined in RISK-XVII, they can be classified into three groups bytheir functionality: ...

March 16, 2023 · 20 min · jiezi

关于算法:COMP212-分布式协议算法

Department of Computer ScienceCOMP212 - 2023 - CA Assignment 1Coordination and Leader ElectionSimulating and Evaluating Distributed Protocols in JavaAssessment InformationAssignment Number 1 (of 2)Weighting 15%Assignment Circulated 10th February 2023Deadline 17th March 2023, 17:00 UK Time (UTC)Submission Mode Electronic via CANVASLearning outcomes assessed (1) An appreciation of the main principles underlyingdistributed systems: processes, communication, naming, synchronisation, consistency, fault tolerance, andsecurity. (3) Knowledge and understanding of the essential facts, concepts, principles and theories relatingto Computer Science in general, and Distributed Computing in particular. (4) A sound knowledge of the criteria and mechanisms whereby traditional and distributedsystems can be critically evaluated and analysed to determine the extent to which they meet the criteria defined for their current and future development.Purpose of assessment This assignment assesses the understanding of coordination and leader election in distributed systems and implementing, simulating, and evaluating distributed protocols by using the Java programming language.Marking criteria Marks for each question are indicated under the corresponding question.Submission necessary in order Noto satisfy Module requirements?Late Submission Penalty Standard UoL Policy.11 Overall marking schemeThe coursework for COMP212 consists of two assignments contributing altogether 30% ofthe final mark. The contribution of the individual assignments is as follows:Assignment 1 15%Assignment 2 15%TOTAL 30%2 ObjectivesThis assignment requires you to implement in Java two distributed algorithms for leaderelection in a ring network and then to experimentally validate their correctness and evaluatetheir performance.3 Description of courseworkThroughout this coursework, the network on which our algorithms are to be executed is abidirectional ring, as depicted in Figure 1.Figure 1: A bidirectional ring network on n processors.In our setting, all processors execute the same algorithm, do not know the number n ofprocessors in the system in advance, but they do know the structure of the network andare equipped with unique ids. The ids are not necessarily consecutive and for simplicityyou can assume that they are chosen from {1, 2, . . . , n}, where ≥ 1 is a small constant(e.g., for = 3, the n processors will be every time assigned unique ids from {1, 2, . . . , 3n −1, 3n}). Additionally, every processor can distinguish its clockwise from its counterclockwiseneighbour, so that, for example, it can choose to send to only one of them or to send a2different message to each of them. Processors execute in synchronous rounds, as in everyexample we have discussed so far in class.3.1 Implementing the LCR Algorithm—30% of the assignmentmarkAs a first step, you are required to implement the LCR algorithm for leader election in aring. The pseudocode of the non-terminating version of LCR can be found in the lecturenotes and is also given here for convenience (Algorithm 1).Algorithm 1 LCR (non-terminating version)Code for processor ui, i ∈ {1, 2, . . . , n}:Initially:ui knows its own unique id stored in myIDisendIDi:= myIDistatusi:= “unknown”1: if round = 1 then2: send hsendIDii to clockwise neighbour3: else// round > 14: upon receiving hinIDi from counterclockwise neighbour5: if inID > myIDi then6: sendIDi:= inID7: send hsendIDii to clockwise neighbour8: else if inID = myIDi then9: statusi:= “leader”10: else if inID < myIDi then11: do nothing12: end if13: end ifYou are required to implement a terminating version of the LCR algorithm inwhich all processors eventually terminate and know the id of the elected leader.3.2 Implementing the HS Algorithm—30% of the assignmentmarkNext, you are required to implement another algorithm for leader election on a ring, knownas the HS algorithm. As LCR, HS also elects the processor with the maximum id. The maindifference is that HS, instead of trying to send ids all the way around in one direction (whichis what LCR does), has every processor trying to send its id in both directions some distance3away (e.g., k) and then has the ids turn around and come back to the originating processor.As long as a processor succeeds, it does so repeatedly (in “phases”) to successively greaterdistances (doubling the distance to be travelled each time, e.g., 2k). See Figure 2 for anillustration.Figure 2: Trajectories of successive “phases” originating at processor u4 (imagine the rest ofthe processors doing something similar in parallel, but not depicted here). The id transmittedby u4 aims to travel some distance out in both directions and then return back. If it succeeds,then u4 doubles the aimed distance and repeats.Informally, each processor ui “operates in phases” l = 0, 1, . . . (where each phase l consistsof one or more rounds). In each phase l, processor ui sends out a “token” (i.e., a message)containing its id idiin both directions. These are intended to travel distance 2l(that is,as in Figure 2, distance 20 = 1 for l = 0, distance 21 = 2 for l = 1, distance 22 = 4 forl = 2, and so on) and then return to their origin. If both tokens manage to return back thenui goes to the next phase, otherwise it stops to produce its own tokens (and only performsfrom that point on the rest of the algorithm’s operations). A token is discarded if it evermeets a processor with greater id while travelling outwards (away from its origin). Whiletravelling inwards (back to its origin), a token is forwarded by all processors without anycheck. The termination criterion is as follows: If a token travelling outwards meets its ownorigin ui (meaning that this token managed to perform a complete turn of the whole ringwhile travelling outwards), then ui elects itself as the leader. Observe that in order for tokensto know how far they should travel each time and in which direction, this information has tobe included inside the transmitted messages (that is, apart from the id being transmitted,4the messages should also contain this auxiliary information).The pseudocode of the non-terminating version of HS is given in Algorithm 2. As withLCR, you are required to implement a terminating version of the HS algorithm inwhich all processors eventually terminate and know the id of the elected leader.3.3 Experimental Evaluation, Comparison & Report—40% of theassignment markAfter implementing the terminating LCR and HS algorithms, the next step is to conductan experimental evaluation of their correctness and performance.Correctness. Execute each algorithm in rings of varying size (e.g., n = 3, 4, . . . , 1000, . . .;actually, up to a point where simulation does take too much time to complete) and startingfrom various different id assignments for each given ring size. For instance, you couldexecute them on both specifically constructed id assignments (e.g., ids ascending clockwiseor counterclockwise) and random id assignments. In each execution, your simulatorshould check that eventually precisely one leader is elected. Of course, this will not be areplacement of a formal proof that the algorithms are correct as you won’t be able to testthem on all possible combinations of ring sizes and id assignments, but at least it will be afirst indication that they may do as intended.Performance. Execute, as above, each algorithm in rings of varying size and starting fromvarious different id assignments for each given ring size. For each execution, your simulatorshould record the number of rounds and the total number of messages transmitteduntil termination. ...

March 16, 2023 · 10 min · jiezi

关于算法:DATA3888-R爬虫解答

DATA3888 (2023): Assignment 1Instructions Your assignment submission needs to be a HTML document that you have compiled using R Markdownor Quarto. Name your file as SIDXXX_Assignment.html” where XXX is your Student ID.Under author, put your Student ID at the top of the Rmd file (NOT your name).For your assignment, please use set.seed(3888) at the start of each chunk (where required).Do not upload the code file (i.e. the Rmd or qmd file).You must use code folding so that the marker can inspect your code where required.Your assignment should make sense and provide all the relevant information in the text when the codeis hidden. Don’t rely on the marker to understand your code.Any output that you include needs to be explained in the text of the document. If your code chunkgenerates unnecessary output, please suppress it by specifying chunk options like message = FALSE.Start each of the 3 questions in a separate section. The parts of each question should be in the samesection.You may be penalised for excessive or poorly formatted output.Question 1 - Case Study 1 (Reef): Visualising dataSully and colleagues have curated a public dataset containing characteristics linked to coral bleaching over thelast two decades. The data is in the file Reef_Check_with_cortad_variables_with_annual_rate_of_SST_change.csv,and the authors curated coral bleaching events at 3351 locations in 81 countries from 1998 to 2017. The fulldescription of the variables can be found in the supplementary table of the study.a. In the paper, the authors claim “the highest probability of coral bleaching occurred at tropical mid-latitude sites (15–20 degrees north and south of the Equator)”. Create an informative map visualisationto explore this claim and comment on what you can learn from your visualisation.b. A researcher wants to investigate coral bleaching events around the world as they occurredfrom 1998 to 2017. Create an interactive map visualisation, representing the information you thinkwould be important. Justify your choice of visualisation, and comment on what you can learn fromyour visualisation.1Question 2 - Case Study 2 (Kidney): Blood vs Biopsy Biomarkerfor classificationIn the data GSE46474, we estimated the accuracy for our predictive model in graft rejection from peripheralblood gene expression dataset. However, rejection is a very active process that occurs in the kidney itself.Here we will look at a similar kidney microarray dataset. Therefore, instead of genes being isolated andsequenced from blood, we examine another dataset GSE138043 where the samples have been sequenced froma kidney biopsy.a. In each of the GSE46474 and GSE138043 datasets, use the topTable function in the limma packageto output the most differentially expressed genes between patients that experience graft rejection andstable patients. Which genes are overlapped between the top 300 differentially expressed genes for eachdataset? In other words, which genes can be found in the top 300 differentially expressed genes forBOTH datasets?Hint. In the GSE46474 dataset, the outcome is found in the title column of the featureData and the genesymbols are found the in Gene Symbol column of the featureData. In the GSE138043 dataset, the outcomeis found in the characteristics_ch1 column of the featureData and the gene symbols are found the ingene_assignment column of the featureData, between the first and second // symbols.b. Consider the following framework for cross-validation for a support vector machine (SVM) classifier.Framework 1. Identify the 50 most differentially expressed genes from the entire dataset. Subset the entiredataset to the 50 most differentially expressed genes. Randomly split the data into training and testing sets(80:20 split). Build a SVM classifier on the training set. Calculate the accuracy of the classifier when appliedon the testing set.For each of the GSE46474 and GSE138043 datasets, use repeated 5-fold cross validation (with 50 repeats),following the framework above, to estimate the accuracy of graft survival prediction (rejection or stable).Show your results in a visualisation and comment on the result.c. Consider the following framework for cross-validation for a support vector machine (SVM) classifier.Framework 2. Randomly split the entire dataset into training and testing sets (80:20 split). Identify the 50most differentially expressed genes from the training data. Subset both the training and testing data to themost differentially expressed genes. Build a SVM classifier on the training set. Calculate the accuracy ofthe classifier when applied on the testing set.For each of the GSE46474 and GSE138043 datasets, use repeated 5-fold cross validation (with 50 repeats),following the framework above, to estimate the accuracy of graft survival prediction (rejection or stable).Show your results in a visualisation and comment on the result.d. Compare all the results from b and c using an appropriate graphic. Which of framework 1 or frameworkis more valid? Is using blood or biopsy more accurate? Justify your answers.2Question 3: Case Study 3 (Brain): Streaming classifier for Brain-boxA physics instructor Zoe has created a data set stored under zoe_spiker.zip that contains brain signalseries (each series is a file) which corresponds to sequences of eye movements of varying lengths. The filename corresponds to the true eye movement. For example the file LRL_z.wav corresponds to left-right-lefteye movements; the file LLRLRLRL_z.wav corresponds to left-left-right-left-right-left-right-left eye movements.There are a total of 31 files.a. Build a classification rule for detecting a series of {L, R} under a streaming condition where the functionwill take a sequence of signals as an input. Explain how your classification rule works.Note. Your function should take the entire .wav file as an input, but should run through the .wav file understreaming conditions (e.g., by considering overlapping/rolling windows in the signal).b. Create a metric to estimate the accuracy of your classifier on the length 3 wave files, justifying yourchoice. Comment on the performance of your classifier (ie. is it reasonable for this context?).c. Compare at least four different classification rules on the length 3 wave files, using the metric youcreated. (This may include changing the parameters, different rules to identify events from non-events,or different rules to identify left-movement from right-movement). What is your best model? Justifyyour answer with appropriate visualisations.d. For the best model that you found in part c, evaluate its performance on sequences of varying lengths.Does the length of the sequence have an impact on the classification accuracy? Justify your answerwith appropriate visualisations.dir("data/zoe_spiker/Length3") ## [1] "LLL_z.wav" "LLL_z2.wav" "LLL_z3.wav" "LLR_z.wav" "LLR_z2.wav" ## [6] "LLR_z3.wav" "LRL_z.wav" "LRL_z2.wav" "LRL_z3.wav" "LRR_z.wav" ## [11] "LRR_z3.wav" "LRRz_2.wav" "RLL_z.wav" "RLL_z2.wav" "RLL_z3.wav" ## [16] "RLR_z.wav" "RLR_z2.wav" "RLR_z3.wav" "RRL_z.wav" "RRL_z2.wav" ## [21] "RRL_z3.wav" "RRR_z.wav" "RRR_z2.wav" "RRR_z3.wav" dir("data/zoe_spiker/Length8") ## [1] "LLRLRLRL_z.wav" "LLRRLLLR_z.wav" "LLRRRLLL_z.wav" "LRRRLLRL_z.wav" ## [5] "RRRLRLLR_z.wav" dir("data/zoe_spiker/Long") ## [1] "LLLRLLLRLRRLRRRLRLLL_Z.wav" "RRLRRLRLRLLLLLLRRLRL

March 16, 2023 · 5 min · jiezi

关于算法:COMP9417-机器学习

COMP9417 - Machine LearningHomework 2: Kernel Features & Model CombinationsIntroduction In this homework we first take a closer look at feature maps induced by kernels. We then ex-plore a creative use of the gradient descent method introduced in homework 1. We will show that gradientdescent techniques can be used to construct combinations of models from a base set of models such that thecombination can outperform any single base model.Points Allocation There are a total of 28 marks.What to Submit A single PDF file which contains solutions to each question. For each question, provide your solutionin the form of text and requested plots. For some questions you will be requested to provide screenshots of code used to generate your answer — only include these when they are explicitly asked for. .py file(s) containing all code you used for the project, which should be provided in a separate .zipfile. This code must match the code provided in the report. You may be deducted points for not following these instructions. You may be deducted points for poorly presented/formatted work. Please be neat and make yoursolutions clear. Start each question on a new page if necessary.1 You cannot submit a Jupyter notebook; this will receive a mark of zero. This does not stop you fromdeveloping your code in a notebook and then copying it into a .py file though, or using a tool such asnbconvert or similar. We will set up a Moodle forum for questions about this homework. Please read the existing questionsbefore posting new questions. Please do some basic research online before posting questions. Pleaseonly post clarification questions. Any questions deemed to be fishing for answers will be ignoredand/or deleted. Please check Moodle announcements for updates to this spec. It is your responsibility to check forannouncements about the spec. Please complete your homework on your own, do not discuss your solution with other people in thecourse. General discussion of the problems is fine, but you must write out your own solution andacknowledge if you discussed any of the problems in your submission (including their name(s) andzID). As usual, we monitor all online forums such as Chegg, StackExchange, etc. Posting homework ques-tions on these site is equivalent to plagiarism and will result in a case of academic misconduct. You may not use SymPy or any other symbolic programming toolkits to answer the derivation ques-tions. This will result in an automatic grade of zero for the relevant question. You must do thederivations manually.When and Where to Submit Due date: Week 7, Monday March 27th, 2023 by 5pm. Please note that the forum will not be activelymonitored on weekends. Late submissions will incur a penalty of 5% per day from the maximum achievable grade. For ex-ample, if you achieve a grade of 80/100 but you submitted 3 days late, then your final grade will be80? 3× 5 = 65. Submissions that are more than 5 days late will receive a mark of zero. Submission must be made on Moodle, no exceptions.Page 2Question 1. Kernel PowerConsider the following 2-dimensional data-set, where y denotes the class of each point.index x1 x2 y1 1 0 -12 0 1 -13 0 -1 -14 -1 0 +15 0 2 +16 0 -2 +17 -2 0 +1Throughout this question, you may use any desired packages to answer the questions.(a) Use the transformation x = (x1, x2) 7→ (1(x), 2(x)) where 1(x) = 2x22 ? 4x1 + 1 and 2(x) =x21 ? 2x2 ? 3. What is the equation of the best separating hyper-plane in the new feature space?Provide a plot with the data set and hyperplane clearly shown.What to submit: a single plot, the equation of the separating hyperplane, a screen shot of your code, a copyof your code in your .py file for this question.(b) Fit a hard margin linear SVM to the transformed data-set in the previous part1. What are theestimated values of (1, . . . , 7). Based on this, which points are the support vectors? What errordoes your computed SVM achieve?What to submit: the indices of your identified support vectors, the train error of your SVM, the computed’s (rounded to 3 d.p.), a screen shot of your code, a copy of your code in your .py file for this question.(c) Consider now the kernel k(x, z) = (2+x>z)2. Run a hard-margin kernel SVM on the original (un-transformed) data given in the table at the start of the question. What are the estimated values of(1, . . . , 7). Based on this, which points are the support vectors? What error does your computedSVM achieve?What to submit: the indices of your identified support vectors, the train error of your SVM, the computed’s (rounded to 3 d.p.), a screen shot of your code, a copy of your code in your .py file for this question.(d) Provide a detailed argument explaining your results in parts (i), (ii) and (iii). Your argumentshould explain the similarities and differences in the answers found. In particular, is your answerin (iii) worse than in (ii)? Why? To get full marks, be as detailed as possible, and use mathematicalarguments or extra plots if necessary.What to submit: some commentary and/or plots. If you use any code here, provide a screen shot of your code,and a copy of your code in your .py file for this question.Question 2. Gradient Descent for Learning Combinations of ModelsIn this question, we discuss and implement a gradient descent based algorithm for learning combina-tions of models, which are generally termed ’ensemble models’. The gradient descent idea is a verypowerful one that has been used in a large number of creative ways in machine learning beyond directminimization of loss functions as in the previous question.The Gradient-Combination (GC) algorithm can be described as follows: Let F be a set of base learningalgorithms2. The idea is to combine the base learners in F in an optimal way to end up with a good1If you are using the SVC class in sklearn, to get a hard-margin svm, you need to set the hyper parameter C to be very large.2For example, you could take F to be the set of all regression models with a single feature, or alternatively the set of all regressionmodels with 4 features, or the set of neural networks with 2 layers etc.Page 3learning algorithm. Let `(y, y?) be a loss function, where y is the target, and y? is the predicted value.3Suppose we have data (xi, yi) for i = 1, . . . , n, which we collect into a single data set D0. We then setthe number of desired base learners to T and proceed as follows:(I) Initialize f0(x) = 0 (i.e. f0 is the zero function.)(II) For t = 1, 2, . . . , T :(GC1) Compute:rt,i = ? ??f(xi)n∑j=1`(yj , f(xj))∣∣∣∣f(xj)=ft?1(xj), j=1,...,nfor i = 1, . . . , n. We refer to rt,i as the i-th pseudo-residual at iteration t.(GC2) Construct a new pseudo data set, Dt, consisting of pairs: (xi, rt,i) for i = 1, . . . , n.(GC3) Fit a model to Dt using our base class F . That is, we solveht = argminf∈Fn∑i=1`(rt,i, f(xi))(GC4) Choose a step-size. This can be done by either of the following methods:(SS1) Pick a fixed step-size t = (SS2) Pick a step-size adaptively according tot = argminn∑i=1`(yi, ft?1(xi) + ht(xi)).(GC5) Take the stepft(x) = ft?1(x) + tht(x).(III) return fT .We can view this algorithm as performing (functional) gradient descent on the base class F . Note thatin (GC1), the notation means that after taking the derivative with respect to f(xi), set all occurencesof f(xj) in the resulting expression with the prediction of the current model ft?1(xj), for all j. Forexample:??xlog(x+ 1)∣∣∣∣x=23=1x+ 1∣∣∣∣x=23=124.(a) Consider the regression setting where we allow the y-values in our data set to be real numbers.Suppose that we use squared error loss `(y, y?) = 12 (y? y?)2. For round t of the algorithm, show thatrt,i = yi ? ft?1(xi). Then, write down an expression for the optimization problem in step (GC3)that is specific to this setting (you don’t need to actually solve it).What to submit: your working out, either typed or handwritten.(b) Using the same setting as in the previous part, derive the step-size expression according to theadaptive approach (SS2).What to submit: your working out, either typed or handwritten.3Note that this set-up is general enough to include both regression and classification algorithms.Page 4(c) We will now implement the gradient-combination algorithm on a toy dataset from scratch, and wewill use the class of decision stumps (depth 1 decision trees) as our base class (F), and squared errorloss as in the previous parts.4. The following code generates the data and demonstrates plottingthe predictions of a fitted decision tree (more details in q2.py):51 np.random.seed(123)2 X, y = f_sampler(f, 160, sigma=0.2)3 X = X.reshape(-1,1)45 fig = plt.figure(figsize=(7,7))6 dt = DecisionTreeRegressor(max_depth=2).fit(X,y) # example model7 xx = np.linspace(0,1,1000)8 plt.plot(xx, f(xx), alpha=0.5, color=’red’, label=’truth’)9 plt.scatter(X,y, marker=’x’, color=’blue’, label=’observed’)10 plt.plot(xx, dt.predict(xx.reshape(-1,1)), color=’green’, label=’dt’) # plottingexample model11 plt.legend()12 plt.show()13The figure generated isYour task is to generate a 5 x 2 figure of subplots showing the predictions of your fitted gradient-combination model. There are 10 subplots in total, the first should show the model with 5 baselearners, the second subplot should show it with 10 base learners, etc. The last subplot should bethe gradient-combination model with 50 base learners. Each subplot should include the scatter of4In your implementation, you may make use of sklearn.tree.DecisionTreeRegressor, but all other code must be yourown. You may use NumPy and matplotlib, but do not use an existing implementation of the algorithm if you happen to find one.5Although we will not cover decision trees until week 4, we are treating the decision tree as a black box algorithm that can be calledusing the sklearn implementation. For more on using sklearn models, see Lab 1.Page 5data, as well as a plot of the true model (basically, the same as the plot provided above but withyour fitted model in place of dt). Comment on your results, what happens as the number of baselearners is increased? You should do this two times (two 5x2 plots), once with the adaptive stepsize, and the other with the step-size taken to be = 0.1 fixed throughout. There is no need tosplit into train and test data here. Comment on the differences between your fixed and adaptivestep-size implementations. How does your model perform on the different x-ranges of the data?What to submit: two 5 x 2 plots, one for adaptive and one for fixed step size, some commentary, and a screenshot of your code and a copy of your code in your .py file.(d) Repeat the analysis in the previous question but with depth 2 decision trees as base learners in-stead. Provide the same plots. What do you notice for the adaptive case? What about the non-adaptive case? What to submit: two 5 x 2 plots, one for adaptive and one for fixed step size, some commen-tary, and a copy of your code in your .py file.(e) Now, consider the classification setting where y is taken to be an element of {?1, 1}. We considerthe following classification loss: `(y, y?) = log(1 + e?yy?). For round t of the algorithm, what is theexpression for rt,i? Write down an expression for the optimization problem in step (GC3) that isspecific to this setting (you don’t need to actually solve it).What to submit: your working out, either typed or handwritten.(f) Using the same setting as in the previous part, write down an expression for t using the adaptiveapproach in (SS2). Can you solve for t in closed form? Explain.What to submit: your working out, either typed or handwritten, and some commentary.(g) In practice, if you cannot solve for t exactly, explain how you might implement the algorithm.Assume that using a constant step-size is not a valid alternative. Be as specific as possible in youranswer. What, if any, are the additional computational costs of your approach relative to using aconstant step size ?What to submit: some commentary. ...

March 15, 2023 · 10 min · jiezi

关于算法:MAS286-数学与统计

University of SheffieldSchool of Mathematics and StatisticsMAS286 Mathematics and Statistics in ActionWhat is pharmaco-kinetics?Pharmaco-kinetics is concerned with how pharmaceutical drugs move around the body. Inreality this is a highly complex process, as drugs are absorbed, distributed, metabolised andeliminated through various parts of the body. We will make various assumptions that mean wecan simplify these processes. In particular we will assume the body is made up of a smallnumber of ‘compartments’ through which the drugs move. We will also simplify the processesby which drugs are administered. Our focus will be on how the concentration of a particulardrug in the body changes over time. We will thus construct mathematical models usingordinary differential equations that describe how this occurs. The models will be made up offunctions that describe the key mechanisms that we think are involved. While data will guideour knowledge of these mechanisms as well as ‘sensible’ parameter values to use, here we willnot be looking to fit trends to data-sets, or what we might call statistical modelling. Instead,we will construct mechanistic mathematical models and analyse them.Some detailsIn this topic you will see how we can build models to represent pharmaco-kinetics using sets offirst-order ordinary differential equations. The mathematical techniques will thus build oncontent from MAS110 and MAS111. There will also be some programming in Python. I do notassume any prior knowledge of Python, since some of you have never met it before. Samplecode that you can edit is available on Blackboard, as well as some instructions for those of youwho feel unfamiliar with Python. There will be some medical/biological terminology you willneed to learn, but this is secondary to the mathematical techniques. The overarchingmethodology I am trying to introduce to you is mathematical modelling - specifying aproblem, translating it in to a mathematical model, analysing the model, and translating back.11 Single intravenous bolus doseWe will start with the simplest (yet realistic) model we can think of, before gradually buildingmore complexity into it. Let us assume that:• the body can be considered as a single compartment (effectively the bloodstream);• there is a rapid dose of the drug into the body, called an intravenous bolus;• only one dose of the drug is given;• the drug is eliminated from the body at a rate proportional to the drug concentration.Let C(t) be the concentration of the drug in the bloodstream at time t. If the rate ofelimination of the drug is k, we can writedCdt = −kC (1.1)for t > 0, with some initial concentration C(0) = C0 > 0. Note that we only have a negativeterm on the right-hand side of equation (1.1), indicating that the concentration of the drug isalways decreasing. This fits with our assumptions above, since we assume no additional drugis added after t = 0, so all that can happen to the drug is it is eliminated from the body.Equation (1.1) is a first-order, linear ordinary differential equation so can be solved byseparation of variables:−kt (taking exponentials).Imposing the initial condition C = C0 at t = 0, we find that eA = C0, henceC(t) = C0e−kt. (1.2)We thus predict exponential decay of the drug concentration.1.1 Calculating kSuppose we wish to find out what the elimination rate is for a particular drug. All we need aretwo data-points for the concentration at known times. One of these could be the startingconcentration (C0 at t = 0), and suppose we take a measurement of the concentration, C1after t1 hours (hr). If we take logs of both sides of our solution (1.2) we findln(C(t)) = ln(C0) − kt =⇒ k =ln(C0) − ln(C(t))t=⇒ k =ln(C0) − ln(C1)t1(imposing C(t1) = C1). (1.3)2In fact, if we had multiple data points, we would simply need to know the gradient of the lineof ln(C(t)) vs t. Since k is a rate, it has units of hr−1. Elimination rates for real drugs tend tobe in the range k ∈ [0.02, 0.4] hr−1.1.2 A note on C0We assumed above that we know the initial concentration, C0. This might seem obvious sincewe know the dose, but dose and concentration are not the same thing. The dose is the actualamount of the drug administered, while the concentration is the relative amount of drug inthe body, as a proportion of the effective volume of the body. We can convert between thesetwo quantities by noting thatC0 =drug doseeffective volume.For complex biological reasons, the effective volume is not necessarily equal to a patient’sactual body (or blood) volume, and varies from drug to drug. For simplicity, however, we willassume it is a fixed value for each drug in every patient.1.3 Drug half-lifeGiven that we expect exponential decay of the drug concentration over time, we cannot give aprecise time when the concentration reaches zero. However, we can calculate a ‘half-life’ forthe drug based on its elimination rate. If we know that the concentration is Ca after ta hours,then to find the half-life we need to know the time when the concentration is Ca/2. Usingequation (1.3), we haveln(Ca/2) = ln(Ca) − kt1/2 =⇒ t1/2 =ln(Ca) − ln(Ca/2)k=⇒ t1/2 =ln(2)k. (1.4)1.4 An example problemSuppose the half-life of a drug is known to be 3 hours, and the effective volume for the drug ina patient is 30 litres (L). What should the initial dose be such that after 4 hours theconcentration of the drug in the patient is 2.5 mg/L?First, we find the elimination rate k from the half-life, using equation (1.4):t1/2 =ln(2)k=⇒ k =ln(2)t1/2=⇒ k = 0.231 hr−1.3Next, we work out what the initial concentration must be. We know that k = 0.231 hr−1andthat C = 2.5 mg/L at t = 4 hr. Using equation (1.2), we haveC(t) = C0e−kt =⇒ C0 = C(t)ekt=⇒ C0 = 2.5e0.231×4 = 6.297 mg/L.The final step is to multiply the initial concentration by the effective volume to get the actualdose, 6.297 × 30 ≈ 190 mg. A plot of the drug dynamics for these parameter values is shownin Figure 1.0 2 4 6 8Time (hr)02468Concentration (mg/L)Figure 1: Exponential decay according to equation (1.2). Parameter values are k = 0.231 hr−1,C0 = 6.297 mg/L, and V = 30 L.42 Repeated intravenous bolus dosesIn Section 1 we assumed that a single dose of a drug is given to a patient, and studied howthe concentration of the drug decays over time. In some medical circumstances this may wellbe all that happens. However, for ongoing conditions, a patient may need repeated doses of adrug. This leads to an important question: what is the optimal time interval between doses fora particular drug? We need to ensure the patient always has enough drug in their system so itis effective, but not so much drug in their system that they overdose.In Section 1 we could assume there was no drug in the bloodstream prior to the single dose.After repeated doses, however, the drug will accumulate in the bloodstream. We know thateventually the drug concentration decays towards zero after each dose, so if we leave enoughtime between doses then there will be (effectively) no accumulation. However, if doses aregiven more frequently, then some of the drug will already be present.Consider the process demonstrated in Figure 2. A drug with a half-life of 6 hours is given to apatient in 100 mg doses every 6 hours, with an effective volume of 25 L. The initialconcentration is thus 4 mg/L. The concentration halves to 2 mg/L after 6 hours, at whichpoint the second dose is given and the concentration increases to 6 mg/L. The concentrationthen halves to 3 mg/L after a further 6 hours, at which point the third dose is given and theconcentration increases to 7 mg/L. As such the drug is accumulating in the bloodstream.Notice, however, that this rate of accumulation starts to slow, and eventually the drugconcentration simply fluctuates between 8 mg/L and 4 mg/L. There is thus a limit toaccumulation, because eventually the constant amount of drug that is added at each dosebecomes equal to the density-dependent amount that is removed over the next 6 hours. Wecan therefore design our dosage regime to ensure that the patient always has an effective, butsafe, amount of drug in their system.Concentration (mg/L)Figure 2: Repeated doses according to equation (2.13). Parameter values are k = 0.116 hr−1,C0 = 4 mg/L, V = 25 L, and = 6 hr.5In fact, this process is given by a geometric series. Recall from Section 1 that the dynamics ofa single dose are given byC(t) = C0e−kt. (2.1)Suppose that a dose of the drug is given every hours. We will also label the concentrationswith a subscript to the dose number. Thus hours after the first dose, but immediately beforethe second dose, the drug concentration isC1( ) = C0e−k. (2.2)The second dose is then given. Immediately after this second dose, re-starting time as t = 0,the drug concentration isC2(0) = C1( ) + C0 = C0(1 + e−k ). (2.3)This quantity is therefore the new ‘initial condition’, and the dynamics for the next hours aregiven byC2(t) = C0(1 + e−k )e−kt, (2.4)and exactly hours after this second dose, but just before the third, we haveC2( ) = C0(1 + e−k )e−k. (2.5)This pattern continues. For example, immediately and hours after the third dose we have,respectively,C3(0) = C0(1 + e−k )e−k + C0 = C0(1 + e−k + e−2k ), (2.6)C3( ) = C0(1 + e−k + e−2k )e−k. (2.7)Let R = e−k , then we can re-write (2.6)–(2.7) asC3(0) = C0(1 + R + R2), (2.8)C3( ) = C0(R + R2 + R3). (2.9)Extrapolating, the concentrations immediately and hours after the n-th dose are given byCn(0) = C0(1 + R + · · · + Rn−1) = C0Xni=1Ri−1, (2.10)Cn( ) = C0(R + R2 + · · · + Rn) = C0Xni=1Ri. (2.11)These are clearly geometric series. We can express (2.10) as a simple fraction as follows:Cn(0) = C0(1 + R + · · · + Rn−1) =⇒ RCn(0) = C0(R + R2 + · · · + Rn)=⇒ Cn(0) − RCn(0) = C0(1 − Rn)=⇒ Cn(0) = C01 − Rn1 − R= C01 − e−nk1 − e−k .6By similar reasoning, we find thatCn( ) = C0R1 − Rn1 − R= C0e−k 1 − e−nk1 − e−k . (2.12)This also leads to an expression for the concentration at any time, t, after the n-th dose,Cn(t) = C0e−kt 1 − e−nk1 − e−k . (2.13)We can use these expressions to find the minimum and maximum concentrations of the drugafter many doses, i.e. as n → ∞. In particular, notice that e−nk → 0 as n → ∞, meaningwe haveC∞(0) = C+ =C01 − e−kfor the maximum concentration, andC∞( ) = C− =C0e−k1 − e−kfor the minimum.2.1 An example problemSuppose that a drug has a half-life of 4 hours, and is given in 200 mg doses every 6 hours witheffective volume of 25 L. Clinical guidelines suggest that, in the long-term, the maximum safeconcentration of the drug is 20 mg/L, and a minimum concentration of 7.5 mg/L is neededfor the drug to be effective. Does this drug dosage regime fit within these guidelines?First, recalling Section 1.2, we calculate the initial concentration C0 = 200/25 = 8 mg/L.Next, we use equation (1.4) to calculate the elimination rate k = ln 2/4 = 0.173 hr−1. Thismeans we can find e−k = 0.354. Substituting these values and = 6 hr into the expressionsfor C+ and C−, we obtainC+ =81 − 0.354= 12.38 mg/Lfor the maximum concentration, andC− =6 × 0.3541 − 0.354= 4.38 mg/Lfor the minimum concentration. Therefore this dosage regime is always safe (since themaximum is less than 20 mg/L) but not always effective (since considerable time will be spentat concentrations lower than 7.5 mg/L).Now let’s think about the problem in a different way. Suppose we wish to design our dosageregime so that it fits the clinical guidelines exactly - that is, a maximum concentration of 207mg/L and a minimum of 7.5 mg/L. What should the dosage amount and time interval be?We already know that k = 0.173 hr−1. First, notice thatC∞( )C∞(0) = R = e−k =⇒7.520= e−0.173.By taking logs and re-arranging this we get = − ln(0.375)/0.173 = 5.67 hr. Let’s round thisto the more realistic time frame of 6 hours. Then, by considering the maximum, we haveC0 = C∞(0)(1 − e−k ) =⇒ C0 = 20(1 − e−0.173×6)=⇒ C0 = 12.92 mg/L.Given that V = 25 L, this means a dose of 322.92 mg, which we might round down (becausewe definitely wish to avoid overdosing) to the more realistic 300 mg. Thus, based on ourknowledge of the drug, we would recommend a dose of 300 mg every 6 hours for maximum,yet safe, efficacy. (A quick check shows that our rounding means this actually gives amaximum of 18.58 mg/L and a minimum of 6.58 mg/L.) A time course of this regime isshown in Figure 3(a).2.2 Loading dose and maintenance doseA downside of the approach we have just described is that it might take a long time for thedrug concentration to reach these long-term maximum and minimum values. For example,Figure 3(a) shows that after the first dose the concentration is ineffective for over 3 hours.What we might consider doing, then, is to initiate treatment with a larger ‘loading dose’ thattakes the concentration to (or near) the maximum concentration, with following dosages givenas a ‘maintenance dose’. The loading dose should come as close as possible to immediatelyreaching the maximum concentration. For our example above, then, we might take a loadingdose of 20 × 25 = 500 mg. The maintenance dose and timing would then be identical topreviously (300 mg every 6 hours) since this regime is already balanced to fluctuate betweenthe maximum and minimum concentrations. Plots comparing the time-course of theconcentrations with and without the loading dose are shown in Figure 3.We can still write a general solution, similar to equation (2.13), by considering the dynamicsover subsequent doses. Calling the loading dose CL and the maintenance dose CM, some timet after the first dose we haveC1(t) = CLe−kt. (2.14)Immediately after the second dose at time , re-starting time as t = 0, we haveC2(0) = C1( ) + CM = CM + CLe−k, (2.15)and the dynamics for the next hours will now be given byC2(t) = (CM + CLe−k )e−kt. (2.16)80 4 8 12 16 20 24 28 32 36Time (hr)01020Concentration (mg/L)(a)0 4 8 12 16 20 24 28 32 36Time (hr)01020Concentration (mg/L)(b)Figure 3: Repeated doses according to (a) equation (2.13) with no loading dose, and (b)equation (2.20) with a loading dose. In each case, parameter values are k = 0.173 hr−1,C0 = CM = 12 mg/L, CL = 20 mg/L, V = 25 L, and = 6 hr. The maximum and minimumconcentrations given by the clinical guidelines are shown as red dashed lines.Then, immediately after the third dose we haveC3(0) = CM + (CMe−k + CLe−2k ). (2.17)Note that we can then re-write this asC3(0) = CM(1 + e−k + e−2k ) + (CL − CM)e−2k. (2.18)and for any time t after this we would haveC3(t) = CM(1 + e−k + e−2k )e−kt + (CL − CM)e−2k e−kt. (2.19)We can see the first term is exactly the same as we had previously and will again be given by ageometric sum. There is then the addition of a second term to do with the extra loading dose.The final expression for the dynamics in this case is thus given byCn(t) = CMe−kt 1 − e−nk1 − e−k ...

March 15, 2023 · 26 min · jiezi

关于算法:Chai-3D-组件工具

举荐:将 NSDT场景编辑器 退出你的3D开发工具链 介绍 在 CHAI3D 中,小部件是以 2D 模式显示数据和状态信息的次要元素。每个摄像机都蕴含一个前层和后层,能够在其上附加小部件。渲染摄像机场景时,首先渲染 2D 背景图层,而后渲染 3D 世界,最初渲染 2D 前图层。在下一节中,咱们将回顾其中一些根本小部件。CHAI3D中的小部件 面板 cPanel 可用于在窗口上搁置一个空面板。面板具备用于提供圆角、色彩和资料纹理属性的属性。面板由其宽度定义,高度能够搁置在视口中的任何地位。 using namespace chai3d;// create a panelcPanel* panel = new cPanel();// add panel to front layer of cameracamera->m_frontLayer->addChild(panel);// set width and height of panelpanel->setSize(300, 200);// assign radius of each corner of panelpanel->setCornerRadius(10, 10, 10, 10);// assign a position (x,y) to panelpanel->setLocalPos(40, 60);// set a uniform color to panelpanel->setColor(cColorf(1.0, 0.5, 0.5));// assign a transparency level to panelpanel->setTransparencyLevel(0.5);位图 应用程序能够将位图小组件增加到其用户界面以显示图像。图像及其原点显示在微件指标矩形的左下角,并且能够沿 X 轴和 Y 轴拉伸。 ...

March 15, 2023 · 2 min · jiezi

关于算法:MTHM506统计数据建模

MTHM506/COMM511 - Statistical Data ModellingTopic 3 - IntroductionPreliminariesIn this session, we will start Topic 3 and introduce Generalised Additive Models, another, more flexible classof models that are often seen as an extension to Generalised Linear Models. These notes refer to Topics3.1-3.8 from the lecture notes. In this session, we need the mgcv package to help us fit Generalised AdditiveModels. We use the install.packages() function to download and install the most recent package and usethe library() function to load them into the R library. ...

March 15, 2023 · 12 min · jiezi

关于算法:ECMT3150-R语言统计汇总

ECMT3150: Assignment 1 (Semester 1, 2023) [Total: 24 marks]Note: Please append your R codes (as a separate .R le) for part (g) while you submit theassignment.Let Xi denote the log-price of a stock, Cherry Inc. (code: CRRY), by the end of trading dayi, and let Xi := Xi Xi1; thus Xi is the log-return on trading day i (i.e., over period(i 1; i]).Assume fXig_i0 follows the AR(1) model:Xi = 0 + 1Xi1 + ui: (1)where ui iid normal with mean 0 and variance 2.Let fFigi0 be the natural ?ltration generated by fuigi0.(a) [2 marks] Express Xi in terms of Xi1 and ui.(b) [2 marks] Compute E(XijFi1).(c) [2 marks] Compute V ar(XijFi1).(d) [2 marks] What is the condition on 0 and 1 such that fXigi1 is a martingaledi¤erence sequence?A trading strategy is de?ned by figi0, where i is measurable with respect to Fi. Speci?-cally, i represents the number of CRRY shares a trader buys at the start of day i.The log-return due to the trading strategy over period (0; T ] is given byrT =TXi=1i1Xi.(e) [4 marks] Alice invested in a share of CRRY using a buy-and-hold strategy, with i 1for all i. Compute E(rT ) and V ar(rT ) with 0 = 0 and 1 = 1.(f) [4 marks] Bob suggested another strategy, with i Xi for i > 0 and Compute E(rT )and V ar(rT ) with 0 = 0 and 1 = 1.1(g) [8 marks] Carol suggested yet another strategy, with i 1fXi > 0g and 0 = 1.We want to evaluate the risk-return tradeo¤ of the proposed strategies using computersimulation.Start an R session, and set a random seed equal to the last 3 digits of your student ID.1Then generate B sample values of rT (name them as r(1)T ; r(2)T ; : : : ; r(B)T ), and computethe sample mean and variance of rT as follows:rT =1BBXb=1r(b)T ;se(rT ) =1B 1BXb=1(r(b)T rT )2:For the purpose of your simulations, set T = 63, 2 = 0:1, B = 1000.The Sharpe ratio, de?ned as SR = rTse(rT ) , is a common measure of the risk-returntradeo¤. Trading strategies with higher SR are more preferred by investors.Complete the following table with SR values. Comment on the performance of thetrading strategies under di¤erent scenarios.1 Alice Bob Carol10:01 10:01 10:91:1[Total: 16 marks] LetM denote the mood of Mimi (h: happy; a: angry), and let W denotethe weather (s: sunny; r: rainy). The joint probability distribution of M and W is given inthe table below. The row and column sums are displayed in the last column and in the lastrow, respectively.p(m;w) M = h M = aW = s 0:4 0:1W = r 0:2 0:3(a) [2 marks] Compute P (M = a).(b) [2 marks] Derive the conditional distribution of W given M = a.Assume that, given m and w, your test score S follows a normal distribution with mean(m;w) := E(SjM = m;W = w) and standard deviation 5. The conditional mean function(m;w) is given in the table below:1This is to ensure that your answers are replicable but di¤erent from those of other students.2(m;w) m = h m = aw = s 80 50w = r 70 40The passing score is 50 or above.(c) [3 marks] Compute the mean score E(S).(d) [3 marks] Given that Mimi was angry, what is the mean score you would get?(e) [3 marks] Compute the probability of failing the test.(f) [3 marks] Given that you failed the test, what is the probability that Mimi was angry?[Total: 20 marks]Note: Please append your R codes (as a separate .R ?le) while you submit the assignment.Carol, an amateur economist, proposes the following time series model for unemploymentrate:yt =120+p32yt1 14yt2 + "t; (2)where "t iid N(0; 0:022) (normal distribution with mean 0 and variance 0:022). The timeperiod is measured in number of quarters.(a) [3 marks] Show that the time series fytg generated by model (1) is stationary.(b) [3 marks] There is a stochastic cycle in the time series generated by model (1). Find itsperiodity in number of quarters.(c) [4 marks] Compute the ACF for the ?rst 3 lags, i.e., (1), (2) and (3).(d) [2 marks] Write an R program to simulate a sample path of fytg over 30 years. Set theinitial values y0 and y1 to be y0 = 0:1 and y1 = 0:12. While simulating the randomnumbers for "t, set the random seed to be your last 3 digits of your student ID.(e) [2 marks] Plot the sample ACF and record its value for the ?rst 3 lags (the values can beretrieved from the acf command output stored as a list). Why are they di¤erent fromyour answers in part (c)?(f) [3 marks] Using the simulated sample path in part (d), estimate an AR(2) model usingthe R command arima. Write down the estimated model with the parameter estimatesand their standard error. Also record the estimated variance of the innovations.[Important note: the ?intercept?estimate in the arima output is in fact the unconditionalmean; see Rob Hyndman?s page for details: https://robjhyndman.com/hyndsight/arimaconstants/.](g) [3 marks] Using the simulated sample path in part (d) and the R package forecast,plot the point forecast and the con?dence interval for each period over the next 5 years.Describe the short-run and long-run behaviour of the point forecast and the con?denceinterval.

March 14, 2023 · 5 min · jiezi

关于算法:MMGT6001-策略算法研究分析

MMGT6001 – StrategyGroup Assessment (20%)ASSESSMENT TASKThe group assignment will ask you to perform a strategic analysis of an Australiancompany, generate strategic options and design a learning launch.The strategic analysis should provide the following elements: A sound analysis of the current situation, the strategic issues to tackle as wellas the key success factors of the industry. To do so, you should use relevantand applied frameworks discussed in class. Ideally, your analysis shouldinclude an external analysis, an internal analysis/activity analysis and adiscussion of the customers. Pay particular attention to emerging trends thataffect the industry as well as emerging business models.Based on your findings, you should suggest two strategic options. Use theory,evidence as well as your critical thinking to design these potential options. Youshould then evaluate these options and select one strategic option that youwould like to investigate further in a learning launch. There are no wronganswers, however you should back-up your choice of a specific strategic optionwith a short comparative analysis of the strategic options (outcomes, input,time, etc.).You should then provide a proposition of a learning launch. Hess and Liedtkadefine a learning launch as a “carefully designed experiment or prototypedesigned to test the key underlying value-generating assumptions of a potentialnew-growth initiative. In contrast to a full new-product roll-out, a learning launchis a learning experiment conducted quickly and inexpensively” (Liedtka, 2009:p.1).Finally, you should evaluate your strategy by using the four Hess & Liedtkatests. A document with the relevant information will be uploaded to Canvas inthe ‘Group Assessment’ Module.Your group is responsible for choosing the Australian company to analyse. I stronglyrecommend you select a public company for which there is information available suchas annual reports, earning calls and strategy presentations. You can find thisinformation in the ‘Investor’ section of the company’s website, as well as on the ASXdatabase or other databases you might have access to, such as Capital IQ. Yourchosen company should have interesting strategies, and/or be in industries that aredisrupting the way customers do things or in industries facing disruption. Your chosencompany should be approved by the lecturer. Please notify the lecturer of yourselection via email (massimo.garbuio@sydney.edu.au) by 28 February 2023. Alongwith this, please also brainstorm and email me five challenges and five opportunitiesfor your selected company, which may later form the basis of your Group Assessment.Some examples of companies that you could investigate are Afterpay, Seek, Tyro,Splitit, Xero, Myer, Santos, Atlassian, Coles, Woolworths, Airtasker, Zip, any Big FourBUSINESS SCHOOLbank (ANZ, CBA, NAB, Westpac), Fortesque Metals, or any Australian lithium orhydrogen company (e.g. Lake Resources).The Group for this assignment will be allocated by the lecturer and made available onCanvas.Your Group should consider itself a team of consultants that have been called by theBoard of Directors to provide a strategic analysis of the situation the company face,strategic options and a learning launch. Therefore, your presentation needs to bebacked up by solid quantitative and qualitative evidence.During your analysis, look at both what the organization says of itself and what othersources say of it. Industry and consulting reports will be valuable to identify key trends,as well as the business press (e.g. BusinessWeek, Bloomberg, The Financial Times,FastCompany, Entrepreneur and Australian Financial Review – use Factiva if youdon’t have direct access to these sources). To identify relevant internal issues, thecompany’s annual report can be complemented by interviews with experts, reports ofcompetitors as well as articles from the business press.Following submission, each team will be assigned another group’s presentation to lookat in detail. You should prepare 2-3 questions to ask during the live session in the finalclass (Tuesday 21 March 2023). Formulating these questions is another opportunityto practice your strategic thinking. In formulating the questions, consider: ...

March 14, 2023 · 5 min · jiezi

关于算法:COMP9334-计算机网络

COMP9334 Capacity Planning of Computer Systems andNetworksAssignment (Version 1.01), Term 1, 2023Due 5:00pm, Fri 17 March 2023 (Friday Week 5)Change log and version infoUpdates, changes and clarifications will appear in this box. Version 1.01 (7 March 2023) revises the wording in Question 1. Version 1.00 issued on 27 February 2023Instructions(1) There are 3 questions in this assignment. Answer all questions.(2) The total mark for this assignment is 20 marks.(3) The submission deadline is 5:00pm Friday 17 March 2023. Submissions made after thedeadline will incur a penalty of 5% per day. Late submissions will only be accepteduntil 5:00pm Wednesday 22 March 2023, after which no submissions will be accepted.(4) In answering the questions, it is important for you to show your intermediate steps andstate what arguments you have made to obtain the results. You need to note that boththe intermediate steps and the arguments carry marks. Please note that we are not justinterested in whether you can get the final numerical answer right, we are more inter-ested to find out whether you understand the subject matter. We do that by looking atyour intermediate steps and the arguments that you have made to obtain the answer.Thus, if you can show us the perfect intermediate steps and the in-between argumentsbut get the numerical values wrong for some reason, we will still award you marks forhaving understood the subject matter.You can take a look at the solution to revision problems to get some ideas the level ofexplanation that is required.1(5) If you use a computer program to perform any part of your work, you must submit theprogram or you lose marks for the steps.(6) This is an individual assignment.(7) Your submission should consist of:(a) A report describing the solution to the problems. This report can be typewrittenor a scan of handwritten pages. This report must be in pdf format and must benamed report.pdf. The submission system will only accept the name report.pdf.(b) One or more computer programs if you use them to solve the problems numerically.You should use zip to archive all the computer programs into one file with the namesupp.zip. The submission system will only accept this name. The report must referto the programs so that we know which program is used for which part.(8) Submission can be made via the course website.(9) You can submit as many times as you wish before the deadline. A later submission willover-write the earlier one. We will only mark the last submission that you make.(10) If you want to ask questions on the assignment, you can attend a consultation (see theTimetable section of the course website for dates and times) or post your question onthe forum. Please note that if your forum post shows part of your solution or code, youmust mark that forum post private.(11) Additional assignment conditions: Joint work is not permitted on this assignment.– This is an individual assignment. The work you submit must be entirely yourown work: submission of work even partly written by any other person is notpermitted.– Do not request help from anyone other than the teaching staff of COMP9344.– Do not post your assignment work or code to the course forum.– Assignment submissions are routinely examined both automatically and man-ually for work written by others.Rationale: this assignment is designed to develop the individual skills needed tosolve problems. Using work/code written by, or taken from, other people will stopyou learning these skills. Other CSE courses focus on skills needed for working ina team. The use of AI generative tools, such as ChatGPT, is not permitted on this assign-ment.Rationale: this assignment is designed to develop your understanding of basic con-cepts. Using AI tools will stop you learning these fundamental concepts, which willsignificantly impact your ability to complete future courses. Moreover, ChatGPThas been found to give incorrect answers for advanced problems covered in thiscourse.? Sharing, publishing, or distributing your assignment work is not permitted.2– Do not provide or show your assignment work to any other person, other thanthe teaching staff of COMP9334. For example, do not message your work tofriends.– Do not publish your assignment code via the Internet. For example, do notplace your assignment in a public GitHub repository.Rationale: by publishing or sharing your work, you are facilitating other studentsusing your work. If other students find your assignment work and submit partor all of it as their own work, you may become involved in an academic integrityinvestigation. Sharing, publishing, or distributing your assignment work after the completion ofCOMP9334 is not permitted.– For example, do not place your assignment in a public GitHub repository afterthis offering of COMP9334 is over.Rationale: COMP9334 may reuse assignment themes covering similar concepts andcontent. If students in future terms find your assignment work and submit partor all of it as their own work, you may become involved in an academic integrityinvestigation.3Question 1 (5 marks)Assuming that you are the administrator of an interactive computer system. The computersystem consists of a multi-core CPU and a disk. During an observation time of 1800 seconds,you obtained the following measurements from the system:Average busy time per core 1575 sDisk busy time 1124 sNumber of requests completed by the computer system 57This computer system is used by 16 interactive users and the thinking time per interactiveuser is 45 seconds.You consider the current throughput of the system is too low. You are considering aproposal to upgrade the current CPU, which has 4 cores, to a new CPU with 6 cores and thesame processing speed per core as that of the current CPU.For this question, you can assume that the total workload remains the same before andafter the upgrade. You can also assume that the workload is requests (note the plural) arealmost evenly distributed among the cores at the moment and the workload requests can stillbe evenly distributed among the cores after the upgrade. As the system administrator, youknow that when the workload each request (note the singular) uses the CPU, it uses only onecore at a time, i.e., the workload a request does not use multiple cores concurrently.Answer the following questions.(a) Determine the current average service demand of a core.(b) What will the average service demand per core be if the proposed CPU upgrade iscarried out?Hint: The service demand of a core depends on two factors: the number of visits to thecore and the service time needed per visit to the core. For the set up of this question,one of these two factors remains the same after the upgrade, while the other factor willchange.(c) What will the throughput bound of the computer system be if the proposed CPUupgrade is carried out?Reminder: If you use a computer program to derive your numerical answers, you mustinclude your computer program in your submission. Do not forget to show us your steps toobtain your answer.4Question 2 (7 marks)Assuming that you are the owner of a CPU and you are happy for outsiders to use your CPUas long as these outsiders’ work does not interrupt yours and your work takes precedence overtheirs. This question is inspired by people who donate their spare CPU time for scientificresearch.In this question, we will use the term primary user to refer to the CPU owner (which isyou) and the term external users to refer to the outsiders. We make the following assumptions: The CPU is configured as a single processing unit without any queueing spaces. If a request (which can be from the primary user or an external user) arrives when theCPU is idle, the request will be admitted to the CPU. If a request (which can be from the primary user or an external user) arrives when theCPU is working on a primary user’s request, the arriving request will be rejected. Thisis because there are no queueing spaces in the CPU. If a request from the primary user arrives when the CPU is working on an externaluser’s request, the external user’s work will be terminated immediately and the primaryuser’s request will be admitted to the CPU immediately. In this case, the externaluser loses their work and its remaining work will not be resumed. Therefore, you canconsider that the external user has left permanently. If a request from an external user arrives when the CPU is working on another externaluser’s request, the newly arriving request will be rejected. The inter-arrival times for the primary user’s requests are exponentially distributed withmean arrival rate p; those for the external users’ requests are exponentially distributedwith mean arrival rate e. The service times of the primary user’s requests are exponentially distributed witha mean service time of 1p ; those for the external users’ requests are exponentiallydistributed with mean 1e . The four probability distributions mentioned in the last two dot points are independentof each other.Answer the following questions. You are expected to express your answers in terms ofthese rate parameters: p, e, p and e.(a) Let us assume that at time t, you observe that the request at the CPU belongs to anexternal user. What is the probability that this observed request will still be in theCPU at time (t + ?t) where ?t is an infinitesimal time change? You should expressthis probability in terms of ?t and any of the appropriate rate parameters.(b) Formulate a continuous-time Markov chain for the CPU. Your formulation should in-clude the definition of the states and the transition rates between states.5(c) Write down the balance equations for the continuous-time Markov chain that you haveformulated.(d) Derive the expressions for the steady state probabilities of the continuous-time Markovchain that you have formulated. You should be able to solve for the steady stateprobabilities analytical and provide answers in terms of p, e, p and e.(e) What is the probability that a request from the primary user will be admitted? Whyis this probability independent of the rate parameters of the external users?(f) What is the probability that a request from an external user will be admitted?6Question 3 (8 marks)This question is based on the system illustrated in Figure 1. The system consists of a databaseserver and an external queue. The database server consists of a front-end server and a back-end server; each server has its own queue. Each of the three queues in this system (i.e.,external, front-end, back-end) has the capacity to hold only one request.IncomingrequestsDepartingrequestsDatabase serverExternalqueue Front-end Back-endFigure 1The mode of operation for the system in Figure 1 is as follows: The total number of requests in the database server (i.e., the two servers and two queues)must be two or less. If an incoming request arrives when there are a total of 2 requests in the database server,then the incoming request will join the external queue if it is empty; otherwise it willbe rejected if the external queue is already occupied. If an incoming request arrives when there are no requests in the database server, thenthe incoming request will be sent to the front-end server. If there is one request in the database server, then an incoming request will be sent tothe front-end server if it is idle or it will be placed in the front-end queue if the front-endserver is busy. After the front-end server has finished processing a request, there is a probability ofp that the front-end server will send the request to the back-end server for furtherprocessing, and a probability of (1 ? p) that the request will leave the database server(hence the system) permanently. After a request has been processed by the back-end server, the request will always besent back to the front-end for further processing; this request will need to join the queueif the front-end server is busy. If there is a request waiting in the external queue at the time a request is leaving thedatabase server permanently, then the request in the external queue will be admitted to7the database server. There are two scenarios depending on whether the front-end queueis occupied at the time when the permanent departure takes place. If the front-endqueue is occupied, then upon the permanent departure, the request in the front-endqueue will move to the server and the request in the external queue will move to thefront-end queue. If the front-end queue is unoccupied, then the request in the externalqueue will go to the front-end server.You can assume the following for the workload: The incoming requests are Poisson distributed with a mean arrival rate of requestsper unit time. The service time (i.e., per visit) to the front-end is exponentially distributed with amean of 1f . The service time (i.e., per visit) to the back-end is exponentially distributed with amean of 1b . All the service times and inter-arrival times are independent of each other.(a) Formulate a continuous-time Markov chain for the system. Your formulation should in-clude the definition of the states and the transition rates between states. The transitionrates should be expressed in terms , f , b and p.(b) Assuming that = 1.4, f = 2.1, b = 1.8 and p = 0.3.(i) Determine the steady state probabilities of the state of the continuous-time Markovchain that you have specified in Part (a).(ii) Determine the throughput of the database server.(iii) Determine the mean response time of the database server.Reminder: If you use a computer program to derive your numerical answers, you mustinclude your computer program in your submission. Do not forget to show us your steps toobtain your answer. End of assignment ...

March 14, 2023 · 11 min · jiezi

关于算法:Chai-3D之灯光与阴影

举荐:将 NSDT场景编辑器 退出你的3D开发工具链 介绍 光是人类能够视觉感知的任何事物的视觉示意背地的最重要的思维。光感知的概念在于,你所看到的不是基于你正在观看的物体,而是基于光源投射并从这些物体反射的光线。重要的是要留神,你的眼睛不会间接看到物体,因为你的眼睛和这些物体之间没有物理相关性。 当然,所有这些都是实践上的。咱们应用术语光线只是形象出更简单的机制。 光线通常来自能量起源,例如太阳或房间内的灯。重要的是要留神,从实践上讲,光线沿直线传播,当您在视觉上感知物体时,您的眼睛排汇的是该物体反射或散射的光线。 光的形象类型 以下术语形容了在对须要光源的 3D 应用程序进行编程时必须理解的不同类型的光。理解每种类型的光在渲染的 3D 对象外表上产生的成果十分重要。创立这些术语是因为须要形容光对物体产生的某些成果,以便提炼出光的简单数学计算。然而,这并不意味着这些确切类型的光实际上存在于自然界中,咱们只是将它们视为光投射在不同资料上时可能产生的成果的形象。计算光的实在机制及其在自然界中的工作形式将十分耗时,因而,OpenGL 通常采纳这组常见的光类型:环境光、漫射光和镜面光。发射光与其余光不同,是物体收回的光的类型,而其余三种类型的光通常用于形容光源。让咱们具体看看这些类型的光:将环境、漫反射和镜面反射重量组合在一起 环境光 被环境光照亮的 3D 球体看起来只有 2D。环境光是由照明区域四周(或位于照明区域外部)的所有光源发射光而产生的均匀光量。当阳光穿过房间的窗户时,它们会打到墙壁上,反射并散射到各个不同的方向,均匀照亮整个房间。这种视觉品质由环境光形容。仅环境光无奈传播在 3D 空间中设置的对象的残缺示意,因为所有顶点都由雷同的色彩平均照明,并且对象看起来像是二维的,如上图所示。只管显示的对象实际上是一个 3D 球体,但当仅由环境光照亮时,它在屏幕上看起来是平坦的。 漫反射光 红色的漫射光投射到定义其 3D 形态的彩色物体上。 漫射光示意光源投射的定向光。漫射光能够形容为在空间中具备地位并且来自单个方向的光。手电筒略微高于它所照亮的物体,能够被认为是发射漫射光。在上图中,投射红色漫射光的光源位于物体的左侧。当漫射光接触物体外表时,它会在该外表上平均地散射和反射。 为了演示环境光和漫射光如何协同工作以创立看起来更真切的对象,请设想一个 3D 球体,其上分布着深红色的环境光: 当初,通过将漫射光源搁置在球体的右侧,咱们失去以下后果: 请留神球体当初看起来是 3D 的。 镜面光 除了环境光层和漫反射层外,此处还显示镜面反射(或镜面反射高光)。您能够察看到镜面反射光源属性如何大大加强对象的 3D 示意。 就像漫射光一样,镜面反射光是一种定向光。它来自一个特定的方向。 两者之间的区别在于镜面光以锐利而平均的形式从外表反射。镜面反射光的渲染取决于观察者和光源之间的角度。从观察者的角度来看,镜面反射光会在被察看对象的外表上创立一个突出显示的区域,称为镜面反射或镜面反射。镜面反射的强度取决于形成物体的资料以及蕴含镜面反射光重量的光源的强度。 发射光 自发光与之前解释的任何其余光重量略有不同。自发光光组件负责物体材质反射或排汇光的属性。当利用于对象的材质时,自发光光的作用是模仿从物体反射的光。 因为四周没有其余光源,仅利用自发光光重量的物体色彩与仅应用环境光的对象具备雷同的视觉品质。然而,任何额定的漫射光或镜面光如何与仅施加自发光光的同一物体的表面反应的机制是不同的。让咱们思考一个均匀收回绿色的物体。在下图中,自发光重量利用于球体。如您所见,后果相似于在下面示例中将环境光利用于同一球体所创立的成果。 反射绿色自发光光的 3D 球体。在将其余光源引入场景之前,该成果相似于环境光。 如您所知,光源能够调配所有三个组件,即环境光、漫反射光和镜面反射光组件。让咱们看看当咱们在下面的场景中利用光源时会产生什么。咱们利用的光源具备以下属性:红色环境光、红色漫射光和红色镜面光。 如果下面的球体没有收回绿色的光,它就会出现红色。然而,自发光的绿色重量被施加到它下面。当光源的“光线”照射到球体外表时,“光源”和“指标”色彩交融在一起,产生淡黄色的外表。光源的镜面反射光重量为红色。镜面反射的核心核心是红色的,然而当它扩散时,它与绿色和红色交融,在黄色(即绿红色)上减少。同样,请留神,如果没有将自发光光利用于球体,它看起来就像下面镜面反射局部下显示的球体一样,全副为红色,带有红色镜面反射。 本教程的以下局部将介绍 OpenGL 着色多边形以模仿光线的形式,以及如何将光源属性调配给光源和材质。 光源 为了计算 3D 物体的暗影,CHAI3D 须要晓得落在其上的光线的强度、方向和色彩。这些属性由世界中的光对象提供。所有光源的基色和强度设置雷同,但方向取决于您应用的光源类型。此外,光线可能会随着与光源的间隔而削弱。上面介绍了 CHAI3D 中可用的三种类型的光源。CHAI3D 光源 位置灯 地位光位于空间中的某个点,并平等地向各个方向收回光。光线照耀外表的方向是从接触点回到光对象核心的线。强度随着与光的间隔而减小,在指定范畴内达到零。点光源可用于模仿场景中的灯和其余部分光源。 using namespace chai3d;// create a light sourcelight = new cPositionalLight(world);// attach light to cameraworld->addChild(light);// enable light sourcelight->setEnabled(true);// position the light sourcelight->setLocalPos(1.0, 1.0, 0.5);方向灯 定向光源没有任何可辨认的光源地位,因而光源对象能够搁置在世界任何中央。世界上所有的物体都被照亮,就如同光线总是来自同一个方向一样。光与指标物体的间隔未定义,因而光不会削弱。 定向光源示意来自流动工作区范畴之外的地位的大而远的光源。在真切的场景中,它们能够用来模仿太阳或月亮。在形象的模仿世界中,它们能够成为一种有用的办法,能够为对象增加令人信服的暗影,而无需精确指定光线的起源。在场景视图中查看对象时(例如,查看其网格、着色器和材质的外观),定向光通常是理解其着色显示方式的最快办法。对于这样的测试,您通常对光线来自哪里不感兴趣,而只是想看到物体看起来“固体”并寻找模型中的毛刺。 ...

March 14, 2023 · 2 min · jiezi

关于算法:随着以-ChatGPT-为代表的人工智能与产业结合AI-服务会是未来新型消费的增长点吗

随着人工智能技术的疾速倒退,AI 服务曾经开始成为各个行业的新兴业务。尤其是在生产畛域,AI服务曾经逐步走进了人们的日常生活,比方智能家居、智能客服、智能购物助手等。因而,能够说 AI 服务曾经成为将来新型生产的重要增长点之一。 AI服务可能进步消费者的购物体验。通过智能举荐、智能客服等AI技术,消费者能够更快捷、更便当地找到本人须要的产品和服务,缩小了繁琐的购物流程,进步了购物的效率和便利性。AI技术能够依据消费者的历史购买记录、浏览记录、偏好等信息,智能地举荐类似的产品或服务,从而帮忙消费者疾速找到本人须要的商品。这种智能举荐可能节俭消费者大量的工夫和精力,进步购物效率。 AI技术能够提供智能客服零碎,通过自然语言解决和机器学习等技术,可能智能地辨认和解决消费者的问题和需要,提供及时、精确的解答和服务。这种智能客服不受工夫和地点的限度,能够随时随地为消费者提供帮忙,从而进步购物的便利性和效率。 AI技术能够提供虚构试衣间,通过计算机视觉和深度学习等技术,可能将消费者的实时图像与商品进行匹配,帮忙消费者疾速抉择适合的尺码和样式。这种虚构试衣间可能无效缩小消费者在实体店试衣的工夫和懊恼,进步购物的便利性和效率。 AI服务可能进步产品的品质和服务的程度。AI技术能够剖析消费者的需要和偏好,提供更加个性化的产品和服务,从而进步消费者的满意度和忠诚度。AI 服务能够通过下列技术手段,对消费者提供个性化服务: (1) 机器学习:AI技术能够利用机器学习算法来预测消费者的需要和偏好。比方,能够通过分类算法、聚类算法等来对消费者进行分群,进而理解不同群体的需要和偏好。 (2) 自然语言解决:AI技术能够利用自然语言解决技术,对消费者的评论、评估等文本信息进行剖析,从而理解消费者的满意度、爱好和不称心之处等信息。 (3) 视觉辨认:AI技术能够利用计算机视觉技术,对消费者的图像进行剖析,理解消费者的外貌、服饰等信息,从而为消费者提供更加个性化的举荐和服务。 AI服务还能够升高企业的经营老本和危险。通过智能客服和智能助手等AI技术,企业能够缩小人工成本,进步服务效率,升高经营危险,从而进步企业的竞争力和盈利能力。总之,我认为,AI服务在将来新型生产中将会表演重要的角色。尽管在一些畛域AI技术还存在一些有余和挑战,但随着技术的一直倒退和欠缺,AI服务的利用前景有限,将会在将来的消费市场中施展越来越重要的作用。

March 13, 2023 · 1 min · jiezi

关于算法:ECON0019-经济分析

ECON0019: QUANTITATIVE ECONOMICS AND ECONOMETRICSEMPIRICAL PROJECT 2023InstructionsThe mark for the empirical project is worth 20% of your total mark for the module.Please follow these instructions so that we can ensure anonymity in marking and ensure compliancewith UCL assessment policies. We will only be able to give you credit for your project if you followthese instructions. If the instructions are not followed, you will receive a mark of zero. Please elect one group member to submit the project for the group.All answers must be uploaded via Turnitin by 12pm on March 27, 2023.All marking on Turnitin is anonymised. Do not put your name or student number orgroup name anywhere on your submitted answer — either in the document or in the file name.Put the candidate numbers for all group members at the top of the first page. Candidatenumbers are NOT student numbers! Use the candidate number from this year—it is not thesame as last year.You should submit one PDF or Word document that includes: you answers and expla-nations in the main text (including tables and figures, if any), as well as an appendix withyour code producing these results. If you use software other than Stata, you should state whichprogramme was used. You may optionally include raw statistical output (e.g. Stata log-file) afterthe code but such output does NOT substitute for your answers and explanations.Your answers should be no more than 800 words, including footnotes but excluding tables,figures, the code appendix, and the raw statistical output. State the number of words at the topof the first page of your submission.Submissions will be checked for plagiarism. By submitting this assessment, you pledge your hon-our that you have not violated UCL’s Assessment Regulations which are detailed in https://www.ucl.ac.uk/academic-manual/chapters/chapter-6-student-casework-framework/section-9-student-academic-misconduct-procedure, which include (but are not limited to) plagia-rism, self-plagiarism, unauthorised collaboration between students, sharing my assessment withanother student or third party, access another student’s assessment, falsification, contract cheat-ing, and falsification of extenuating circumstances.Please make sure to allow sufficient time should problems arise with Turnitin. Check the submis-sion inbox for confirmation that your essay has been submitted. Once your submission has beenaccepted you will return to the ‘My Submissions’ tab where you will be able to see the details ofyour submission. If your submission is not confirmed for some reason, or you are having issuesECON0019 1 TURN OVERuploading the document, get in touch with ISD (servicedesk@ucl.ac.uk) as soon as possibleto figure out what the problem might be.You will be awarded a mark of 0% or Grade F if you (1) do not attempt the summative assessmentcomponent or (2) attempt so little of the summative assessment component that it cannot be assessed.Please check the UCL Academic Manual (Section 3.11) for information on the consequences of notsubmitting or engaging with any of your assessment components.If you have extenuating circumstances that affect your ability to engage with any of the module assess-ment components, please apply for alternative arrangements to the Economics Department as soon aspossible. See details in Section 6 of the Academic Manual and send your request to economics.ug@ucl.ac.uk.If you have a disability or long-term medical condition, you may be entitled to adjustments for as-sessments. This may include an extension for this essay. Please see Section 5 of the AcademicManual for information on how to apply for adjustments. Contact the Departmental Tutor, Dr FrankWitte (f.witte@ucl.ac.uk) and the UG Admin team (economics.ug@ucl.ac.uk). Do not contactthe course lecturers about this.QUESTION:In “Testing for Imperfect Competition at the Fulton Fish Market” (RAND Journal of Economics, 1995)and later work, Kathryn Graddy studies demand and competition in the main market for whiting —a type of fish — in New York City in 1992. The author spent a lot of time at the market and hand-collected daily observations on the quantity sold and the average price, as well as the quantities andprices separately for Asian and white buyers. The data include 97 daily observations.The Stata data file FISH.dta contains observations on the variables of interest. Specifically: t: day of observation, excluding weekends (and running from 1 to 100, with three days excludedbecause the data are missing); totqty, ltotqty: total quantity sold (to Asian + white buyers) and its log; avgprc, ltotprc: average per-unit price and its log; mon, tues, wed, thurs: indicator variables for the day of the week of the observation (withFriday as an omitted category); wave2: average max wave height at sea over last 2 days, measured in feet; wave3: lagged average wave height (two days prior to those in wave2); prca: price paid by Asian buyers;ECON0019 2 CONTINUED prcw: price paid by white buyers.The Stata data file FISH panel.dta is a panel version of FISH.dta, with separate observations foreach day for both Asian and white buyers (with 2× 97 observations in total): t: day of observation; asian: indicator equal to 1 if observation is observation is for Asian buyers, 0 for white buyers; lprc: log of price for given group of buyers; lqty: log of quantity for given group of buyers; mon, tues, wed, thurs, wave2, wave3: described above.Some Stata hints: Command test allows you to compute F -statistics and perform two-sided tests on (single ormultiple) coefficients or their linear combinations. Type help command to get more details on how a particular command works, e.g. help test. Type gen varname = f(x) to generate a new variable equal to the function f(x). Type tsline varname1 varname2 ... to plot the time series of the selected variables. Beforerunning, type tsset varname to use varname as the time variable. Type predict varname after a regression to generate predicted values and name them varname. Type predict varname, resid after a regression to generate predict residuals and name themvarname. L.varname is the first lag of varname. ac varname plots the autocorrelation function of varname. c(pi) is the = 3.14 . . . constant. To compute Newey-West standard errors with the ivreg2 command, replace the r for robust inthe syntax with bw(auto).ECON0019 3 TURN OVERAnswer the following questions:Run a regression to test whether log total quantity depends on the day of the week. (Allow forheteroskedasticity in all of your analyses, and assume for now that there is no serial correlationin the errors.) Report the F-statistic and p-value testing the null hypothesis that the log totalquantity is the same on all days of the week, on average. What do you conclude? Describe anyseasonal pattern you find.Recall that another way to account for seasonality is to use trigonometric functions. Generatetwo new deterministic season variables as a function of time, t, with a weekly (i.e., 5-day)frequency:Regress log total quantity on these two variables (and a constant). Compute the estimatedseasonal “trend” in this regression and that in the regression of question 1 and plot them together.What do you conclude about the two approaches?Estimate an OLS regression of log total quantity on log average price, controlling for day-of-the-week dummies. (Keep using these controls in all regressions below.) Report the slope coefficientwith 3 significant digits. Under which (strong) condition is this estimate consistent for thedemand elasticity?To deal with simultaneity of demand and supply, Graddy uses instrumental variables which measurethe conditions at sea. Specifically, she uses lagged wave heights (wave2 and wave3).1 Winds above4.5 feet make fishing more difficult.Estimate the demand elasticity, using wave2 as a single excluded instrument. Report the elas-ticity estimate and its standard error with 3 significant digits. Test whether the instrument isstrong; report which test statistic you used, which value it takes, and which critical value youare comparing it to. Provide an argument for the exogeneity of this instrument.Looking for stronger instruments for log price, you recall that waves are supposed to be badfor fishing only when they exceed 4.5 feet. You therefore conjecture that a dummy wave2high,indicating that wave2 > 4.5, may better predict log price than wave2 itself. Test this conjecturein the data. Should one use wave2high as an additional instrument when estimating the demandelasticity? (You need to generate the wave2high dummy.)To estimate the inverse demand elasticity, swap log price and log quantity variables in your IVregression from question 4. Report the inverse demand elasticity estimate and its standard errorwith 3 significant digits. Relate the estimate to the IV estimate of demand elasticity. Whichconcerns may you have about this estimate, relative to the one in question 4?1She also used lagged wind speeds (speed2 and speed3) but we won’t.ECON0019 4 CONTINUEDComing back to the demand elasticity in question 4, use both wave2 and wave3 as instruments forprice. Report the elasticity and its standard error with 3 significant digits. Test the exogeneityof the two instruments; report which test statistic you used, which value it takes, and how youmake the conclusion.Are bad weather conditions persistent? Estimate a probit regression of the indicator variablewave2high from question 5 on its first lag (with the standard controls). What is the estimatedcoefficient and its statistical significance? What is the average partial effect of wave2highyesterday on the probability that wave2high = 1 today? Explain the intuition for your finding.We have so far assumed that heteroskedasticity-robust standard errors were valid, implicitlyassuming no autocorrelation in the errors. To assess this assumption, first generate residualsfrom the model you estimated in question 4. Plot the autocorrelation function for the residuals.What do you observe? Test whether the errors are serially correlated in an AR(1) model. Reportan appropriate test statistic and p-value. For this question, you can assume strict exogeneity.Re-estimate the model in question 4 with heteroskedasticity and autocorrelation robust standarderrors (using the default Newey-West bandwidth). How does the estimated elasticity compareto that in question 4? How does the p-value compare?How much does the mean (non-logged) price paid by Asian and white buyers differ? Computethe means of prca and prcw and interpret their difference. Now load the panel version of thedataset, FISH panel. Rerun the 2SLS regression in question 7 adding the ethnicity indicator asan exogenous regressor and the interaction of asian with lprc as a second endogenous regressor.You should also interact the instruments with asian to allow the first-stage coefficients to differby ethnicity. Cluster standard errors at the day level. Is the price elasticity significantly differentfor Asian and white buyers?

March 11, 2023 · 8 min · jiezi

关于算法:MATH4321游戏理论

MATH4321 Game Theory (2023 Spring)Assignment 1Submission deadline of Assignment 1: 11:59p.m. of 10th Mar, 2022 (Fri)Instruction: Please complete all required problems. Full details (including (i)description of methods used and explanation, (ii) key formula and theorem usedand (iii) calculation and final answer) must be shown clearly to receive full credits.Please make sure that your work is clearly presented. Marks can be deducted forincomplete solution or unclear solution. You may earn extra score by completingsome bonus problems. Also, additional score will be given for well-writtenassignment.Please submit your completed work via the submission system in canvas before thedeadline. Late assignment will not be accepted.Your submission must be 100% hand-written (typed assignment will not be accepted, you may writeon ipad if you wish) in a single pdf. file (other file formats will not be accepted) and contain your full name and student ID on the first page of the assignment. ...

March 10, 2023 · 5 min · jiezi

关于算法:CSCI1200-反转查找递归算法

CSCI-1200 Data Structures — Spring 2023Homework 6 — Inverse Word Search RecursionIn this homework we will build an inverse word search program using the techniques of recursion. The goalis to construct a grid of letters that one can search to find specific words. Understanding the non-linear wordsearch program from Lectures 12 & 13 will be helpful in thinking about how you will solve this problem.We strongly urge you to study and play with that program, including tracing through its behavior using adebugger or cout statements or both. Please read the entire handout before beginning your implementation.Your TasksFor this assignment, you will be given the dimensions (width and height) of a word search puzzle, a set ofwords that should appear in the grid (forwards, backwards, up, down, or along any diagonal), and optionallya set of words that should not appear anywhere in the grid. Each grid cell will be assigned one of the 26lowercase letters. Note that unlike the non-linear word search problem we discussed in class, we will onlyallow words that appear in a straight line (including diagonals). Your task is to output all unique wordsearch grids that satisfy the requirements. Rotations and mirroring of the board will be considered uniquesolutions.Your program should expect three command line arguments, the name of the input file, the name of theoutput file, and a string:inverse_word_search.exe puzzle2.txt out2.txt one_solutioninverse_word_search.exe puzzle2.txt out2.txt all_solutionsThe third argument indicates whether the program should find all solutions, or just one solution. Here’s anexample of the input file format: ...

March 10, 2023 · 6 min · jiezi

关于算法:ACCT2019-资讯算法研究

ACCT2019 Management AccountingGroup AssignmentSemester 1, 2022 Instructions for Parts A & BScope: There are two parts in this assignment. Part A is a group assessment and Part Bis an individual assessment. Part A requires students, as a group, to carry out ananalysis of the case study (Gretzky Pty Ltd – described in this document) andsubmit an executive report in PowerPoint format. Part B requires each studentto map the Gretzky Pty Ltd case study data in the SAP accounting system andcomplete several transactions and reports and submit a document. Thisassignment requires students to demonstrate their:i) Ability to identify and apply relevant management accounting conceptsand techniques to practical business contexts and make recommendationswith a focus on the usage of qualitative and quantitative information.ii) Specialist SAP software skills by mapping the business scenario in SAP,determination of relevant master data and transactions, their creationand/or execution and producing relevant reports from the SAP accountingsystem. ...

March 9, 2023 · 17 min · jiezi

关于算法:活动回顾-2023-Meet-TVM-首聚上海百余位工程师共话机器学习编译的现在和未来

本文首发自微信公众号:HyperAI超神经内容一览:「2023 Meet TVM·开年首聚」胜利线下相聚上海,来自企业和高校的 100 多位参与者齐聚一堂,共话机器学习编译的当初和将来。关键词:2023 Meet TVM 线下流动3 月 4 日,由 MLC.AI 社区主办、上海五角场翻新守业学院、HyperAI超神经及 OpenBayes贝式计算合办的 2023 Meet TVM 线下团聚在上海胜利举办,来自上海、杭州、北京、南京的 100 余位小伙伴齐聚上海,开展了热烈地面对面探讨。 流动当天,TVM 次要发明者、机器学习畛域驰名的青年学者陈天奇, 还为大家筹备了 Opening Video,分享本人对于机器学习编译趋势的研判以及 Apache TVM 后续的倒退布局。 以下为局部重点内容:各位 TVM 中文社区的小伙伴们大家好,我是陈天奇。非常感谢大家加入这一次 Meet TVM 上海的流动,也非常感谢 local 组织的反对。 在过来几年里,人工智能和人工智能部署产生了十分大的变动,其中机器学习不再是一个以算法惟一驱动的货色,从数据算法和零碎自身都影响着机器学习零碎部署的胜利。 而机器学习编译也从一个才开始前沿探讨的畛域逐渐进入到公众眼帘。 TVM 社区在这个方向上曾经深耕了 5 年,咱们也始终晓得必须通过不断创新自我、总结以前的教训, 能力持续把整个畛域包含机器学习编译和机器学习零碎畛域带入到下一个阶段。 从去年开始 TVM 社区也进行了一个十分大胆的扭转和尝试,推动了 TVM Unity 的解决方案,心愿从基本下面彻底解决包含像动静 shape、各种硬件部署以及包含像算子库和机器学习主动优化整合的各个方向的内容,并且咱们心愿可能把迭代开发作为咱们的首要指标,使得优化算法和优化零碎的小伙伴能够在一个 Python 的框架下持续迭代后退。 去年咱们也推出了 MLC 在线课程,跟大家解说相干机器学习的内容,往年咱们会逐渐把 Unity 真正把端到端对接起来,并且利用到各个理论的模型下面去, 也欢送大家一起来社区开发共建,把整个机器学习编译以及机器学习 TVM 自身的工具链带入到下一个阶段。 现场分享内容简介及 PPT 获取流动当天,咱们邀请到了 4 位 Speaker 进行现场分享。分享主题: TVM 与机器学习编译倒退 ...

March 8, 2023 · 1 min · jiezi

关于算法:动转静两大升级一键转静成功率领先重点模型训练提速18

目前支流深度学习框架反对的编程形式有两种,别离为动态图和动态图。动态图的Pythonic编程体验更佳、更易调试,但性能方面与动态图有肯定差距。动态图先组网再执行,事后领有残缺网络结构,更利于全局优化,虽调试难度大,但执行性能更佳。 百度飞桨采纳动静对立的技术架构设计,提供了动转静(@to_static)模块性能,反对用户动态图编程,并可一键切换动态图训练和部署。2022年11月,飞桨框架 2.4 版本(以下简称飞桨v2.4)正式公布,动转静“转换成功率”和“训练性能”迎来全面降级,带来了全新的用户应用体验。 动转静成功率显著晋升,一键转换成功率达到92.1%。动转静训练减速成果显著,重点模型训练可提速18%+。一键转静成功率显著晋升 动转静的转换成功率是动转静性能的一个重要指标,与用户的应用体验非亲非故,飞桨v2.4从“动转静语法欠缺”和“API动静行为对立”两个方面进行了重点优化和降级:动转静语法欠缺 JIT 式动静执行 新增 Shape、Len、Attr、List、Unpack、Indexable等JIT 模式接口,晋升语法转写的鲁棒性。 控制流语法重构 重构了控制流IF/For/While语法转写逻辑,齐备反对简单嵌套场景下变量名解析等疑难问题。 关键字语法优化 优化了控制流中提前return、break、continue 等关键字语法转写机制,无效缩小了动态图两头示意多余算子的引入,晋升执行效率。 API动静行为对立 属性参数可变 实现了20多个中高频动态图API参数的降级,如 Reduce系列的paddle.mean/sum/max/min API的参数 axis ,新增反对为Tensor类型,动静可变。 接口动静对立 补齐了Tensor类动态图下缺失的接口,降级了paddle.to_tensor、paddle.grad等高频API 性能,反对动态图调用。 Einsum 降级 实现了动静对立爱因斯坦求和算子,并反对Python二元、多元输出,训推一体。 动转静成功率和语法反对度飞桨v2.4下,动转静具备了更丰盛的语法反对,Python语法反对比例达到了90%,在80多个内部用户实在论文复现模型汇合上,动转静一键转写成功率晋升至92.1%,性能齐备性和易用性都有显著晋升。 如下是一个动转静导出预测模型的样例代码: import paddleclass SimpleNet(paddle.nn.Layer):    def __init__(self):        super(SimpleNet, self).__init__()        self.linear = paddle.nn.Linear(10, 3)    @paddle.jit.to_static   # step 1: 增加装璜器    def forward(self, x):        out = self.linear(x)        out = out + 1        return outnet = SimpleNet()train(net)  # 此处略去了训练过程# step 2: 切换到 eval() 模式net.eval()# step 3: 调用 jit.save 接口paddle.jit.save(net, path='./simple_net')执行上述代码样例后,在当前目录下会生成三个文件,即代表胜利导出预测模型: simple_net.pdiparams        // 寄存模型中所有的权重数据simple_net.pdmodel          // 寄存模型的网络结构simple_net.pdiparams.info   // 寄存额定的其余信息动转静导出模型个别包含三个步骤: 增加装璜器将@to_static装璜器装璜在forward函数上。 切换 eval 模式如Dropout 、LayerNorm 接口在 train() 和 eval() 模式下行为存在较大的差别,在模型导出前,请务必确认模型已切换到正确的模式。 调用 save 接口调用 paddle.jit.save接口导出其对应的模型文件和参数文件。飞桨动转静@to_staitc更多功能用法,可参考【扩大浏览—动转静应用样例】 动转静训练减速成果显著 在飞桨框架中,通常状况下应用动态图训练即可满足大部分场景需要。飞桨v2.4优化了动转静训练的相干逻辑,面向重点模型,动态图训练的性能曾经能够和动态图媲美,例如在ResNet50、Transformer、YOLOv3等模型上,动转静训练相较于动态图有18.6%~21.5%的显著减速成果。 重点模型减速成果在如下场景中,开发者能够思考应用动转静形式进行模型训练,将会取得较显著的性能晋升成果。 场景一: 重调度模型 即每个API背地的GPU Kernel 计算耗时较少,在CPU端拉起后很快就执行完了,此类工作的特点: PU 利用率较低(可通过watch -n 1 nvidia-smi命令查看)。常见于NLP 畛域或AMP/FP16 工作。训练性能瓶颈点次要是Host端调度开销。 如上图是重调度模型的动态图和动转静 Timeline 示意图。从图中能够看出: 一个Batch的训练耗时取决于 Host 端总耗时。动态图每个Python API在运行时,都会产生一次Python 和C++交互,会产生较大的调度开销。动转静之后,整体上切分为执行前向和反向的两个Python C API,故缩小了很多个API间的调度开销。动转静内核执行器也通过了极致的优化(如Instruction缓存等),Kernel launch效率也会比纯动态图模式要高。对于想应用动态图训练代码的用户来说,只须要在组网入口的forward函数处增加装璜器@to_static,其余代码无需改变就能够一键切换为动转静训练。@to_static装璜器会将此函数内的所有subLayers 转化为一个动态子图并执行。如下是一个动转静训练样例代码: ...

March 7, 2023 · 1 min · jiezi

关于算法:ESTR2102B-数据结构算法

CSCI2100F/ESTR2102B Data Structures (Spring 2022)Lab Assignment #5Schedule: Week 8 To familiarize the implementations of general treeTo familiarize the tree terminologySubmission guideline (Online judge)Name your program(s) according to the question (e.g. lab5-q1.c)Write your programs under “Desktop/mywork/lab5”Duplicate your programs before submissionSubmit your programs under “Desktop/submission/lab5”Check the submission folder for your grading report (Be patience, auto grading needssome time. You can F5 to update the browser.)Resubmit your work if you cannot receive full scoreVisit “Other Locations/Computer/2100share/lab4” in the VM to download thestarting programsVersion of the gcc compiler in the online judge: C99Deadline: 7th March, 2023 (Tuesday)Late penalty is 30%, and the judge for lab 5 is closed on 14th March, 2023(Tuesday).There will be 5 late days in total for all 10 labs. The late days will be usedautomatically if your submission is beyond the deadline and achieve a better score.Grading scheme (CSCI2100F)Total score is 100%The weighting of each question is marked on the questionBonus score is at most 13%Bonus score: Extra question (13%)http://www.6daixie.com/contents/13/7597.html○ Mark given according to the number of correct answers of the question (13%)Requirements for ESTR2102B: Non-zero scores for the extra questionUseful commands ...

March 6, 2023 · 9 min · jiezi

关于算法:COMM5030-社会创业实习

COMM5030 Social Entrepreneurship Practicum – Project Brief1 Business name CoolamonBusiness Website Coolamon.org.auBusiness overview The Coolamon is a platform to elevate and build first people small to mediumenterprises.The objectives of the Coolamon are to:a) Establish clear pathways with milestones for First People Enterprises tobe sustainable and self-determined.b) The Coolamon is a platform that connects First People Enterprises, withNon-Indigenous Enterprises to fulfil procurement and or RAPrequirementsc) The Coolamon provides necessary evaluation, training and mapping forthe First People Enterprise to reach the Non-Indigenous EnterpriseSupply chain requirements.d) The Coolamon identifies a First Peoples Community DevelopmentFramework built from Indigenous Leaders based on the UN Declarationof Indigenous Peoples to maintain Cultural Integrity objectives as theyrelate to business and Government objectivesThe Coolamon was formed recently (December-2021) as an Enterprise ofAnnecto an Aged and Disability Provider ...

March 6, 2023 · 2 min · jiezi

关于算法:交叉编译和-RPC

本篇文章译自英文文档 Cross Compilation and RPC 作者是 Ziheng Jiang,Lianmin Zheng。更多 TVM 中文文档可拜访 →TVM 中文站 本教程介绍了如何在 TVM 中应用 RPC 进行穿插编译和近程设施执行。 利用穿插编译和 RPC,能够实现程序在本地机器编译,在近程设施运行。这个个性在近程设施资源无限时(如在树莓派和挪动平台上)尤其有用。本教程将把树莓派作为 CPU 示例,把 Firefly-RK3399 作为 OpenCL 示例进行演示。 在设施上构建 TVM Runtime首先在近程设施上构建 TVM runtime。 留神 本节和下一节中的所有命令都应在指标设施(例如树莓派)上执行。假如指标设施运行 Linux 零碎。因为在本地机器上只做编译,而近程设施用于运行生成的代码。所以只需在近程设施上构建 TVM runtime。 git clone --recursive https://github.com/apache/tvm tvmcd tvmmake runtime -j2胜利构建 runtime 后,要在 ~/.bashrc 文件中设置环境变量。能够用 vi ~/.bashrc命令编辑 ~/.bashrc,在这个文件里增加上面这行代码(假如 TVM 目录在 ~/tvm 中): export PYTHONPATH=$PYTHONPATH:~/tvm/python执行 source ~/.bashrc 来更新环境变量。 在设施上设置 RPC 服务器在近程设施(本例为树莓派)上运行以下命令来启动 RPC 服务器: python -m tvm.exec.rpc_server --host 0.0.0.0 --port=9090看到上面这行提醒,则示意 RPC 服务器已胜利启动。 ...

March 6, 2023 · 2 min · jiezi

关于算法:COMP26020-求解方法

COMP26020 - Lab exercise for Part III (Compilers)Register Allocation using Graph ColouringBackgroundComputer programs, regardless of the programming language, often use many more variablesthan the number of variables that can fit in all CPU registers. When a program is compiled forexecution on a given processor, the compiler needs to consider what variables will stay inregisters and for how long. If we think that moving data from the memory takes several cycles,there is a performance benefit if the compiler can minimise such transfers. How to do this? Bydoing some ‘clever’ register allocation, for example, by making sure that the most frequently usedvariables are placed in registers.To understand the problem, consider the following piece of code: ...

March 5, 2023 · 6 min · jiezi

关于算法:COMP3223-描述

Assignment title Coursework 1 Assignment type and description Coursework assignment Rationale Learning the mathematical basis of symmetric cryp- tosystems Weighting 20% of total mark Submission dead- line March 6th 2023 at 09:00 Submission method Turnitin submission through Minerva Feedback provision Feedback provided on Minerva Learning outcomes assessed (i) Understand and apply in practice the fundamental principles of cryptography and information security. (ii) Analyse and evaluate the strengths and weaknesses of cryptosystems. (iii) Apply mathematical analysis to un- derstand how symmetric cryptosystems are constructed. ...

March 4, 2023 · 4 min · jiezi

关于算法:MA117-科学计算方法与技巧

MA117 Programming for Scientists: Project 3 Deadline: 12pm, Friday 6th May 20221MA117 Project 3: Determinants of MatricesAdministrative Details• This project is the third of the three assignments required for the assessment in this course. It is to besubmitted by 12pm, Friday 6th May 2022. Details of the method of the submission via the Tabulasystem have been described in the lecture notes and are also available on the course web page.• This assignment will count for 40% of your total grade in the course.• The automated submission system requires that you closely follow instructions about the format ofcertain files; failure to do so will result in the severe loss of points in this assessment.• You may work on the assignment during the lab session, provided you have completed the other tasksthat have been set. You can always use the work areas when they are not booked for teaching, 7 daysper week. If you are working on the assignment on your home system, you are advised to make regularback-up copies (for example by transferring the files to the University systems). You should note thatno allowance will be made for domestic disasters involving your own computer system. You shouldmake sure well ahead of the deadline that you are able to transfer all necessary files to the Universitysystem and that it works there as well.• The Tabula system will be open for the submission of this assignment starting from 14th March 2022.You will not be able to test your code for correctness using Tabula, but you can resubmit your workseveral times, until the deadline, if you find a mistake after your submission. A later submission alwaysreplaces the older one, but you must re-submit all files.• Remember that all work you submit should be your own work. Do not be tempted to copy work; thisassignment is not meant to be a team exercise. There are both human and automated techniques todetect pieces of the code which have been copied from others. If you are stuck, then ask for assistancein the lab sessions. TAs will not complete the exercise for you, but they will help if you do notunderstand the problem, are confused by an error message, need advice on how to debug the code,require further explanation of a feature of Java or similar matters.• If you have more general or administrative problems e-mail me immediately. Always include the coursenumber (MA117) in the subject of your e-mail.1 Formulation of the ProblemMatrices are one of the most important mathematical concepts to be modelled by computer, being used inmany problems from solving simple linear systems to modelling complex partial differential equations.Whilst a matrix (in our formulation) is simply an element of the vector space ℝ!×#, it usually possesses somestructure which we can exploit to gain computational speed. For example, a matrix-matrix multiplicationgenerally requires of the order of floating-point operations. If the matrix has some special structure whichwe can exploit using a clever method, then we might be able to reduce this to operations. For large valuesof, this significantly improves the performance of our code.MA117 Programming for Scientists: Project 3 Deadline: 12pm, Friday 6th May 20222In this project, you will write two classes representing matrices of the form: Also,the tri-diagonal matrices you need to represent will always be square.In a similar fashion to Fraction, you will then write functions to perform various matrix operations: ...

March 4, 2023 · 12 min · jiezi

关于算法:二叉树相关

层级遍历剑指中的这类型题目的外围代码 while(!queue.isEmpty()){ int size=queue.size();//这里其实不这么写也行,这么写更加清晰直观地反映出这是遍历着一层的元素 List<Integer> level=new ArrayList<>(); for(int i=0;i<size;i++){ TreeNode node=queue.poll();//queue须要是Queue类型,想一想为什么不能用List level.add(node.val); if(ndoe.left!=null){ queue.add(node.left); } if(node.right!=null){ queue.add(node.right); } } res.add(level); //这里res是内部定义的变量}值得一提的是,如果返回的int[]类型,那么就能够应用随同类Arrays.toArray(new int[list.size()]); 子树

March 3, 2023 · 1 min · jiezi

关于算法:3基于Label-studio的训练数据标注指南文本分类任务

文本分类工作Label Studio使用指南1.基于Label studio的训练数据标注指南:信息抽取(实体关系抽取)、文本分类等2.基于Label studio的训练数据标注指南:(智能文档)文档抽取工作、PDF、表格、图片抽取标注等3.基于Label studio的训练数据标注指南:文本分类工作4.基于Label studio的训练数据标注指南:情感剖析工作观点词抽取、属性抽取 目录 1. 装置2. 文本分类工作标注 2.1 我的项目创立2.2 数据上传2.3 标签构建2.4 工作标注2.5 数据导出2.6 数据转换2.7 更多配置 1. 装置以下标注示例用到的环境配置: Python 3.8+label-studio == 1.7.1在终端(terminal)应用pip装置label-studio: pip install label-studio==1.7.1装置实现后,运行以下命令行: label-studio start在浏览器关上http://localhost:8080/,输出用户名和明码登录,开始应用label-studio进行标注。 文本分类工作标注2.1 我的项目创立点击创立(Create)开始创立一个新的我的项目,填写项目名称、形容,而后在Labeling Setup中抉择Text Classification。 填写项目名称、形容 数据上传,从本地上传txt格式文件,抉择List of tasks,而后抉择导入本我的项目 设置工作,增加标签 2.2 数据上传我的项目创立后,可在Project/文本分类工作中点击Import持续导入数据,同样从本地上传txt格式文件,抉择List of tasks,详见我的项目创立 。 2.3 标签构建我的项目创立后,可在Setting/Labeling Interface中持续配置标签,详见我的项目创立 2.4 工作标注 2.5 数据导出勾选已标注文本ID,抉择导出的文件类型为JSON,导出数据: 2.6 数据转换将导出的文件重命名为label_studio.json后,放入./data目录下。通过label_studio.py脚本可转为UTC的数据格式。 在数据转换阶段,还须要提供标签候选信息,放在./data/label.txt文件中,每个标签占一行。例如在医疗用意分类中,标签候选为["病情诊断", "医治计划", "病因剖析", "指标解读", "就医倡议", "疾病表述", "结果表述", "注意事项", "效用作用", "医疗费用", "其余"],也可通过options参数间接进行配置。 python label_studio.py \ --label_studio_file ./data/label_studio.json \ --save_dir ./data \ --splits 0.8 0.1 0.1 \ --options ./data/label.txt2.7 更多配置label_studio_file: 从label studio导出的数据标注文件。save_dir: 训练数据的保留目录,默认存储在data目录下。splits: 划分数据集时训练集、验证集所占的比例。默认为[0.8, 0.1, 0.1]示意依照8:1:1的比例将数据划分为训练集、验证集和测试集。options: 指定分类工作的类别标签。若输出类型为文件,则文件中每行一个标签。is_shuffle: 是否对数据集进行随机打散,默认为True。seed: 随机种子,默认为1000.备注: ...

March 2, 2023 · 1 min · jiezi

关于算法:PHS2061-Quantum-Mechanics

School of Physics and AstronomyPHS2061 Quantum Mechanics - Assignment 2Question 1A particle with energy E is incident on a step potential as shown in Fig. 1. The wavefunction in the tworegions is given by:Calculate the reflection coefficient R. A numerical value is required for R.Figure 1: A particle of energy E > 0 incident on a step potential.Question 2A particle of mass m and energy is incident on a P?schl-Teller potential (see Fig. 2) defined by:where is a real constant and the hyperbolic secant function isFigure 2: A particle of energy E > 0 is incident on a P?schl-Teller potential from the far left ...

March 2, 2023 · 3 min · jiezi

关于算法:STAT-337难点分析

STAT 337 ASSIGNMENT 2 Due: 5:00pm EDT Thursday, June 16, 2022Notes for Submission: Upload your assignment directly to Crowdmark via the link youreceive by email. It is your responsibility to make sure your solution to each question issubmitted in the correct section, that the pages are rotated correctly, and that everything islegible. Typed solutions are preferred.Notes on the use of statistical software: Unless specifically told otherwise, you are freeto do your calculations using any software you like (SAS, R, Excel, etc) but your solutionsshould clearly explain the steps you used in the computation, showing intermediate calcu-lations when necessary, and give the formulas that you used. Any code and output createdshould also be submitted. ...

March 1, 2023 · 9 min · jiezi