关于chatgpt:Prompt-learning-教学基础篇prompt基本原则以及使用场景技巧助力你更好使用chatgpt

5次阅读

共计 23304 个字符,预计需要花费 59 分钟才能阅读完成。

Prompt learning 教学[根底篇]:prompt 根本准则以及应用场景技巧助力你更好应用 chatgpt,失去你想要的答案

  • 如果你想零碎学习

    • 如果你对 AI 和 Prompt Engineering 不是很理解,甚至连 ChatGPT 也不是很理解,那我倡议你从根底篇开始读起。根底篇更多的会从用户的角度教你如何应用 AI 产品,或者换句话说,会讲更多 prompt 的内容。
    • 如果你曾经理解根本的用法,并且想要学习如何更好地开发 AI 产品,想理解更多 prompt engineering 的内容。那能够间接跳去高级篇浏览。
  • 如果你只是想疾速学会应用

    • 如果你只是想疾速理解应用 AI 产品的技巧,你能够间接读技巧篇,那里汇总了所有应用技巧。
    • 如果你曾经理解如何应用了,但想理解更多应用场景,能够看看材料 & 产品举荐篇。

# 0.Dyno Prompt Engineering 介绍

Dyno Prompt Engineering IDE 是一款基于人工智能技术的集成开发环境(IDE),它能够帮忙开发者更疾速、更高效地进行软件开发。Dyno Prompt Engineering IDE 的次要特点包含以下几个方面:

  • 智能代码提醒:Dyno Prompt Engineering IDE 能够通过剖析代码语法和上下文,智能地提供代码提醒和主动补全性能,帮忙开发者更疾速地编写代码。
  • 智能谬误检测:Dyno Prompt Engineering IDE 能够通过剖析代码语法和逻辑,智能地检测代码中的谬误和潜在问题,并提供相应的修复倡议,帮忙开发者更疾速地调试代码。
  • 智能重构:Dyno Prompt Engineering IDE 能够通过剖析代码构造和依赖关系,智能地进行代码重构和优化,帮忙开发者更高效地改良代码品质和性能。
  • 智能集成:Dyno Prompt Engineering IDE 能够与其余开发工具和平台进行智能集成,例如版本控制系统、测试工具、部署平台等,帮忙开发者更高效地进行软件开发和治理。

Dyno Prompt Engineering IDE 是一款基于人工智能技术的集成开发环境,它能够帮忙开发者更疾速、更高效地进行软件开发,进步开发效率和代码品质。

Dyno:Prompt Engineering IDE

1.Prompt Engineering 简介

解释这个词之前,首先须要解释 prompt 这个词。简略的了解它是 给 AI 模型的指令。能够是一个问题、一段文字描述,甚至能够是带有一堆参数的文字描述。AI 模型会基于 prompt 所提供的信息,生成对应的文本,亦或者图片。比方,咱们在 ChatGPT 里输出 What is the capital of China? (中国的首都是什么?),这个问题就是 prompt。

而 Prompt Engineering(中文意思为提醒工程,后缩写为 PE)则是:

Prompt Engineering 是一种人工智能(AI)技术,它通过设计和改良 AI 的 prompt 来进步 AI 的体现。Prompt Engineering 的指标是创立高度无效和可控的 AI 零碎,使其可能精确、牢靠地执行特定工作。

看上去很难懂,我试着换个形式让你了解。你可能用过不少 AI 相干的产品,你或者会感觉如同只须要会谈话、会打字,就能让 AI 输入答案。如同不须要什么技术。确实,如果你只想让 AI 给你答案,你不须要额定做什么,只须要输出文字即可。但如果你想要失去称心的答案,甚至准确的答案。就须要用到 PE 这个技术。因为人类的语言从根本上说是不准确的,目前机器还没法很好地了解人类说的话,所以才会呈现 PE 这个技术。另外,受制于目前大语言模型 AI 的实现原理,局部逻辑运算问题,须要额定对 AI 进行提醒(这里你不须要深究起因,临时先晓得这是个问题即可)。

举个例子,如果咱们在 ChatGPT 里输出这样的一段话:

What is 100*100/400*56?

ChatGPT 会返回一个谬误的答案 0.4464(留神,如果你用下方的 Dyno 运行,答案应该也是谬误的,API 版本的答复是 14):但如果咱们对 prompt 进行一些批改,答案则会是正确的。留神,如果你用下方的 Dyno 运行旧版的模型 API 的版本,做了批改后,答案应该还是谬误的,须要用到 Role Prompting 能力生成正确答案。(这个后续章节会解说)另外,目前的 AI 产品还比拟晚期,因为各种起因,产品设置了很多限度,如果你想要绕过一些限度,或者更好地施展 AI 的能力,也须要用到 Prompt Engineering 技术。这个咱们在后续的章节会讲到。

所以,总的来说,Prompt Engineering 是一种重要的 AI 技术:

  • 如果你是 AI 产品用户,能够通过这个技术,充分发挥 AI 产品的能力,取得更好的体验,从而进步工作效率。
  • 如果你是产品设计师,或者研发人员,你能够通过它来设计和改良 AI 零碎的提醒,从而进步 AI 零碎的性能和准确性,为用户带来更好的 AI 体验。

2.prompt 根本准则

在和 ChatGPT 对话时,亦或者在应用和设计 prompt 时,有以下几个准则与倡议。记住这几个准则,能让你写出更好的 prompt。如果你是间接应用 AI 产品,比方 ChatGPT 或者 Midjourney,那无需在意这个准则。如果你是通过 API 或者 OpenAI Playground 的形式应用,则倡议你先应用最新的模型测试。

2.1 Prompt 里最好蕴含残缺的信息

这个是对后果影响最大的因素。比方如果你想让 AI 写一首对于 OpenAI 的诗。

Less effective prompt:

Write a poem about OpenAI.

它生成的答案可能就会很宽泛,而更好的形式是减少更多的信息。

Better prompt:

Write a short inspiring poem about OpenAI, focusing on the recent DALL-E product launch (DALL-E is a text to image ML model) in the style of a {famous poet}

2.2 Prompt 最好简洁易懂,并缩小歧义

这个比拟好了解,即便你跟人谈话,说一些简略的短句,对方也会更容易听懂,AI 也是这样。另外,在 prompt 里也须要缩小一些歧义,少用不置可否的词语。

比方像这个就很不明确,什么叫 not too much more?

The description for this product should be fairly short, a few sentences only, and not too much more.

更好的 prompt 是这样的,明确告知要写多少句话,就比拟明确:

Use a 3 to 5 sentence paragraph to describe this product.

另外须要留神的是,简略并不代表简短。你的 prompt 也能够很长,只有你的 prompt 形容更充沛就能够,即便长一点也没有关系。

2.3 Prompt 要应用正确的语法、拼写,以及标点,从简略的先开始,并给产品多一点急躁

最初一点算是我集体的倡议。如我在后面提到的例子 What is 100*100/40*56?一样,如果发现机器了解谬误,无妨补充点信息,无妨多试验几次,给 AI 多一点急躁。

3. 根本应用场景和应用技巧

3.1 场景 1:问答问题

这个场景应该是应用 AI 产品最常见的办法。以 ChatGPT 为例,个别就是你提一个问题,ChatGPT 会给你答案,比方像这样:

在这个场景下,prompt 只有满足后面提到的根本准则,基本上就没有什么问题。但须要留神,不同的 AI 模型善于的货色都不太一样,prompt 可能须要针对该模型进行微调。另外,目前的 AI 产品,也不是无所不能,有些问题你再怎么优化 prompt 它也没法答复你。以 ChatGPT 为例:

  1. ChatGPT 比拟善于答复根本事实的问题,比方问 什么是牛顿第三定律?。但不太善于答复意见类的问题,比方问它 谁是世界第一足球运动员?,它就没法答复了。
  2. 另外,ChatGPT 的数据仅有 2021 年 9 月以前的,如果你问这个工夫当前的问题,比方 当初的美国总统是谁?它的答案是「截至 2021 年 9 月,现任美国总统是乔·拜登(Joe Biden)。」

另外,正如我在后面根底用法一章中提到的那样,问答场景里还有一个很重要的玩法,就是多轮聊天,你能够针对某个问题,进行多轮的发问。

应用技巧一:To do and Not To do

我介绍的技巧其实在各个场景都能够应用,我将其放在某个场景下解释,只是因为我感觉它更有可能在这个场景用到。你也会更容易记住这个用法。并不意味着这个技巧仅能在此场景应用。并且多技巧混用也是个不错的用法。

在问答场景里,为了让 AI 答复更加精确,个别会在问题里加条件。比方让 AI 举荐一部电影给你 Recommend a movie to me。但这个 prompt 太空泛了,AI 无奈间接答复,接着它会问你想要什么类型的电影,但这样你就须要跟 AI 聊很多轮,效率比拟低。

所以,为了提高效率,个别会在 prompt 里看到相似这样的话(意思是不要询问我对什么感兴趣,或者问我的个人信息):

DO NOT ASK FOR INTERESTS. DO NOT ASK FOR PERSONAL INFORMATION.

如果你在 ChatGPT 里这样发问,或者应用 ChatGPT 最新的 API,它就不会问你问题,而是间接举荐一部电影给你,它的 Output 是这样的:

Certainly! If you're in the mood for an action-packed movie, you might enjoy"John Wick"(2014), directed by Chad Stahelski and starring Keanu Reeves. The movie follows a retired hitman named John Wick who seeks vengeance against the people who wronged him. It's a fast-paced and stylish film with lots of thrilling action sequences and an engaging story. If you're looking for something that will keep you on the edge of your seat,"John Wick" is definitely worth a watch!

但如果你应用的是如 Davinci-003 这样的模型,它的 Output 很可能是这样的,它还会问你的兴趣爱好:

Sure, I can recommend a movie based on your interests. What kind of movie would you like to watch? Do you prefer action, comedy, romance, or something else?

所以 OpenAI 的 API 最佳实际文档里,提到了一个这样的最佳实际:

Instead of just saying what not to do, say what to do instead. 与其告知模型不能干什么,无妨通知模型能干什么。

我本人的实际是,尽管当初最新的模型曾经了解什么是 Not Todo,但如果你想要的是明确的答案,退出更多限定词,告知模型能干什么,答复的效率会更高,且预期会更明确。还是电影举荐这个案例,你能够退出一个限定词:

Recommend a movie from the top global trending movies to me.

当然并不是 Not Todo 就不能用,如果:

  • 你曾经告知模型很明确的点,而后你想放大范畴,那减少一些 Not Todo 会进步不少效率。
  • 你是在做一些摸索,比方你不晓得如何做精准限定,你只晓得不要什么。那能够先退出 Not Todo,让 AI 先发散给你答案,当摸索实现后,再去优化 prompt。

以下是一些场景案例,我整顿了两个 Less Effective(不太无效的)和 Better(更好的)prompt,你能够本人尝试下这些案例:

场景 Less Effective Better 起因
举荐雅思必背英文单词 Please suggest me some essential words for IELTS Please suggest me 10 essential words for IELTS 后者 prompt 会更加明确,前者会给大略 20 个单词。这个依然有晋升的空间,比方减少更多的限定词语,像字母 A 结尾的词语。
举荐香港值得玩耍的中央 Please recommend me some places to visit in Hong Kong. Do not recommend museums. Please recommend me some places to visit in Hong Kong including amusement parks. 后者的举荐会更精确高效一些,但如果你想进行一些摸索,那前者也能用。

3.2 基于实例答复

在某些场景下,咱们能比较简单地向 AI 形容出什么能做,什么不能做。但有些场景,有些需要很难通过文字指令传递给 AI,即便形容进去了,AI 也不能很好地了解。

比方给宠物起英文名,外面会夹杂着一些所谓的名字格调。此时你就能够在 prompt 里减少一些例子,咱们看看这个例子。

这个是没有任何示例的 Prompt:

Suggest three names for a horse that is a superhero.

Output 如下所示。第一个感觉还行,第二个 Captain 有 hero 的感觉,但 Canter 就像是说这匹马跑得很慢,感觉不太适合,而且三个都比拟个别,不够酷。

Thunder Hooves, Captain Canter, Mighty Gallop

技巧 2:减少示例

如果你无奈用文字精确解释问题或批示,你能够在 prompt 里减少一些案例:

Suggest three names for an animal that is a superhero.

Animal: Cat
Names: Captain Sharpclaw, Agent Fluffball, The Incredible Feline
Animal: Dog
Names: Ruff the Protector, Wonder Canine, Sir Barks-a-Lot
Animal: Horse
Names:

减少例子后,Output 的后果就更酷一些,或者说是靠近我想要的那种格调的名字。

Gallop Guardian, Equine Avenger, The Mighty Stallion

以下是一些场景案例,我整顿了两个 Less Effective(不太无效的)和 Better(更好的)prompt,你能够本人尝试下这些案例:

场景 Less Effective Better 起因
起产品名 Product description: A pair of shoes that can fit any foot size.<br/>Seed words: adaptable, fit, omni-fit.<br/>Product names: Product description: A home milkshake maker<br/>Seed words: fast, healthy, compact.<br/>Product names: HomeShaker, Fit Shaker, QuickShake, Shake Maker<br/>Product description: A pair of shoes that can fit any foot size.<br/>Seed words: adaptable, fit, omni-fit.<br/>Product names: 能够在下方运行这个案例,在不给示例的状况下 AI 会给你什么答案。
将电影名称转为 emoji Convert Star Wars into emoji Convert movie titles into emoji. <br/>Back to the Future: 👨👴🚗🕒<br/>Batman: 🤵🦇<br/>Transformers: 🚗🤖<br/>Star Wars: 能够在下方运行这个案例,在不给示例的状况下 AI 会给你什么答案。

3.3 推理

在问答这个大场景下,还有一个子场景是推理,这个场景十分有意思,而且是个十分值得深挖的场景,prompt 在此场景里施展的作用十分大。

如果你想用 ChatGPT API 做点什么小利用,我倡议能够从这个场景动手,相对来说没有其余场景那么红海。

举个比较简单的例子,我将美国工商管理硕士的入学考试 GMAT 的一道数学题输出进了 ChatGPT:

If a real estate agent received a commission of 6 percent of the selling price of a certain house, what was the selling price of the house?
(1) The selling price minus the real estate agent’s commission was $84,600.
(2) The selling price was 250 percent of the original purchase price of $36,000.

(A) Statement (1) ALONE is sufficient, but statement (2) alone is not sufficient.
(B) Statement (2) ALONE is sufficient, but statement (1) alone is not sufficient.
(C) BOTH statements TOGETHER are sufficient, but NEITHER statement ALONE is sufficient.
(D) EACH statement ALONE is sufficient.
(E) Statements (1) and (2) TOGETHER are NOT sufficient.

这道题目的中文意思是这样的(你无妨也思考下这个题目?):

如果一个房地产经纪人的佣金是某个房子的售价的 6%,那么这个房子的售价是多少?(1)售价减去房地产经纪人的佣金为 84,600 美元。(2)购买价是 36,000 美元,售价是购买价的 250%。(A)仅陈说(1)足以答复问题,但仅陈说(2)不能答复问题。(B)仅陈说(2)足以答复问题,但仅陈说(1)不能答复问题。(C)两个陈说合起来足以答复问题,但没有一个陈说独自足以答复问题。(D)每个陈说独自足以答复问题。(E)陈说(1)和(2)合起来不能答复问题。

它的 Output 是这样的,它会依据题目的要求去推理和运算:

Let's assume the selling price of the house is S.

According to the given information, the real estate agent's commission is 6% of S, which is 0.06S.

So, the amount received by the seller after paying the commission would be (S - 0.06S) = 0.94S.

Statement (1) tells us that 0.94S = $84,600. Solving for S, we get S = $90,000.

Therefore, statement (1) alone is sufficient to determine the selling price of the house.

Statement (2) tells us that the selling price is 250% of the original purchase price, which means S = 2.5*$36,000 = $90,000.

Therefore, statement (2) alone is also sufficient to determine the selling price of the house.

Since both statements individually provide the same answer, we can conclude that together they are also sufficient to determine the selling price of the house.

Hence, the answer is (D) EACH statement ALONE is sufficient.

翻译成中文是这样的:

假如房子的售价为 S。依据所给信息,房地产经纪人的佣金是 S 的 6%,即 0.06S。因而,在领取佣金后卖方收到的金额为(S-0.06S)= 0.94S。陈说(1)通知咱们,0.94S = 84,600 美元。解出 S,咱们失去 S = $90,000。因而,陈说(1)独自足以确定房子的售价。陈说(2)通知咱们,售价是原始购买价的 250%,这意味着 S = 2.5 * $36,000 = $90,000。因而,陈说(2)独自也足以确定房子的售价。因为两个陈说别离提供雷同的答案,咱们能够得出结论,它们独特足以确定房子的售价。因而,答案是(D)每个陈说独自足以答复问题。

这个场景还比较简单,我就不在这举更多例子了,你能够将各种题目输出进 ChatGPT 里试试。

3.4 写代码

除了答复问题外,另一个重要的场景是让 AI 依据你的要求实现一些内容生成工作,依据输入的后果差别,我将其概括为以下几个次要场景:

  1. 无中生有
  2. 精益求精
  3. 化繁为简

本章,咱们先来聊聊「无中生有」场景。顾名思义,就是让 AI 给你生成一些内容。你只须要向 AI 形容你想写的内容,AI 就会按你的要求写出该内容。比方:

  1. 撰写招聘信息
  2. 撰写电商平台的货物形容
  3. 撰写短视频脚本
  4. 甚至让它写代码都能够

像撰写招聘信息等,你只须要明确指标,将指标定得比拟明确即可生成能让你较为称心的答案。本章我想聊下写代码。上面是一个让 ChatGPT 写代码的案例。

Prompt:

Create a MySQL query for all students in the Computer Science Department:
Table departments, columns = [DepartmentId, DepartmentName]
Table students, columns = [DepartmentId, StudentId, StudentName]

Output:

SELECT students.*
FROM students
INNER JOIN departments
ON students.DepartmentId = departments.DepartmentId
WHERE departments.DepartmentName = 'Computer Science';

技巧 3:应用疏导词,疏导模型输入特定语内容

在代码生成场景里,有一个小技巧,下面提到的案例,其 prompt 还能够持续优化,在 prompt 最初,减少一个代码的疏导,告知 AI 我曾经将条件形容完了,你能够写代码了。

在 prompt 的最初减少 SELECT 能够很好地提醒 AI 能够写 SQL 代码了。Better prompt:

Create a MySQL query for all students in the Computer Science Department:
Table departments, columns = [DepartmentId, DepartmentName]
Table students, columns = [DepartmentId, StudentId, StudentName]
SELECT

同样的情理,如果你想让 AI 写 Python 代码,那 import 会是比拟好的提醒。但须要留神,这个只是告知 AI 能够写代码了,并不能告知 AI 写何种代码,你依然须要在 prompt 里减少提醒,告知 AI 要生成何种语言的代码。

在吴恩达的 ChatGPT Prompt Engineering 课程中,也提到这个技巧,只是在课程中,疏导词并不是放在最初,而是在 prompt 里间接阐明,让 AI 生成一个 JSON 格局的内容。课程中的例子是这样的(留神这个是 python 代码):

prompt = f"""
Generate a list of three made-up book titles along \
with their authors and genres.
Provide them in JSON format with the following keys:
book_id, title, author, genre.
"""

3.5 批改内容(润色、翻译、查错)

除了从 0 生成内容外,你也能够先给 AI 一段曾经写好的内容,而后让 AI 对其进行批改,让 AI 帮你将内容改得更好。理论的场景有如:

  1. 翻译:将中文文档翻译成英文,亦或者将英文翻译成中文。对于翻译,我还想强调,像 ChatGPT 除了能翻译人类的语言外,还能翻译编程语言,比方将 Python 代码翻译成 Haskell 代码。
  2. 批改:批改内容的语法,甄别内容里的错别字。
  3. 润色:润色文章,将内容改成另一种格调。

本章重点介绍下润色场景,上面这是个润色的例子。

Prompt(内容来自维基百科):

Please rewrite the following sentences to make them easier to understand.
OpenAI is an American artificial intelligence (AI) research laboratory consisting of the non-profit OpenAI Incorporated (OpenAI Inc.) and its for-profit subsidiary corporation OpenAI Limited Partnership (OpenAI LP). OpenAI conducts AI research with the declared intention of promoting and developing a friendly AI. OpenAI systems run on the fifth most powerful supercomputer in the world.[5][6][7] The organization was founded in San Francisco in 2015 by Sam Altman, Reid Hoffman, Jessica Livingston, Elon Musk, Ilya Sutskever, Peter Thiel and others,[8][1][9] who collectively pledged US$1 billion. Musk resigned from the board in 2018 but remained a donor. Microsoft provided OpenAI LP with a $1 billion investment in 2019 and a second multi-year investment in January 2023, reported to be $10 billion.[10]

它的 Output 是这样的:

OpenAI is a research organization that focuses on artificial intelligence (AI). It has two parts: a non-profit called OpenAI Incorporated and a for-profit subsidiary called OpenAI Limited Partnership. The organization's goal is to create AI that is friendly and beneficial to humans. OpenAI uses one of the most powerful supercomputers in the world for its research. The organization was founded in San Francisco in 2015 by a group of people who pledged a total of $1 billion, including Sam Altman, Reid Hoffman, Jessica Livingston, Elon Musk, Ilya Sutskever, and Peter Thiel. Musk resigned from the board in 2018 but still supports the organization financially. In 2019, Microsoft invested $1 billion in OpenAI LP, and in January 2023, they made a second investment reported to be $10 billion.

技巧 4:减少 Role(角色)或人物

后面提到的改写例子,我在 prompt 里加了 Role 让其更易于浏览,这个优化办法是 OK 的。也很罕用,比方你想将你写的 Email 改得更商务,则只须要加 business 相干的词即可。

我再介绍一个更无效的技巧,就是在 prompt 里减少一些 role(角色)相干的内容,让 AI 生成的内容更合乎你的需要。

比方还是下面那个 rewrite 的例子,我在例子前退出这样的一段话,我让 AI 假如本人是一个小学老师,并且很善于将简单的内容转变成 7、8 岁小朋友也能听懂的话,而后再改写这段话:

You are a primary school teacher who can explain complex content to a level that a 7 or 8 year old child can understand. Please rewrite the following sentences to make them easier to understand:

Output 就变成了上面这样。不论是用词,还是语法,看起来都更容易懂了。

OpenAI is a place where people work on making computers really smart. It's like a lab where they do research to make computers do amazing things! They want the computers to be really friendly and helpful to people. They use one of the most powerful computers in the world to do this research. A bunch of really rich and smart people gave money to start this lab. One of them is Elon Musk, but he left in 2018. Another really big company, Microsoft, gave them a lot of money too.

另外除了减少角色外,还能让 AI 模仿某个人,比方让 AI 以 David Foster Wallace 的格调改写上述那段话:

Imagine you are the famous writer David Foster Wallace, please rewrite this sentences.

Output 是这样的:

As David Foster Wallace, I might rewrite these sentences to sound more like this:

OpenAI is a lab that studies something called artificial intelligence, which is like when computers can do things that normally only humans can do. They have a big computer that's one of the most powerful in the world, and they're trying to make sure the AI they create is friendly and helpful. The lab was started in 2015 in a city called San Francisco, and a bunch of important people gave them a lot of money to do their research. One of those people was Elon Musk, who stopped working there but still gives them money, and Microsoft also gave them a lot of money recently.

对于这个技巧的利用,以及如何更好地应用 AI 晋升写作能力,倡议各位浏览 Nat Eliason 的 Using ChatGPT as a Writing Coach,他应用 ChatGPT 辅助其写作,就用到了上述的技巧。

3.6 内容解释(代码正文、论文解释等)

精益求精下的第二大的场景是信息解释。它跟改写内容有点像,但又不太一样,信息解释有可能提供超过原文内容的信息。

举几个信息解释的例子,大家应该就能了解了:

  1. 解释代码:比方你看到一段 Python 的代码,但你看不懂,你能够让 AI 解释下代码的含意。
  2. 解释论文:看某篇论文看不懂,或者论文里的某一段看不懂,你也能够让 AI 解释。

以解释代码为例,它的 prompt 是这样的(此案例来自 OpenAI 官网案例):

Explanation of what the code does:

Python 3 
def remove_common_prefix(x, prefix, ws_prefix): 
    x["completion"] = x["completion"].str[len(prefix) :] 
    if ws_prefix: 
        # keep the single whitespace as prefix 
        x["completion"] = "" + x["completion"] 
return x

以下是 ChatGPT 的 Output,它论述了这段代码是拿来干什么的,同时还论述了各个参数的含意:

This is a Python function called remove_common_prefix that takes in three parameters: x, prefix, and ws_prefix.

The x parameter is assumed to be a Pandas DataFrame or Series object with a column named "completion", which contains strings that may or may not start with the prefix string.

The prefix parameter is a string that represents the common prefix that needs to be removed from the beginning of each string in the "completion" column of the DataFrame.

The ws_prefix parameter is a Boolean value that indicates whether or not to keep a single whitespace character as a prefix after the common prefix has been removed.

The function modifies the "completion" column of the DataFrame by removing the prefix string from the beginning of each string in that column using the Pandas str method and string slicing. The len(prefix) : syntax is used to slice the string starting from the character that follows the prefix.

If ws_prefix is True, then the function adds a single whitespace character as a prefix to each string in the "completion" column after removing the prefix.

Finally, the modified DataFrame is returned by the function.

3.7 信息总结(摘要)

内容生成大场景下的第二个场景是化繁为简,这个场景其实很好了解,就是将简单的内容,转为简略的内容,个别常遇到的场景有:

  1. 信息总结:顾名思义,就是对一堆信息进行总结。
  2. 信息解释:这个跟改写内容有点像,但这个更偏差于解释与总结。下一章会给大家介绍更多的例子。
  3. 信息提取:提取信息里的某一段内容,比方从一大段文字中,找到要害内容,并分类。

本章会讲一下信息总结。信息总结还是比较简单的,基本上在 prompt 里退出总结 summarize 就能够了。但如果你想要一些特地的成果,无妨组合应用之前介绍的技巧,比方:

  1. 应用技巧 2,减少总结示例,让 AI 总结合乎你需要的内容
  2. 应用技巧 4,减少 role,让 AI 总结的内容具备肯定的格调

不过在这个场景,还有个技巧须要各位留神。

技巧 5:应用特殊符号将指令和须要解决的文本离开

不论是信息总结,还是信息提取,你肯定会输出大段文字,甚至多段文字,此时有个小技巧。

能够用“”“将指令和文本离开。依据我的测试,如果你的文本有多段,减少”“”会晋升 AI 反馈的准确性(这个技巧来自于 OpenAI 的 API 最佳实际文档)

像咱们之前写的 prompt 就属于 Less effective prompt。为什么呢?据我的测试,次要还是 AI 不晓得什么是指令,什么是待处理的内容,用符号分隔开来会更利于 AI 辨别。

Please summarize the following sentences to make them easier to understand.
OpenAI is an American artificial intelligence (AI) research laboratory consisting of the non-profit OpenAI Incorporated (OpenAI Inc.) and its for-profit subsidiary corporation OpenAI Limited Partnership (OpenAI LP). OpenAI conducts AI research with the declared intention of promoting and developing a friendly AI. OpenAI systems run on the fifth most powerful supercomputer in the world.[5][6][7] The organization was founded in San Francisco in 2015 by Sam Altman, Reid Hoffman, Jessica Livingston, Elon Musk, Ilya Sutskever, Peter Thiel and others,[8][1][9] who collectively pledged US$1 billion. Musk resigned from the board in 2018 but remained a donor. Microsoft provided OpenAI LP with a $1 billion investment in 2019 and a second multi-year investment in January 2023, reported to be $10 billion.[10]

Better prompt:

Please summarize the following sentences to make them easier to understand.

Text: """OpenAI is an American artificial intelligence (AI) research laboratory consisting of the non-profit OpenAI Incorporated (OpenAI Inc.) and its for-profit subsidiary corporation OpenAI Limited Partnership (OpenAI LP). OpenAI conducts AI research with the declared intention of promoting and developing a friendly AI. OpenAI systems run on the fifth most powerful supercomputer in the world.[5][6][7] The organization was founded in San Francisco in 2015 by Sam Altman, Reid Hoffman, Jessica Livingston, Elon Musk, Ilya Sutskever, Peter Thiel and others,[8][1][9] who collectively pledged US$1 billion. Musk resigned from the board in 2018 but remained a donor. Microsoft provided OpenAI LP with a $1 billion investment in 2019 and a second multi-year investment in January 2023, reported to be $10 billion.[10]"""

另外,在吴恩达的 ChatGPT Prompt Engineering 课程中,还提到,你能够应用其余特殊符号来宰割文本和 prompt,比方<><tag></tag> 等,课程中的案例是这样的(留神这个是 python 代码,须要关注的是 prompt 里的 text):

text = f"""
You should express what you want a model to do by \
providing instructions that are as clear and \
specific as you can possibly make them. \
This will guide the model towards the desired output, \
and reduce the chances of receiving irrelevant \
or incorrect responses. Don't confuse writing a \
clear prompt with writing a short prompt. \
In many cases, longer prompts provide more clarity \
and context for the model, which can lead to \
more detailed and relevant outputs.
"""prompt = f"""
Summarize the text delimited by triple backticks \
into a single sentence.
`{text}`
"""

3.8 信息提取:通过格局词论述须要输入的格局

介绍完信息总结,再聊聊信息提取,我认为这个场景是继场景 3 推理以外,第二个值得深挖的场景。这个场景有十分多的有意思的场景,比方:

  1. 将一大段文字,甚至网页里的内容,按要求转为一个表格。
  2. 依照特定格局对文章内容进行信息归类。

第二个可能比拟难了解,举个 OpenAI 里的例子,它的 prompt 是这样的(为了有足够空间显示内容,我仅节选了 text 里的局部内容,残缺内容,能够点击这里查看):

Extract the important entities mentioned in the article below. First extract all company names, then extract all people names, then extract specific topics which fit the content and finally extract general overarching themes
Desired format:
Company names: <comma_separated_list_of_company_names>
People names: -||-
Specific topics: -||-
General themes: -||-

Text: """Powering Next Generation
Applications with OpenAI Codex
Codex is now powering 70 different applications across a variety of use cases through the OpenAI API.

May 24, 2022
4 minute read
OpenAI Codex, a natural language-to-code system based on GPT-3, helps turn simple English instructions into over a dozen popular coding languages. Codex was released last August through our API and is the principal building block of GitHub Copilot.

Warp is a Rust-based terminal, reimagined from the ground up to help both individuals and teams be more productive in the command-line.

Terminal commands are typically difficult to remember, find and construct. Users often have to leave the terminal and search the web for answers and even then the results might not give them the right command to execute. Warp uses Codex to allow users to run a natural language command to search directly from within the terminal and get a result they can immediately use.“Codex allows Warp to make the terminal more accessible and powerful. Developers search for entire commands using natural language rather than trying to remember them or assemble them piecemeal. Codex-powered command search has become one of our game changing features.”—Zach Lloyd, Founder, Warp


Machinet helps professional Java developers write quality code by using Codex to generate intelligent unit test templates.

Machinet was able to accelerate their development several-fold by switching from building their own machine learning systems to using Codex. The flexibility of Codex allows for the ability to easily add new features and capabilities saving their users time and helping them be more productive.“Codex is an amazing tool in our arsenal. Not only does it allow us to generate more meaningful code, but it has also helped us find a new design of product architecture and got us out of a local maximum.”—Vladislav Yanchenko, Founder, Machinet"""

Prompt 有点长,我解释下,它是让 AI 将文章里的重点内容进行抽离,并将其依据特定格局进行输入。要求将文章里提到的公司、人名、主题都依照 comma separated list(按逗号空格离开)的形式列举进去。

最初的 output 是这样的:

Company names: OpenAI, Microsoft, GitHub, Pygma, Replit, Warp, Machinet
People names: Emile Paffard-Wray, Amjad Masad, Zach Lloyd, Vladislav Yanchenko
Specific topics: GPT-3, OpenAI API, Azure OpenAI Service, GitHub Copilot, Pygma, Replit, Warp, Machinet
General themes: Natural language-to-code, Productivity, Problem solving, Creativity, Learning

技巧 6:通过格局词论述须要输入的格局

上述的案例,就是用到了这个技巧。这个技巧是技巧 2 的变种,比拟罕用于生成文本场景。其实在场景 7 中,你也能够用到这个技巧。

假如你想让 AI 总结一篇十分十分长的文章,并且依照特定格局给你总结,那你能够在文章后面明确输入的格局(为了不占用太多显示空间,我省略了注释内容,你能够在 OpenAI 的 Playground 看到残缺案例)。它的意思其实是让 ChatGPT 按 Topic 总结,每个 Topic 里依照无序列表(就是外面那个 -)将每个 Topic 的次要观点列举进去。

Summarize the main points of the following speech
Use the following format:
Topic 1: <topic_name_1>
- <point_1>
..
Topic 2: <topic_name_2>
- <point_1>
..
Topic 10: ..

Text: """
Thank you so much, Fred, for that lovely introduction. And thanks to the Atlantic Council for hosting me today.

The course of the global economy over the past two years has been shaped by COVID-19 and our efforts to fight the pandemic. It’s now evident, though, that the war between Russia and Ukraine has redrawn the contours of the world economic outlook. Vladimir Putin’s unprovoked attack on Ukraine and its people is taking a devastating human toll, with lives tragically lost, families internally displaced or becoming refugees, and communities and cities destroyed.
...

"""

Output 是这样的:

Topic 1: The war in Ukraine
- The war is taking a human toll with lives lost, families displaced, and communities destroyed
- The Biden administration is committed to holding Russia accountable
- The war has violated international law and is a challenge to the international order

Topic 2: The global economy
- The war is having negative impacts on the global economy, including higher commodity prices and inflation
- The IMF and World Bank will be focused on helping developing countries weather the impacts of the war
- The ultimate outcome for the global economy depends on the path of the war

对于这个场景和技巧,我想再解释一下为什么后劲很大。依据我应用各种 Summary 或者信息提取的产品,我发现,AI 并不知道什么是重点,所以在总结的过程中,会失落很多内容。如何疏导 AI 进行总结,就变得十分重要,且具备肯定的可玩性。


应用此技巧能够使输入更结构化。比方针对一篇文章进行问答,你不仅想要失去一个答案,也心愿 ChatGPT 的答案合乎特定的格局,不便你下一步进行自动化。

比方问 “ 这里的债券 duration 是多少?” , 失常 GPT 模型的答案可能是 “ 债券 duration 是 4 年 ” 或 “duration 4 年 ”。
ChatGPT 的答复不稳固,且不不便持续解决。

解法:
咱们能够通过这个技巧,让模型了解咱们预期的格局。并在此基础上,为了不便自动化,让模型输入特定的结构化答案 (比方 JSON/Markdown 等)。
也能够不便集成更多的额定要求,比方减少一个 ”confidence level”, 并通过 prompt 的模式指定这些数值的格局与甚至区间。

比方:

{context}
Question: What is bond duration mentioned here.
Answer template (Valid JSON format):
{{
"duration": $duration_numeric_value_in_year,
"confidence_level": $answer_confidence_level_high_moderate_or_low,
}}
Answer:

在吴恩达的 ChatGPT Prompt Engineering 课程中,有提到一个这个技巧的高级用法,在让 AI 依照特定格局输入内容的同时,还让 AI 依据内容是否满足特定条件,来判断应该输入什么后果,上面课程中的案例的 prompt(留神这个是 python 代码,有一些转义字符,能够不必管):

You will be provided with text delimited by triple quotes.
If it contains a sequence of instructions, \
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions, \
then simply write \"No steps provided.\"

\"\"\"{text}\"\"\"

简略解释下,这个 prompt 分成两步:

让 AI 将输出的 text 转为步骤(就是 prompt 里的 Step 1、2)

而后还减少了一个判断,如果输出的 text 里没有 step 的内容,那么就输入 No Step

如果输出的 text 是一个泡茶的步骤介绍:

Making a cup of tea is easy! First, you need to get some \
water boiling. While that's happening, \
grab a cup and put a tea bag in it. Once the water is \
hot enough, just pour it over the tea bag. \
Let it sit for a bit so the tea can steep. After a \
few minutes, take out the tea bag. If you \
like, you can add some sugar or milk to taste. \
And that's it! You've got yourself a delicious \
cup of tea to enjoy.

那么 AI 输入的内容是这样的(因为内容中蕴含了步骤式的内容):

Step 1 - Get some water boiling.
Step 2 - Grab a cup and put a tea bag in it.
Step 3 - Once the water is hot enough, pour it over the tea bag.
Step 4 - Let it sit for a bit so the tea can steep.
Step 5 - After a few minutes, take out the tea bag.
Step 6 - Add some sugar or milk to taste.
Step 7 - Enjoy your delicious cup of tea!

但如果咱们输出的是这样的 text:

The sun is shining brightly today, and the birds are \
singing. It's a beautiful day to go for a \
walk in the park. The flowers are blooming, and the \
trees are swaying gently in the breeze. People \
are out and about, enjoying the lovely weather. \
Some are having picnics, while others are playing \
games or simply relaxing on the grass. It's a \
perfect day to spend time outdoors and appreciate the \
beauty of nature.

从内容上看,这段话,没有任何步骤式的内容,所以 AI 的输入是这样的:

No steps provided.

参考链接:https://github.com/thinkingjimmy/Learning-Prompt

正文完
 0