关于程序员:如何使用-Python-多处理模块

59次阅读

共计 7038 个字符,预计需要花费 18 分钟才能阅读完成。

在本文中,咱们将学习如何应用多解决模块中的特定 Python 类(过程类)。我将通过示例为您提供疾速概述。

什么是多解决模块?

还有什么比从官网文档中提取模块更好的形式来形容模块呢?Multiprocessing 是一个应用相似于线程模块的 API 反对生成过程的包。多解决包提供本地和近程并发,通过应用子过程而不是线程无效地回避全局解释器锁。

线程模块不是本文的重点,但总而言之,线程模块将解决一小段代码执行(轻量级且具备共享内存),而多解决模块将处理程序执行(较重且齐全隔离)。

一般来说,多解决模块提供了各种其余类、函数和实用程序,可用于处理程序执行期间执行的多个过程。如果程序须要在其工作流程中利用并行性,该模块专门设计为交互的次要点。咱们不会探讨多解决模块中的所有类和实用程序,而是将重点关注一个十分具体的类,即过程类。

什么是过程类?

在本节中,咱们将尝试更好地介绍过程是什么,以及如何在 Python 中辨认、应用和治理过程。正如 GNU C 库中所解释的:“过程是调配系统资源的根本单位。每个过程都有本人的地址空间和(通常)一个控制线程。一个过程执行一个程序;能够让多个过程执行雷同的程序程序,但每个过程在其本人的地址空间内都有本人的程序正本,并独立于其余正本执行它。”

但这在 Python 中是什么样子的呢?到目前为止,咱们曾经设法对过程是什么、过程和线程之间的区别进行了一些形容和参考,但到目前为止咱们还没有涉及任何代码。好吧,让咱们扭转一下,用 Python 做一个非常简单的流程示例:

#!/usr/bin/env python
import os

# A very, very simple process.
if __name__ == "__main__":
    print(f"Hi! I'm process {os.getpid()}")

这将产生以下输入:

[r0x0d@fedora ~]$ python /tmp/tmp.iuW2VAurGG/scratch.py
Hi! I'm process 144112

正如您所看到的,任何正在运行的 Python 脚本或程序都是它本人的一个过程。

创立子过程

那么在父过程中生成不同的子过程又如何呢?好吧,要做到这一点,咱们须要多解决模块中的 Process 类的帮忙,它看起来像这样:

#!/usr/bin/env python
import os
import multiprocessing

def child_process():
    print(f"Hi! I'm a child process {os.getpid()}")

if __name__ == "__main__":
    print(f"Hi! I'm process {os.getpid()}")

    # Here we create a new instance of the Process class and assign our
    # `child_process` function to be executed.
    process = multiprocessing.Process(target=child_process)

    # We then start the process
    process.start()

    # And finally, we join the process. This will make our script to hang and
    # wait until the child process is done.
    process.join()

这将产生以下输入:

[r0x0d@fedora ~]$ python /tmp/tmp.iuW2VAurGG/scratch.py
Hi! I'm process 144078
Hi! I'm a child process 144079

对于上一个脚本的一个十分重要的注意事项:如果您不应用 process.join() 来期待子过程执行并实现,那么该点的任何其余后续代码将理论执行,并且可能会变得有点难以同步您的工作流程。

思考以下示例:

#!/usr/bin/env python
import os
import multiprocessing

def child_process():
    print(f"Hi! I'm a child process {os.getpid()}")

if __name__ == "__main__":
    print(f"Hi! I'm process {os.getpid()}")

    # Here we create a new instance of the Process class and assign our
    # `child_process` function to be executed.
    process = multiprocessing.Process(target=child_process)

    # We then start the process
    process.start()

    # And finally, we join the process. This will make our script to hang and
    # wait until the child process is done.
    #process.join()

    print("AFTER CHILD EXECUTION! RIGHT?!")

该代码片段将产生以下输入:

[r0x0d@fedora ~]$ python /tmp/tmp.iuW2VAurGG/scratch.py
Hi! I'm process 145489
AFTER CHILD EXECUTION! RIGHT?!
Hi! I'm a child process 145490

当然,断言下面的代码片段是谬误的也是不正确的。这齐全取决于您想要如何应用该模块以及您的子过程将如何执行。所以要明智地应用它。

创立各种子过程

如果要生成多个过程,能够利用 for 循环(或任何其余类型的循环)。它们将容许您创立对所需流程的尽可能多的援用,并在稍后阶段启动 / 退出它们。

#!/usr/bin/env python
import os
import multiprocessing

def child_process(id):
    print(f"Hi! I'm a child process {os.getpid()} with id#{id}")

if __name__ == "__main__":
    print(f"Hi! I'm process {os.getpid()}")
    list_of_processes = []

    # Loop through the number 0 to 10 and create processes for each one of
    # them.
    for i in range(0, 10):
        # Here we create a new instance of the Process class and assign our
        # `child_process` function to be executed. Note the difference now that
        # we are using the `args` parameter now, this means that we can pass
        # down parameters to the function being executed as a child process.
        process = multiprocessing.Process(target=child_process, args=(i,))
        list_of_processes.append(process)

    for process in list_of_processes:
        # We then start the process
        process.start()

        # And finally, we join the process. This will make our script to hang
        # and wait until the child process is done.
        process.join()

这将产生以下输入:

[r0x0d@fedora ~]$ python /tmp/tmp.iuW2VAurGG/scratch.py
Hi! I'm process 146056
Hi! I'm a child process 146057 with id#0
Hi! I'm a child process 146058 with id#1
Hi! I'm a child process 146059 with id#2
Hi! I'm a child process 146060 with id#3
Hi! I'm a child process 146061 with id#4
Hi! I'm a child process 146062 with id#5
Hi! I'm a child process 146063 with id#6
Hi! I'm a child process 146064 with id#7
Hi! I'm a child process 146065 with id#8
Hi! I'm a child process 146066 with id#9

数据通信

在上一节中,我形容了向 multiprocessing.Process 类构造函数增加一个新参数 args。此参数容许您将值传递给子过程以在函数外部应用。但你晓得如何从子过程返回数据吗?

您可能会认为,要从子级返回数据,必须应用其中的 return 语句能力真正检索数据。过程非常适合以隔离的形式执行函数,而不会烦扰共享资源,这意味着咱们晓得从函数返回数据的失常且罕用的形式。在这里,因为其隔离而不容许。

相同,咱们能够应用队列类,它将为咱们提供一个在父过程与其子过程之间通信数据的接口。在这种状况下,队列是一个一般的 FIFO(先进先出),具备用于解决多解决的内置机制。

思考以下示例:

#!/usr/bin/env python
import os
import multiprocessing

def child_process(queue, number1, number2):
    print(f"Hi! I'm a child process {os.getpid()}. I do calculations.")
    sum = number1 + number2

    # Putting data into the queue
    queue.put(sum)

if __name__ == "__main__":
    print(f"Hi! I'm process {os.getpid()}")

    # Defining a new Queue()
    queue = multiprocessing.Queue()

    # Here we create a new instance of the Process class and assign our
    # `child_process` function to be executed. Note the difference now that
    # we are using the `args` parameter now, this means that we can pass
    # down parameters to the function being executed as a child process.
    process = multiprocessing.Process(target=child_process, args=(queue,1, 2))

    # We then start the process
    process.start()

    # And finally, we join the process. This will make our script to hang and
    # wait until the child process is done.
    process.join()

    # Accessing the result from the queue.
    print(f"Got the result from child process as {queue.get()}")

它将给出以下输入:

[r0x0d@fedora ~]$ python /tmp/tmp.iuW2VAurGG/scratch.py
Hi! I'm process 149002
Hi! I'm a child process 149003. I do calculations.
Got the result from child process as 3

异样解决

解决异样是一项非凡且有些艰难的工作,咱们在应用流程模块时必须不断地实现它。起因是,默认状况下,子过程内产生的任何异样将始终由生成它的 Process 类解决。

上面的代码引发带有文本的异样:

#!/usr/bin/env python
import os
import multiprocessing

def child_process():
    print(f"Hi! I'm a child process {os.getpid()}.")
    raise Exception("Oh no! :(")

if __name__ == "__main__":
    print(f"Hi! I'm process {os.getpid()}")

    # Here we create a new instance of the Process class and assign our
    # `child_process` function to be executed. Note the difference now that
    # we are using the `args` parameter now, this means that we can pass
    # down parameters to the function being executed as a child process.
    process = multiprocessing.Process(target=child_process)

    try:
        # We then start the process
        process.start()

        # And finally, we join the process. This will make our script to hang and
        # wait until the child process is done.
        process.join()

        print("AFTER CHILD EXECUTION! RIGHT?!")
    except Exception:
        print("Uhhh... It failed?")

输入后果:

[r0x0d@fedora ~]$ python /tmp/tmp.iuW2VAurGG/scratch.py
Hi! I'm process 149505
Hi! I'm a child process 149506.
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib64/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib64/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/tmp/tmp.iuW2VAurGG/scratch.py", line 7, in child_process
    raise Exception("Oh no! :(")
Exception: Oh no! :(AFTER CHILD EXECUTION! RIGHT?!

如果您跟踪代码,您将可能留神到在 process.join() 调用之后认真搁置了一条 print 语句,以模仿父过程仍在运行,即便在子过程中引发了未解决的异样之后也是如此。

克服这种状况的一种办法是在子过程中理论解决异样,如下所示:

#!/usr/bin/env python
import os
import multiprocessing

def child_process():
    try:
        print(f"Hi! I'm a child process {os.getpid()}.")
        raise Exception("Oh no! :(")
    except Exception:
        print("Uh, I think it's fine now...")

if __name__ == "__main__":
    print(f"Hi! I'm process {os.getpid()}")

    # Here we create a new instance of the Process class and assign our
    # `child_process` function to be executed. Note the difference now that
    # we are using the `args` parameter now, this means that we can pass
    # down parameters to the function being executed as a child process.
    process = multiprocessing.Process(target=child_process)

    # We then start the process
    process.start()

    # And finally, we join the process. This will make our script to hang and
    # wait until the child process is done.
    process.join()

    print("AFTER CHILD EXECUTION! RIGHT?!")

当初,您的异样将在您的子过程内解决,这意味着您能够管制它会产生什么以及在这种状况下应该做什么。

总结

当工作和实现依赖于并行形式执行的解决方案时,多解决模块十分弱小,特地是与 Process 类一起应用时。这减少了在其本人的隔离过程中执行任何函数的惊人可能性。

本文由 mdnice 多平台公布

正文完
 0