关于python:如何优雅地解析命令行选项

8次阅读

共计 4683 个字符，预计需要花费 12 分钟才能阅读完成。

如何优雅地解析命令行选项
随着咱们编程教训的增长，对命令行的相熟水平日渐加深，想来很多人会慢慢地领会到应用命令行带来的高效率。

自然而然地，咱们本人写的很多程序（或者罗唆就是脚本），也心愿可能像原生命令和其余程序一样，通过运行时输出的参数就能够设定、改变程序的行为；而不用一层层找到相应的配置文件，而后还要定位到相应内容、批改、保留、退出……

想想就很麻烦好吗

1. 手动解析
所以让咱们开始解析命令行参数吧~

在以前对于模块的文章中咱们提到过 sys.args 这个变量，其中保留的就是调用以后脚本时传入的命令行参数。

咱们先察看一下这个变量：

# test_sys.py
import sys

print(sys.argv)

通过命令行调用：

$ python test_sys.py -d today -t now --author justdopython --country China --auto

失去如下输入后果：

['test_sys.py', '-d', 'today', '-t', 'now', '--author', 'justdopython', '--country', 'China', '--auto']

可见，sys.argv 其实就是将命令行参数按空格切分，失去的一个字符串列表。此外，命令行参数的第一个就是以后运行的脚本名称。

咱们如果想要提取出各个参数及其对应的值，首先得辨别出命令行的长参数和短参数，它们别离由“–”和“-”结尾作为标识。所以咱们也以此作为判断长短参数的条件：

import sys


for command_arg in sys.argv[1:]:
    if command_arg.startswith('--'):
        print("%s 为长参数" % command_arg)
    elif command_arg.startswith('-'):
        print("%s 为短参数" % command_arg)

测试后果：

$ python manually_parse_argv.py -d today -t now --author justdopython --country China --auto


-d 为短参数
-t 为短参数
--author 为长参数
--country 为长参数
--auto 为长参数

紧接着，咱们须要在解析出长短参数这一步的根底上，再解析出对应的参数值：

# manually_parse_argv.py
import sys


# 因为 sys.argv 的第一个变量是以后脚本名称，因而略过
for index, command_arg in enumerate(sys.argv[1:]):
    if command_arg.startswith('--'):
        try:
            value = sys.argv[1:][index+1]
            if not value.startswith('-'):
                print("%s 为长参数，参数值为 %s" % (command_arg, value))
                continue
        except IndexError:
            pass
        
        print("%s 为长参数，无参数值" % command_arg)

    elif command_arg.startswith('-'):
        try:
            value = sys.argv[1:][index+1]
            if not value.startswith('-'):
                print("%s 为短参数，参数值为 %s" % (command_arg, value))
                continue
        except IndexError:
            pass
        
        print("%s 为短参数，无参数值" % command_arg)

再测试一下：

$ python manually_parse_argv.py -d today -t now --author justdopython --country China --auto

-d 为短参数，参数值为 today
-t 为短参数，参数值为 now
--author 为长参数，参数值为 justdopython
--country 为长参数，参数值为 China
--auto 为长参数，无参数值

看起来还不错。

然而再看看咱们的代码……真正的逻辑还没开始，反倒是为了解析命令行参数曾经写了几十行代码。这一点都不 pythonic——这还不包含一些其余对于异常情况的解决。

更何况是要在每个相似的程序中退出这么一段程序了。

2. getopt 模块帮您忙
Python 的益处就在于，生态过于丰盛，简直你要用到的每个性能，都曾经有人为你写好了现成的模块以供调用。

衣来伸手饭来张口的日子除了能在梦中想想，在用 Python 写程序的时候也不是不能够奢望。

比方命令行参数解析，就有一个名为 getopt 的模块，既可能精确辨别长短命令行参数，也可能失当地提取命令行参数的值。

咱们先来看看：

# test_getopt.py
import sys
import getopt


opts, args = getopt.getopt(sys.argv[1:], 'd:t:', ["author=", "country=", "auto"])

print(opts)
print(args)

打印后果：

$ python test_getopt.py -d today -t now --author justdopython --country China --auto
[('-d', 'today'), ('-t', 'now'), ('--author', 'justdopython'), ('--country', 'China'), ('--auto', '')]
[]

上面咱们来别离解释一下相干参数的含意。

getopt 模块中的 getopt 函数用于解析命令行参数。

该函数承受三个参数：args，shortopts 和 longopts，别离代表“命令行参数”，“要接管的短选项”和“要接管的长选项”。

其中 args 和 longopts 均为字符串组成的列表，而 shortopts 则为一个字符串。

同样地，因为 sys.argv 的第一个值为以后脚本名称，所以少数状况下咱们会抉择向 args 参数传入 sys.argv[1:]的值。

而 shortopts 这个参数承受的字符串则示意须要解析哪些短选项，字符串中每个字母均示意一个短选项：

import sys
import getopt


opts, args = getopt.getopt(sys.argv[1:], 'dt')

print(opts)
print(args)

输入后果：

$ python test_getopt.py -d  -t
[('-d', ''), ('-t','')]
[]

当然，如果输出的参数少于预期，也不会导致解析失败：

$ python test_getopt.py  -t
[('-t', '')]
[]

但要是给出了预期之外的参数，就会导致模块抛错：

$ python test_getopt.py -d  -t -k
Traceback (most recent call last):
  File "test_getopt.py", line 11, in <module>
    opts, args = getopt.getopt(sys.argv[1:], 'dt')
      ...
    raise GetoptError(_('option -%s not recognized') % opt, opt)
getopt.GetoptError: option -k not recognized

这样的解决逻辑也合乎咱们应用命令的体验，能够简略地了解为“宁缺毋滥”。

如果短参数相应的字母后带了一个冒号:，则意味着这个参数须要指定一个参数值。getopt 会将该参数对应的下一个命令行参数作为参数值（而不管下一个参数是什么模式）：

import sys
import getopt


opts, args = getopt.getopt(sys.argv[1:], 'd:t')

print(opts)
print(args)

# $ python test_getopt.py -d  -t
# [('-d', '-t')]
# []

此外，一旦 getopt 在预期接管到长短选项的地位没有找到以“–”或“-”结尾的字符串，就会终止解析过程，剩下的未解析字符串均放在返回元组的第二项中返回。

$ python test_getopt.py -d d_value o --pattern -t
[('-d', 'd_value')]
['o', '--pattern', '-t']

相似地，longopts 参数示意须要解析的长参数。

列表中的每一个字符串代表一个长参数：

import sys
import getopt


opts, args = getopt.getopt(sys.argv[1:], '', ["author","country"])

print(opts)
print(args)

# $ python test_getopt.py --author  --country
# [('--author', ''), ('--country','')]
# []

要解析带有参数值的长参数，还应在每个长参数后附带一个等于号（=），以标识该参数须要带值：

import sys
import getopt


opts, args = getopt.getopt(sys.argv[1:], '', ["author=","country"])

print(opts)
print(args)

# $ python test_getopt.py --author justdopython --country
# [('--author', 'justdopython'), ('--country', '')]
# []

所以最终就失去了咱们一开始的解析后果：

import sys
import getopt


opts, args = getopt.getopt(sys.argv[1:], 'd:t:', ["author=", "country=", "auto"])

print(opts)
print(args)

# $ python test_getopt.py -d today -t now --author justdopython --country China --auto
# [('-d', 'today'), ('-t', 'now'), ('--author', 'justdopython'), ('--country', 'China'), ('--auto', '')]
# []

解析实现后，咱们再从 opts 中提取相应的值即可。

懒人福音
getopt 除了替咱们节俭了编写命令行参数解析代码的工夫和精力，另一方面还能够让你在输出命令行参数时少打几个字母——当然，谨严来讲，咱们并不倡议此类行为。慎用，慎用！

getopt 对长参数的解析反对前缀匹配，只有输出的参数可能与某个指定参数惟一匹配，同样可能实现预期解析。

$ python test_getopt.py -d today -t now --auth justdopython --coun China --auto
[('-d', 'today'), ('-t', 'now'), ('--author', 'justdopython'), ('--country', 'China'), ('--auto', '')]
[]

能够看到，author 和 country 两个参数咱们都只输出了一部分，然而 getopt 仍然进行了正确的解析。

总结
本文解说了应用 Python 解析命令行参数的两种形式，一种是略显轻便的手动解析，即本人编写程序自定义解析；另一种则是调用现成、且更加强壮的 getopt 模块来实现解析。

从此以后，咱们终于能够解脱繁琐的配置文件，用一种优雅简洁的形式来批改程序的行为了。

以上就是本次分享的所有内容，想要理解更多 python 常识欢送返回公众号：Python 编程学习圈，发送“J”即可收费获取，每日干货分享

正文完