命令选项和文件配置

2023-01-06 更新

1 命令选项

"""
python xxx.py -a -b arg1 -c arg2 arg3 arg4 arg5
"""

import getopt

def getopts(argv):
    try:
        opts, args = getopt.getopt(argv, "ab:c:")
        print(opts)
        print(args)
    except getopt.GetoptError:
        print(getopt.GetoptError)
        exit(1)

    for opt, arg in opts:
        if opt in ("-a",):
            pass
        elif opt in ("-b",):
            pass
        elif opt in ("-c",):
            pass
        else:
            pass
    return 0

getopts(sys.argv[1:])

2 文件配置

参考 why json isnt a good configuration language

2.1 JSON

很多项目使用 JSON 作为配置语言,譬如 npm 和 yarn 使用的 package.json 。为什么流行 JSON 呢?自然是因为方便,几乎所有工具都支持。但作为配置语言, JSON 实际上是一个糟糕的选择:

  1. 难以注释。注释对于配置语言而言绝对重要。我们可以通过在对象中使用特殊的键作为注释,例如 //__comment ,显然这种写法可读性不高,而且当单个对象需要多个注释时力不从心。 JSON 创始人 David Crockford 建议预处理注释,可有一些库是允许输入注释的,如 Ruby 的 json 模块,这让我们很为难。
  2. 过于严格。正因为它的严格规范,解析才特别简单。代价就是可读性可写性都受到影响。
  3. 不能换行。字符串想换行必须转义 \n 。倘若你想在文件中另起一行显示,放弃吧, JSON 办不到。
  4. 信噪比低。 JSON 多余的标点符号太多了,像首尾的花括号、键的引号等,对配置来说毫无用处,非常嘈杂。而且 JSON 居然不允许在结尾出现逗号,结尾的逗号可以让用户更方便地添加新条目。
  5. 数值有限。 JSON 规范中将数字定义成用十进制表示的精度有限的浮点数,无法表示十六进制数、无穷、 NaN 等值。
{
  "name": "example",
  "description": "A really long description that needs multiple lines.\nThis is a sample project to illustrate why JSON is not a good configuration format. This description is pretty long, but it doesn't have any way to go onto multiple lines.",
  "version": "0.0.1",
  "main": "index.js",
  "//": "This is as close to a comment as you are going to get",
  "keywords": ["example", "config"],
  "scripts": {
    "test": "./test.sh",
    "do_stuff": "./do_stuff.sh"
  },
  "bugs": {
    "url": "https://example.com/bugs"
  },
  "contributors": [{
    "name": "John Doe",
    "email": "johndoe@example.com"
  }, {
    "name": "Ivy Lane",
    "url": "https://example.com/ivylane"
  }],
  "dependencies": {
    "dep1": "^1.0.0",
    "dep2": "3.40",
    "dep3": "6.7"
  }
}
import json

with open("xxx.json") as f:
    json_dict = json.loads(f.read())

2.2 HJSON JSON5

HJSON 是一种基于 JSON 的格式,它解决了上面问题中的一部分,包括注释、不带引号的键、结尾的逗号、多行字符串。有一些命令行工具能将 HJSON 转换为 JSON ,不妨先写 HJSON 再转换。

{
  name: example
  description: '''
  A really long description that needs multiple lines.

  This is a sample project to illustrate why JSON is
  not a good configuration format.  This description
  is pretty long, but it doesn't have any way to go
  onto multiple lines.
  '''
  version: 0.0.1
  main: index.js
  # This is a a comment
  keywords: ["example", "config"]
  scripts: {
    test: ./test.sh
    do_stuff: ./do_stuff.sh
  }
  bugs: {
    url: https://example.com/bugs
  }
  contributors: [{
    name: John Doe
    email: johndoe@example.com
  } {
    name: Ivy Lane
    url: https://example.com/ivylane
  }]
  dependencies: {
    dep1: ^1.0.0
    # Why we have this dependency
    dep2: "3.40"
    dep3: "6.7"
  }
}

JSON5 与 HJSON 非常相似。

2.3 HOCON

HOCON 是 JSON 的超集,可以直接应用当前的 JSON 文件。除了注释、结尾的逗号和多行字符串外, HOCON 还支持引用文件或其他键(以点作为分隔符),避免代码重复。

name = example
description = """
A really long description that needs multiple lines.

This is a sample project to illustrate why JSON is
not a good configuration format.  This description
is pretty long, but it doesn't have any way to go
onto multiple lines.
"""
version = 0.0.1
main = index.js
# This is a a comment
keywords = ["example", "config"]
scripts {
  test = ./test.sh
  do_stuff = ./do_stuff.sh
}
bugs.url = "https://example.com/bugs"
contributors = [
  {
    name = John Doe
    email = johndoe@example.com
  }
  {
    name = Ivy Lane
    url = "https://example.com/ivylane"
  }
]
dependencies {
  dep1 = ^1.0.0
  # Why we have this dependency
  dep2 = "3.40"
  dep3 = "6.7"
}

2.4 YAML

YAML(YAML Ain’t Markup Language) 几乎是 JSON 的超集。它更加灵活,也支持引用文件,而且 YAML 库和 JSON 库一样无处不在。缺点是写法多样,不同写法的解析结果可能不一致。复制粘贴也比较困难。

name: example
description: >
  A really long description that needs multiple lines.

  This is a sample project to illustrate why JSON is not a good
  configuration format. This description is pretty long, but it
  doesn't have any way to go onto multiple lines.
version: 0.0.1
main: index.js
# this is a comment
keywords:
  - example
  - config
scripts:
  test: ./test.sh
  do_stuff: ./do_stuff.sh
bugs:
  url: "https://example.com/bugs"
contributors:
  - name: John Doe
    email: johndoe@example.com
  - name: Ivy Lane
    url: "https://example.com/ivylange"
dependencies:
  dep1: ^1.0.0
  # Why we depend on dep2
  dep2: "3.40"
  dep3: "6.7"

有两种方法可以书写多行字符串,一是冒号后加 | ,二是冒号后加 >

"""
pip install pyyaml
"""

import yaml

with open("xxx.yaml") as f:
    yaml_dict = yaml.load(f.read(), Loader=yaml.FullLoader)

2.5 TOML

TOML 是一种越来越流行的配置语言, 比 YAML 简单得多, Cargo ( Rust 的构建工具)、 Pip ( Python 的包管理器)和 Dep ( Go 的依赖管理器)都在使用 TOML 。 TOML 有点类似 INI ,不同的是,它对嵌套结构有明确定义的语法。要是有大量嵌套结构, TOML 可能会显得冗长。

name = "example"
description = """
A really long description that needs multiple lines.
This is a sample project to illustrate why JSON is not a \
good configuration format. This description is pretty long, \
but it doesn't have any way to go onto multiple lines."""

version = "0.0.1"
main = "index.js"
# This is a comment
keywords = ["example", "config"]

[bugs]
url = "https://example.com/bugs"

[scripts]
test = "./test.sh"
do_stuff = "./do_stuff.sh"

[[contributors]]
name = "John Doe"
email = "johndow@example.com"

[[contributors]]
name = "Ivy Lane"
url = "https://example.com/ivylane"

[dependencies]
dep1 = "^1.0.0"
# Why we depend on dep2
dep2 = "3.40"
dep3 = "6.7"

2.6 INI

INI(Initialization File) 由节、键、值组成。

;comment text

[Section1 Name]
KeyName11=Value11
KeyName12=Value12

[Section2 Name]
KeyName21=Value21
KeyName22=Value22

2.7 总结

如果你的程序是脚本语言开发的,配置来源又十分可靠,那么最好的选择就是直接用该语言进行配置,灵活方便。不是脚本语言开发的,也可以在编译语言中嵌入,不过可能引入严重的安全问题。

有这么多更好的配置语言,没有理由继续坚持 JSON 。即使由于某种需求,上述语言都无法满足(这几乎不可能),你也可以考虑自定义。在做出选择之前一定要想清楚:你不仅要自己编写和维护一个解析器,还要让你的用户学习陌生的配置语言。