Python中日志框架（Logging库）

[TOC]

背景

https://murphypei.github.io/blog/2019/09/python-logging

Python版本为Python 3.8.8

第一部分内置`Logging`包

1.1 日志级别

等级		数值
`CRITICAL`	严重；导致应用无法运行的重大错误；	50
`ERROR`	错误	40
`WARNING`	警告	30
`INFO`	信息	20
`DEBUG`	调试	10
`NOTSET`	通知	0

当然logging包也提供用户自定义日志级别。日志级别通过数值量化比较。例如：

1
2
3

# 新增格式：
# logging.addLevelName(level, levelName)
logging.addLevelName(25, 'POINT')

代码中定义了新的日志级别：POINT，级别值25，在DEBUG和INFO之间。

1.2 日志流

参考官网文档：https://docs.python.org/3/howto/logging.html#logging-basic-tutorial

详细流程如下图，其中涉及角色有：

Logger：日志对象。定义后供业务程序调用。判断可用性开关参数，日志级别。不满足条件结束。
LogRecord ：日志记录器。多于满足过滤器条件，流程结束。否则继续给将日志传到相应的处理器（Handler）处理。
Handler ：处理器。将日志记录器产生的日志记录发送至合适的目的地。提供各类处理器配置。
Filter ：过滤器。提供了更好的粒度控制,可以决定输出哪些日志记录。
Formatter：格式化器。指明了最终输出中日志记录的格式。

logging_flow

官网中将日志Flow分为Logger Flow和Handler Flow，具体处理如下：

判断 Logger 对象对于设置的级别是否可用，如果可用，则往下执行，否则，流程结束。
创建 LogRecord 对象，如果注册到 Logger对象中的 Filter 对象过滤后返回 False，则不记录日志，流程结束，否则，则向下执行。
LogRecord 对象将 Handler 对象传入当前的 Logger 对象，（图中的子流程）如果 Handler 对象的日志级别大于设置的日志级别，再判断注册到 Handler 对象中的 Filter 对象过滤后是否返回 True 而放行输出日志信息，否则不放行，流程结束。
如果传入的 Handler 大于 Logger 中设置的级别，也即 Handler 有效，则往下执行，否则，流程结束。
判断这个 Logger 对象是否还有父 Logger 对象，如果没有（代表当前 Logger 对象是最顶层的 Logger 对象 root Logger），流程结束。否则将 Logger 对象设置为它的父 Logger 对象，重复上面的 3、4 两步，输出父类 Logger 对象中的日志输出，直到是 root Logger 为止。

上述流程我们后面会结合具体配置文件详细说明。

1.3 基本使用

1.3.1 例子

import logging

logging.basicConfig(filename="test.log",
                    filemode="a",
                    format="%(asctime)s %(name)s:%(levelname)s:%(message)s",
                    datefmt="%d-%M-%Y %H:%M:%S",
                    level=logging.INFO)

logging.debug('debug')
logging.info('info')
logging.warning('warning')
logging.error('error')
logging.critical('critical')
# 输出test.log（文件若不存在，自动创建），追加写入。
#05-38-2022 20:38:32 root:INFO:info
#05-38-2022 20:38:32 root:WARNING:warning
#05-38-2022 20:38:32 root:ERROR:error
#05-38-2022 20:38:32 root:CRITICAL:critical

basicConfig 定义了日志的各类配置参数，具体如下：

参数名	参数说明
`filename`	日志输出到文件的文件名
`filemode`	文件模式，r[+]、w[+]、a[+]
`format`	日志输出的格式
`datefmt`	日志附带日期时间的格式
`style`	格式占位符，默认为 “%” 和 “{}”
`level`	设置日志输出级别
`stream`	定义输出流，用来初始化 `StreamHandler` 对象，不能 `filename` 参数一起使用，否则会`ValueError` 异常
`handles`	定义处理器，用来创建 `Handler` 对象，不能和 `filename` 、`stream` 参数一起使用，否则也会抛出 `ValueError` 异常

1.3.2 日志记录

每条日志都是一个LogRecord 对象，record对象对象的全部属性如下表。例子中，通过format参数定义日志输出格式，例如：format="%(asctime)s %(name)s:%(levelname)s:%(message)s"。其中message参数是用户提供，其他参数有用户自定义选取和组织。

参数名	格式	描述
args	You shouldn’t need to format this yourself.	The tuple of arguments merged into `msg` to produce `message`, or a dict whose values are used for the merge (when there is only one argument, and it is a dictionary).
asctime	`%(asctime)s`	将日志的时间构造成可读的形式，默认情况下是精确到毫秒，如 `2018-10-13 23:24:57,832`，可以额外指定 `datefmt` 参数来指定该变量的格式
created	`%(created)f`	Time when the `LogRecord` was created (as returned by `time.time()`).
exc_info	You shouldn’t need to format this yourself.	Exception tuple (à la `sys.exc_info`) or, if no exception has occurred, None.
filename	`%(filename)s`	不包含路径的文件名
funcName	`%(funcName)s`	日志记录所在的函数名
levelname	`%(levelname)s`	日志的级别名称 (`'DEBUG'`, `'INFO'`, `'WARNING'`, `'ERROR'`,`'CRITICAL'`).
levelno	`%(levelno)s`	Numeric logging level for the message (`DEBUG`, `INFO`, `WARNING`, `ERROR`,`CRITICAL`).
lineno	`%(lineno)d`	日志记录所在的行号
module	`%(module)s`	Module (name portion of `filename`).
msecs	`%(msecs)d`	Millisecond portion of the time when the `LogRecord` was created.
message	`%(message)s`	具体的日志信息, computed as `msg % args`. This is set when`Formatter.format()` is invoked.
msg	You shouldn’t need to format this yourself.	The format string passed in the original logging call. Merged with `args` to produce `message`, or an arbitrary object (see Using arbitrary objects as messages).
name	`%(name)s`	日志对象的名称
pathname	`%(pathname)s`	进行日志调用的源文件的完整路径名
process	`%(process)d`	当前进程ID
processName	`%(processName)s`	当前进程名称
relativeCreated	`%(relativeCreated)d`	Time in milliseconds when the LogRecord was created, relative to the time the logging module was loaded.
stack_info	You shouldn’t need to format this yourself.	Stack frame information (where available) from the bottom of the stack in the current thread, up to and including the stack frame of the logging call which resulted in the creation of this record.
thread	`%(thread)d`	当前线程ID
threadName	`%(threadName)s`	当前线程名称

如果对于系统自带的参数集合不满足需求。logging包从Python 3.2起提供了Factory函数getLogRecordFactory() 和setLogRecordFactory()支持用户自定义record属性。例如对于分布式部署的环境，日志需要有程序部署节点和主机名和ip信息，需要用户自己新增定义。例如下面的案例：

import logging
import socket

historyFactory = logging.getLogRecordFactory()

def record_factory(*args, **kwargs):
    record = historyFactory(*args, **kwargs)
    record.hostname = socket.gethostname()
    record.hostip = socket.gethostbyname(socket.gethostname())
    return record

logging.setLogRecordFactory(record_factory)

这样就可以配置format的时候就可以引用已经定义的参数hostname和hostip。上面的函数相当于在record中新增两个参数的定义。

1
2

format: "%(asctime)s - %(hostname)s - %(hostip)s - %(levelname)s - %(name)s(%(lineno)d) - %(message)s"
# 输出：2022-05-08 15:24:54,163 - localhost-PC - 192.168.152.1 - INFO - TestLogger(48) - info

另外还可以继承logging.Formatter，例如（代码片段放在__ini__.py中，Python的模块名为：logTest）：

import socket

class CustomFormatter(logging.Formatter):
    def format(self, record):
        record.hostname = socket.gethostname()
        record.hostip = socket.gethostbyname(socket.gethostname())
        return super().format(record)

使用yaml配置文件时，配置如下：

formatters:
  custom:
    format: "%(asctime)s - %(hostname)s - %(hostip)s - %(levelname)s - %(name)s(%(lineno)d) - %(message)s"
    (): logTest.CustomFormatter

另外也可以通过自定义日志过滤器 filter 方法来实现。filter实际上既可以过滤日志，也可以添加字段参数和改写record，具有灵活的表达能力。我们在后文介绍。

1.3.3 日志切割

生产环境程序常驻运行，为了避免日志文件过大，需要有生命周期管理。通常采用日志切割配置，将日志按照规则进行切割，并按照生命周期策略清理历史文件。

logging包中提供两种方式切割：按照时间和按照大小。

按大小：logging.handlers.RotatingFileHandler类；

import logging
from logging.handlers import RotatingFileHandler

# logging.basicConfig()
logger = logging.getLogger('test')
logger.setLevel(logging.INFO)
formatter = logging.Formatter('"%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s: %(message)s"')
# 定义一个RotatingFileHandler，最多备份5个日志文件，每个日志文件最大0.01M
rotatingHandler = RotatingFileHandler('test.log', maxBytes=0.01*1024*1024, backupCount=5)

rotatingHandler.setFormatter(formatter)
logger.addHandler(rotatingHandler)

# 下面是测试案例
if "__name__" == "__main__":
  while(True):
    logger.info("info")
# 会有5个日志文件：test.log test.log.1 test.log.2 test.log.3 test.log.4 test.log.5

按时间：logging.handlers.TimedRotatingFileHandler类；

其中参数：TimedRotatingFileHandler(filename [,when [,interval [,backupCount]]])

filename 输出文件名前缀；

when 时间单位，字典如下：

“S”: Seconds 
“M”: Minutes
“H”: Hours
“D”: Days
“W”: Week day (0=Monday)
“midnight”: Roll over at midnight（半夜）

interval 等待多少个单位when的时间后，Logger会自动重建文件。
backupCount 是保留日志个数，默认的0是不会自动删除掉日志。

import logging
from logging.handlers import TimedRotatingFileHandler

logger = logging.getLogger('test')
logger.setLevel(logging.INFO)
formatter = logging.Formatter('"%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s: %(message)s"')
# 定义一个TimedRotatingFileHandler，最多备份5个日志文件，1分钟切割一次
timedRotatingFileHandler = TimedRotatingFileHandler('Test.log', "M", 1, 5)

timedRotatingFileHandler.suffix = "%Y%m%d-%H%M.log"
timedRotatingFileHandler.setFormatter(formatter)
logger.addHandler(timedRotatingFileHandler)

# 下面是测试案例
if __name__ == "__main__":
    while(True):
        logger.info("info")
# 输出日志：Test.log.20220505-2309.log，文件名为：filename+suffix

1.3.4 `handler`类

logging包提供大量的handler方法类，其中StreamHandler、FileHandler、NullHandler源码在__init__.py文件中。其他handler在handlers.py中。全部说明如下：

StreamHandler 实例发送消息到流（类似文件对象）。可以是sys.stderr、sys.stdout或者文件。
FileHandler 实例将消息发送到硬盘文件。
BaseRotatingHandler 是轮换日志文件的处理程序的基类。它并不应该直接实例化。而应该使用 RotatingFileHandler 或 TimedRotatingFileHandler 代替它。
RotatingFileHandler 实例将消息发送到硬盘文件，支持最大日志文件大小和日志文件轮换。
TimedRotatingFileHandler 实例将消息发送到硬盘文件，以特定的时间间隔轮换日志文件。
SocketHandler 实例将消息发送到 TCP/IP 套接字。从 3.4 开始，也支持 Unix 域套接字。
DatagramHandler 实例将消息发送到 UDP 套接字。从 3.4 开始，也支持 Unix 域套接字。
SMTPHandler 实例将消息发送到指定的电子邮件地址。
SysLogHandler 实例将消息发送到 Unix syslog 守护程序，可能在远程计算机上。
NTEventLogHandler 实例将消息发送到 Windows NT/2000/XP 事件日志。
MemoryHandler 实例将消息发送到内存中的缓冲区，只要满足特定条件，缓冲区就会刷新。
HTTPHandler 实例使用 GET 或 POST 方法将消息发送到 HTTP 服务器。
WatchedFileHandler 实例会监视他们要写入日志的文件。如果文件发生更改，则会关闭该文件并使用文件名重新打开。此处理程序仅在类 Unix 系统上有用； Windows 不支持依赖的基础机制。
QueueHandler 实例将消息发送到队列，例如在 queue 或 multiprocessing 模块中实现的队列。
NullHandler 实例对错误消息不执行任何操作。它们由想要使用日志记录的库开发人员使用，但是想要避免如果库用户没有配置日志记录，则显示 “无法找到记录器XXX的消息处理器” 消息的情况。有关更多信息，请参阅配置库的日志记录。

另外用户也可以自己实现logging.Handler方法，定义自己的Handler方法。然后在配置中引用使用即可。

1.3.5 日志开关

在特殊场景下，如果不需要输出日志。logging也提供响应的参数开关。例如对于定义的logger可以进行关闭，如下：

1 2	logger = logging.getLogger("FileLogger") logger.disabled = True

这样就关闭了。如果是配置文件中，可以将handlers的list置为空，即可。

FileLogger:
   handlers: []
   level: DEBUG
   propagate: no

如果需要禁用某个级别以下所有日志输出，可以使用下面的参数：

1 2	logging.disable(logging.INFO) # 禁用INFO级别以下的所有日志

注：这个参数会限制所有的logger的日志输出级别。

1.3.5 日志过滤器

业务代码产生的日志通常并不是都要作为日志输出的。所以在打印日志的时候我们对每条日志紧要程度进行了量化划分等级，即INFO、DEBUG 等。但是一些特殊的需求，这种粗粒度的等级划分并不能满足需求。能否提供可编程自定义的过滤函数filter(record)？

logging并没有像Handler一样，为用户提供丰富的filter工具类。建议项目组可以像java中日志框架 logback一样提供部分常用的filter工具类。

logging没有现成可用的filter工具，就需要用户自己继承实现（logging.Filter）。例如下面的案例：

class NoStringFilter(logging.Filter):
    def filter(self, record):
        return not record.getMessage().startswith('user')

logger.addFilter(NoStringFilter())

定义 filter 方法，日志记录 record 作为唯一参数，返回值为 0 或 False 表示日志记录将被抛弃，1 或 True则表示记录别放过。

上面的案例实现了过滤message中以user开头的日志消息。对于yaml文件配置方式，我们在后文介绍。

另外从Python3.2起，用户自定义filter也可以不继承logging.Filter，只需要定义一个实现函数，logger.addFilter方法引入即可。例如我们在打印日志的时候，需要对于日志中一些敏感信息进行脱敏处理，例如密码信息。

下面的代码片段放在__ini__.py中，Python的模块名为：logTest。

class CustomFilter(logging.Filter):
    def filter(self, record):
        if 'password' in record.msg:
            record.msg = record.msg.replace("password", "*****")
            return True
        if 'pwd' in record.msg:
            record.msg = record.msg.replace("pwd", "*****")
            return True
        return True

对于msg中的字符串进行关键字替换脱密，对于含有password和pwd的字段。

在yaml配置文件中配置如下，在handlers中console应用该过滤器。

filters:
    filter_pwd:
      (): logTest.CustomFilter
handlers:
  console:
     class: logging.StreamHandler
     level: INFO
     formatter: custom
     filters: [filter_pwd]

这样对于日志中还有相关关键字的日志将被过滤。

1 2	logger.error("user：admin and password") # user：admin and *****

1.4 最佳实践

1.4.1 `yaml`配置文件

对于生产环境Python项目运行，最佳实践当然是日志相关参数需要配置化，而不是耦合在业务代码中。所以首先是日志配置参数配置文件化。通常有几种方式：ini格式、yaml格式、JSON 格式、Python文件等，建议使用yaml格式。

例如下面的配置文件（logConfig.yaml）：

version: 1
disable_existing_loggers: True
incremental: False

formatters:
  standard:
      format: "%(asctime)s - %(levelname)s - %(name)s(%(lineno)d) - %(message)s"
  error:
    format: "%(asctime)s-%(levelname)s <PID %(process)d:%(processName)s> %(name)s.%(funcName)s(): %(message)s"
  custom:
    format: "%(asctime)s - %(hostname)s - %(hostip)s - %(levelname)s - %(name)s(%(lineno)d) - %(message)s"
    (): logTest.CustomFormatter

filters:
    require_debug_false:
      (): logTest.NoParsingFilter
    filter_pwd:
      (): logTest.CustomFilter

handlers:
  console:
     class: logging.StreamHandler
     level: INFO
     formatter: custom
     filters: [filter_pwd]

  file:
     class: logging.FileHandler
     filename: ./logs/logging.log
     level: DEBUG
     formatter: standard

  TimedRotatingfile:
      class: logging.handlers.TimedRotatingFileHandler
      level: DEBUG
      formatter: standard
      filters: [require_debug_false]
      when: M
      backupCount: 5
      filename: ./logs/timedRotating.log
      encoding: utf8

  rotatingFileHandler:
      class: logging.handlers.RotatingFileHandler
      level: DEBUG
      formatter: standard
      filename: ./logs/RotatingFile.log
      maxBytes: 100000
      backupCount: 5
      encoding: utf8

loggers:
  FileLogger:
     handlers: []
     level: DEBUG
     #propagate: no

  SampleLogger:
    level: DEBUG
    handlers: [console,rotatingFileHandler]
    #propagate: no

  TestLogger:
    level: DEBUG
    handlers: [console]
    #propagate: no

root:
  level: DEBUG
  handlers: []

配置文件中参数说明如下：

version，版本参数，目前限定为1（或者1.0）；

disable_existing_loggers ,参数值默认为True。

关于这个参数需要重点讲解一下，因为网上各类文章解释均有出入，笔者做了实验并查看了代码，解释如下。首先从字面意思来看，是否禁止已经存在的loggers。那么怎么理解已经存在的loggers？有下面的例子，我们有两个配置文件中分别定义了日志配置（Config.yaml和logConfig.yaml）：

with open('Config.yaml', 'r') as f:
    config = yaml.safe_load(f.read())
    logging.config.dictConfig(config)

with open('logConfig.yaml', 'r') as f:
    config = yaml.safe_load(f.read())
    logging.config.dictConfig(config)

两个配置文件中均定义了名称为TestLogger的loggers（但是配置定义不同，例如Handler参数不同）。但是配置文件Config.yaml和logConfig.yaml有加载先后顺序，那么哪个loggers是有效的呢？

这时候配置文件logConfig.yaml中disable_existing_loggers参数值为True，那么旧的配置失效，否则新的配置失效。

incremental，中文是增加的意思。是否将此配置文件解释为现有配置的增量, 默认为False。该参数True是有不明错误，暂未研究参数用途。
formatters ，日志格式化器。可以定义多个日志格式类型，供handler配置使用。
filters，日志过滤器。配置

handlers，日志处理器。例如配置文件中我们定义了：console、file、TimedRotatingfile、RotatingFileHandler。供后续loggers配置使用。举个例子，其他handler不在展开讲了。

# handler名称标签
TimedRotatingfile:
    # handler实现类，也可以配置用户自己实现的Handler方法
    class: logging.handlers.TimedRotatingFileHandler
    # 日志过滤级别
    level: DEBUG
    # 日志格式化方式而error，配置文件中配置使用
    formatter: error
    # 过滤器选择，配置文件中需要已配置
    filters: [require_debug_false]
    # 日志切割时间，M为分钟切割
    when: M
    # 日志切割保存的文件数量
    backupCount: 5
    # 日志文件名称，注意这里的绝对路径
    filename: ./logs/timedRotating.log
    # 日志输出编码，utf8支持中文
    encoding: utf8
    # 注意配置文件方式不支持自定义suffix值，目前只支持在代码中basicConfig配置
    # suffix: "%Y-%m-%d_%H-%M"

loggers，日志对象。配置文件中可以定义多个loggers。程序中使用下面方式调用：
1
logger = logging.getLogger("FileLogger")
另外logger有一个重要的概念：父子logger。每个用户自定义的logger默认都是root的子logger，这时候，子logger的日志会发送给父级的Logger。当其中的参数propagate为no时候，则不再发送给父。例如下面的配置。
1
2
3
4
FileLogger:
handlers: [TimedRotatingfile]
level: DEBUG
propagate: no
另外除了默认的root作为父logger，用户也可以自己定义，例如：
1
logger = logging.getLogger("FileLogger.SampleLogger")
这时候FileLogger.SampleLogger的父logger为FileLogger，并且默认FileLogger的父logger是root（如果有配置和定义）。即关系为：FileLogger.SampleLogger---->FileLogger--->root。

那么对于FileLogger.SampleLogger.TestLogger，是否有多重父子传递呢？经过测试验证是不支持的，也就是说最多就2层。

root，终极父logger，接受所有子logger的日志发送，属于兜底处理。例如下面的配置：

root:
  level: DEBUG
  handlers: []
  # handlers: [console]

root属于可选配置项。另外handlers配置为空，并且所有的子logger的handlers配置也为空，这时候会有默认输出，输出格式较为简单只有%(message)s信息。

如果一个日志对象没有指定具体的logger，那么就是root，例如：

1 2	print(logging.getLogger() == logging.root) # True

查看一下root的配置：

print(logging.root.handlers)
print(logging.lastResort)
print(logging.root.filters)
# []
# <_StderrHandler <stderr> (WARNING)>
# []

所以root默认的Handler是StderrHandler，并且日志级别是warning，输出到stderr（默认显示的字体是红色）。如果是shell脚本运行可以配置日志重定向：

1	$ python log.py 2> /dev/null

业务代码中如下加载配置：

import logging.config
import yaml

with open('logConfig.yaml', 'rb') as f:
    config = yaml.safe_load(f.read())
    logging.config.dictConfig(config)

logger = logging.getLogger("FileLogger")
logger1 = logging.getLogger("SampleLogger")

logger.debug("debug")
logger1.debug("debug")

1.4.2 其他配置文件

对于yaml文件读取，使用了yaml包方式先将配置文件解释成字典对象，然后再通过下面的方法解析配置信息。

1	logging.config.dictConfig()

其实logging提供原生接口直接读取配置文件的解析方法（如下）。

1	logging.config.fileConfig()

方式该方法目前只支持configparser类型的文件，即下面的ini格式的配置文件（截取片段）：

[loggers]
keys=root,sampleLogger

[handlers]
keys=consoleHandler

fileConfig()是一个较老的接口，不支持配置Filter。官方建议后续尽量使用dictConfig()接口，后续新功能都将增加到dictConfig()接口。

对于json文件，logConfig.json。

{
    "version":1,
    "disable_existing_loggers":false,
    "formatters":{
        "simple":{
            "format":"%(asctime)s - %(name)s - %(levelname)s - %(message)s"
        }
    },
    "handlers":{
        "console":{
            "class":"logging.StreamHandler",
            "level":"DEBUG",
            "formatter":"simple",
            "stream":"ext://sys.stdout"
        }
    },
    "loggers":{
        "my_module":{
            "level":"ERROR",
            "handlers":["console"],
            "propagate":"no"
        }
    },
    "root":{
        "level":"INFO",
        "handlers":["console"]
    }
}

文件解析代码案例：

import logging.config
import json

with open('logConfig.json', 'r') as f:
    config = json.load(f)
    logging.config.dictConfig(config)

logger = logging.getLogger("FileLogger")
logger.debug("debug")

1.5 线程安全和进程不安全

1.5.1 线程安全

logging 库是线程安全的。

class Handler(Filterer):
    def __init__(self, level=NOTSET):
        # ......
        self.createLock()
    
    def createLock(self):
        """
        Acquire a thread lock for serializing access to the underlying I/O.
        """
        self.lock = threading.RLock()
        _register_at_fork_reinit_lock(self)

    def acquire(self):
        """
        Acquire the I/O thread lock.
        """
        if self.lock:
            self.lock.acquire()

    def release(self):
        """
        Release the I/O thread lock.
        """
        if self.lock:
            self.lock.release()
    def handle(self, record):
        """
        Conditionally emit the specified logging record.

        Emission depends on filters which may have been added to the handler.
        Wrap the actual emission of the record with acquisition/release of
        the I/O thread lock. Returns whether the filter passed the record for
        emission.
        """
        rv = self.filter(record)
        if rv:
            self.acquire()
            try:
                self.emit(record)
            finally:
                self.release()
        return rv

1.5.2 进程安全

1.6 总结

参考文献及资料

1、Logging包，链接：https://docs.python.org/3.9/library/logging.html

2、日志介绍，链接：https://rmcomplexity.com/article/2020/12/01/introduction-to-python-logging.html

3、日志介绍，链接：https://coralogix.com/blog/python-logging-best-practices-tips/

背景

第一部分 内置Logging包