如何处理资源池化开启RTO后出现not mark dirt问题


问题现象

出现场景是failover时,备机持有最新页面,主机从备机获取最新页面后,应该置脏的,但是没有置脏。

  • buffer置脏:buffinfo->dirtyflag = false

  • 磁盘上的lsn:buffDesc->lsn_on_disk 2/419077B0,

  • buffer的lsn:PageGetLSN(bufferinfo->pageinfo.page) 2/4198B4F0

  • xlog的lsn:bufferinfo->lsn 2/419077B0

  • 总结:xlog的lsn = 磁盘上的lsn < buffer的lsn

    2024-02-02 15:12:24.163 [unknown] [unknown] localhost 140190590953216 0[0:0#0]  0 [BACKEND] PANIC:  extreme_rto segment page not mark dirty:lsn 2/419077B0, lsn_disk 2/40187550,                                   lsn_page 2/4198B4F0, page 1663/15201/5004 60990

    报错点

    bufferinfo没有被置脏,但是页面是最新页面

    SSMarkBufferDirtyForERTO,初步怀疑BUF_ERTO_NEED_MARK_DIRTY为什么被异常。

    相关日志

    MarkSegPageRedoChildPageDirty

    lsn_on_disk 不对


    分析结果

    1

    报错信息


    报错页面:1663/16388/5005/16384 0-3606,现在的问题是,该页面lsn_on_disk(0/DF6D60C8),小于buffer上的lsn(0/E8C0D3B0),但是没有被置脏。

    extreme_rto segment page not mark dirty:lsn 0/DDAACD58, lsn_disk 0/DF6D60C8, lsn_page 0/E8C0D3B0, page 1663/16388/5005 3606

      2024-02-06 14:57:16.998 [unknown] [unknown] localhost 281456656767520 0[0:0#0]  0 [BACKEND] PANIC:  extreme_rto segment page not mark dirty:lsn 0/DDAACD58, lsn_disk 0/DF6D60C8,                                   lsn_page 0/E8C0D3B0, page 1663/16388/5005 3606
      2024-02-06 14:57:16.998 [unknown] [unknown] localhost 281456656767520 0[0:0#0]  0 [BACKEND] CONTEXT:  xlog redo [segpage] segment head extend: relfilenode/fork:, nblocks[3606->3607], (phy loc 128/101203), reset_zero:1
      2024-02-06 14:57:16.998 [unknown] [unknown] localhost 281456656767520 0[0:0#0]  0 [BACKEND] BACKTRACELOG:  tid[2717562]'s backtrace:
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb() [0x1163ac4]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_Z9errfinishiz+0x324) [0x1157c1c]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_Z29MarkSegPageRedoChildPageDirtyP14RedoBufferInfo+0x2a4) [0x22038ec]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_Z21SegPageRedoChildStateP17XLogRecParseState+0x84) [0x2203a04]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_Z21ProcSegPageCommonRedoP17XLogRecParseState+0xf0) [0x2203bf8]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_ZN11extreme_rto24RedoPageManagerDdlActionEP17XLogRecParseState+0x108) [0x20055b8]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_ZN11extreme_rto35PageManagerProcSegPipeLineSyncStateEP17XLogRecParseState+0x120) [0x20060ec]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_ZN11extreme_rto25PageManagerRedoParseStateEP17XLogRecParseState+0xcc) [0x20063d8]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_ZN11extreme_rto30PageManagerRedoDistributeItemsEP17XLogRecParseState+0xc0) [0x20066b0]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_ZN11extreme_rto19RedoPageManagerMainEv+0x12c) [0x20067f8]