Description: fs:workload/{begin clusters/1a5s-mds-1c-client-3node conf/{client mds mon osd} distro/{ubuntu_latest} mount/kclient/{mount overrides/{distro/stock/{k-stock rhel_8} ms-die-on-skipped}} objectstore-ec/bluestore-comp-ec-root omap_limit/10000 overrides/{frag_enable osd-asserts session_timeout whitelist_health whitelist_wrongly_marked_down} ranks/5 scrub/yes tasks/{0-check-counter workunit/suites/ffsb}}

Log: http://qa-proxy.ceph.com/teuthology/mchangir-2021-03-05_13:27:07-fs:workload-wip-mchangir-mds-scrub-error-with-background-task-distro-basic-smithi/5939223/teuthology.log

Failure Reason:

"2021-03-05T13:50:16.778415+0000 mon.a (mon.0) 308 : cluster [WRN] Health check failed: 1 MDSs report slow metadata IOs (MDS_SLOW_METADATA_IO)" in cluster log

  • kernel:
    • client:
      • sha1: distro
    • kdb: True
    • sha1: distro
  • tasks:
    • internal.check_packages:
    • internal.buildpackages_prep:
    • internal.save_config:
    • internal.check_lock:
    • internal.add_remotes:
    • console_log:
    • internal.connect:
    • internal.push_inventory:
    • internal.serialize_remote_roles:
    • internal.check_conflict:
    • internal.check_ceph_data:
    • internal.vm_setup:
    • kernel:
      • client:
        • sha1: distro
      • kdb: True
      • sha1: distro
    • internal.base:
    • internal.archive_upload:
    • internal.archive:
    • internal.coredump:
    • internal.sudo:
    • internal.syslog:
    • internal.timer:
    • pcp:
    • selinux:
    • ansible.cephlab:
    • clock:
    • install:
      • extra_system_packages:
        • deb:
          • bison
          • flex
          • libelf-dev
          • libssl-dev
          • network-manager
          • iproute2
          • util-linux
          • dump
          • indent
          • libaio-dev
          • libtool-bin
          • uuid-dev
          • xfslibs-dev
        • rpm:
          • bison
          • flex
          • elfutils-libelf-devel
          • openssl-devel
          • NetworkManager
          • iproute
          • util-linux
          • libacl-devel
          • libaio-devel
          • libattr-devel
          • libtool
          • libuuid-devel
          • xfsdump
          • xfsprogs
          • xfsprogs-devel
          • libaio-devel
          • libtool
          • libuuid-devel
          • xfsprogs-devel
      • extra_packages:
        • deb:
          • python3-cephfs
          • cephfs-shell
          • cephfs-top
          • cephfs-mirror
        • rpm:
          • python3-cephfs
          • cephfs-top
          • cephfs-mirror
      • sha1: f4f90b687273c03c7ef17619d1f4323272a78333
    • ceph:
    • kclient:
    • fwd_scrub:
      • scrub_timeout: 900
      • cluster: ceph
      • sleep_between_iterations: 1
    • check-counter:
      • workunit:
        • clients:
          • all:
            • suites/ffsb.sh
        • branch: wip-mchangir-mds-scrub-error-with-background-task
        • sha1: f4f90b687273c03c7ef17619d1f4323272a78333
    • verbose: False
    • pid:
    • duration: 0:55:43
    • owner: scheduled_mchangir@teuthology
    • flavor: basic
    • status_class: danger
    • targets:
      • smithi172.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDsyKXK4EZ9Gd/ONov6OrBgzUG2yNXxyUHCtl+fSPAsWLHOw7AXJJPaZZRHi6DNQbX4gJoV8RI9PsJ2DfJvWipyWR+xwgsMJ30T8xmGqVG5GEMkmo2O0jwo5wkE8p3zVTgfZfzkvFKqvrh0rml0BPNt1ItitMLY/zYtqtGXg2AdhU+chMT47O5zaqFrUlEK+1QotxSJLhiwbVtpgHqohV9AtGPTZKDGjZsBuWgiPYMZA9b669EL20Z4wMuzRINW/n8uM6YH8YfZ5LqjKGYBKH9tJQPPewK8d7x+YGNSpev292Jy0j6HMBESSaizNk2IrCGB/QpSiTqCzSNRnvbYdN8CH8WcZvBPL4vwxRUH6OPQVOoNQQZ90t3y8GbLCzLQm6jtEvEC6xpPkGVHeV9SmxlJmbudnEWlGUTnaR4hiT+naQ3BPGEvuX/ZYFT4wZD0p4TRmoFy/3j80uAbl82ctDIvOF4s7A5th4or7B7t7ccsUhtTRNorevrjrA97nZ9NpR8=
      • smithi119.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDEgcu5p3qg3kXqC2sBi77miRFXMm46tIFUCy81t2LbaY/pIarmyN1YnjTUGAGyHxlC9l0c5yuBuLjQbnTETRdTipOADsU1oOfmW3qtE9sEP5X6kN/ti6CHWkvrd6zIB2DljZiJoFFO0tjNLRvU9NAWlpfG88aW8UxV4M7j6UuvHOU6g195YFRE2z4mPIfcxjFYW+UA11fNxFinjj1C5lN8xq4Bv+IbSEcP6Tjn57SEh92E3/9UCxmWlRVMqqvGyBIEgfROnJahXIOvNbV34xw6b+UmqaGm3wFBndRcXm5us0ElxSprSZWeSrnFfbocACvPGSJ+BvuxTcr3dqLTXaMNv8G2k2TkohpLUb1jPPlOulwwtFeYLsbE3WtLatAxvhymboheWZO923CUeXD30shH0wj34EZnBTqLvhwI+ie4aydsCrLxV0pWNY9tZRMjxu7ksyWDH5Uc1+bz5dBLLhf2bIBF4YqzQ4m2pgiyBiYjsmNotnJI6/APScauuswWRr0=
      • smithi096.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC9isUZlJzwiBmRTV7Jc7vJYV7nFo7nXUfTzN6FYCdW/aqEkMx9Z9cKgJ+cFcoCrYDEp7I6qzlK81lcOqJh9Hcug+dlHSEGzyQeSdGeTjn+PI9OHgimnAtJdXcfzUK1ZM6yHKx6Mek3udRMY84AGfX1OApMPbZLLuuBapfW+TQr/PP9v5cuEYB9gfAGIl9UpTKt1eEBPVL3m3jycsz/dq9F8uImh6JiphOZabC6wVOP1IbXKsLNhInxyWKjav+4F9E4uQdK2WkqyqQPYggmuvN0Mhzz2ZNzO+EZpTgQ3Yf/EtzLvyjMORyME7eATv6ATzlsKgSK95VFm5D43mpNIJrfDoBwmwpUnRAZrkbDwYqxUpQ+BH+ArK1BxxWUH5CaXoB4hjPk/inZ9k9DBMhocUp16zLJXyvCKnqHgQY16fAl1Utzz4+8hv3Grn94JTcDySH5QSuQtJqDe1M686uF06FZXRJBSV4rZjMVsieRpG28nvuKLn2jp7xLR30pGq5yO8c=
    • job_id: 5939223
    • log_href: http://qa-proxy.ceph.com/teuthology/mchangir-2021-03-05_13:27:07-fs:workload-wip-mchangir-mds-scrub-error-with-background-task-distro-basic-smithi/5939223/teuthology.log
    • suite_branch: wip-mchangir-mds-scrub-error-with-background-task
    • wait_time: 0:07:45
    • os_version: 8.3
    • branch: wip-mchangir-mds-scrub-error-with-background-task
    • pcp_grafana_url:
    • email: mchangir@redhat.com
    • archive_path: /home/teuthworker/archive/mchangir-2021-03-05_13:27:07-fs:workload-wip-mchangir-mds-scrub-error-with-background-task-distro-basic-smithi/5939223
    • updated: 2021-03-05 14:32:36
    • description: fs:workload/{begin clusters/1a5s-mds-1c-client-3node conf/{client mds mon osd} distro/{ubuntu_latest} mount/kclient/{mount overrides/{distro/stock/{k-stock rhel_8} ms-die-on-skipped}} objectstore-ec/bluestore-comp-ec-root omap_limit/10000 overrides/{frag_enable osd-asserts session_timeout whitelist_health whitelist_wrongly_marked_down} ranks/5 scrub/yes tasks/{0-check-counter workunit/suites/ffsb}}
    • started: 2021-03-05 13:29:08
    • last_in_suite: False
    • machine_type: smithi
    • sentry_event:
    • posted: 2021-03-05 13:27:14
    • teuthology_branch: master
    • sha1: f4f90b687273c03c7ef17619d1f4323272a78333
    • name: mchangir-2021-03-05_13:27:07-fs:workload-wip-mchangir-mds-scrub-error-with-background-task-distro-basic-smithi
    • roles:
      • [u'mon.a', u'mgr.x', u'mds.a', u'mds.d', u'osd.0', u'osd.3', u'osd.6', u'osd.9', u'client.0']
      • [u'mon.b', u'mgr.y', u'mds.b', u'mds.e', u'osd.1', u'osd.4', u'osd.7', u'osd.10']
      • [u'mon.c', u'mgr.z', u'mds.c', u'mds.f', u'osd.2', u'osd.5', u'osd.8', u'osd.11']
    • overrides:
      • ceph-deploy:
        • conf:
          • client:
            • log file: /var/log/ceph/ceph-$name.$pid.log
          • mon:
            • osd default pool size: 2
      • check-counter:
        • counters:
          • mds:
            • mds.exported
            • mds.imported
            • mds.dir_split
      • selinux:
        • whitelist:
          • scontext=system_u:system_r:logrotate_t:s0
      • workunit:
        • sha1: f4f90b687273c03c7ef17619d1f4323272a78333
        • branch: wip-mchangir-mds-scrub-error-with-background-task
      • ceph:
        • log-whitelist:
          • \(MDS_ALL_DOWN\)
          • \(MDS_UP_LESS_THAN_MAX\)
        • sha1: f4f90b687273c03c7ef17619d1f4323272a78333
        • fs: xfs
        • conf:
          • global:
            • ms die on skipped message: False
          • mgr:
            • debug ms: 1
            • debug mgr: 20
          • client:
            • rados osd op timeout: 15m
            • debug ms: 1
            • rados mon op timeout: 15m
            • debug client: 20
            • client mount timeout: 600
          • mon:
            • debug ms: 1
            • debug mon: 20
            • debug paxos: 20
            • mon op complaint time: 120
          • mds:
            • mds bal split bits: 3
            • mds bal split size: 100
            • mds bal fragment size max: 10000
            • debug mds: 20
            • mds bal merge size: 5
            • debug ms: 1
            • mds bal frag: True
            • mds verify scatter: True
            • rados osd op timeout: 15m
            • osd op complaint time: 180
            • mds op complaint time: 180
            • rados mon op timeout: 15m
            • mds debug scatterstat: True
            • mds debug frag: True
          • osd:
            • mon osd full ratio: 0.9
            • osd_max_omap_entries_per_request: 10000
            • debug ms: 20
            • bluestore fsck on mount: True
            • filestore flush min: 0
            • osd heartbeat grace: 60
            • debug osd: 25
            • bluestore compression mode: aggressive
            • debug bluestore: 20
            • debug bluefs: 20
            • osd objectstore: bluestore
            • mon osd backfillfull_ratio: 0.85
            • osd op complaint time: 180
            • bluestore block size: 96636764160
            • osd shutdown pgref assert: True
            • debug filestore: 20
            • debug rocksdb: 10
            • mon osd nearfull ratio: 0.8
            • osd failsafe full ratio: 0.95
            • debug journal: 20
        • cephfs:
          • session_timeout: 300
          • ec_profile:
            • m=2
            • k=2
            • crush-failure-domain=osd
          • max_mds: 5
        • log-ignorelist:
          • \(MDS_ALL_DOWN\)
          • \(MDS_UP_LESS_THAN_MAX\)
          • overall HEALTH_
          • \(FS_DEGRADED\)
          • \(MDS_FAILED\)
          • \(MDS_DEGRADED\)
          • \(FS_WITH_FAILED_MDS\)
          • \(MDS_DAMAGE\)
          • \(MDS_ALL_DOWN\)
          • \(MDS_UP_LESS_THAN_MAX\)
          • \(FS_INLINE_DATA_DEPRECATED\)
          • overall HEALTH_
          • \(OSD_DOWN\)
          • \(OSD_
          • but it is still running
          • is not responding
          • SLOW_OPS
          • slow request
      • install:
        • ceph:
          • sha1: f4f90b687273c03c7ef17619d1f4323272a78333
      • admin_socket:
        • branch: wip-mchangir-mds-scrub-error-with-background-task
      • thrashosds:
        • bdev_inject_crash_probability: 0.5
        • bdev_inject_crash: 2
    • success: False
    • failure_reason: "2021-03-05T13:50:16.778415+0000 mon.a (mon.0) 308 : cluster [WRN] Health check failed: 1 MDSs report slow metadata IOs (MDS_SLOW_METADATA_IO)" in cluster log
    • status: fail
    • nuke_on_error: True
    • os_type: rhel
    • runtime: 1:03:28
    • suite_sha1: f4f90b687273c03c7ef17619d1f4323272a78333