gabrioux-2024-09-04_20:28:00-orch:cephadm-wip-guits-main-2024-09-04-1323-distro-default-smithi

User	Scheduled	Started	Updated	Runtime	Suite	Branch	Machine Type	Revision	Pass	Fail	Dead
gabrioux	2024-09-04 20:28:00	2024-09-04 20:29:20	2024-09-05 04:46:49	8:17:29	orch:cephadm	wip-guits-main-2024-09-04-1323	smithi	f9fcca5	4	13	1

Average wait time: 0:14:45

See other runs of suite 'orch:cephadm' on branch 'wip-guits-main-2024-09-04-1323'?

See other runs on branch 'wip-guits-main-2024-09-04-1323'?

See other runs scheduled on 2024-09-04?

See detail view?

Status	Job ID	Posted	Started	Updated	Runtime	Duration	In Waiting	Machine	Teuthology Branch	OS Type	OS Version	Description	Nodes
fail	7889713	2024-09-04 20:28:09	2024-09-04 20:29:00	2024-09-04 21:10:17	0:41:17	0:32:47	0:08:30	smithi	main	centos	9.stream	orch:cephadm/mds_upgrade_sequence/{bluestore-bitmap centos_9.stream conf/{client mds mgr mon osd} fail_fs/no overrides/{ignorelist_health ignorelist_upgrade ignorelist_wrongly_marked_down pg-warn pg_health syntax} roles tasks/{0-from/reef/{reef} 1-volume/{0-create 1-ranks/1 2-allow_standby_replay/no 3-inline/no 4-verify} 2-client/kclient 3-upgrade-mgr-staggered 4-config-upgrade/{fail_fs} 5-upgrade-with-workload 6-verify}}	2
Failure Reason: "2024-09-04T21:00:00.000164+0000 mon.smithi062 (mon.0) 531 : cluster [WRN] osd.5 (root=default,host=smithi066) is down" in cluster log
pass	7889714	2024-09-04 20:28:10	2024-09-04 20:29:20	2024-09-04 21:32:49	1:03:29	0:56:20	0:07:09	smithi	main	centos	9.stream	orch:cephadm/thrash/{0-distro/centos_9.stream 1-start 2-thrash 3-tasks/radosbench fixed-2 msgr/async-v1only root}	2
fail	7889715	2024-09-04 20:28:12	2024-09-04 20:29:51	2024-09-04 20:54:58	0:25:07	0:16:10	0:08:57	smithi	main	centos	9.stream	orch:cephadm/smb/{0-distro/centos_9.stream_runc tasks/deploy_smb_mgr_ctdb_res_ips}	4
Failure Reason: SELinux denials found on ubuntu@smithi033.front.sepia.ceph.com: ['type=AVC msg=audit(1725483041.415:11643): avc: denied { nlmsg_read } for pid=64084 comm="ss" scontext=system_u:system_r:container_t:s0:c316,c427 tcontext=system_u:system_r:container_t:s0:c316,c427 tclass=netlink_tcpdiag_socket permissive=1']
fail	7889716	2024-09-04 20:28:13	2024-09-04 20:31:51	2024-09-04 21:30:52	0:59:01	0:51:32	0:07:29	smithi	main	centos	9.stream	orch:cephadm/mds_upgrade_sequence/{bluestore-bitmap centos_9.stream conf/{client mds mgr mon osd} fail_fs/yes overrides/{ignorelist_health ignorelist_upgrade ignorelist_wrongly_marked_down pg-warn pg_health syntax} roles tasks/{0-from/squid 1-volume/{0-create 1-ranks/2 2-allow_standby_replay/yes 3-inline/yes 4-verify} 2-client/fuse 3-upgrade-mgr-staggered 4-config-upgrade/{fail_fs} 5-upgrade-with-workload 6-verify}}	2
Failure Reason: reached maximum tries (50) after waiting for 300 seconds
pass	7889717	2024-09-04 20:28:14	2024-09-04 20:31:52	2024-09-04 20:51:17	0:19:25	0:13:18	0:06:07	smithi	main	centos	9.stream	orch:cephadm/smoke-roleless/{0-distro/centos_9.stream 0-nvme-loop 1-start 2-services/nfs 3-final}	2
fail	7889718	2024-09-04 20:28:16	2024-09-04 20:32:02	2024-09-04 21:22:04	0:50:02	0:42:36	0:07:26	smithi	main	centos	9.stream	orch:cephadm/upgrade/{1-start-distro/1-start-centos_9.stream-squid 2-repo_digest/repo_digest 3-upgrade/staggered 4-wait 5-upgrade-ls agent/off mon_election/connectivity}	2
Failure Reason: Command failed on smithi098 with status 1: 'sudo /home/ubuntu/cephtest/cephadm --image quay.ceph.io/ceph-ci/ceph:squid shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 2b860878-6afe-11ef-bcd6-c7b262605968 -e sha1=f9fcca5273b6971f640393d33a94730179073754 -- bash -c \'ceph versions \| jq -e \'"\'"\'.rgw \| length == 1\'"\'"\'\''
fail	7889719	2024-09-04 20:28:17	2024-09-04 20:32:42	2024-09-04 20:52:30	0:19:48	0:11:51	0:07:57	smithi	main	centos	9.stream	orch:cephadm/workunits/{0-distro/centos_9.stream agent/off mon_election/classic task/test_rgw_multisite}	3
Failure Reason: Command failed on smithi084 with status 1: 'sudo /home/ubuntu/cephtest/cephadm --image quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:f9fcca5273b6971f640393d33a94730179073754 shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 8f8c8b1c-6afe-11ef-bcd6-c7b262605968 -- bash -c \'set -e\nset -x\nwhile true; do TOKEN=$(ceph rgw realm tokens \| jq -r \'"\'"\'.[0].token\'"\'"\'); echo $TOKEN; if [ "$TOKEN" != "master zone has no endpoint" ]; then break; fi; sleep 5; done\nTOKENS=$(ceph rgw realm tokens)\necho $TOKENS \| jq --exit-status \'"\'"\'.[0].realm == "myrealm1"\'"\'"\'\necho $TOKENS \| jq --exit-status \'"\'"\'.[0].token\'"\'"\'\nTOKEN_JSON=$(ceph rgw realm tokens \| jq -r \'"\'"\'.[0].token\'"\'"\' \| base64 --decode)\necho $TOKEN_JSON \| jq --exit-status \'"\'"\'.realm_name == "myrealm1"\'"\'"\'\necho $TOKEN_JSON \| jq --exit-status \'"\'"\'.endpoint \| test("http://.+:\\\\d+")\'"\'"\'\necho $TOKEN_JSON \| jq --exit-status \'"\'"\'.realm_id \| test("^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$")\'"\'"\'\necho $TOKEN_JSON \| jq --exit-status \'"\'"\'.access_key\'"\'"\'\necho $TOKEN_JSON \| jq --exit-status \'"\'"\'.secret\'"\'"\'\n\''
pass	7889720	2024-09-04 20:28:18	2024-09-04 20:33:33	2024-09-04 21:14:16	0:40:43	0:32:35	0:08:08	smithi	main	centos	9.stream	orch:cephadm/mds_upgrade_sequence/{bluestore-bitmap centos_9.stream conf/{client mds mgr mon osd} fail_fs/no overrides/{ignorelist_health ignorelist_upgrade ignorelist_wrongly_marked_down pg-warn pg_health syntax} roles tasks/{0-from/reef/{v18.2.1} 1-volume/{0-create 1-ranks/1 2-allow_standby_replay/yes 3-inline/no 4-verify} 2-client/kclient 3-upgrade-mgr-staggered 4-config-upgrade/{fail_fs} 5-upgrade-with-workload 6-verify}}	2
dead	7889721	2024-09-04 20:28:19	2024-09-04 20:35:03	2024-09-05 04:46:49	8:11:46			smithi	main	centos	9.stream	orch:cephadm/smoke-roleless/{0-distro/centos_9.stream 0-nvme-loop 1-start 2-services/rgw-ingress 3-final}	2
Failure Reason: hit max job timeout
fail	7889722	2024-09-04 20:28:21	2024-09-04 20:37:24	2024-09-04 22:58:02	2:20:38	0:17:03	2:03:35	smithi	main	centos	9.stream	orch:cephadm/osds/{0-distro/centos_9.stream_runc 0-nvme-loop 1-start 2-ops/deploy-raw}	2
Failure Reason: reached maximum tries (120) after waiting for 120 seconds
fail	7889723	2024-09-04 20:28:22	2024-09-04 20:38:35	2024-09-04 21:40:05	1:01:30	0:51:00	0:10:30	smithi	main	centos	9.stream	orch:cephadm/mds_upgrade_sequence/{bluestore-bitmap centos_9.stream conf/{client mds mgr mon osd} fail_fs/yes overrides/{ignorelist_health ignorelist_upgrade ignorelist_wrongly_marked_down pg-warn pg_health syntax} roles tasks/{0-from/squid 1-volume/{0-create 1-ranks/2 2-allow_standby_replay/no 3-inline/yes 4-verify} 2-client/fuse 3-upgrade-mgr-staggered 4-config-upgrade/{fail_fs} 5-upgrade-with-workload 6-verify}}	2
Failure Reason: reached maximum tries (50) after waiting for 300 seconds
fail	7889724	2024-09-04 20:28:24	2024-09-04 20:43:06	2024-09-04 21:21:44	0:38:38	0:32:48	0:05:50	smithi	main	centos	9.stream	orch:cephadm/mds_upgrade_sequence/{bluestore-bitmap centos_9.stream conf/{client mds mgr mon osd} fail_fs/no overrides/{ignorelist_health ignorelist_upgrade ignorelist_wrongly_marked_down pg-warn pg_health syntax} roles tasks/{0-from/reef/{v18.2.0} 1-volume/{0-create 1-ranks/2 2-allow_standby_replay/no 3-inline/no 4-verify} 2-client/kclient 3-upgrade-mgr-staggered 4-config-upgrade/{fail_fs} 5-upgrade-with-workload 6-verify}}	2
Failure Reason: "2024-09-04T21:10:00.000148+0000 mon.smithi022 (mon.0) 433 : cluster [WRN] osd.3 (root=default,host=smithi043) is down" in cluster log
pass	7889725	2024-09-04 20:28:25	2024-09-04 20:43:06	2024-09-04 21:20:12	0:37:06	0:31:22	0:05:44	smithi	main	centos	9.stream	orch:cephadm/upgrade/{1-start-distro/1-start-centos_9.stream-reef 2-repo_digest/defaut 3-upgrade/simple 4-wait 5-upgrade-ls agent/on mon_election/connectivity}	2
fail	7889726	2024-09-04 20:28:26	2024-09-04 20:43:16	2024-09-04 21:30:35	0:47:19	0:36:36	0:10:43	smithi	main	centos	9.stream	orch:cephadm/mds_upgrade_sequence/{bluestore-bitmap centos_9.stream conf/{client mds mgr mon osd} fail_fs/yes overrides/{ignorelist_health ignorelist_upgrade ignorelist_wrongly_marked_down pg-warn pg_health syntax} roles tasks/{0-from/squid 1-volume/{0-create 1-ranks/1 2-allow_standby_replay/yes 3-inline/yes 4-verify} 2-client/fuse 3-upgrade-mgr-staggered 4-config-upgrade/{fail_fs} 5-upgrade-with-workload 6-verify}}	2
Failure Reason: reached maximum tries (50) after waiting for 300 seconds
fail	7889727	2024-09-04 20:28:27	2024-09-04 20:47:37	2024-09-04 21:36:35	0:48:58	0:42:46	0:06:12	smithi	main	centos	9.stream	orch:cephadm/upgrade/{1-start-distro/1-start-centos_9.stream-squid 2-repo_digest/repo_digest 3-upgrade/staggered 4-wait 5-upgrade-ls agent/off mon_election/classic}	2
Failure Reason: Command failed on smithi031 with status 1: 'sudo /home/ubuntu/cephtest/cephadm --image quay.ceph.io/ceph-ci/ceph:squid shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 34153714-6b00-11ef-bcd6-c7b262605968 -e sha1=f9fcca5273b6971f640393d33a94730179073754 -- bash -c \'ceph versions \| jq -e \'"\'"\'.rgw \| length == 1\'"\'"\'\''
fail	7889728	2024-09-04 20:28:29	2024-09-04 20:47:38	2024-09-04 21:14:59	0:27:21	0:15:30	0:11:51	smithi	main	centos	9.stream	orch:cephadm/smb/{0-distro/centos_9.stream tasks/deploy_smb_mgr_ctdb_res_ips}	4
Failure Reason: "2024-09-04T21:11:53.808313+0000 mon.a (mon.0) 782 : cluster [WRN] Health check failed: 2 stray daemon(s) not managed by cephadm (CEPHADM_STRAY_DAEMON)" in cluster log
fail	7889729	2024-09-04 20:28:30	2024-09-04 20:51:38	2024-09-04 21:17:19	0:25:41	0:17:40	0:08:01	smithi	main	centos	9.stream	orch:cephadm/workunits/{0-distro/centos_9.stream_runc agent/off mon_election/classic task/test_monitoring_stack_basic}	3
Failure Reason: Command failed on smithi162 with status 5: 'sudo /home/ubuntu/cephtest/cephadm --image quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:f9fcca5273b6971f640393d33a94730179073754 shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 37544644-6b01-11ef-bcd6-c7b262605968 -- bash -c \'set -e\nset -x\nceph orch apply node-exporter\nceph orch apply grafana\nceph orch apply alertmanager\nceph orch apply prometheus\nsleep 240\nceph orch ls\nceph orch ps\nceph orch host ls\nMON_DAEMON=$(ceph orch ps --daemon-type mon -f json \| jq -r \'"\'"\'last \| .daemon_name\'"\'"\')\nGRAFANA_HOST=$(ceph orch ps --daemon-type grafana -f json \| jq -e \'"\'"\'.[]\'"\'"\' \| jq -r \'"\'"\'.hostname\'"\'"\')\nPROM_HOST=$(ceph orch ps --daemon-type prometheus -f json \| jq -e \'"\'"\'.[]\'"\'"\' \| jq -r \'"\'"\'.hostname\'"\'"\')\nALERTM_HOST=$(ceph orch ps --daemon-type alertmanager -f json \| jq -e \'"\'"\'.[]\'"\'"\' \| jq -r \'"\'"\'.hostname\'"\'"\')\nGRAFANA_IP=$(ceph orch host ls -f json \| jq -r --arg GRAFANA_HOST "$GRAFANA_HOST" \'"\'"\'.[] \| select(.hostname==$GRAFANA_HOST) \| .addr\'"\'"\')\nPROM_IP=$(ceph orch host ls -f json \| jq -r --arg PROM_HOST "$PROM_HOST" \'"\'"\'.[] \| select(.hostname==$PROM_HOST) \| .addr\'"\'"\')\nALERTM_IP=$(ceph orch host ls -f json \| jq -r --arg ALERTM_HOST "$ALERTM_HOST" \'"\'"\'.[] \| select(.hostname==$ALERTM_HOST) \| .addr\'"\'"\')\n# check each host node-exporter metrics endpoint is responsive\nALL_HOST_IPS=$(ceph orch host ls -f json \| jq -r \'"\'"\'.[] \| .addr\'"\'"\')\nfor ip in $ALL_HOST_IPS; do\n curl -s http://${ip}:9100/metric\ndone\n# check grafana endpoints are responsive and database health is okay\ncurl -k -s https://${GRAFANA_IP}:3000/api/health\ncurl -k -s https://${GRAFANA_IP}:3000/api/health \| jq -e \'"\'"\'.database == "ok"\'"\'"\'\n# stop mon daemon in order to trigger an alert\nceph orch daemon stop $MON_DAEMON\nsleep 120\n# check prometheus endpoints are responsive and mon down alert is firing\ncurl -s http://${PROM_IP}:9095/api/v1/status/config\ncurl -s http://${PROM_IP}:9095/api/v1/status/config \| jq -e \'"\'"\'.status == "success"\'"\'"\'\ncurl -s http://${PROM_IP}:9095/api/v1/alerts\ncurl -s http://${PROM_IP}:9095/api/v1/alerts \| jq -e \'"\'"\'.data \| .alerts \| .[] \| select(.labels \| .alertname == "CephMonDown") \| .state == "firing"\'"\'"\'\n# check alertmanager endpoints are responsive and mon down alert is active\ncurl -s http://${ALERTM_IP}:9093/api/v1/status\ncurl -s http://${ALERTM_IP}:9093/api/v1/alerts\ncurl -s http://${ALERTM_IP}:9093/api/v1/alerts \| jq -e \'"\'"\'.data \| .[] \| select(.labels \| .alertname == "CephMonDown") \| .status \| .state == "active"\'"\'"\'\n\''
fail	7889730	2024-09-04 20:28:31	2024-09-04 20:51:59	2024-09-04 21:35:27	0:43:28	0:36:42	0:06:46	smithi	main	centos	9.stream	orch:cephadm/mds_upgrade_sequence/{bluestore-bitmap centos_9.stream conf/{client mds mgr mon osd} fail_fs/yes overrides/{ignorelist_health ignorelist_upgrade ignorelist_wrongly_marked_down pg-warn pg_health syntax} roles tasks/{0-from/squid 1-volume/{0-create 1-ranks/1 2-allow_standby_replay/no 3-inline/yes 4-verify} 2-client/fuse 3-upgrade-mgr-staggered 4-config-upgrade/{fail_fs} 5-upgrade-with-workload 6-verify}}	2
Failure Reason: reached maximum tries (50) after waiting for 300 seconds