ACCRE R9 Cluster Quick and Dirty Status
Report generated at Fri Feb 6 08:23:01 AM CST 2026
Problem Nodes
HOSTNAMES STATE AVAIL_FEATURES TIMESTAMP USER REASON
cn1412 drained* intel_e5-2630_v3,haswell,intel 2026-01-20T08:55:55 broadrt Troy - RT96858 - prolog error + investigate other is
cn1547 drained intel_5218,cascadelake,intel,x 2026-01-21T13:18:27 broadrt Troy - RT96892 - T/S PSU issues
cn1574 drained intel_5218,cascadelake,intel,x 2026-02-03T09:29:57 broadrt Alex - RT91343 - replacing fans for 75 & 78
cn1575 drained intel_5218,cascadelake,intel,x 2026-02-03T09:30:30 broadrt Alex - RT91343 - these are quad nodes so the chassis
cn1576 drained intel_5218,cascadelake,intel,x 2026-02-03T09:30:53 broadrt Alex - RT91343 - are drained for repair
cn1577 drained intel_5218,cascadelake,intel,x 2026-01-23T10:28:06 root Alex - RT91343 - replacing fans
cn1586 drained intel_5218,cascadelake,intel,x 2026-01-23T10:27:29 root Alex - RT91343 - replacing fans
cn1587 drained intel_5218,cascadelake,intel,x 2026-01-23T10:27:36 root Alex - RT91343 - replacing fans
cn1588 drained intel_5218,cascadelake,intel,x 2026-01-20T14:07:42 root Alex - RT91343 - replacing fans
cn1589 drained intel_5218,cascadelake,intel,x 2026-01-23T10:27:19 root Alex - RT91343 - replacing fans
cn1593 drained* intel_5218,cascadelake,intel,x 2026-01-27T10:50:58 broadrt Troy - RT96959 - down, troubleshoot via console
dgx02 drained* dgx,a100_40gb,amd_7742,zen2,ze 2026-02-03T10:02:42 root Scott - RT96990 - read only filesystem, hot ssd
gpu0084 drained* a100,amd_9554,zen4,zen,amd,x86 2026-02-04T15:54:29 slurm Troy - RT96861 - OS drive read-only issues : Not res
gracehopper02 drained gh200,aarch64 2026-02-05T09:45:36 goffta1 Thomas - RT97016 - re-image
Queue Summary (Batch)
GROUP USER ACTIVE_JOBS ACTIVE_CORES PENDING_JOBS PENDING_CORES
-----------------------------------------------------------------------------------------
accre_guests 0 0 1 100
haojz 0 0 1 100
-----------------------------------------------------------------------------------------
beam_lab 2 32 0 0
zhuj29 2 32 0 0
-----------------------------------------------------------------------------------------
behringer_lab 2 96 0 0
haleof 2 96 0 0
-----------------------------------------------------------------------------------------
bias_group 1 35 0 0
biasds 1 35 0 0
-----------------------------------------------------------------------------------------
biostat_faculty 1 50 0 0
nianh1 1 50 0 0
-----------------------------------------------------------------------------------------
booth_lab 2 12 0 0
mathura 1 4 0 0
vessa 1 8 0 0
-----------------------------------------------------------------------------------------
brg_cores 1 16 0 0
kandelr 1 16 0 0
-----------------------------------------------------------------------------------------
cds_group 0 0 30 302
cartaij 0 0 30 302
-----------------------------------------------------------------------------------------
cgg 0 0 1 64
liy110 0 0 1 64
-----------------------------------------------------------------------------------------
chem_5420 3 6 0 0
walkeas2 3 6 0 0
-----------------------------------------------------------------------------------------
cms 71 3528 656 1532
cmslocal 34 1824 325 760
cmspilot 37 1704 331 772
-----------------------------------------------------------------------------------------
coxlab 1 1 0 0
mille131 1 1 0 0
-----------------------------------------------------------------------------------------
cqs_si 0 0 4 8
chenarsw 0 0 4 8
-----------------------------------------------------------------------------------------
csb_sanders 12 480 0 0
lig7 12 480 0 0
-----------------------------------------------------------------------------------------
das_lab 1 1 0 0
shiltmh1 1 1 0 0
-----------------------------------------------------------------------------------------
davis_lab 0 0 1 16
bluejor 0 0 1 16
-----------------------------------------------------------------------------------------
econgrads 1 1 0 0
chaj3 1 1 0 0
-----------------------------------------------------------------------------------------
edwards_lab 1 5 0 0
gorejl1 1 5 0 0
-----------------------------------------------------------------------------------------
g_benntor_lab 1 3 0 0
mccorcl1 1 3 0 0
-----------------------------------------------------------------------------------------
hadjim_lab 2 16 1 16
reasosa2 2 16 1 16
-----------------------------------------------------------------------------------------
h_biostat_kang 26 26 0 0
yanb1 26 26 0 0
-----------------------------------------------------------------------------------------
h_biostat_student 34 287 3 24
gaix1 2 40 0 0
koy2 2 24 2 8
namy1 1 1 0 0
shil10 2 2 0 0
yangc16 26 204 0 0
yih4 1 16 1 16
-----------------------------------------------------------------------------------------
h_cqs 4 60 0 0
xuh14 4 60 0 0
-----------------------------------------------------------------------------------------
h_darby_lab 2 4 0 0
mills37 2 4 0 0
-----------------------------------------------------------------------------------------
h_vmac 5 160 45 1440
wuy64 1 32 0 0
zhanm32 4 128 45 1440
-----------------------------------------------------------------------------------------
l3_aboud_lab 1 64 0 0
hongm1 1 64 0 0
-----------------------------------------------------------------------------------------
l3_jasper_lab 1 1 0 0
hattleee 1 1 0 0
-----------------------------------------------------------------------------------------
l3_vuiis_cci 25 25 0 0
vuiis_daily_s 25 25 0 0
-----------------------------------------------------------------------------------------
maha 0 0 1 1
wardbm1 0 0 1 1
-----------------------------------------------------------------------------------------
maier_lab 1 12 0 0
poggigp 1 12 0 0
-----------------------------------------------------------------------------------------
mass_spec 1 12 0 0
masapps 1 12 0 0
-----------------------------------------------------------------------------------------
mchaourab 1 8 399 399
kaot1 0 0 399 399
tangq3 1 8 0 0
-----------------------------------------------------------------------------------------
mcml 0 0 3 256
odenyogg 0 0 2 192
subravvr 0 0 1 64
-----------------------------------------------------------------------------------------
moro_lab 1 8 0 0
lehn 1 8 0 0
-----------------------------------------------------------------------------------------
nbody 165 657 119 464
ligo 164 656 119 464
smitm77 1 1 0 0
-----------------------------------------------------------------------------------------
neurogroup 1 3 0 0
roggeokk 1 3 0 0
-----------------------------------------------------------------------------------------
nordman_lab 1 32 0 0
shephc3 1 32 0 0
-----------------------------------------------------------------------------------------
p_csb_meiler 474 1553 16940 129507
huntek1 388 388 6808 6808
mothcw 3 3 996 996
tydingcw 83 1162 8659 121226
yange8 0 0 477 477
-----------------------------------------------------------------------------------------
p_dsi 0 0 1 1
yangi1 0 0 1 1
-----------------------------------------------------------------------------------------
p_matheny_lab 3 13 0 0
koolajd1 3 13 0 0
-----------------------------------------------------------------------------------------
p_meiler 0 0 1 1
yange8 0 0 1 1
-----------------------------------------------------------------------------------------
rer 4 48 0 0
cantrekb 1 6 0 0
hum6 2 32 0 0
paciarja 1 10 0 0
-----------------------------------------------------------------------------------------
rke_group 30 120 0 0
sleethmr 30 120 0 0
-----------------------------------------------------------------------------------------
rokaslab 3 25 0 0
copea1 1 1 0 0
riedlio 2 24 0 0
-----------------------------------------------------------------------------------------
rubinov_lab 2 38 0 0
abbasia 1 30 0 0
wuw11 1 8 0 0
-----------------------------------------------------------------------------------------
ruderferlab 2 2 1 6
davya2 1 1 1 6
palmesa3 1 1 0 0
-----------------------------------------------------------------------------------------
sbcs 1 2 0 0
liq17 1 2 0 0
-----------------------------------------------------------------------------------------
stein_lab 1 1 0 0
karakg1 1 1 0 0
-----------------------------------------------------------------------------------------
tk_lab 6 240 0 0
yoonh15 6 240 0 0
-----------------------------------------------------------------------------------------
vgi 15 139 0 0
gaow9 13 130 0 0
niarcm2 1 5 0 0
salerl1 1 4 0 0
-----------------------------------------------------------------------------------------
walker_lab 1 16 0 0
davishl4 1 16 0 0
-----------------------------------------------------------------------------------------
wankowicz_lab 1700 1700 50974 50974
wankows 1700 1700 50974 50974
-----------------------------------------------------------------------------------------
wan_lab 1 400 0 0
hardenn 1 400 0 0
-----------------------------------------------------------------------------------------
williams_roberson_lab 1 1 0 0
yeohb1 1 1 0 0
-----------------------------------------------------------------------------------------
womelsdorf_lab 1 20 0 0
azezewka 1 20 0 0
-----------------------------------------------------------------------------------------
zhu_group 1 32 1 32
zhuw12 1 32 1 32
-----------------------------------------------------------------------------------------
Totals: 2617 9991 69182 185143
Queue Summary (Batch GPU)
GROUP USER ACTIVE_JOBS ACTIVE_GPUS PENDING_JOBS PENDING_GPUS
-----------------------------------------------------------------------------------------
accre_guests_acc 5 5 0 0
lih30 3 3 0 0
liy110 2 2 0 0
-----------------------------------------------------------------------------------------
csb_gpu_acc 5 7 0 0
kaermel 1 1 0 0
karadim 1 1 0 0
lybrantp 1 1 0 0
ranx 2 4 0 0
-----------------------------------------------------------------------------------------
h_oguz_lab_acc 1 1 0 0
lih30 1 1 0 0
-----------------------------------------------------------------------------------------
h_vmac_acc 0 0 1 1
janveva 0 0 1 1
-----------------------------------------------------------------------------------------
mchaourab_acc 1 4 399 399
kaot1 0 0 399 399
wut18 1 4 0 0
-----------------------------------------------------------------------------------------
nbody_acc 2 16 3 3
bustam1 0 0 3 3
khanfm 2 16 0 0
-----------------------------------------------------------------------------------------
p_meiler_acc 1 2 1 1
labeilro 1 2 0 0
scotj14 0 0 1 1
-----------------------------------------------------------------------------------------
Totals: 15 35 404 404
Queue Summary (interactive)
GROUP USER ACTIVE_JOBS ACTIVE_CORES PENDING_JOBS PENDING_CORES
-----------------------------------------------------------------------------------------
g_giri_group_int 2 8 0 0
breyem3 1 4 0 0
giria1 1 4 0 0
-----------------------------------------------------------------------------------------
h_vmac_int 1 12 0 0
jackb13 1 12 0 0
-----------------------------------------------------------------------------------------
maiziezhou_lab_phd_int 2 25 1 20
luoc1 2 25 0 0
zhuy45 0 0 1 20
-----------------------------------------------------------------------------------------
rubinov_lab_int 1 10 0 0
sardarn 1 10 0 0
-----------------------------------------------------------------------------------------
vgi_int 2 22 0 0
nitinr 1 6 0 0
shellejp 1 16 0 0
-----------------------------------------------------------------------------------------
yang_lab_int 1 8 0 0
shaoq1 1 8 0 0
-----------------------------------------------------------------------------------------
Totals: 9 85 1 20
Queue Summary (interactive_gpu)
GROUP USER ACTIVE_JOBS ACTIVE_GPUS PENDING_JOBS PENDING_GPUS
-----------------------------------------------------------------------------------------
dsi_dgx_iacc 4 4 0 0
chattec 1 1 0 0
criswea 1 1 0 0
may19 1 1 0 0
mohamb2 1 1 0 0
-----------------------------------------------------------------------------------------
Totals: 4 4 0 0
Partition Summary
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
interactive up 14-00:00:0 8 mix cn[1287,1301-1302,1322,1328,1330,1812-1813]
interactive up 14-00:00:0 1 alloc cn1329
interactive up 14-00:00:0 19 idle cn[1323-1326,1707,1800-1811,1814,1816]
batch* up 14-00:00:0 2 drain* cn[1412,1593]
batch* up 14-00:00:0 9 drain cn[1547,1574-1577,1586-1589]
batch* up 14-00:00:0 100 mix cn[1202-1203,1207,1215-1216,1219,1222,1224,1226,1229-1231,1234,1237-1238,1327,1378-1380,1385,1415,1419,1430,1437-1438,1441,1448,1450,1454,1456,1463,1466,1470-1476,1481,1483-1484,1492,1496,1500-1501,1511,1513,1515-1517,1525,1529-1530,1532,1537,1543-1545,1548-1549,1551,1553-1556,1558,1562,1564,1567-1569,1571,1573,1579,1583,1594,1602-1605,1607,1609-1610,1612,1615-1616,1619,1625-1626,1630-1631,1701-1706,1708,2000]
batch* up 14-00:00:0 277 alloc cn[1204-1206,1208-1213,1217-1218,1220-1221,1223,1225,1227-1228,1232-1233,1235-1236,1239-1242,1257-1262,1264-1286,1288-1299,1303-1318,1320-1321,1331-1355,1357-1369,1371-1377,1381-1384,1387-1411,1414,1416-1418,1420-1427,1431-1432,1434-1436,1439-1440,1442-1443,1445-1447,1449,1452-1453,1455,1457-1458,1460-1462,1464,1467-1469,1477-1480,1482,1485-1491,1493-1495,1497-1499,1502-1510,1512,1514,1518-1520,1522-1524,1526-1528,1531,1533-1536,1538,1540,1546,1550,1552,1557,1559,1561,1563,1565-1566,1570,1578,1580-1582,1584-1585,1592,1595-1597,1606,1608,1613-1614,1617-1618,1620-1624,1627-1629,1632-1633,1700]
batch_gpu up 14-00:00:0 1 drain* gpu0084
batch_gpu up 14-00:00:0 1 drain gracehopper02
batch_gpu up 14-00:00:0 10 mix gpu[0059,0062-0063,0075-0077,0082],gracehopper01,hgx[01-02]
batch_gpu up 14-00:00:0 1 alloc gpu0302
batch_gpu up 14-00:00:0 46 idle gpu[0013,0015,0017-0022,0026-0027,0033-0034,0039,0045-0046,0049-0050,0053,0060-0061,0064-0074,0078-0081,0085,0300-0301,0303-0310]
interactive_gpu up 14-00:00:0 1 drain* dgx02
interactive_gpu up 14-00:00:0 2 mix dgx[01,03]
interactive_gpu up 14-00:00:0 3 idle dgx04,gpu[0058,0207]
sam up 2-02:00:00 2 alloc cms-sam-[01-02]
reserved inact infinite 0 n/a