Failure of System B and C(already recovered)
publication date : Oct.16, 2017
System failure has occurred in the system B/C from 2:50 p.m. to 5:10 p.m. on October 16, 2017 and programs with high load exceeded the control of the job scheduler and operated on the computation nodes which other jobs are running. During this period, the memory loads of the computation nodes nb-0001 to nb-0702 of the system B and nc - 0001 to nc - 0016 of the system C were in a high state.
Due to this failure, there is a possibility that the performance of the running job has deteriorated. Also, on the System C, there is a possibility that some jobs were abnormally terminated or re-executed. Please check and submit the jobs again if the jobs were abnormally terminated.
Currently, the program was canceled and the trouble was restored.
We apologize for any inconvenience this may have caused you.
|Date of occurrence||2017/10/16 14:50 ～2017/10/16 17:10|
Supercomputing Section, IT Services Division, Information Management Department, Kyoto University