1. HOME
  2. Information
  3. Current Trouble
  4. Failure of Storage (already recovered)

コンテンツ

Current Trouble

Failure of Storage (already recovered)

publication date : Nov.11, 2019


To all supercomputer users

Storage(LARGE0/LARGE1) has a slow response from 2:20 p.m. on November 11th.
We are currently working for investigation and restoration so please wait for a while until recovery.

We apologize for any inconvenience this has caused you.


Additional notes: 2019/11/11 18:00

At present, a part of the large-capacity disk (LARGE0/LARGE1) is not recognized, and it takes a long time to check the disk.
As a result of this trouble, the new execution of the job that has already been submitted is temporarily suspended.
As soon as you recover, you will resume the execution of your jobs. I hope you will wait for a while now. If you have recovered, you will be informed again.


Additional notes: 2019/11/12 8:50

The failure was restored on Monday, November 11 at 20:08.
Since the failure was caused by hardware, the equipment will be replaced today.

The system is not stopped due to replacement of the equipment.
The following jobs have been abnormally terminated or rerun due to the failure.
For jobs that have ended abnormally, please submit them again.


● System A: Job list of Abnormally terminated during the failure occurrence.

1000209 1000213 1000219 1000217

System B: Job list of Abnormally terminated during the failure occurrence.

3947486 3943203 3950869 3950872 3949181 3949182 3921516
3921516 3949831 3950663 3906867 3950934 3950306 3950656
3949790 3950656 3950119 3950127 3906949 3950647 3950938
3949814 3950825 3950850 3949808 3947716 3949245 3928647
3942837 3942370 3950345 3947705 3945647 3947705

3945647

System B: Job list of rerun after recovery from the failure.

3949716 3910611 3928267 3910611 3928267 3949716 3949929
3936281 3947806 3936282 3950774 3928224 3928226 3938689
3938690 3950954 3922479 3950299 3925832 3950103 3950104
3950106 3949917 3947733 3947734 3947824 3947827 3950393
3925822 3918774 3924214 3947749 3947750 3947751 3950228
3950229 3950230 3945860 3945859 3917195 3950786 3947764
3947765 3947766 3950255 3950256 3950257 3947643 3947644
3947645 3950805[469] 3950805[474] 3950805[475] 3950805[476] 3950805[477] 3950805[478]
3950805[479] 3950805[480] 3950805[481] 3950805[482] 3950805[483] 3950805[484] 3950805[485]
3950805[486] 3950805[487] 3950805[488] 3950805[489] 3950805[490] 3950805[491] 3950805[492]
3950805[493] 3950805[494] 3950805[495] 3950805[496] 3950805[497] 3950805[498] 3950805[499]
3950805[146] 3950805[182] 3950805[500] 3950805[2] 3950805[38]

System C: Job list of Abnormally terminated during the failure occurrence.

369991 370323 370169 366932 367029 367030 370277

● システムC: Job list of rerun after recovery from the failure.

370264
The system was restored. We apologize for the inconvenience and trouble that you may have had.
Date of occurrence 2019/11/11 14:20 ~2019/11/11 20:08
Inquiry Supercomputing Section, IT Services Division, Information Management Department, Kyoto University
E-mail:consultkudpc.kyoto-u.ac.jp
Inquiry Form

Back to Current Trouble

 

Copyright © Institute for Information Management and Communication, Kyoto University, all rights reserved.