Cluster down
Page 1 of 1
Cluster down
StarExec is down today unexpectedly due to a power outage. We had been told it would be a brief outage, and so we took down all2.q since those nodes are not on a UPS. We expected n033 to n192 to stay up thank to their UPS. Unfortunately, there was some miscommunication from the data center folks to our IT folks, and the outage is 4 hours, not 2 minutes. So the whole cluster is down now, and will be until the afternoon.
Hopefully this will not inconvenience anyone too much. For running jobs, you may just have to re-run a few job pairs that might have gotten lost when the cluster was taken down. This can be done from the details page for the job pair, or via StarExecCommand.
Sorry for the inconvenience and the short notice (we just found out about this minutes ago),
Aaron
Hopefully this will not inconvenience anyone too much. For running jobs, you may just have to re-run a few job pairs that might have gotten lost when the cluster was taken down. This can be done from the details page for the job pair, or via StarExecCommand.
Sorry for the inconvenience and the short notice (we just found out about this minutes ago),
Aaron
Similar topics
» Cluster back up
» improve cluster status screen
» cannot access job info, cannot see cluster status
» Jobs not running, but cluster appears idle?
» cluster status : internal error populating table
» improve cluster status screen
» cannot access job info, cannot see cluster status
» Jobs not running, but cluster appears idle?
» cluster status : internal error populating table
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum