Applications unavailable

Incident Report for Simployer

Postmortem

Postmortem report of incident at October 31st, 2019

At 17:47 (Norwegian time) on October 31st, 2019 our hosting provider initiated a disaster test in their datacenter in Halden, Norway. The test was meant to simulate a power failure in the datacenter and engage the diesel generators for backup power. Unfortunately, none of their customers was warned ahead of this test, and the test was done outside of scheduled service windows.

The test went horribly wrong, and power for the whole datacenter was not restored as expected. This again leading to all our products and services went down.

We got immediate notifications of the problem through our monitoring systems, which runs in another datacenter, and started working with the hosting provider to restore services. We also did status updates for customers regularly during the incident.

Unfortunately, bringing up a datacenter to full speed after such an incident takes time, and our services were not fully operational until 22:45 Norwegian time. There was no loss of data due to the incident.

Scheduled integrations that were set to run between our systems and customer’s systems were not executed between 17:47 and 22:45. If you have custom integrations that were affected by the outage, please contact our Customer Care, and they will provide help.

We will take actions with the hosting provider to make sure such test are never run outside of scheduled service windows.

We apologize to all our customers affected by the outage. This should never have happened, and we will do our very best, together with our subcontractors, to make sure it does not happen again.

Best regards

Flemming Ottosen

Technology Director

Infotjenester AS

Posted Nov 01, 2019 - 09:24 CET

Resolved

This incident has been resolved.
Posted Nov 01, 2019 - 08:44 CET

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Oct 31, 2019 - 22:54 CET

Update

All services are back up and running. We will continue to monitor the situation to verify the stability of all services.
We are truly sorry for any inconvenience the outage may have caused.
Posted Oct 31, 2019 - 22:51 CET

Update

The incident is related to a major outage with our Norwegian hosting partner. We are working with the hosting provider to restore all services as fast as possible.
Posted Oct 31, 2019 - 20:06 CET

Update

We are continuing to work on a fix for this issue.
Posted Oct 31, 2019 - 19:21 CET

Identified

An issue has caused our applications to be unavailable. We have identified the issue and are working on restoring them.
We apologize for the inconvenience.
Posted Oct 31, 2019 - 19:20 CET
This incident affected: Simployer HRM (HRM Web application, HRM Mobile app, HRM Reports, HRConnect API), Status notifications (Status Email Notifications, Status SMS Notifications), Simployer Handbooks (Handbooks web, Handbooks Mobile), Simployer Expert (Expert help, My messages), Simployer Middleware (Filedrop, Middleware, SFTP), Simployer Talent Portal (Web Application, Public API, Notifications, Middleware (User imports), Reports), Simployer Time and Planning (Capitech Tid, Capitech Flow, My Capitech), and Simployer Logon, Simployer HES Deviation.