Client Access: External Monitoring Failure
Incident Report for Method13
Postmortem

Incident Summary

On [2024-09-13, at 5:13 MDT], the Client Access Portal (https://www.method13.com/crm/) experienced an unscheduled maintenance event, resulting in an outage that lasted approximately 42 minutes. During this time, clients were unable to access the portal to upload or retrieve files.

Timeline of Events

  • 5:13 MDT: Initial reports indicated that clients were unable to access the portal. Engineers were alerted and began investigating the issue.
  • 5:55 MDT: The issue was resolved, and the portal was fully operational again.

Root Cause

The root cause of the unscheduled maintenance was determined to be [ an authentication failure in the system due to a misconfiguration ], which temporarily blocked client login attempts using the provided credentials.

Impact

  • Affected Users: All clients attempting to access the portal between 5:13 MDT and 5:55 MDT.
  • Business Impact: File transfers and client activities dependent on the portal were delayed for approximately 42 minutes.

Resolution Steps

  1. Engineers identified the issue with our SSO intagretion and applied a fix to restore normal functionality.
  2. The system was thoroughly tested post-resolution to ensure there were no remaining issues.

Lessons Learned

  1. Improve real-time monitoring of critical services to detect similar issues faster.
  2. Evaluate redundancy and failover mechanisms to prevent total service outages in case of a single point of failure.

Next Steps

  1. Enhance monitoring and alert systems to improve early detection of portal issues.
  2. Review and test critical system components to ensure stability and minimize downtime in the future.

Conclusion

We apologize for the inconvenience caused by this unscheduled maintenance and appreciate your patience as we worked to resolve the issue. We are taking steps to improve our systems and avoid future disruptions.

Posted Sep 15, 2024 - 13:49 MDT

Resolved
This incident has been resolved.
Posted Sep 13, 2024 - 05:55 MDT
Investigating
'Engineers have been alerted to this status and an investigating'
Posted Sep 13, 2024 - 05:13 MDT