Impact and Timelines
From 8:39:00 AM to 9:28:15 AM UTC on March 17, 2023, OKX experienced partial to full unavailability of trading systems. Below is the detailed incident timeline:
- 08:39:00 AM UTC: Intermittent alerts triggered in trading systems. Engineering teams initiated immediate investigations.
- 08:49:00 AM UTC: Trading suspended proactively to ensure market order. Root cause identified.
- 08:50:00 AM UTC: Outage notification published on the π Status Page.
- 09:18:15 AM UTC: Partial services restored (order cancellations, post-only orders, fund transfers).
- 09:28:15 AM UTC: Full trading services resumed.
Root Cause Analysis
The downtime resulted from resource exhaustion in a core infrastructure component due to:
- Unexpectedly high transient load from a log process.
- Subsequent failure cascading to downstream trading systems.
Proactive measure: Trading suspension prevented disorderly market conditions during resolution.
Preventative Actions
To minimize future disruptions, OKX is implementing:
Technical Optimizations
- Log process scaling (e.g., file size limits).
- Server/client-end monitoring enhancements.
Procedural Improvements
- Detailed incident documentation for root-cause analysis.
- Streamlined alert protocols for faster response.
System Redundancies
- Upgraded infrastructure resilience.
Commitment to Reliability
OKX prioritizes:
- Transparency: Real-time updates via π Status Page and API channels.
- Prevention: Continuous system performance audits.
- Communication: Timely notifications through official channels.
FAQs
Q: How long did the outage last?
A: 49 minutes (08:39β09:28 AM UTC).
Q: Were user funds affected?
A: No. Fund safety protocols remained intact.
Q: Whatβs being done to prevent recurrence?
A: Infrastructure upgrades and enhanced monitoring.
Q: Where can I check real-time system status?
A: Visit the π OKX Status Page.
Note: This report replaces all prior communications dated March 20, 2023.