Skip to content
Mailing Lists/CIP-TBD: Canton Ecosystem Status Page & Incident TransparencySource on lists.sync.global ↗

CIP-TBD: Canton Ecosystem Status Page & Incident Transparency

cip-discuss3 messagesstarted 06-02-2026
Also mentions:CIP-0023CIP-0060CIP-0062CIP-0089
  1. #1Evan V. Temple Digital Group06-02-2026source ↗

    Abstract

    This CIP proposes the creation of a Canton Ecosystem Status Page and a complementary Slack channel to provide real-time visibility into network health, outages, scheduled maintenance, and incident resolution across the Global Synchronizer and connected infrastructure. The goal is to eliminate redundant back-and-forth communication during technical incidents, increase transparency between Super Validators, Validators, and application builders, and enable potential contributions to incident awareness.

    Motivation

    As the Canton Network matures and onboards more validators, application providers, and institutional participants, the need for a centralized, authoritative source of network health information has become critical. Today, when outages or degraded performance occur on the Global Synchronizer — whether during planned upgrades (e.g., Major Upgrades with Downtime per CIP-0062, CIP-0089) or unplanned incidents — builders and validators must rely on fragmented communication across mailing lists, Slack threads, and ad-hoc messages to understand what is happening, what is affected, and what the resolution timeline looks like.

    This creates several problems:

    For Builders: Application developers building on Canton (e.g., trading platforms, settlement systems, custody integrations) cannot easily distinguish between an issue in their own stack versus a network-level incident. Without a canonical status page, engineering teams waste hours debugging local infrastructure before discovering the root cause is upstream. 

    For Validators: Validator operators, including Node-as-a-Service providers hosting on behalf of institutions, need timely incident data to communicate with their own customers. 

    For Super Validators: During incidents, Super Validator operators are often inundated with the same questions from multiple parties simultaneously. A canonical status source would dramatically reduce this operational overhead and let SV ops teams focus on resolution rather than communication.

    Specification

    1. Public Status Page

    Deploy a publicly accessible status page (e.g., status.canton.network or status.sync.global) that provides:

    Component Monitoring:

    • Global Synchronizer (sequencer health, round progression)
    • Scan API and Validator API availability
    • SV/Validator availability and response times
    • DevNet, TestNet, and MainNet environments (displayed separately)

    Incident Management:

    • Real-time incident creation with severity levels (Operational, Degraded Performance, Partial Outage, Major Outage)
    • Timestamped status updates throughout incident lifecycle (Investigating → Identified → Monitoring → Resolved)
    • Post-incident summaries with root cause analysis
    • Scheduled maintenance windows with advance notice (minimum 48 hours for planned upgrades, consistent with existing Major Upgrade with Downtime procedures)

    Subscription Mechanism:

    • Email, Slack, webhook, and RSS subscription options for status updates
    • Granular subscriptions (e.g., MainNet-only, specific components)

    2. Slack Channel with Webhook Integration

    Create a dedicated, Slack channel (e.g., #canton-status) within the Canton Network Slack workspace that:

    • Receives automated posts from the status page via webhook whenever an incident is created, updated, or resolved
    • Posts scheduled maintenance reminders at 48h, 24h, and 1h before planned windows
    • Includes links back to the full status page for detailed information and historical context

    Implementation

    Platform

    The status page should be deployed using an established status page platform (e.g., Atlassian Statuspage, Instatus, or an open-source solution like Cachet or Upptime) that supports:

    • Webhook integrations for Slack and other messaging platforms
    • API access for programmatic incident creation and status queries
    • Custom domain hosting under a canton.network or sync.global subdomain
    • SSO integration for administrator access

    Phase 1 (Within 30 days of CIP approval):

    • Deploy status page with manual incident management for MainNet
    • Create the Slack channel with webhook integration
    • Establish incident severity definitions and escalation procedures

    Phase 2 (Within 90 days of CIP approval):

    • Add DevNet and TestNet environments
    • Implement automated health checks for key components (round progression, API availability)
    • Enable subscriber notifications (email, webhook, RSS)

    Phase 3 (Within 120 days of CIP approval):

    • Integrate automated monitoring that can trigger incident creation based on anomaly detection
    • Publish a public SLA dashboard with historical uptime metrics
    • Evaluate integration with the Canton Scan explorer for correlated visibility

    Governance

    • The SV Operations Committee (CIP-0060) is responsible for administering the status page and approving incident publications
    • Any Super Validator operator can create an incident
     
    This CIP is licensed under CC0-1.0: Creative Commons CC0 1.0 Universal.
  2. #2Wayne Collier10-02-2026source ↗
    On Fri, Feb 6, 2026 at 11:54 AM, Evan V. Temple Digital Group wrote:
    The SV Operations Committee (CIP-0060)
    Could you clarify which CIP you're referring to here? CIP-0023 was withdrawn, and CIP-0060 does not refer to an SV Operations Committee. 
  3. #3Evan V. Temple Digital Group10-02-2026source ↗
    Hey Wayne,
     
    Thanks for your feedback, I must have copied the wrong CIPs, I was referring to updates that required downtime such as https://github.com/canton-foundation/cips/blob/main/CIP-0089/CIP-0089.md