Admin Operations¶

This guide is for tenant administrators and operators who run SigID as part of a production application. It focuses on repeatable operating practices rather than implementation internals.

Operational ownership¶

Assign owners before production launch.

Area	Owner should know
Tenant administration	Who can change identity settings
Application integration	Redirect URIs, client secrets, scopes, token validation
User operations	Invitations, suspension, removal, support escalation
Organization operations	SSO, domains, ownership, directory sync
Agent operations	Agent registration, keys, delegations, vault grants
Security operations	Audit review, incidents, secret rotation, emergency freezes
Billing operations	Plan limits, checkout, portal, payment failures

Document owners in your internal runbook. Identity incidents move faster when everyone knows who can make changes.

Administrator access¶

Administrators should use the strongest available account security.

Minimum standard:

at least two administrators
MFA required
passkeys preferred
no shared administrator accounts
break-glass access documented
quarterly access review
immediate removal after role or employment changes

Keep administrator access separate from normal product access when possible.

Change management¶

Treat these as high-risk changes:

adding or removing administrators
changing redirect URIs or allowed origins
adding SSO enforcement
changing organization ownership
adding broad scopes or roles
rotating client or webhook secrets
changing agent delegation policy
enabling wallet signing or increasing budgets
deleting a webhook subscription

For high-risk changes:

Record the reason.
Confirm the requester and approver.
Make the smallest possible change.
Verify the expected behavior.
Review audit logs.
Communicate to support if user-facing behavior changed.

Database Partition Operations¶

Hash partition counts are part of capacity planning, not routine tuning. Revisit session, tenant_membership, tenant_subject_identifier, and referral-table partition counts during release planning when any of these stay true for a full business week:

average live rows exceed 1 million per hash partition
any partition is more than 4x the median live-row count
btree indexes for hot tenant-scoped lookups no longer fit in shared buffers
autovacuum lag for hot partitions exceeds the write retention target
p95 latency for tenant-scoped session, membership, or referral queries regresses by 25 percent after normal vacuum/analyze has caught up
partition planning time is a measurable part of p95 query latency on tenant-scoped hot paths

Increase partition counts only with a fresh migration plan that creates the new partitioned table, copies by tenant hash, validates row counts and constraints, swaps names during a maintenance window, and drops the old table after rollback time has passed. Decrease counts only when planning and autovacuum overhead are the bottleneck and per-partition indexes remain comfortably cache-resident after the merge.

Access reviews¶

Run access reviews on a schedule that matches tenant risk.

Review:

tenant administrators
organization owners and admins
users with billing, policy, webhook, or agent permissions
support impersonation or break-glass permissions
active API keys and OAuth clients
webhook subscriptions and event coverage
agents, keys, anchors, and delegations
vault grants and wallet access
stale invitations and suspended users

Review output should include the reviewer, date, exceptions, corrective actions, and due dates.

Support runbook¶

Common support workflows:

Request	First checks
User cannot sign in	Tenant membership, suspension, SSO enforcement, MFA state
User cannot access app	Token scopes, organization membership, role assignment
SSO login fails	Domain verification, provider metadata, client secret, redirect URI
Callback fails	Registered redirect URI, `state`, PKCE, client ID
Webhook missing	Subscription active, event type selected, delivery status
Agent denied	Anchor status, key status, scopes, delegation, policy
Billing blocked	Plan limits, payment state, billing role

Keep support replies user-facing. Avoid exposing internal event payloads, tokens, secrets, or policy internals.

Incident response¶

Follow the incident response procedure and apply the containment actions below.

Risk	Action
Compromised user	Suspend tenant user, revoke sessions, review grants and audit logs
Compromised admin	Remove admin role, revoke sessions, rotate changed secrets
Exposed OAuth secret	Rotate client secret and review token activity
Exposed webhook secret	Rotate webhook secret and reject old signatures
Agent compromise	Suspend agent, revoke keys, delegations, and vault grants
Wallet risk	Freeze wallet signing and review transactions
SSO outage	Use break-glass admin and disable enforcement only if approved

Audit review¶

Audit logs should answer who changed what, when, from where, and with what outcome.

Prioritize review for:

administrator access changes
organization ownership transfer
SSO configuration
domain verification
role assignment
policy updates
agent key or anchor changes
vault credential access
wallet signing
webhook creation or secret rotation
billing changes

Production readiness¶

Before launch, ensure both the server infrastructure and tenant configuration are production-ready. For server-level configuration (database, TLS, KMS, production mode, environment variables), see Server Configuration.

Tenant-level launch checks:

Complete the production checklist before launch. Additional operational readiness items:

support has a runbook for login, SSO, and consent triage
billing owner understands plan limits
audit review ownership is assigned
on-call rotation covers identity incidents

Ongoing maintenance¶

Schedule:

Frequency	Activity
Weekly	Review failed webhooks, SSO failures, and high-severity audit events
Monthly	Review administrators, organization owners, active agents, and stale invitations
Quarterly	Review OAuth clients, redirect URIs, webhook subscriptions, roles, and policies
After incidents	Rotate affected secrets and validate that detection rules fired

Professional identity operations are boring on purpose: narrow access, documented ownership, reviewed changes, and predictable recovery paths.