Skip to content

Case Study: WebAuthn Passkey Authentication

This case study demonstrates the specification framework through a complex feature example: adding WebAuthn passkey authentication to a video surveillance platform backend. This case study highlights challenges when integrating with third-party identity providers (AWS Cognito) and managing breaking constraints.

High-Level Specification (Proposal)

The proposal captures business intent, technical constraints, and breaking change implications.

markdown
# Change Proposal: Add WebAuthn Passkey Support

## Why

Users increasingly expect passwordless authentication options for better security
and user experience. Passkeys (WebAuthn) provide phishing-resistant authentication
that's more secure than traditional passwords and easier for users. AWS Cognito
added native support for passwordless authentication including passkeys in
November 2024, making this a viable enhancement for the platform.

## What Changes

- Add WebAuthn/passkey authentication as an alternative first-factor method
- Implement passkey registration workflow (up to 20 passkeys per user)
- Add GraphQL APIs for passkey management (register, list, delete)
- Modify authentication flow to support password and passwordless paths
- Add organization-level setting `passkeyEnabled` (default false)

## Impact

### Affected Specs
- NEW: authentication/passkey-management
- MODIFIED: organization-settings (add passkeyEnabled flag with MFA constraint)

### Breaking Changes
- **BREAKING**: Organizations with `isRequiredMFA=true` cannot enable passkey
  authentication (AWS Cognito constraint: MFA and passwordless are mutually exclusive)
- Organizations must choose between enforced MFA or passkey support
- Users with personal MFA enabled cannot authenticate with passkeys

### Dependencies
- AWS Cognito Essentials or Plus plan (Lite plan does not support passkeys)
- Frontend must implement WebAuthn browser APIs
- Choice-based authentication flow configuration in Cognito

### Migration
- No data migration required for existing users
- Organizations wanting passkeys must disable MFA enforcement first
- Feature flag for controlled rollout

Key Characteristics:

  • Explicitly identifies platform constraint (Cognito MFA/passkey mutual exclusivity)
  • Classifies breaking change with clear impact scope
  • Lists external dependencies that affect implementation choices

Design Document

The design document captures architectural decisions driven by external platform constraints.

markdown
# Design Document: WebAuthn Passkey Support

## Context

Vortex Backend uses AWS Cognito for user authentication with two MFA mechanisms:
1. **User-level MFA**: Individual users enable TOTP via `AdminSetUserMFAPreference`
2. **Organization-level MFA**: Organizations require all users to use MFA via
   `isRequiredMFA` flag

AWS Cognito constraint (November 2024 documentation):
> "If you require multi-factor authentication (MFA) in your user pool,
> you cannot use passwordless authentication."

## Goals / Non-Goals

### Goals
- Enable passwordless authentication using WebAuthn passkeys
- Support passkey registration, listing, and deletion
- Maintain backward compatibility with password + MFA authentication
- Provide clear configuration options for organizations

### Non-Goals
- Replace existing password-based authentication
- Modify existing TOTP MFA implementation
- Custom WebAuthn implementation (will use Cognito native support)
- Passkey-based MFA (AWS Cognito doesn't support this)

## Decisions

### Decision 1: Use Cognito Native Passkey Support
**Choice:** Use AWS Cognito's native WebAuthn support rather than custom implementation.

**Rationale:**
- Reduces implementation complexity and security risks
- Leverages AWS's managed infrastructure and compliance
- Native integration with existing user pools
- Built-in support for standard WebAuthn APIs

**Alternatives Considered:**
- Custom WebAuthn with Lambda: Rejected due to security audit requirements
- Third-party service (Okta): Rejected to avoid vendor migration costs

### Decision 2: MFA and Passkey Mutual Exclusivity at Organization Level
**Choice:** Add `passkeyEnabled` boolean (default false) with validation:
`isRequiredMFA && passkeyEnabled` cannot both be true.

**Rationale:**
- AWS Cognito enforces this at the user pool level
- Clear separation prevents user confusion
- Explicit opt-in for passkey functionality

**Alternatives Considered:**
- Auto-disable MFA when passkeys enabled: Too dangerous (security downgrade)
- Separate user pool for passkey users: Too complex (data sync issues)

### Decision 3: Support Both ES256 and RS256 Algorithms
**Choice:** Accept both ES256 (-7) and RS256 (-257) for passkey registration.

**Rationale:**
- Maximizes device compatibility (different platforms prefer different algorithms)
- Cognito supports both; no security downside

### Decision 4: Support Synced and Device-bound Passkeys
**Choice:** Allow both device-bound and synced passkeys (iCloud Keychain, Google).

**Rationale:**
- Better UX (passkeys work across user's devices)
- Industry trend toward synced passkeys for usability
- Users choose based on security/convenience preference

## Risks

### Risk 1: Organizations Cannot Use Both MFA and Passkeys
**Impact:** Organizations requiring MFA for compliance must choose between
enforcement and passkey convenience.

**Mitigation:**
- Document constraint clearly in admin UI
- Provide migration guide
- Monitor AWS Cognito roadmap for future support

### Risk 2: User Lockout
**Impact:** Users might lose access if they lose passkey device without backup.

**Mitigation:**
- Allow multiple passkeys per user (up to 20)
- Encourage synced passkeys
- Always keep password authentication as backup

Key Characteristics:

  • Documents external platform constraints as first-class concerns
  • Each decision traces to platform limitation or security consideration
  • Risks acknowledge constraints outside team's control

Domain Specification

Domain specifications define requirements with scenarios covering the constraint.

markdown
# Authentication Capability: WebAuthn Passkey Support

## Requirement: Passkey Registration Challenge
The system SHALL provide an API to initiate passkey registration.

#### Scenario: User requests passkey registration challenge
- **GIVEN** a user is authenticated
- **AND** the organization has `passkeyEnabled=true`
- **WHEN** the user requests a registration challenge via `registerPasskeyChallenge`
- **THEN** the system SHALL return challenge ID and WebAuthn options
- **AND** the challenge SHALL be valid for 5 minutes

#### Scenario: Organization has passkeys disabled
- **GIVEN** a user belongs to an organization with `passkeyEnabled=false`
- **WHEN** the user requests a passkey registration challenge
- **THEN** the system SHALL return error "PASSKEYS_NOT_ENABLED"

## Requirement: Passkey Registration Completion
The system SHALL accept and verify WebAuthn credentials.

#### Scenario: Valid passkey credential submitted
- **GIVEN** a user has requested a registration challenge
- **WHEN** the user submits valid signed credential within 5 minutes
- **THEN** the system SHALL verify the credential with AWS Cognito
- **AND** store the passkey in the user's account
- **AND** return success with passkey metadata

#### Scenario: Maximum passkeys limit reached
- **GIVEN** a user has 20 registered passkeys
- **WHEN** the user attempts to register another passkey
- **THEN** the system SHALL return error "MAX_PASSKEYS_REACHED"

## Requirement: Passkey Listing
The system SHALL provide API to list user's registered passkeys.

#### Scenario: User lists their passkeys
- **GIVEN** a user has registered passkeys
- **WHEN** the user queries via `listPasskeys`
- **THEN** the system SHALL return passkey metadata including:
  - Passkey ID, Credential ID, Friendly name
  - Creation timestamp, Last used timestamp
  - Supported transports

## Requirement: Passkey Deletion
The system SHALL provide API to delete a registered passkey.

#### Scenario: User deletes their own passkey
- **GIVEN** a user has registered passkeys
- **WHEN** the user deletes a passkey via `deletePasskey`
- **THEN** the system SHALL remove the passkey from Cognito
- **AND** return success response

#### Scenario: User attempts to delete another user's passkey
- **GIVEN** a user attempts to delete another user's passkey
- **WHEN** the request is made
- **THEN** the system SHALL return authorization error

## Requirement: MFA and Passkey Mutual Exclusivity
The system SHALL enforce mutual exclusivity between MFA requirement and passkeys.

#### Scenario: Enable passkeys when MFA not required
- **GIVEN** an organization has `isRequiredMFA=false`
- **WHEN** admin sets `passkeyEnabled=true`
- **THEN** the system SHALL update the setting
- **AND** users can register passkeys

#### Scenario: Attempt to enable passkeys with required MFA
- **GIVEN** an organization has `isRequiredMFA=true`
- **WHEN** admin attempts to set `passkeyEnabled=true`
- **THEN** the system SHALL reject with error "MFA_PASSKEY_CONFLICT"
- **AND** return message explaining the constraint

#### Scenario: Attempt to require MFA with passkeys enabled
- **GIVEN** an organization has `passkeyEnabled=true`
- **WHEN** admin attempts to set `isRequiredMFA=true`
- **THEN** the system SHALL reject with error "MFA_PASSKEY_CONFLICT"

## Requirement: Audit Logging
The system SHALL log all passkey operations for security audit.

#### Scenario: Passkey registration logged
- **GIVEN** a user registers a passkey
- **WHEN** registration completes
- **THEN** the system SHALL log: user ID, timestamp, credential ID, result

Key Characteristics:

  • Constraint scenarios (MFA/passkey conflict) treated as first-class requirements
  • Error codes specified for frontend integration
  • Security requirements (audit logging) included

Implementation Tasks

Tasks organized by layer, reflecting the constraint-driven design.

markdown
# Implementation Tasks: WebAuthn Passkey Support

## 1. Prerequisites and Investigation
- [ ] 1.1 Verify Cognito user pool plan (Essentials/Plus required)
- [ ] 1.2 Review Cognito passkey API documentation and limits
- [ ] 1.3 Test passkey registration in staging Cognito pool
- [ ] 1.4 Document browser compatibility matrix for WebAuthn

## 2. Infrastructure Configuration
- [ ] 2.1 Enable choice-based authentication flow in Cognito
- [ ] 2.2 Enable passkey authentication method
- [ ] 2.3 Configure allowed origins for WebAuthn
- [ ] 2.4 Set passkey algorithm preferences (ES256, RS256)
- [ ] 2.5 Update Terraform configurations

## 3. Database Schema Changes
- [ ] 3.1 Add `passkeyEnabled` boolean to Organization table
- [ ] 3.2 Create migration script (default `passkeyEnabled=false`)
- [ ] 3.3 Add validation preventing `isRequiredMFA && passkeyEnabled`

## 4. Domain Layer
- [ ] 4.1 Create passkey domain entities
  - [ ] 4.1.1 PasskeyEntity (id, credentialId, friendlyName, timestamps)
  - [ ] 4.1.2 PasskeyChallenge entity for registration flow
  - [ ] 4.1.3 Error types for passkey operations
- [ ] 4.2 Define passkey service interface

## 5. Cognito Adapter Layer
- [ ] 5.1 Extend Cognito adapter with WebAuthn methods
  - [ ] 5.1.1 StartWebAuthnRegistration
  - [ ] 5.1.2 CompleteWebAuthnRegistration
  - [ ] 5.1.3 ListWebAuthnCredentials
  - [ ] 5.1.4 DeleteWebAuthnCredential
- [ ] 5.2 Implement error handling for Cognito API errors
- [ ] 5.3 Add audit logging for passkey operations

## 6. Application Service Layer
- [ ] 6.1 Create passkey application services
  - [ ] 6.1.1 RegisterPasskeyChallengeService
  - [ ] 6.1.2 RegisterPasskeyCompleteService
  - [ ] 6.1.3 ListPasskeysService
  - [ ] 6.1.4 DeletePasskeyService
- [ ] 6.2 Add privilege checks (own passkeys only)
- [ ] 6.3 Implement rate limiting for registration

## 7. Organization Settings Update
- [ ] 7.1 Add `passkeyEnabled` to organization service
- [ ] 7.2 Implement MFA/passkey mutual exclusivity validation
- [ ] 7.3 Add audit logging for configuration changes

## 8. GraphQL Schema
- [ ] 8.1 Define passkey types (PasskeyInfo, Challenge, Credential)
- [ ] 8.2 Add mutations (registerChallenge, registerComplete, delete)
- [ ] 8.3 Add queries (listPasskeys)
- [ ] 8.4 Update Organization type with `passkeyEnabled`

## 9. GraphQL Resolvers
- [ ] 9.1 Implement passkey mutation resolvers
- [ ] 9.2 Implement passkey query resolvers
- [ ] 9.3 Add error handling and logging

## 10. Testing
- [ ] 10.1 Unit tests for domain and service layers
- [ ] 10.2 Integration tests for passkey registration flow
- [ ] 10.3 Test MFA/passkey conflict scenarios
- [ ] 10.4 Test max passkeys limit (20)
- [ ] 10.5 Test unauthorized access scenarios

## 11. Documentation
- [ ] 11.1 API documentation with passkey endpoints
- [ ] 11.2 Migration guide for MFA-required organizations
- [ ] 11.3 Browser compatibility requirements
- [ ] 11.4 Troubleshooting guide

## 12. Deployment
- [ ] 12.1 Deploy infrastructure changes (Cognito config)
- [ ] 12.2 Run database migrations
- [ ] 12.3 Deploy with feature flag disabled
- [ ] 12.4 Enable for pilot organizations
- [ ] 12.5 Gradual rollout to all organizations

Key Characteristics:

  • Infrastructure tasks come first (Cognito configuration required)
  • Validation logic explicitly called out (MFA/passkey conflict)
  • Rollout strategy reflects breaking change sensitivity

Specification Traceability

The Passkey feature demonstrates traceability through a constraint-driven implementation:

Intent: "Users need passwordless authentication for better security and UX"

    ├── Proposal (proposal.md)
    │   ├── Identifies AWS Cognito constraint
    │   └── Classifies breaking change (MFA/passkey mutual exclusivity)

    ├── Design Document (design.md)
    │   ├── 4 architectural decisions
    │   ├── External constraint documented as first-class concern
    │   └── Risk analysis for compliance-driven organizations

    ├── Domain Specs
    │   ├── authentication/passkey-management (NEW)
    │   │   ├── 6 requirements, 15+ scenarios
    │   │   └── Constraint scenarios as requirements
    │   └── organization-settings (MODIFIED)
    │       └── passkeyEnabled flag with validation

    └── Implementation Tasks (tasks.md)
        └── 12 task groups, 50+ individual tasks

Why This Case Study Matters:

  1. External Constraints: Demonstrates how to handle third-party platform limitations (AWS Cognito MFA/passkey mutual exclusivity) in specifications

  2. Breaking Changes: Shows explicit classification and communication of breaking changes that affect organizational configuration

  3. Compliance Impact: Illustrates how security compliance requirements (MFA enforcement) interact with feature enablement

  4. Multi-Layer Coordination: Backend changes require frontend WebAuthn implementation, infrastructure configuration, and careful rollout

  5. Agent Context: AI agents need the full constraint chain to generate correct validation logic and error handling

Post-Mortem: Strategic Dependency and Hidden Costs

Outcome: The passkey proposal was ultimately cancelled due to: (1) no budget for Cognito plan upgrade, and (2) "One Account" initiative was identified as the long-term solution.

This outcome reveals critical organizational patterns that the specification-driven workflow should address:

The Hidden Dependency Problem

Timeline of Discovery (AI-Accelerated):
───────────────────────────────────────
Day 1: Engineering proposes passkey feature
Day 2: AI-assisted investigation reveals Cognito MFA/passkey constraint
Day 3: AI-assisted cost analysis and design document completed
Day 4: PM reveals "One Account" (Okta migration) is planned
        → Engineering was unaware this would affect ALL auth changes
        → Proposal cancelled as "throwaway work"

How AI Accelerated Discovery:

The specification-driven workflow with AI agents compressed what would traditionally take months into days:

PhaseTraditional TimelineAI-Assisted TimelineAcceleration
Proposal drafting1-2 weeks1 day7-14x
Technical investigation2-4 weeks1 day14-28x
Design document1-2 weeks1 day7-14x
Task breakdown (50+ tasks)1 weekHours10-20x
Total to decision point5-9 weeks4 days9-16x

The Paradox: AI acceleration exposed the strategic blocker (One Account) much faster than traditional processes would have. Without AI assistance, the team might have spent 5-9 weeks on investigation before discovering the same blocker - a significantly higher sunk cost.

Key Insight: AI-assisted specification work produces valuable artifacts even when proposals are cancelled:

  • Complete technical investigation documented (Cognito constraints, MFA interactions)
  • Design decisions and alternatives preserved for future reference
  • 50+ implementation tasks defined - ready if initiative unblocks
  • Decision rationale captured for organizational learning
  • Effort not entirely "wasted" - knowledge is retained in machine-readable format

The Real Value of AI Acceleration: Fast failure is better than slow failure. Discovering a strategic blocker in 4 days vs. 5-9 weeks saves significant organizational resources and allows teams to pivot quickly.

The Compounding Cost Analysis

ScenarioDirect CostHidden CostTotal Impact
Proceed with Cognito passkeyEngineering effort + plan upgradeThrowaway when Okta migratesMedium-High
Wait for One AccountZero engineering effortDelayed user value indefinitelyUnknown
One Account delayed/cancelled (like Privilege 2.0)Zero engineering effortAll blocked improvements permanently lostVery High

Process Gap: Demand Source Management

A fundamental question emerged during retrospective:

"Is there demand from the field?" (PM question)

This reveals an undefined process:

QuestionCurrent StateRecommended State
How are field demands tracked?UnclearCentralized demand registry
How do engineering-identified improvements flow?Ad-hocProposal → Impact assessment → Prioritization
Was passkey ever in product roadmap?NoProactive technology radar
Who owns authentication experience roadmap?UnclearDefined ownership

Solution Proposal: Transparent, AI-Visible Product Planning Context

The root cause of the passkey proposal cancellation was information asymmetry: strategic plans (One Account) existed but were not visible to engineering teams or AI agents during proposal creation.

Proposed Solution: Unified Product Planning Registry

product-planning/
├── strategic-initiatives/
│   ├── one-account.md              # Status, timeline, affected domains
│   ├── privilege-2.0.md            # Status, timeline, affected domains
│   └── platform-migration.md       # Status, timeline, affected domains
├── domain-ownership/
│   ├── authentication.md           # Owner, roadmap, blocked-by
│   ├── authorization.md            # Owner, roadmap, blocked-by
│   └── device-management.md        # Owner, roadmap, blocked-by
├── technology-radar/
│   ├── adopt/                      # Technologies to adopt
│   ├── trial/                      # Technologies being evaluated
│   ├── assess/                     # Technologies to assess
│   └── hold/                       # Technologies on hold
└── demand-registry/
    ├── field-requests/             # Customer/field demands
    ├── engineering-proposals/      # Internal improvement proposals
    └── competitive-response/       # Market-driven requirements

Integration with Proposal Workflow:

Key Learnings

  1. Visibility of Strategic Plans: Long-term initiatives (One Account, Privilege 2.0) must be visible in a shared roadmap that engineering consults during proposal creation

  2. Cost of Waiting: "Wait for X" decisions should include risk assessment of X being delayed or cancelled

  3. Demand Pipeline: Need clear process for both field-driven and engineering-driven improvements

  4. Pattern Recognition: Organizations should track "blocked by future initiative" patterns to identify systemic planning issues

Framework Addition: Specifications blocked by strategic initiatives should be archived with "BLOCKED_BY_INITIATIVE" status, with periodic review triggers to reassess if the blocking initiative is still on track.

Back to: E-Map Case Study | One-Year Plan

References