Vimeo Video Platform - Solution Architecture
Mission
Build a low-latency backend integration layer that turns Vimeo into an in-app video library by synchronising a deeply nested Vimeo folder tree into the platform, serving only authorised content to signed-in users, staying aligned with client-side changes made directly in Vimeo, and supporting ranking, analytics, audit logging, Kafka-driven events, and per-video HubSpot chat context.
Constraints
- Backend stack:
Go - Cloud:
AWS - Primary database:
PostgreSQL - Optional cache:
Redis - Video source of truth:
Vimeo - User authentication:
Google/Apple - Target backend latency for library/detail endpoints: p95 ≤ 300ms
- No reliance on front-end caching
- Vimeo folder tree can change at any time
- Users can only see videos permitted by role/profile
Core Design
- Never call Vimeo on the user read path for library browsing.
- Mirror Vimeo folders/videos into
PostgreSQL. - Use
Redisfor hot metadata, ranked feeds, and entitlement-friendly read acceleration. - Use Kafka for async processing:
- Vimeo sync events
- analytics events
- audit events
- ranking feature events
- Use both:
- webhook-driven sync for freshness
- scheduled reconciliation for correctness
- Enforce authorisation in backend before returning video playback metadata.
- Let Vimeo deliver the media; backend delivers metadata, structure, entitlements, ranking, and context.
AWS Services
Amazon Cognito- Google / Apple federated sign-in
Amazon ECS Fargateapi-servicesync-serviceanalytics-consumeraudit-consumeradmin-service
Amazon RDS for PostgreSQLAmazon ElastiCache for RedisAmazon MSKAmazon EventBridge SchedulerAWS Secrets ManagerAmazon CloudWatch+OpenTelemetry
High-Level Architecture
Services
api-service (Go)
- Authenticates requests using Cognito JWTs
- Maps external identity to internal user
- Serves:
- library tree
- folder contents
- video detail
- playback metadata
- ranked video feed
- HubSpot chat context
- analytics ingest
- admin read endpoints
sync-service (Go)
- Consumes Vimeo webhooks
- Verifies signatures
- Writes raw provider events
- Produces Kafka events
- Reconciles Vimeo tree on schedule
- Upserts folders/videos into
PostgreSQL - Invalidates/rebuilds
Rediscaches
analytics-consumer (Go)
- Consumes watch events from Kafka
- Builds watch sessions / aggregates
- Writes analytics tables in
PostgreSQL
audit-consumer (Go)
- Consumes audit events from Kafka
- Writes immutable audit records
ranking-pipeline
- Inputs:
- user profile
- entitlements
- watch behaviour
- content metadata
- Outputs:
- ranked video IDs per user
- Writes:
user_video_rankingsinPostgreSQL- hot ranked feeds in
Redis
admin-service
- Sync health
- Failed events
- Replay actions
- Folder/video drift visibility
- Audit views
Read Path
Rule
- User requests must never require live traversal of the Vimeo folder tree.
- All library/folder/video pages are served from local indexed state.
Library Request Flow
Video Detail / Playback Flow
Sync Path
Rule
- Webhooks provide freshness.
- Reconciliation provides correctness.
- Neither is trusted alone.
Sync Flow
Reconciliation Flow
Vimeo Traversal Strategy
- Do not fetch all Vimeo videos globally.
- Start from the known root folder.
- Recursively traverse child folders.
- For each folder:
- fetch direct child folders
- fetch direct videos
- Persist:
- folder hierarchy
- Vimeo IDs
- parent-child relations
- content metadata
- visibility state
- On reconciliation:
- mark missing objects as deleted/unavailable
- detect moved folders/videos by parent/path changes
Data Model
Core Tables
| Table | Purpose |
|---|---|
users |
Internal user profile |
user_identities |
Cognito / Google / Apple identity mappings |
roles |
Role catalogue |
user_roles |
User-to-role mapping |
folders |
Vimeo folder mirror |
videos |
Vimeo video mirror |
folder_videos |
Video placement inside folders |
entitlements |
Role/profile access rules for folders/videos |
user_video_rankings |
Ranked videos per user |
watch_events |
Raw playback-related events |
watch_sessions |
Sessionised viewing facts |
provider_events |
Raw Vimeo webhook/reconciliation events |
sync_runs |
Reconciliation runs and outcomes |
audit_logs |
Immutable operational/security audit trail |
hubspot_context |
Video/user-to-chat context mappings |
dead_letters |
Failed provider/analytics events |
Folder Modelling
Use PostgreSQL with:
- adjacency columns:
idparent_idvimeo_folder_id
- plus path column:
path
- plus indexes for:
parent_idpathvimeo_folder_id
This gives:
- fast tree traversal
- easy subtree queries
- efficient cache rebuilds
Video Modelling
videosidvimeo_video_idtitledescriptionthumbnail_urlduration_secondsstatusvisibilityembed_keyupdated_at
folder_videosfolder_idvideo_idsort_order
Entitlements
- Access can be granted at:
- folder level
- video level
- role level
- profile segment level
- Backend resolves effective access before returning content.
Redis Strategy
Cache
Use Redis for:
- library snapshots:
library:root:{role}:{profile_hash}:{version}
- folder contents:
folder:{folder_id}:{role}:{profile_hash}:{version}
- video detail:
video:{video_id}:{user_segment}:{version}
- ranked feeds:
ranked:{user_id}:{version}
Invalidation
- On folder/video update:
- bump a content version
- invalidate affected folder/video keys
- On entitlement update:
- bump entitlement version
- On ranking update:
- replace only ranked-feed keys
Authorisation Model
- Identity: Cognito JWT from Google / Apple federation
- Internal auth:
- backend maps subject to internal
user - backend resolves
rolesandprofile flags
- backend maps subject to internal
- Access resolution:
- user can browse only folders/videos allowed by entitlement rules
- video detail endpoint re-checks access even if listed
- Playback:
- backend returns Vimeo playback metadata only after access passes
Analytics
Ingest
- App emits:
video_openedplayback_startedheartbeatpausedseekedcompletedchat_opened
- API validates:
- user
- video
- session token
- API publishes to Kafka immediately
Analytics Flow
Storage
watch_events- raw event stream
watch_sessions- start time
- end time
- last position
- watched seconds
- completion ratio
- materialised aggregates:
- per user
- per video
- per folder/category
Audit Logging
Log:
- logins
- entitlement changes
- admin replay actions
- webhook receipt
- sync failures
- content moves/deletions
- video access decisions if required
- HubSpot context generation if sensitive
Write audit events through Kafka, then persist immutably in PostgreSQL.
Kafka Topics
| Topic | Purpose |
|---|---|
vimeo.webhooks.raw |
Raw provider events |
vimeo.sync.normalized |
Normalized sync actions |
analytics.events |
Watch/chat events |
audit.events |
Operational/security audit |
ranking.features |
Ranking input events |
ranking.updated |
New ranking outputs |
deadletters.vimeo |
Failed sync events |
deadletters.analytics |
Failed analytics events |
Retry / DLQ / Idempotency
Idempotency
Use:
- unique key on provider event ID where available
- otherwise derived key:
provider + object_id + event_type + source_timestamp
- unique insert into
provider_events - consumers process only once per event key
Retries
- transient failures:
- exponential backoff
- retry topic or delayed requeue
- permanent failures:
- move to dead-letter topic/table
DLQ Handling
- failed events appear in admin dashboard
- admin can:
- inspect payload
- replay event
- mark ignored
- attach note
Rate-Limit Handling
Vimeo / HubSpot API client
- central provider client in
Go - token-bucket limiter per provider credential
- jittered exponential backoff on
429 - circuit-breaker behaviour on repeated upstream failure
- scheduled reconciliation throttled by budgeted API call rate
HubSpot Chat Context
Goal
- When a user opens a video, HubSpot chat can be contextualised with:
- user identity
- current video
- folder/category
- rank / recommendation reason if desired
Flow
Admin Dashboard
Views
- sync run history
- webhook receipt status
- failed events / dead letters
- folder tree diff
- deleted / unavailable videos
- cache version state
- analytics lag
- Kafka consumer lag
- audit log search
Actions
- replay failed sync event
- trigger folder/video reconciliation
- invalidate cache subtree
- mark video hidden internally
- inspect entitlement resolution
API Shape
Public / App APIs
GET /library/homeGET /folders/:idGET /videos/:idGET /videos/:id/playbackGET /videos/:id/chat-contextPOST /analytics/events
Admin APIs
GET /admin/sync-runsGET /admin/provider-eventsGET /admin/dead-lettersPOST /admin/replay/:event_idPOST /admin/reconcile/rootPOST /admin/cache/invalidate
Deployment
Why This Meets the Requirements
- Low latency:
- user reads hit
Redis/PostgreSQL, not Vimeo
- user reads hit
- No front-end caching required:
- backend pre-indexes and caches everything needed
- Vimeo stays source of truth:
- sync + reconciliation keep backend aligned
- Role/profile restrictions:
- enforced in backend before listing/playback
- Ranking:
- precomputed and stored for fast retrieval
- Analytics/audit:
- event-driven through Kafka
- HubSpot chat:
- contextual endpoint per video
- Reliability:
- idempotency, retries, DLQs, replay tools
- Operability:
- admin dashboard + sync visibility
Main Trade-Offs
- More moving parts than a naive direct-Vimeo read approach
- Slight staleness risk between Vimeo change and local sync
- Extra cache invalidation complexity
- Kafka adds operational weight, but is justified for:
- analytics
- audit logs
- decoupled sync/ranking pipelines
Non-Negotiable Rule
- Vimeo is never queried on the critical user browse path unless there is an explicit admin/debug fallback.
- All user-facing library performance depends on local indexed state.
Suggested Build Order
folders/videosmirror inPostgreSQL- Vimeo reconciliation job from root folder
- user auth + entitlement model
- library APIs
Rediscaching- playback metadata endpoint
- Kafka analytics ingest
- audit pipeline
- HubSpot context
- ranking pipeline
- admin dashboard
If you want, next I can turn this into a cleaner case-study style doc instead of an architecture spec, so it reads better on your portfolio.