Inventory of every personal data field HiringCoachAI stores, where it lives, who processes it, and how long it is retained.
Firestore collections
User-scoped subcollections live under users/{uid}/.... Top-level collections that key on userId (or equivalent) are listed flat. Operational and admin-internal collections that hold no customer personal data are excluded; the automated data-map audit maintains an explicit allowlist for those.
User identity and authentication
| Collection | Fields | Classification | Retention | Lawful basis (GDPR) | Processors |
|---|
users | id, name, email, emailVerified, phone, photoURL, displayName, lastAuthAt, role, cookieConsent (necessary, analytics, marketing, version, decidedAt; written by the cookie-consent API on user action), preferences | Confidential | Until account deletion + audit retention | Contract (6.1.b), consent for optional analytics/marketing tracking | Firebase/GCP; Mailchimp for account/customer communications and communication-list management |
users/{uid}/userDetails | extended profile fields, updatedAt | Confidential | Until account deletion | Contract | Firebase |
accounts | userId, provider, providerAccountId, refreshToken, accessToken, type | Restricted | Until account deletion | Contract | Firebase |
sessions | userId, sessionToken, expires | Restricted | Max 7 d (customer) / 12 h (admin) | Contract | Firebase |
authTokens | userId, tokenHash, createdAt, expiresAt | Restricted | 5 min TTL | Contract | Firebase |
verificationTokens | identifier, tokenHash, expires | Restricted | TTL per token (typically 24 h) | Contract | Firebase |
account_deletion_challenges | challengeId, userId, confirmTokenHash, createdAt, expiresAt | Restricted | 15 min TTL | Contract | Firebase |
Resumes, applications, and job-search content
| Collection | Fields | Classification | Retention | Lawful basis (GDPR) | Processors |
|---|
users/{uid}/resumes | File name, extracted text, metadata, createdAt | Confidential | Until account deletion | Contract | Firebase/GCS, OpenAI (on generation; per-request store: false, no Zero Data Retention (ZDR) amendment), ElevenLabs (TTS output only) |
users/{uid}/files | fileId, ownerUserId, original file name, content type, size, SHA-256 hash, storage bucket/object path, source, status, created/updated/deleted timestamps | Confidential | Until account deletion | Contract | Firebase/GCS |
users/{uid}/resumeMetadata | resume tags, ATS scoring, last-edit timestamps | Confidential | Until account deletion | Contract | Firebase |
users/{uid}/coverLetters | Job title, company, content, status | Confidential | Until account deletion | Contract | Firebase, OpenAI (per-request store: false, no Zero Data Retention (ZDR) amendment) |
users/{uid}/applications | jobId, status, dates, notes, attached resume reference | Confidential | Until account deletion | Contract | Firebase |
users/{uid}/drafts | in-progress application materials | Confidential | Until account deletion | Contract | Firebase |
users/{uid}/customQuestions | question, responses, createdAt | Confidential | Until account deletion | Contract | Firebase, OpenAI (per-request store: false, no Zero Data Retention (ZDR) amendment) |
users/{uid}/shortAnswers | short-answer responses for application packets | Confidential | Until account deletion | Contract | Firebase |
users/{uid}/fitAnalysis | per-job fit analysis records | Confidential | Until account deletion | Contract | Firebase, OpenAI |
users/{uid}/candidateAnalysis | candidate-evaluation outputs | Confidential | Until account deletion | Contract | Firebase, OpenAI |
users/{uid}/intelBriefings | per-company intel briefings | Confidential | Until account deletion | Contract | Firebase, OpenAI, Perplexity |
users/{uid}/explore | exploration session state (saved searches, comparisons) | Confidential | Until account deletion | Contract | Firebase |
adminResumeBenchFixtures | benchmark fixture labels, anonymized resume text, source label, content hash, creator, notes | Confidential | Until admin deletion of fixture or superseded benchmark corpus | Legitimate interest (quality assurance and model evaluation) | Firebase |
adminResumeBenchRuns | benchmark run configuration, sampled fixture/model IDs, status, cost/quality summary, creator, timestamps | Confidential | 2 years or until admin deletion | Legitimate interest (quality assurance and model evaluation) | Firebase, OpenAI, Anthropic, Google Gemini |
adminResumeBenchRuns/{runId}/attempts | per-fixture/model attempt status, run number, parser/judge outputs and scores, latency, token/cost metadata, error details | Confidential | 2 years or until parent run deletion | Legitimate interest (quality assurance and model evaluation) | Firebase, OpenAI, Anthropic, Google Gemini |
Coaching, interview prep, and pep-talks
| Collection | Fields | Classification | Retention | Lawful basis (GDPR) | Processors |
|---|
users/{uid}/interviewQuestions | generated practice questions per role | Confidential | Until account deletion | Contract | Firebase, OpenAI |
users/{uid}/interviewResearchCases | research-case payloads for interview prep | Confidential | Until account deletion | Contract | Firebase, OpenAI, Perplexity |
users/{uid}/pepTalks | generated pep-talks (text + audio) | Confidential | Until account deletion | Contract | Firebase, OpenAI, ElevenLabs / Google Cloud Text-to-Speech |
users/{uid}/tasks | title, description, status, dueDate, tags | Confidential | Until account deletion | Contract | Firebase |
users/{uid}/onboarding | onboarding progress / preferences | Confidential | Until account deletion | Contract | Firebase |
Contacts and follow-ups
| Collection | Fields | Classification | Retention | Lawful basis (GDPR) | Processors |
|---|
users/{uid}/contacts | name, email, phone, LinkedIn URL, notes, lastContacted | Confidential | Until account deletion | Contract | Firebase |
users/{uid}/contactLinks | per-contact relationship metadata | Confidential | Until account deletion | Contract | Firebase |
users/{uid}/followUps | scheduled follow-ups per contact | Confidential | Until account deletion | Contract | Firebase |
users/{uid}/followUpReminders | reminder records for follow-ups | Confidential | Until account deletion | Contract | Firebase |
Integrations (LinkedIn, OAuth-based imports)
| Collection | Fields | Classification | Retention | Lawful basis (GDPR) | Processors |
|---|
users/{uid}/integrations | per-integration tokens / configuration | Restricted | Until account deletion or revocation | Consent | Firebase |
users/{uid}/linkedIn | LinkedIn profile snapshot, last sync, scope | Confidential | Until account deletion | Consent | Firebase, LinkedIn |
users/{uid}/linkedinJobExports | LinkedIn job-search exports | Confidential | Until account deletion | Consent | Firebase, LinkedIn |
users/{uid}/linkedinProfileExports | LinkedIn profile exports | Confidential | Until account deletion | Consent | Firebase, LinkedIn |
linkedinCookies | uid, liAt (AES-256-GCM encrypted), createdAt, expiresAt | Restricted | TTL 1 h (configurable), or until revoke | Consent | Firebase |
Subscriptions and feedback
| Collection | Fields | Classification | Retention | Lawful basis (GDPR) | Processors |
|---|
subscriptions | userId, stripeCustomerId, stripeSubscriptionId, status, created, updated | Confidential | Until account deletion + 7 y for tax records | Contract, legal obligation | Firebase, Stripe |
subscriptionHistory | per-user subscription lifecycle events | Confidential | 7 y (tax / billing audit) | Legal obligation | Firebase, Stripe |
feedback | user-submitted product feedback (often includes free-text) | Confidential | Until account deletion | Legitimate interest | Firebase |
aiOutputFeedback | thumbs up/down on AI outputs, optional comment | Confidential | Until account deletion | Legitimate interest | Firebase |
Pilot / group programs (per-user data within a sponsor program)
| Collection | Fields | Classification | Retention | Lawful basis (GDPR) | Processors |
|---|
pilotMemberships | userId, pilotId, email, display name, role/status, cohort tags/subgroups, invitation/activation/removal timestamps | Confidential | Identifying fields until account deletion; anonymized program participation retained until program end | Contract, legitimate interest | Firebase |
pilotAdmins | userId, pilotId, role, permission overrides, status, invite/grant/revocation timestamps | Confidential | Until program end or access revocation + audit retention | Contract, legitimate interest | Firebase |
pilotSessions | userId, membershipId, sessionId, start/end/heartbeat timestamps, active/idle/engaged duration, page-view and action counters, features/page groups used, entry/exit paths, device/browser metadata | Confidential | Identifying fields until account deletion; anonymized usage retained until program end | Contract, legitimate interest | Firebase |
pilotEvents | userId, membershipId, sessionId, event name/category/feature, page path/route/referrer, timestamp, duration/count values, platform/device/browser/user-agent, event properties | Confidential | Identifying fields until account deletion; anonymized usage retained until program end | Contract, legitimate interest | Firebase |
pilotUserDailyRollups | userId, membershipId, date, session count, active duration, meaningful-action count, feature-usage counts, page views, last-active timestamp | Confidential | Identifying fields until account deletion; anonymized usage retained until program end | Contract, legitimate interest | Firebase |
pilotGoals | program-level goals and targets | Internal / Confidential | Until program end | Contract | Firebase |
pilotInterventions | coach interventions logged per user | Confidential | Until program end | Contract | Firebase |
therapistSessions | session metadata for coaching engagements | Confidential | 7 y (record retention) | Legitimate interest | Firebase |
Account-deletion ledgers
| Collection | Fields | Classification | Retention | Lawful basis (GDPR) | Processors |
|---|
deleted_users | deletedUid, normalizedEmail, displayEmail, userName, status, deletionReason, contentCreated, billingCleanupStatus | Confidential | 365 d audit retention | Legitimate interest (fraud/abuse detection, restoration) | Firebase |
deleted_account_snapshots | full user doc + subscription snapshot + content inventory | Restricted | 30 d recovery window | Legitimate interest | Firebase |
deleted_account_feedback | userId, feedbackReason, feedbackNotes | Internal | 365 d | Legitimate interest (product improvement) | Firebase |
Audit and security ledgers
| Collection | Fields | Classification | Retention | Lawful basis (GDPR) | Processors |
|---|
auditLog | ts, actorUid, actorRole, ip, userAgent, action, resourceType, resourceId, delta, tenantId, requestId | Confidential | 2 years | Legal obligation (security), legitimate interest (forensics) | Firebase |
securityAuditMonitorRuns | runId, timestamps, lookback window, reviewed-event count, finding type/severity/count, hashed actor/IP/resource identifiers, retention expiry | Confidential | 365 days | Legitimate interest (security monitoring, compliance evidence) | Firebase, Sentry for finding notifications |
aiCallAudit | uid, endpoint, model, promptHash, tokensIn, tokensOut, requestedAt, durationMs, piiDetected | Confidential | 1 year | Legitimate interest (AI governance) | Firebase |
complianceTrainingLog | signerUid, signerEmail, signerName, document title/path/hash/version, signed statement, acknowledgment method, IP, user-agent, timestamp | Confidential | 7 years after access relationship ends | Legal obligation (compliance evidence), legitimate interest (security governance) | Firebase |
users/{uid}/complianceAcknowledgments | latest per-person/per-document onboarding acknowledgment rollup, latest log ID, document metadata, signer identity, timestamp | Confidential | 7 years after access relationship ends | Legal obligation (compliance evidence), legitimate interest (security governance) | Firebase |
complianceDocumentReviewLog | reviewer UID/email/name, document title/path/hash/version, review cadence, scheduled window, notes, timestamp | Confidential | 7 years | Legal obligation (compliance evidence), legitimate interest (security governance) | Firebase |
complianceDocumentReviews | latest per-document review rollup, latest log ID, reviewer identity, document metadata, cadence, scheduled window, notes | Confidential | Superseded by latest review; retained in complianceDocumentReviewLog for 7 years | Legal obligation (compliance evidence), legitimate interest (security governance) | Firebase |
compliancePatchVerificationRuns | reviewer UID/email/name, repository, check status, Dependabot PR/alert summaries, SBOM status, notes, timestamp | Confidential | 7 years | Legal obligation (compliance evidence), legitimate interest (security governance) | Firebase |
compliancePatchVerification | latest monthly patch-verification rollup, latest passing run timestamp, latest check status, reviewer identity | Confidential | Superseded by latest run; retained in compliancePatchVerificationRuns for 7 years | Legal obligation (compliance evidence), legitimate interest (security governance) | Firebase |
Non-Firestore data (sub-processors and external systems)
This section is the data-map projection of the sub-processors; both are kept in sync by the automated data-map audit.
Core infrastructure
| System | Data | Classification | Retention |
|---|
| Google Cloud Platform / Firebase | Authentication, Firestore database content (all rows above), Cloud Storage objects for account documents / resume binaries, Cloud Tasks payloads, Cloud Text-to-Speech inputs, backup/export buckets, and Cloud Functions source buckets. | Confidential (inherits source) | Until account deletion (live data); Firestore PITR retains 7 days; managed Firestore backups retain 98 days |
| Vercel | Application hosting, Edge Middleware, Serverless Function inputs, AI Gateway routing, platform logs | Internal | Vendor retention for runtime/platform logs; enabled log drain sends selected production/preview log sources to Sentry |
| Cloudflare | Public DNS, reverse proxy, CDN/security edge for hiringcoach.ai | Internal / Confidential if request metadata includes user identifiers | Per Cloudflare defaults |
| Firestore backups | Firestore PITR, managed daily Firestore backups, manual local Firestore JSON export tooling, and the US multi-region backup/export bucket | Confidential (inherits source) | PITR: 7 days; managed daily Firestore backups: 98 days; primary export bucket: 90-day soft delete; local exports: 30 days target |
| Domain registrar / DNS | Domain registration metadata, DNS records | Internal | Until domain transfer or expiry |
| GitHub | Source-code hosting, CI runs, deploy artifacts | Internal | Per repository policy |
Payments and email
| System | Data | Classification | Retention |
|---|
| Stripe | Card tokens, customer IDs, payment intents, invoices, subscription events | Restricted (tokens); Confidential (customer IDs) | Per Stripe's policy; we hold only identifiers |
| SendGrid (Twilio) | Email addresses, send metadata, bounce / complaint records (transactional only) | Confidential | Per SendGrid default (90 d activity) |
| Mailchimp (Intuit) | Email, name, account/customer communication status, marketing-email preferences where applicable | Confidential | Until unsubscribe or suppression |
AI / generation providers
| System | Data | Classification | Retention |
|---|
| OpenAI (called both directly and via Vercel AI Gateway) | Prompts + completions | Confidential (input); output is HiringCoachAI-owned | Per-request store: false. No Zero Data Retention (ZDR) amendment: OpenAI's then-current standard API retention windows apply. |
| Perplexity AI | Research-backed search prompts | Confidential | Standard API terms; provider default retention applies. |
| ElevenLabs | Text-to-speech audio output | Confidential | Standard API terms; provider default retention applies. |
| Deepgram | Audio-to-text transcripts | Confidential | Per-request redact=true to redact sensitive number-like entities from transcripts, such as payment cards and Social Security numbers; provider default audio retention otherwise applies. |
| Google Cloud Text-to-Speech | Alternate TTS pipeline | Confidential | Standard API terms; provider default retention applies. |
OAuth and import providers
| System | Data | Classification | Retention |
|---|
| Google OAuth | Profile, email; Google Drive scope only on user grant | Confidential | Until user revokes |
| LinkedIn | OAuth sign-in; profile import (with user consent) | Confidential | Until user revokes |
| Facebook OAuth | Profile, email | Confidential | Until user revokes |
| Canva | Design asset import metadata | Internal | Until user revokes |
| Mapbox | Geocoding and location display (approximate location strings) | Internal | Per Mapbox defaults |
Monitoring and analytics (consent-gated for non-essential)
| System | Data | Classification | Retention |
|---|
| Sentry | Error payloads, stack traces, Vercel drained logs, user IDs only where needed for debugging after beforeSend scrubbing | Confidential | 90 d |
| Amplitude | Product analytics events (anonymous or identified) | Internal / Confidential (if identified) | Per vendor defaults; revocable via analytics consent |
| Mixpanel | Product analytics events | Internal / Confidential (if identified) | Per vendor defaults; revocable via analytics consent |
| Hotjar | Heatmaps and session insights with input masking | Internal | Per vendor defaults; revocable via analytics consent |
| Google Analytics / Google Tag Manager (GTM) | Page views, conversion events | Internal | Per vendor defaults; gated by analytics consent |
| Meta Pixel (Facebook) | Conversion events, hashed identifiers | Internal | Per vendor defaults; gated by marketing consent |
| PostHog (PostHog Inc., US) — client-side | Product-analytics event names and properties (e.g. login_attempted, email_magic_link_sent, registration_form_opened, upgrade_page_viewed, checkout_initiated, resume_optimization_started, job_search_submitted, job_board_link_clicked); stable user identifier (Firebase UID) and email attached only after analytics consent | Internal / Confidential (when identified) | Per vendor defaults; gated by analytics consent. SDK is opted-out by default in instrumentation-client.ts; capture begins only after the analytics consent category is granted and ceases immediately on revocation (distinct ID is reset). Lawful basis: consent. |
| PostHog (PostHog Inc., US) — server-side transactional | Operational events tied to contract performance: subscription_purchased, payment_failed (from Stripe webhook), and account_deletion_confirmed (from the account-deletion confirmation handler). Distinct ID is the Firebase UID for billing reconciliation and churn/fraud analysis. | Confidential | Per vendor defaults. Lawful basis: contract performance (subscription events) and legitimate interest (account-deletion churn analysis, payment-failure fraud signal). These events are operational telemetry, not marketing or behavioral analytics, and are not gated by cookie consent. |
PII fields (consolidated)
For DSR purposes, a user's personal data is distributed across:
users/{uid} and every user-scoped subcollection above (resumes, files, coverLetters, contacts, applications, integrations, etc.)subscriptions and subscriptionHistory (rows where userId == uid)linkedinCookies (rows where uid == uid)accounts, sessions, authTokens, verificationTokens (rows where userId == uid)auditLog, aiCallAudit (rows where actorUid == uid)pilotMemberships, pilotAdmins, pilotSessions, pilotEvents, pilotUserDailyRollups, pilotGoals, pilotInterventions, therapistSessions (rows where uid, userId, or membershipId links to the data subject)- Stripe (customer object keyed by
stripeCustomerId) - SendGrid (email address)
- LinkedIn / Google / Facebook (provider-side records under the user's revocable OAuth grants)
The account-deletion flow cascades across user-scoped Firebase collections, linked authentication and session records, user-linked pilot administrator assignments, pilot-program direct identifiers, and Stripe subscription cancellation/verification before active account data removal. Pilot usage records are retained only after direct user identifiers are replaced with non-reversible deleted-participant identifiers and direct contact, name, and free-text fields are removed. SendGrid and other vendor-side records outside Stripe are handled through the DSR process rather than automatic API deletion.
The self-service account export currently includes the user's profile, recursive user subcollections, subscription record, linked authentication and session records with security secrets redacted, metadata-only audit rows from the audit log and AI call audit, and pilot membership, pilot-admin assignment, pilot-session, pilot-event, and pilot user-daily-rollup rows tied to the user. Vendor-side Stripe, SendGrid, analytics, or OAuth-provider records are handled through the DSR workflow with the applicable provider rather than the self-service JSON export.
Maintenance
- An automated data-map audit runs during local compliance checks and the manual security workflow. It scans every Firestore collection reference in the application code and compares it to this document, fails if a collection is referenced in code but not represented here, and maintains an explicit allowlist for operational and admin-internal collections that hold no personal data.
- The same audit surfaces drift between this document and the sub-processors when local compliance checks or the manual security workflow are run.
Change history
| Date | Change | Author |
|---|
| 2026-04-24 | Initial map | Security Officer |
| 2026-05-06 | Comprehensive rewrite: added user-scoped subcollections (applications, drafts, fitAnalysis, candidateAnalysis, intelBriefings, interviewQuestions, interviewResearchCases, pepTalks, onboarding, contactLinks, followUps, followUpReminders, integrations, linkedIn / linkedinJobExports / linkedinProfileExports, shortAnswers, resumeMetadata, userDetails, explore, feedback, aiOutputFeedback, subscriptionHistory, verificationTokens, therapistSessions, pilotMemberships, pilotGoals, pilotInterventions); split compound vendor cells in non-Firestore section so each sub-processor has its own row; added drift gate via the automated data-map audit. | Security Officer |
| 2026-05-07 | Clarified self-service export coverage and vendor-side deletion status after code review and export expansion. | Security Officer |
| 2026-05-07 | Added compliance onboarding acknowledgments, training log, scheduled document review log, and latest document review rollups after onboarding/review portal implementation. | Security Officer |
| 2026-05-08 | Added monthly patch-verification run and rollup collections; corrected data-map audit workflow description to local/manual checks. | Security Officer |
| 2026-05-12 | Clarified that the cookie marketing-consent field governs marketing/attribution tracking and is not itself a marketing-email subscription. Mailchimp processing is listed for account/customer communications and communication-list management; marketing-email preference handling remains separate from cookie tracking consent. | Security Officer |
| 2026-05-12 | Added user-linked pilot engagement collections (pilotAdmins, pilotSessions, pilotEvents, pilotUserDailyRollups) to the personal-data inventory and account-export coverage notes. | Privacy Officer |
| 2026-05-12 | Clarified account-deletion handling for pilot programs: direct user identifiers are removed from pilot membership and usage records while anonymized usage is retained for program reporting. | Privacy Officer |
| 2026-05-14 | Added securityAuditMonitorRuns after implementing the hourly audit-log anomaly monitor and retention target. | Security Officer |
| 2026-05-17 | Added PostHog (PostHog Inc., US) as a product-analytics sub-processor. Client-side capture is opted-out by default and gated through the analytics consent category. Server-side transactional events (subscription, payment-failure, account-deletion confirmation) are listed separately under contract performance / legitimate interest. | Privacy Officer |
| 2026-05-17 | Added users/{uid}/files metadata for server-mediated account document uploads; aligned GCP/Firebase row with account document storage now that the upload feature is live. | Security Officer |
| 2026-05-20 | Added resume parser benchmark fixture, run, and attempt collections after introducing the admin benchmark harness. | Security Officer |