CortexOS Technical White Paper v4.0

The Philosophy of Sovereign AI
Cryptographic Architecture
Zero-Knowledge Cloud Vault
On-Device ML Pipeline
On-Device LLM: Llama 3.2
Personalized Adaptation Layer
Tiered AI System
Semantic Intelligence Engine
Journey & Narrative System
Health Integration
Voice Processing
Notification Intelligence
Clinical Data Export
Behavioral Privacy
Conclusion

SECTION 01

The Philosophy of Sovereign AI

Modern AI applications function as thin clients streaming user data to centralized servers. Every prompt, every journal entry, every voice recording crosses a trust boundary the user cannot audit. CortexOS rejects this model entirely.

Sovereign AI means your thoughts stay on your device. CortexOS leverages hardware acceleration on modern mobile processors — including on-device LLM inference via Llama 3.2 — to deliver intelligence that rivals cloud services without ever transmitting personal content.

The CortexOS Guarantee

Your journal entries, voice recordings, health correlations, generative AI outputs, and ML-derived insights are processed entirely on your device. Zero bytes of personal content ever touch our servers.

Design Principles

Zero-Knowledge by Default — data cannot be read even under legal compulsion. Cloud backups are encrypted before leaving the device.
Local-First AI — all ML inference and LLM generation runs on-device — TensorFlow Lite (Android) / Apple Neural Engine (iOS), Llama.cpp for LLM, and Whisper.cpp for voice.
Cryptographic Ownership — users hold the only keys, derived from a BIP39 recovery phrase.
Graceful Degradation — features scale to device capability. A 3GB-RAM phone still gets full analysis; an 8GB-RAM phone additionally runs Llama 3.2 generative AI.
Notification Respect — a global throttling system caps notifications at a user-configurable daily budget, ensuring the app never becomes intrusive.

SECTION 02

Cryptographic Architecture

CortexOS implements application-layer encryption with user-derived keys. No server, cloud provider, or CortexOS itself can read your journal. The encryption key exists only on your device and is never transmitted in any form.

Verify the implementation: The cryptographic layer (key derivation, encryption, vault) is open source — github.com/CortexOS-App/CortexOS-crypto-core

2.1 Recovery Phrase & Key Material

During onboarding, CortexOS generates a 6-word BIP39 recovery phrase combined with a 4-digit PIN you choose. These two factors together are the sole source of all cryptographic material. Neither is ever sent to any server.

Factor	Format	Entropy
Recovery phrase	6 words from BIP39 wordlist (2,048 words)	2,048⁶ ≈ 7.2 × 10¹⁹ combinations
PIN	4 digits	10,000 combinations
Combined	`"word1 word2 word3 word4 word5 word6-1234"`	≈ 7.2 × 10²³ — brute-force infeasible with Argon2id

The combined phrase is normalized (lowercase, whitespace trimmed) and fed into Argon2id. Three independent keys are derived from it — each with a different purpose-bound domain salt so that compromising one key reveals nothing about the others.

2.2 Key Derivation (Argon2id)

Argon2id (winner of the Password Hashing Competition, OWASP’s current recommendation) is used for all key derivation. Its memory-hard design makes GPU and ASIC attacks economically infeasible: a single derivation requires 64 MB of RAM held for the full computation.

Parameter	Value	Rationale
Algorithm	`Argon2id v1.3`	Hybrid variant: resists both GPU brute-force and side-channel attacks
Memory cost	`65,536 KB (64 MB)`	Memory-hard — each attempt requires 64 MB RAM
Time cost	`3 iterations`	~800 ms on mid-range mobile; negligible for users, prohibitive for attackers
Parallelism	`4 lanes`	Matches mobile multi-core; attacker gains nothing from parallelism
Output length	`32 bytes (256 bits)`	Exact AES-256 key size; no truncation or expansion

Three separate Argon2id calls produce three independent keys. Domain separation is achieved via purpose-bound salts — a technique that ensures the same input never produces the same output for different purposes:

Key	Salt	Used For
`accountId`	`"cortexos-account-id-v2-argon2id"` (fixed)	Opaque server identifier. Fixed salt ensures the same phrase produces the same ID on any device — enabling cross-device vault lookup without registration.
`encryptionKey`	`"cortexos-encryption-key-v2-argon2id"` + 32-byte random user salt	AES-256 key for all local and vault encryption. Per-user salt ensures uniqueness even if two users share the same phrase.
`authToken`	`"cortexos-auth-token-v2-argon2id"` + 32-byte random user salt	API authentication bearer token. Fully independent from the encryption key — a server compromise reveals no key material.

KeyDerivation.swift — iOS (reference C Argon2id implementation)

// Argon2id parameters — MUST match Android's BouncyCastle implementation exactly
private static let argon2Memory:      UInt32 = 65536   // 64 MB in KB
private static let argon2TimeCost:    UInt32 = 3       // iterations
private static let argon2Parallelism: UInt32 = 4       // threads
private static let hashLength:        Int    = 32      // 256 bits

// Domain-separation salts (raw UTF-8 bytes — identical on iOS and Android)
private static let saltAccountId     = "cortexos-account-id-v2-argon2id"
private static let saltEncryptionKey = "cortexos-encryption-key-v2-argon2id"
private static let saltAuthToken     = "cortexos-auth-token-v2-argon2id"

public static func deriveAllKeys(from fullPhrase: String, userSalt: Data) throws -> DerivedKeys {
    let normalized = normalizePhrase(fullPhrase)   // lowercase, trim whitespace

    // accountId: fixed salt — same result on every device for the same phrase
    let accountIdBytes = try argon2id(password: normalized, salt: saltAccountId)

    // encryptionKey: purpose prefix + per-user random salt
    let encSalt = saltEncryptionKey.data(using: .utf8)! + userSalt
    let encryptionKeyBytes = try argon2id(password: normalized, saltData: encSalt)

    // authToken: purpose prefix + per-user random salt (independent from encryptionKey)
    let authSalt = saltAuthToken.data(using: .utf8)! + userSalt
    let authTokenBytes = try argon2id(password: normalized, saltData: authSalt)

    return DerivedKeys(
        accountId: accountIdBytes.hexString,         // 64-char hex
        encryptionKey: SymmetricKey(data: encryptionKeyBytes),
        authToken: authTokenBytes.hexString          // 64-char hex
    )
}

2.3 Encryption (AES-256-GCM)

All content is encrypted with AES-256-GCM (Galois/Counter Mode). GCM provides both confidentiality and authenticity in a single pass — any tampering with the ciphertext is detected and rejected before decryption proceeds. This protects against padding oracle attacks, bit-flipping attacks, and silent data corruption.

Component	Value
Cipher	`AES-256-GCM`
Key size	`256 bits (32 bytes)`
Nonce / IV	`96 bits (12 bytes)`, cryptographically random per operation
Authentication tag	`128 bits (16 bytes)`
Output format	`Nonce ‖ Ciphertext ‖ AuthTag` — combined into a single opaque blob
Key storage	iOS: Keychain (`kSecAttrAccessibleWhenUnlockedThisDeviceOnly`) • Android: Keystore (hardware-backed TEE/HSM)

EncryptionManager.swift — iOS (Apple CryptoKit)

public func encryptData(_ data: Data) throws -> Data {
    let key = try getMasterKeySync()           // loaded from Keychain
    let sealedBox = try AES.GCM.seal(data, using: key)
    guard let combined = sealedBox.combined else { throw EncryptionError.encryptionFailed }
    return combined                            // nonce || ciphertext || tag
}

public func decryptData(_ data: Data) throws -> Data {
    let key = try getMasterKeySync()
    let sealedBox = try AES.GCM.SealedBox(combined: data)
    return try AES.GCM.open(sealedBox, using: key) // fails if tag invalid
}

2.4 What Gets Encrypted

Every piece of user-generated content is encrypted before being written to disk:

Journal entries — full text content
AI analysis results — sentiment scores, detected emotions, cognitive distortion flags, patterns
Voice recordings — raw audio files
Transcriptions — Whisper on-device output
Health correlations — mood-sleep-activity data processed on device
Cloud vault snapshots — re-encrypted before any network transfer

AI analysis runs against plaintext in memory only. Results are encrypted before being written to the database. The on-device LLM never receives decrypted entries from a network source — it reads from the already-decrypted in-memory context only during an active session.

2.5 Challenge Verification (Login Without Storing the Key)

A zero-knowledge system has an inherent UX problem: a wrong passphrase silently produces a wrong key. CortexOS solves this with two independent mechanisms:

Canary verification: During onboarding, a known string is encrypted with the derived key and stored. On each login, the system attempts to decrypt the canary. A match confirms the correct key was derived — without the key, phrase, or PIN ever being stored anywhere.

Secure word hashes: Each of the 6 phrase words is hashed as SHA-256("position:word") and stored in the Keychain. On re-login, the user is challenged with 2 randomly selected positions. Only the correct word at the correct position produces the matching hash. This resists word-by-word enumeration attacks — an attacker who obtains one hash still cannot reconstruct the full phrase.

RecoveryPhraseManager.swift — challenge hash generation

// SHA-256("position:word") — position prevents cross-position hash reuse attacks
public func hashWordAtPosition(_ word: String, position: Int) -> String {
    let input = "\(position):\(word.lowercased())"
    let hash = SHA256.hash(data: Data(input.utf8))
    return hash.compactMap { String(format: "%02x", $0) }.joined()
}

// Login challenge: 2 random positions from the 6-word phrase + PIN
public func verifyPhraseChallenge(
    wordIndex1: Int, word1: String,
    wordIndex2: Int, word2: String,
    pin: String
) -> Bool {
    guard verifyPin(pin) else { return false }
    let hash1 = hashWordAtPosition(word1, position: wordIndex1)
    let hash2 = hashWordAtPosition(word2, position: wordIndex2)
    return hash1 == getPhraseWordHash(wordIndex1) &&
           hash2 == getPhraseWordHash(wordIndex2)
}

SECTION 03

Zero-Knowledge Cloud Vault

CortexOS offers optional encrypted cloud backup. The vault is architecturally zero-knowledge: the server stores encrypted blobs it cannot decrypt. There is no registration, no email, no account management server-side — just opaque bytes identified by a deterministic hash.

3.1 Architecture

The vault pipeline has three layers, each with a single responsibility:

VaultEncryption — AES-256-GCM encryption of the backup payload using the user’s locally-derived encryptionKey. The key never leaves the device.
VaultManager — serializes entries to JSON, encrypts, and calls the upload API. On restore: downloads, decrypts, and deserializes. No server-side processing occurs.
VaultSyncWorker — background sync triggered after each write (debounced). Respects WiFi-only preference and battery state.

3.2 What the Server Stores — and What It Cannot See

What the server stores	What the server cannot determine
`accountId`: a 64-character hex string derived via Argon2id from the recovery phrase. It is opaque — it reveals nothing about the phrase or the user. An AES-256-GCM encrypted blob A last-modified timestamp	Journal entry content Number of entries Recovery phrase or PIN Encryption key User identity (no email, name, or phone is ever provided) Device or platform type

Even under legal compulsion, the server operator cannot produce plaintext journal data. There is nothing to hand over except encrypted bytes.

3.3 How the accountId Works

The accountId uses a fixed domain salt for its Argon2id derivation. This means the same recovery phrase + PIN always produces the same accountId on any device, on any platform (iOS or Android) — enabling vault recovery on a new device without any server-side registration or account linking. No email or identifier is ever provided to the server.

VaultManager.swift — backup flow

// 1. Serialize entries to JSON
let vaultData = VaultData(version: .currentVersion, platform: "ios", entries: entryDTOs)
let jsonData = try JSONEncoder().encode(vaultData)

// 2. Encrypt client-side — server never sees plaintext
let encryptedData = try VaultEncryption.encrypt(data: jsonData, key: key)

// 3. Upload opaque blob — authenticated with authToken, NOT encryptionKey
try await api.uploadVault(data: encryptedData)

3.4 Recovery on a New Device

When a user installs CortexOS on a new device and enters their recovery phrase + PIN:

Argon2id derives the same accountId and encryptionKey deterministically from the phrase + PIN
The app authenticates to the vault using the derived authToken
The encrypted blob is downloaded
The blob is decrypted locally using the derived encryptionKey
Journal entries are restored on-device

At no point does the server participate in key exchange, key storage, or key escrow. If the recovery phrase is lost, the vault cannot be decrypted by anyone — including CortexOS.

3.5 Threat Model

Threat	CortexOS’s Protection
Server breach (Cloudflare R2 compromised)	Attacker gets encrypted blobs and opaque accountIds. No decryption possible without the user’s phrase + PIN.
Network interception (MitM)	All vault traffic is TLS + the payload is independently AES-256-GCM encrypted. Interception yields an encrypted blob over an encrypted channel.
Subpoena / legal compulsion of server	Server operator can only produce encrypted blobs. No key material exists on the server. No user identity beyond opaque accountId.
Brute-force of recovery phrase	2,048⁶ × 10,000 ≈ 7.2 × 10²³ combinations. Each attempt requires 64 MB RAM × 3 iterations of Argon2id. Computationally infeasible.
Compromised authToken	authToken authenticates vault API access but is cryptographically independent from the encryptionKey. A leaked token allows downloading the blob but not decrypting it.
Malicious app update replacing the crypto	App Store code signing and Apple’s review process. Open-sourced crypto layer allows independent verification that the published code matches the app.
Attacker with physical device access	Data at rest is encrypted. Keychain keys are `WhenUnlockedThisDeviceOnly` — inaccessible when the device is locked or moved to another device.
Lost recovery phrase	Data cannot be recovered by the user or by CortexOS. This is by design — it is the cost of zero-knowledge encryption.

SECTION 04

On-Device ML Pipeline

CortexOS runs AI models entirely offline using a four-system architecture, each serving distinct purposes while sharing common ML infrastructure.

System 1: Entry Analysis

Real-time. Triggered when user saves a journal entry. Routes through AIAnalyzerFactory to tier-appropriate TFLite / Lexicon pipeline. Outputs sentiment, emotions, entities, cognitive distortions.

System 2: Cross-Entry Insights

Background. Runs via InsightWorker (9 AM) and NightlyAnalysisWorker (3 AM). Uses MLCrossEntryAnalyzer for semantic clustering, mood trajectories, and correlation detection.

System 3: Contextual Intelligence

Notifications. MLContextualEngine scores prompts by relevance (60%) + novelty (40%). Considers time-of-day, mood trajectory, and streak status for smart nudges.

System 4: Generative AI

On-demand. Llama 3.2 via LlamaGenerativeService. Generates cognitive reframings, weekly summaries, streak celebrations, and therapeutic nudge messages. See Section 5.

4.1 ML Model Assets

Model	Size	Purpose
`mobilebert_sentiment.tflite`	~25 MB	Primary BERT-based sentiment analysis
`sentiment_model.tflite`	~2 MB	Lightweight sentiment fallback
`emotion_model.tflite`	~5 MB	15-class emotion detection
`sentence_encoder.tflite`	~22 MB	512-dimensional embeddings (USE-Lite)
`llama-3.2-1b-q4_k_m.gguf`	~700 MB	On-device generative AI (mid-range devices)
`llama-3.2-3b-q4_k_m.gguf`	~1.8 GB	On-device generative AI (high-end devices, 6GB+ RAM)
`whisper-tiny.bin`	~75 MB	Voice-to-text transcription

4.2 Fallback Chain

When models fail or device resources are constrained, the system degrades gracefully through five levels:

TFLite MobileBERT → TFLite Lightweight → Lexicon (15,234 words) → Keyword Rules → Static Defaults

The lexicon contains sentiment-scored words from AFINN and SentiWordNet datasets, enabling full offline analysis without any ML models loaded.

4.3 Device Capability Adaptation

DeviceCapabilityChecker detects available RAM and CPU tier at startup. The ML pipeline adapts accordingly:

Device Tier	RAM	Capabilities
HIGH_END	6 GB+	All TFLite models + Llama 3.2 3B + Whisper
MID_END	4–6 GB	All TFLite models + Llama 3.2 1B + Whisper
LOW_END	< 4 GB	Lightweight TFLite + Lexicon + Whisper Tiny
MINIMAL	< 3 GB	Lexicon only. LLM disabled entirely.

SECTION 05

On-Device LLM: Llama 3.2

CortexOS v4.0 introduces full generative AI running entirely on-device using Meta’s Llama 3.2 Instruct models via llama.cpp. No text is ever sent to a server. This is the single largest capability addition since launch.

5.1 Model Configuration

Parameter	Value
Model	Llama 3.2 Instruct (1B or 3B, quantized q4_k_m)
Context Window	2,048 tokens
Max Output Tokens	256 (default)
Temperature	0.7
Top-P	0.9
Timeout	30 seconds (configurable)

5.2 Use Cases

Cognitive reframing — AI-generated alternative perspectives on negative thought patterns
Weekly summaries — personalized narrative summaries of the user’s week
Streak celebrations — unique motivational messages for journaling milestones
Smart nudges — contextually-aware journaling prompts
Emotional support — empathetic responses adapted to the user’s recent mood trajectory
Mood insights — pre-cached nightly analysis of emotional patterns

5.3 Fallback Architecture

Every LLM call uses a two-tier contract: try Llama first, fall back to a high-quality static template if inference fails, times out, or the model isn’t loaded.

LlamaGenerativeService.kt

suspend fun generateOrFallback(
    prompt: String,
    fallback: String,
    maxTokens: Int = 200,
    temperature: Float = 0.7f,
    timeoutMs: Long = 30_000L
): String

suspend fun generateOrNull(
    prompt: String,
    maxTokens: Int = 200,
    temperature: Float = 0.7f,
    timeoutMs: Long = 30_000L
): String?

fun isAvailable(): Boolean = MLCoreEngine.hasLlamaSupport()

This ensures the user experience is never degraded by model unavailability. Static templates in AIContentLibrary (500+ prompts, 200+ insights, 40+ emotion definitions) provide rich fallback content.

SECTION 06

Personalized Adaptation Layer

ML models are trained on general datasets, but every person writes differently. CortexOS includes a PersonalizedSentimentAdapter that learns per-user sentiment corrections over time without retraining the underlying model.

6.1 How It Works

The adapter operates as a lightweight post-processing layer after TFLite inference. If a user’s entries about “gym” are consistently positive but the model scores them neutral, the adapter learns a topic→sentiment offset and adjusts future scores.

Learned offsets stored in Room database (persists across sessions)
Maximum offset cap of ±0.4 to prevent drift
Weights retrained nightly by NightlyAnalysisWorker
Thread-safe singleton with in-memory cache for zero-latency access

PersonalizedSentimentAdapter.kt

suspend fun adjust(text: String, baseSentiment: Float): Float {
    val keywords = extractKeywords(text)
    var totalOffset = 0f
    var matchCount = 0

    for (keyword in keywords) {
        learnedOffsets[keyword.lowercase()]?.let { offset ->
            totalOffset += offset
            matchCount++
        }
    }
    return (baseSentiment + averageOffset).coerceIn(-1f, 1f)
}

SECTION 07

Tiered AI System

CortexOS offers three subscription tiers plus a 14-day free trial. Cortex Deep and Cortex Ultra have identical capabilities — Ultra is the one-time lifetime payment option.

Feature	Nano (Free)	Deep (Premium)	Ultra (Lifetime)	Trial (14 days)
Entries / month	50	Unlimited	Unlimited	Unlimited
Sentiment analysis	3-level	5-level + confidence	5-level + confidence	5-level + confidence
Emotions detected	5	15	15	15
Sentiment lexicon	1,200 words	15,234 words	15,234 words	15,234 words
Context window	1,000 words	5,000 words	5,000 words	5,000 words
Cognitive distortions	—	15 types + reframing	15 types + reframing	15 types + reframing
Cross-entry insights	Basic	ML-enhanced	ML-enhanced	ML-enhanced
Weekly summaries	—	✓	✓	✓
Semantic throwbacks	—	✓	✓	✓
Daily reflection	—	✓	✓	✓
Mood prediction	—	✓	✓	✓
Therapist export	—	✓	✓	✓
Voice journaling	✓	✓	✓	✓
Chapters	✓	✓	✓	✓
Health Integration	✓	✓	✓	✓
Cloud vault backup	✓	✓	✓	✓
Home screen widget	✓	✓	✓	✓

AIAnalyzerFactory.kt

fun getAnalyzer(context: Context): BaseAIAnalyzer {
    val prefs = UserPreferences(context)
    val effectiveTier = when {
        prefs.isLifetime -> TIER_LIFETIME
        prefs.isPremium  -> TIER_PREMIUM   // Includes trial
        else -> prefs.subscriptionTier
    }
    return when (effectiveTier) {
        TIER_LIFETIME, TIER_PREMIUM
            -> LifetimeAIAnalyzer(context)  // Cortex Ultra/Deep
        else
            -> FreemiumAIAnalyzer(context)  // Cortex Nano
    }
}

SECTION 08

Semantic Intelligence Engine

CortexOS implements semantic understanding using sentence embeddings, enabling discovery of entries that feel similar despite using entirely different words.

8.1 Embedding Generation

Each journal entry encodes into a 512-dimensional vector using a quantized version of Universal Sentence Encoder Lite (USE-Lite). Embeddings capture semantic meaning beyond keywords, storing as 2,048 bytes per entry.

MLCoreEngine.kt

const val EMBEDDING_DIM = 512
const val EMBEDDING_BYTE_SIZE = EMBEDDING_DIM * 4  // 2,048 bytes

8.2 Semantic Throwbacks

Rather than matching by date alone, the system finds entries by meaning through cosine similarity with a 0.5–0.7 threshold.

Example

Today’s entry about “feeling overwhelmed at work” surfaces a 6-month-old entry about “the pressure of deadlines” with 0.73 similarity — despite sharing no common words.

8.3 Embedding Clustering

MLCrossEntryAnalyzer uses K-means clustering (k=5) on embeddings to automatically group entries by theme, enabling pattern detection without manual tagging. Themes are extracted from cluster centroids and labeled using the most representative entries.

8.4 Nightly Embedding Regeneration

NightlyAnalysisWorker scans for entries missing embeddings (created before ML was available, or migrated from older versions) and regenerates up to 50 per night. This ensures the semantic index gradually reaches full coverage without impacting daytime performance.

SECTION 09

Journey & Narrative System

CortexOS transforms journaling from isolated entries into a continuous narrative. The Journey System organizes entries into 7-day chapters with AI-generated names, creating a sense of progression and story.

9.1 Chapter Formation

Chapters span 7 consecutive journaling days (not calendar days)
Each day requires at least one entry to count toward the streak
Completing a chapter triggers celebration UI and insight generation
ChapterFinalizationWorker automatically finalizes expired chapters twice daily

9.2 AI Chapter Naming

ChapterNameGenerator aggregates emotions, themes, and mood trajectory across all entries in the chapter to produce a poetic title. When Llama 3.2 is available, names are generated by the LLM for richer, more personalized results.

9.3 Chapter Analytics

ChapterAnalyzer provides ML-enhanced aggregation across a chapter’s entries:

Emotion aggregation (average + frequency weighting)
Semantic theme clustering
Mood trajectory calculation
Health data integration (if health integration enabled)

SECTION 10

Health Integration

CortexOS integrates with Apple HealthKit (iOS) and Android Health Connect (Android) to surface correlations between physical health and emotional state. All health data stays on-device and is processed locally.

10.1 Data Sources

Data Type	What’s Read	Alert Threshold
Sleep	Sessions, total duration, sleep stages	< 5 hours triggers concern
Heart Rate	Average, resting, min/max	> 100 bpm triggers alert
Steps & Activity	Daily steps, distance, calories	—
Workouts	Exercise sessions, types, duration	—

10.2 Reactive Health Signals

HealthSignalMonitor watches for concerning patterns (low sleep, elevated heart rate) and caches health concern flags for InsightWorker to surface. A 24-hour cooldown prevents alert fatigue. Health insights are integrated into chapter analytics, weekly summaries, and therapist exports.

10.3 Permission Model

Each data type is independently togglable. CortexOS requests only the permissions the user explicitly enables. Health data is encrypted alongside journal entries using the same AES-256-GCM pipeline.

SECTION 11

Voice Processing

Voice journaling uses OpenAI’s Whisper model ported for entirely on-device operation via Whisper.cpp. No audio ever leaves the device.

Parameter	Value
Engine	Android: Whisper.cpp via JNI • iOS: WhisperKit
Models	whisper-tiny.bin (75 MB) or whisper-base.bin (150 MB)
Sample Rate	16 kHz mono PCM
Languages	99 languages supported
Processing	100% on-device, zero cloud

Processing Flow

User records voice note → stored as encrypted temp file
Whisper transcribes audio → text extracted on-device
Audio file encrypted with AES-256-GCM
Transcription encrypted and stored with entry
Temp file securely wiped from storage

SECTION 12

Notification Intelligence

CortexOS has 7+ independent notification systems. Without coordination, worst-case days could produce 10–14 notifications. The Notification Intelligence layer ensures notifications are helpful, never overwhelming.

12.1 Global Throttler

NotificationThrottler is a global gatekeeper that every notification must pass through before firing. It enforces a daily budget and minimum spacing.

Priority	Types	Budget	Spacing
CRITICAL	User-set reminders	Always delivered	None
HIGH	Mood alerts, declining mood check-ins	Subject to budget	None
MEDIUM	Daily reflection, nudges, inactivity	Subject to budget	2 hours minimum
LOW	Throwbacks, streaks, chapters, summaries	Subject to budget	2 hours minimum

Default daily budget: 4 notifications (user-configurable, range 2–8). Resets at midnight. CRITICAL reminders never count against the budget.

12.2 InsightWorker: Collect-Then-Pick-Best

InsightWorker runs 7 analysis checks daily but collects notification candidates into a priority queue rather than firing each one. Only the top 1–2 candidates are delivered. Unfired insights are still saved to the database and visible in the Insights tab.

Priority order: mood alert > streak milestone > throwback > weekly summary > smart nudge > health signal.

12.3 Session Mood Monitor

SessionMoodMonitor provides real-time intra-session detection during active journaling sessions.

Parameter	Value
Session window	4 hours for decline detection
Session timeout	6 hours of inactivity
Decline trigger	3+ increasingly negative entries in window
Sudden drop threshold	> 0.6 deviation from 7-day average
Alert cooldown	3 days between session mood alerts

12.4 Progressive Re-engagement

InactivityNudgeWorker sends escalating re-engagement nudges at 3, 7, 14, 30, and 60+ day inactivity thresholds. Each threshold fires only once; the counter resets when the user creates a new entry.

12.5 Natural Language Reminders

ReminderDetector uses NLP to extract reminders directly from journal text (“remind me in 2 hours”, “don’t forget tomorrow”). Detected reminders are automatically scheduled and delivered as CRITICAL-priority notifications.

12.6 Worker Orchestration

CortexOS coordinates 13 background workers through a platform-native scheduler (WorkManager on Android, BGTaskScheduler on iOS):

Worker	Schedule	Purpose
`NightlyAnalysisWorker`	3 AM daily	Heavy ML analysis, cache pre-computation, embedding regeneration
`InsightWorker`	9 AM daily	Throwbacks, mood alerts, weekly summaries, streak checks
`DailyReflectionWorker`	User-configured hour	Mood-aware journaling prompts
`NudgeWorker`	Every 8–24h	Contextual journaling nudges
`InactivityNudgeWorker`	Daily	Progressive re-engagement (3/7/14/30/60+ days)
`ChapterFinalizationWorker`	Every 12h	Finalize expired 7-day chapters
`DailyReminderScanWorker`	6 AM daily	Reschedule missed reminders
`ReminderWorker`	On-demand	Fire individual reminders at scheduled time
`VaultSyncWorker`	On write (debounced)	Encrypted cloud backup

SECTION 13

Clinical Data Export

TherapistExportManager generates shareable reports for mental health professionals while preserving journal privacy. All processing is on-device.

13.1 Report Contents

Included

Mood trends over configurable date range
Emotion frequency distribution
Detected cognitive distortions
Sleep & activity correlations (if health integration enabled)
Chapter summaries and themes
Journaling consistency metrics

Excluded

Raw journal text
Specific names or places
Voice recordings
Unencrypted entry content
Embedding vectors

13.2 Ephemeral Generation

Reports generate in a secure temp directory and are programmatically wiped when:

The share sheet is dismissed
The app moves to background
A configurable timeout expires (default: 5 minutes)

SECTION 14

Behavioral Privacy

Unlike analytics-driven apps, CortexOS learns user behavior locally to optimize the journaling experience — without transmitting behavioral data to any server.

14.1 Optimal Timing Prediction

MLContextualEngine analyzes entry timestamps across the last 30 days to predict optimal journaling times. It counts entries by hour, identifies peak hours, calculates confidence, and provides natural-language reasoning (e.g., “You usually journal in the evening”).

14.2 Smart Nudges

Notifications adapt to user context using locally-computed signals:

Mood trajectory — declining mood triggers supportive check-ins
Time of day — morning vs. evening nudge tone
Streak status — streak-at-risk messages before the streak breaks
Activity level — inactivity nudges escalate from 3 to 60+ days

14.3 Anti-Repetition

SmartVarietySelector in the AIContentLibrary tracks the last 10 prompts shown to each user and ensures no repetition. Prompts are scored by relevance (60%) and novelty (40%) before selection.

14.4 The Nightly Brain

NightlyAnalysisWorker runs heavy analysis at 3 AM while the user sleeps — pre-computing correlations, mood trajectories, emotional cycles, and embedding regeneration. Results are cached in AnalysisCache for instant access on wake. This ensures the Insights tab loads instantly without running expensive ML queries in the foreground.

SECTION 15

Conclusion

CortexOS v4.0 represents the most comprehensive implementation of sovereign AI in a consumer application, combining:

Argon2id + AES-256-GCM cryptography with canary verification for zero-knowledge security
On-device Llama 3.2 generative AI for cognitive reframing, summaries, and personalized messages
On-device ML pipeline (TFLite on Android, Apple Neural Engine on iOS) for sentiment analysis, emotion detection, and semantic embeddings
Personalized adaptation that learns each user’s language without retraining models
Health integration (HealthKit on iOS, Health Connect on Android) surfacing mind-body correlations privately
Zero-knowledge cloud vault for encrypted backup without server-side access
Notification intelligence with global throttling, priority budgeting, and session mood monitoring
13-worker background orchestration balancing insight generation with battery and notification respect

Our Promise

Your mind belongs to you. Your software should respect that.

Security researchers are welcome to review our approach. Responsible disclosure: info@cortexos.app

Security & Architecture

Table of Contents

The Philosophy of Sovereign AI

Design Principles

Cryptographic Architecture

2.1 Recovery Phrase & Key Material

2.2 Key Derivation (Argon2id)

2.3 Encryption (AES-256-GCM)

2.4 What Gets Encrypted

2.5 Challenge Verification (Login Without Storing the Key)

Zero-Knowledge Cloud Vault

3.1 Architecture

3.2 What the Server Stores — and What It Cannot See

3.3 How the accountId Works

3.4 Recovery on a New Device

3.5 Threat Model

On-Device ML Pipeline

System 1: Entry Analysis

System 2: Cross-Entry Insights

System 3: Contextual Intelligence

System 4: Generative AI

4.1 ML Model Assets

4.2 Fallback Chain

4.3 Device Capability Adaptation

On-Device LLM: Llama 3.2

5.1 Model Configuration

5.2 Use Cases

5.3 Fallback Architecture

Personalized Adaptation Layer

6.1 How It Works

Tiered AI System

Semantic Intelligence Engine

8.1 Embedding Generation

8.2 Semantic Throwbacks

8.3 Embedding Clustering

8.4 Nightly Embedding Regeneration

Journey & Narrative System

9.1 Chapter Formation

9.2 AI Chapter Naming

9.3 Chapter Analytics

Health Integration

10.1 Data Sources

10.2 Reactive Health Signals

10.3 Permission Model

Voice Processing

Processing Flow

Notification Intelligence

12.1 Global Throttler

12.2 InsightWorker: Collect-Then-Pick-Best

12.3 Session Mood Monitor

12.4 Progressive Re-engagement

12.5 Natural Language Reminders

12.6 Worker Orchestration

Clinical Data Export

13.1 Report Contents

Included

Excluded

13.2 Ephemeral Generation

Behavioral Privacy

14.1 Optimal Timing Prediction

14.2 Smart Nudges

14.3 Anti-Repetition

14.4 The Nightly Brain

Conclusion