18 years of nonprofit technology operations. Independent AI production systems on the Anthropic Claude API. This portfolio documents methodology, production artifacts, quality assurance systems, and live output.
[email protected] · Bloomfield, NJ · LinkedIn · Medium
In early 2026 I independently designed and built a daily AI content production system on the Anthropic Claude API. The system generates essays in ten distinct character voices, converts them to multi-speaker podcast audio via VibeVoice on Kaggle GPUs, verifies citations through a three-stage adversarial process, runs a four-pass editorial review, and publishes finished episodes overnight with no manual intervention. In daily production since launch across nine publications.
Orchestrated across four platforms: Make.com automation scenarios handle the primary essay pipeline, Power Automate integrates the Claude API with Microsoft 365, Python scripts via Windows Task Scheduler handle batch submission and news aggregation, and Kaggle T4 GPU notebooks run VibeVoice for text-to-speech.
Make.com overnight batch scenario: 24 modules from Gmail news fetch through Claude API Batch submission, poll/retrieve cycle, to email delivery and OneDrive storage.
Batch cleanup: polls for completed batches, extracts text, delivers via email, archives to OneDrive.
Power Automate daily digest: Claude API + Microsoft 365. The voice guide is a plain Word document on OneDrive that non-technical program staff can edit on their own — no code, no pipeline access. The pipeline reads it at runtime.
A reusable methodology for defining and testing AI voice guides — structured prompt documents that control how the Claude API generates text in a specific character voice. Codified in a meta-document and successfully taught to a non-technical user who produced working voice guides independently.
Thinking Style vs. Costume
Every voice has two layers. Thinking Style is the cognitive pattern — how the character processes the world, what tensions they carry. Costume is the surface layer — sentence rhythm, tone, verbal habits. A voice defined at the thinking level produces original output. A voice defined at the surface level produces mimicry.
Balance Rule
If the costume is more detailed than the thinking style, the output sounds right but thinks blandly — mannerisms draped over default Claude reasoning. This is the most common failure mode.
Audience Alignment
Every voice needs a tailored audience model that pulls in the same direction as the costume. A fast, combative voice needs a bored or resistant audience. Mismatched audiences cause the voice to spend generation energy managing the conflict instead of thinking.
Grounding Rule
Every voice needs an explicit instruction for moving from abstraction to concreteness. Without one, the default is vague anecdote. Either fully specific or fully mythic. The middle ground is where every voice sounds the same.
Separation of Generation and Editing
Including editorial rules during generation causes the model to self-edit while writing, producing safer but flatter output. The editing checklist belongs in a separate pass only.
One of twenty voice guides in production. The passthrough section below is the compact, token-efficient definition sent to the Claude API. Full guides include Part 1 (development rationale, failure mode analysis) and a separate editing-pass checklist.
Thinking Style: Sees every problem as an implementation problem. Not “what should we do” but “what happens when we try to do it.” Two tensions: genuine belief that institutions improve the world AND willingness to abandon any institution the moment the math says it’s failing. Deep care for individuals AND utilitarian commitment to collective outcomes.
Costume: Clean, measured, direct prose. Medium-length sentences building arguments in visible steps. Key move: the parenthetical admission, where the person arguing is briefly visible inside the argument. Second move: the composure crack — one short, plain sentence that lets something real through.
Audience: Someone who cares about the same problem and has a worse plan. Firm, specific, not unkind.
Grounding: Specific institutional detail: the exact policy mechanism, the exact budget number. Never vague. Either the specific study and its specific finding, or nothing.
Multi-speaker podcast episodes up to 90 minutes, produced using VibeVoice on Kaggle T4 GPUs. 2–4 speakers with consistent voice identity and natural conversational delivery.
Kaggle GPU notebook: VibeVoice pipeline pulling scripts from OneDrive, generating multi-speaker audio on T4 accelerators.
Audacity: engineering voice reference samples by analyzing pitch contours and selecting segments with the right prosodic characteristics for TTS cloning.
Voice Reference Engineering
Custom voice references built by isolating speakers from source recordings using pitch-based clustering and MFCC analysis. Speaker-order enforcement, voice prompt trimming, and chunking systems handle voice reassignment at boundaries.
Script Engineering for TTS
Punctuation as prosody control. Energy continuity across speaker switches. The monotone test: if a line doesn’t communicate with zero inflection, rewrite it.
The Chunking Solution
Multi-speaker TTS breaks at long durations. Solved by designing script-level commercial breaks in a 1930s radio format that re-introduce all speakers at each boundary — turning a technical constraint into a format feature.
Every generated piece passes through four sequential editorial stages before publication. Each stage has a dedicated AI editor with a specific jurisdiction, a defined scoring methodology, and an explicit constraint preventing it from evaluating outside its domain. The separation of concerns is the point.
AI Detection Check
The 12-category phrase detection checklist. Binary gate: three or more hits in the same passage is diagnostic of machine-generated text. Catches tells invisible to quality-focused editors.
Conceptual Craft
Is the piece thinking, or performing? Evaluates originality, conceptual architecture, voice presence, humor, and thermal range. Swap test: strip the voice and see what survives.
Originality (2x), Architecture (2x), Voice, Humor, Thermal Range → 1–100
Structural Craft
Is the piece built correctly? Proportion, pacing, transitions, register control, architectural integrity. Watches for structural AI tells: uniform section length, repeated openings, symmetric structure without justification.
Proportion (2x), Pacing (2x), Transitions, Register, Integrity → 1–100
Prose Craft
Are the sentences working? Requires quantitative evidence before scoring: sentence length bands, syntactic opening patterns, verb counts. The count overrides impressions.
Rhythm (2x), Economy, Vitality, Architecture, Control (2x) → 1–100
Sample prose judge output: quantitative sentence analysis (word count, length band, syntactic opening type) with diagnostic. The count overrides impressions — if the data shows monotonous openings, the score reflects it regardless of how the prose felt on first read.
Each editor has calibrated scoring (40 = functional, 55 = competent, 75 = sustained craft, 85+ = doing meaning-work), explicit failure modes for the judge itself, and a swap test to prevent scoring outside jurisdiction. Full editor documents: Conceptual · Structural · Prose
12-category detection system based on frequency-ratio analysis. One flagged phrase is normal English. Three or more in the same passage is diagnostic. Developed through analysis of hundreds of generated pieces across multiple model versions.
Seven specific patterns identified through daily operation of ten concurrent voices. Each diagnosed from live output, resolved through prompt architecture or batch configuration changes.
Signature Calcification
The model locks onto favorite phrases and repeats them across pieces. Resolved by adding variety constraints and rotating opener patterns.
Convergent Invention
Independent pieces fabricate identical details — same fictional expert, same invented statistic. Resolved through citation verification pipelines.
Exploitation-Default Drift
One analytical lens crowds out all others over repeated runs. Resolved by strengthening secondary tensions in voice guides and varying subjects.
Cross-Contamination
In multi-voice batches, characteristics bleed between voices. Resolved through batch isolation and staggered generation timing.
Structural Pivot Dependency
The “It’s not X, it’s Y” reframe is the most persistent structural crutch. Dressing pivots in more complex language doesn’t fix the problem — if the output relies on 9–12 pivots per piece, the essay lacks something to say. Each pivot looks sharp in isolation; the pattern is diagnostic at scale. Consistently the most challenging check to pass.
Metaphor Homogeneity
Over repeated runs, the model collapses to a narrow set of structural metaphors — particularly “walls,” “load-bearing,” and “angles.” Resolved by seeding a default metaphor domain per voice (plants, weather, music, etc.) to keep figurative language distinct across the roster.
Contextless Superlatives
Essays default to sweeping evaluative claims — “the best in the history of television” — that sound human individually but become repetitive across pieces. The model reaches for superlatives as a compression shortcut instead of building the specific case.
These pieces were generated by the overnight pipeline, reviewed by the four-pass editorial system, and published without manual editing.
Auto-generated essay with inline footnotes. The citation system requires 50+ sources per piece, recorded during research per Stage 1 of the verification process.
Footnote apparatus from the same essay. Each citation verified by blind summary (Stage 2) and human spreadsheet review (Stage 3).
Essays about the production system, written to explain the methodology to a general audience.
Three eras: political campaign data infrastructure, nonprofit database management, AI production systems. The through-line is making complex technical systems work for people who are not engineers.
Harvard University, BA Government and Economics, 2004.
Publication: T. Vila, R. Greenstadt, D. Molnar. “Why we can’t be bothered to read privacy policies.” WEIS ’03, 2003.