Firmwatch package.json conflict research — 2026-04-25

Trigger: Paul 2026-04-25 10:45 PDT — "can we please build in a cleaner method for handling all of these package.json conflicts? Am I wrong or is it like the most offending conflicting file? ... I'm looking for a more robust solution which will enable more concurrency not limit or hinder it."

Pass 1: Verify the hypothesis

Paul's right. Hard data from the last day on dev:

RankFileCommits touching it (last 24h)
1package.json34
2functions/api/chat/tool-schemas.ts7
3wrangler.toml6
3src/components/ChatDrawer.jsx6
3functions/api/chat/tools-retrieval.ts6
3functions/api/chat/tools-render.ts6
3functions/api/chat/system-prompt.ts6

package.json is touched 5x more than the next file. And the conflict is concentrated on a single field:


scripts.test = "<7,674-char string of 150 commands && && && ...>"

Every PR that adds a test command appends to this single line. Two parallel PRs both modify the same line of the same file → guaranteed conflict on every overlap.

Pass 2: Why the current architecture creates this

The &&-chain is a degenerate one-line append target. Three structural problems compound:

1. No DAG / parallelism — even when CI runs, every test runs sequentially via &&. We don't even get the speedup from running tests in parallel; we just get the conflict cost. 2. No discovery — every new test must be manually added to the chain. This is what makes parallel PRs collide. 3. No package boundaries — Firmwatch is single-package (firmwatch-build); there's no apps/foo/package.json we could move tests into.

Consensus from the research (Vercel, Stripe, Linear, Google/Bazel, Meta/Buck, Shopify, Strapi): the "monolithic test chain" pattern doesn't scale. Every high-velocity team uses discovery-based test runners + DAG orchestration + per-package boundaries.

Pass 3: Solution spectrum (5 options, ranked by surgery)

I'm giving you the full spectrum from "5-min hack" to "real architectural fix" so you can pick where you want to land.

Option 1 — JSON merge driver only (5 min, ~50% conflict reduction)

What: register a git merge-driver for package.json that runs jq (or a small Node script) to do semantic union merge on the scripts object. The driver knows that scripts with identical keys + values merge cleanly; only true conflicts (same key, different value) need human resolution.


# .gitattributes
package.json merge=json-union
# .git/config (per-clone or via a script)
[merge "json-union"]
  driver = node scripts/json-union-merge.js %O %A %B %P

Pros: surgical, no new deps, no test-runner change, deploys in an hour, ~50% of current conflicts go away (the append-to-test-chain ones become auto-merged).

Cons: still a 7,674-char one-liner. Still slow (sequential tests). Doesn't help when two PRs add the SAME script key (rare but possible).

Net: the right immediate fix. We were going to do this anyway via FIRM-261.

Option 2 — Move test discovery to a config file (1 day, ~80% conflict reduction)

What: stop appending to scripts.test. Replace with a one-liner:


"test": "node scripts/run-all-tests.mjs"

Then run-all-tests.mjs glob-discovers tests from filesystem patterns — scripts/test_.js, scripts/test_.mjs, tests/*/.test.{ts,jsx}, etc.

New tests just need a new file with the right name pattern. No package.json edit.

Pros: package.json scripts.test becomes static. Conflicts on it disappear almost entirely. Most other scripts.* entries (smoke, canary, render-tool-specific) are also rarely-touched. Pure mechanical change.

Cons: requires teaching the orch's spec authoring step + Strata's "always-add-test-cmd" pattern to NOT append to package.json — instead, just create the test file with the right name. Ticket-template change. Also: discovery order may differ from current explicit ordering (test_* alphabetical instead of "added in this order"). Probably fine, but verify.

Net: the highest-ROI fix. Most conflict pain goes away with one config refactor + a docs update.

Option 3 — Vitest workspaces with auto-discovery (3-4 days)

What: adopt vitest as the test runner. Configure workspace globs in vitest.config.ts. Tests discovered by file-path patterns, no manual chain.

Pros: real parallelism (vitest runs tests concurrently natively), watch mode, snapshot support, modern test ergonomics. Plus the auto-discovery benefit of Option 2.

Cons: ~120 of our 150 tests are written as plain Node scripts (node scripts/test_X.js style), some npx tsx, some python3. Migrating them all to vitest's expect/describe/it API is real work — call it ~3 hours per 10 tests = ~45 hours to migrate everything. Or we keep the Node-script tests on a parallel track and use vitest only for the new tests, gradual migration. Hybrid is probably the right path.

Cons cont'd: does NOT solve the multi-language issue (we have python3 + tsx + node tests). Vitest is JS/TS only.

Net: the right v2 direction but probably not the immediate fix. Migrate gradually.

Option 4 — Turborepo task graph (1 week + adopt monorepo workspace structure)

What: declare turbo.json with a pipeline.test task. Turborepo runs tests in parallel based on dependency-graph + caching. Per-package package.json files in packages/foo/, apps/firmwatch-pages/, etc. Root package.json becomes the orchestrator.

Pros: what Vercel themselves use. Native CI caching (re-running unchanged tests is a no-op). Per-package conflict isolation — adding a test to packages/scrapers/package.json doesn't conflict with adding one to packages/web/package.json. Real parallelism.

Cons: requires restructuring the repo into packages — splitting functions/, src/, scripts/, e2e/ along package boundaries. That's a real refactor (~3-5 days of careful work). Risk of breaking deploys, smoke tests, the orchestrator, etc.

Net: the right v3 direction once we feel like the project's complexity warrants monorepo discipline. Today probably premature.

Option 5 — GitHub merge queue (already filed in spec 037 §8 as v2 Decision Memo)

What: GitHub native merge queue serializes PR merges + auto-rebases each PR onto the latest dev tip + runs CI before merging. Eliminates the "auto-merge enabled but stuck on conflict" class entirely because each PR is rebased before being attempted.

Pros: GitHub-native, no orch changes, eliminates the auto-merge race wholesale.

Cons: forced sequencing — ironically reduces concurrency at the merge step (only one PR merges at a time per queue). Adds 2-5 min latency per PR. Plus our PRs come from non-trunk branches and many touch package.json — the queue would still hit auto-rebase failures and we'd need a rebase strategy underneath it.

Net: would mostly eliminate Class 1 (auto-merge race) but doesn't solve the underlying package.json contention. Best paired with Option 1 or 2.

Recommendation matrix

GoalBuildCostRiskReduces conflicts by
Fastest visible improvementOption 1 (JSON merge driver)1-2 hoursLow~50%
Best ROIOption 1 + Option 2 (driver + discovery)1 dayLow~85-90%
Long-term scalabilityOption 1 + Option 2 + gradual Option 3 (vitest migration)Multi-weekMedium~95%+
Architectural completenessAll of the above + Option 4 (turborepo)1-2 weeksMedium-high~99%

My pick

Build Option 1 + Option 2 NOW. Skip Option 3/4 until they earn their way in.

Reasoning:

Spec 037 (filed 30 min ago) handles the symptom (auto-merge race + zombie coordinators); this brief proposes the root-cause fix for what's CAUSING the high conflict rate.

Honest case-against-building

We've been bitten by "too clever" merge resolution before — yesterday's Claude Code that hallucinated a "deep API incompatibility" between two near-identical FeedbackCard files (FIRM-526). The risk with Option 1 is the JSON merge driver gets it wrong on a real conflict and silently picks a wrong branch.

Mitigation: the driver should be conservative — it auto-resolves ONLY when both sides added different keys to the same object. If both sides modified the SAME key with different values, it falls through to standard conflict markers and humans handle it.

For Option 2, the risk is that some test was relying on its position in the chain (e.g., setup test must run first). Need a one-shot audit. For our current tests, there's no inter-test ordering dependency I'm aware of — everything is independent.

Recommendation: file as Strata Standard brief

If you say go, this becomes spec 038. Tickets:

5-6 tickets, 1-2 layers, all sonnet, ~4-6h wall-clock.

Want me to file?

Generated 2026-04-25 by Morty