Documents: - Phase 8: 5-day E2E test plan on entech-clone (snapshot, migration, audits, smoke tests, rollback test, sign-off criteria) - Phase 9: Cutover weekend runbook (Friday 6pm stop → Sunday buffer → Monday 7am operators back). 4 hours active work. - Phase 10: 2-week burn-in monitoring + rollback safety net + Day 14 snapshot drop. Bridge_mrp deprecation options. - Phase-end polish task list (deferred Minor items from Phase 1-7 reviews + the Phase 6 operator UI rewrite). - Communication templates (operator email, manager briefing). - Open decisions for user before Phase 9 starts. - File checklist confirming all Phase 1-7 deliverables present. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
14 KiB
Native Job Model — Cutover Runbook (Phases 8, 9, 10)
Date: 2026-04-25
Owner: Nexa Systems
Status: Draft. Verify each step on entech-clone before live cutover.
Predecessor: Phases 1–7 complete (commits up to current HEAD on
feat/fp-native-job-model). Spec:
docs/superpowers/specs/2026-04-25-fp-native-job-model-design.md. Plan:
docs/superpowers/plans/2026-04-25-fp-native-job-model.md.
This runbook covers the operational phases of the migration:
- Phase 8 — End-to-end testing on a clone of entech (~5 days)
- Phase 9 — Live cutover weekend (4 hour window)
- Phase 10 — 2-week burn-in with rollback safety net
Phase 8 — E2E testing on entech-clone (5 days)
8.1 Prepare the clone
- Snapshot live entech:
pct snapshot 111 pre_fp_jobs_cloneon pve-worker5. - Spin up a sibling LXC (e.g.
entech-cloneat LXC 511 / pve-worker5).- Restore from the snapshot
- Configure new IP: 10.200.1.27 (so it doesn't compete with live entech 10.200.1.26)
- Update
odoo.confto a separate database name e.g.admin_clone
- Update Tailscale: add
entech-cloneto your Tailscale ACL so SSH works. - Verify clone independence: any DB writes on entech-clone must NOT bleed to live entech. Different DB name, different IP.
8.2 Pre-migration audit
Run on entech-clone:
ssh pve-worker5 "pct exec 511 -- bash -c 'su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin_clone\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/audit_pre_migration.py"
Expected output: counts of MOs, WOs, dependent records, data quality flags.
Capture the baseline numbers in phase8_baseline.txt for diffing later.
8.3 Run migration
ssh pve-worker5 "pct exec 511 -- bash -c 'su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin_clone\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/migrate_to_fp_jobs.py"
Watch for errors in the output. Audit log at /tmp/fp_jobs_migration.log.
8.4 Post-migration audit
ssh pve-worker5 "pct exec 511 -- bash -c 'su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin_clone\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/audit_post_migration.py"
Verify:
fp.jobcount ==mrp.productioncount (every MO has a mirror)fp.job.stepcount ==mrp.workordercount- Dependent x_fc_*_id counts match production_id / workorder_id counts
If any mismatch, dig into the audit log for errors.
8.5 Smoke test the new flow
Manual on the clone via browser:
- Toggle
x_fc_use_native_jobs=Truein Settings → Fusion Plating Jobs. - Create a new SO with a plating line.
- Confirm the SO. Verify a
WH/JOB/...record appears in Plating Jobs (new) menu. - Verify the recipe steps generated correctly.
- Open a step, click Start, then Finish. Verify timelog row, duration_actual, cost_total all populate.
- Print the new Job Sticker (6×4"). Verify QR scans to
/fp/job/<id>and redirects to the form. - Print the Job Traveller. Verify all steps listed.
- Click Mark Done on the job. Verify state=done, draft delivery created, draft cert created (best-effort).
8.6 Replay 30 days of activity
Identify the last 30 days of MO activity on entech (pre-clone) and replay those operator actions through the new flow on the clone. Look for:
- Operations that succeeded on the legacy flow but error on native
- Reports that render differently
- Cost / margin numbers that differ between legacy and native
Diff certificates byte-for-byte: render 100 random CoC PDFs on legacy and on migrated native job. They should be visually identical. Any differences are audit-grade red flags (Nadcap / aerospace).
8.7 Performance baseline
Measure on the clone:
- Plant Overview load time with N active steps (grouped by work_centre)
- Job form open time with 50-step recipe
- Job traveller PDF render time
- Job sticker PDF render time
- Migration script runtime (target: < 30 min on entech-scale data)
If anything is significantly slower than the legacy MO/WO flow, investigate indexes (M2M tables, related stores) before cutover.
8.8 Rollback test
On the clone, simulate a rollback:
- Restore the pre-cutover snapshot.
- Verify legacy MO/WO data is intact.
- Verify the
fusion_plating_jobsmodule is still installed but inert (flag is False). - Verify nothing in bridge_mrp / fusion_plating_reports / shopfloor / notifications regressed.
Rollback safety is the most important thing to prove before live cutover.
8.9 Sign-off criteria
Before scheduling Phase 9:
- All Phase 1+2 tests pass (50+ tests)
- Migration script runs cleanly on clone with 0 errors in audit log
- Pre/post audit counts match
- 100 sample CoCs byte-identical
- All performance baselines within 20% of legacy
- Rollback test successful
If any item fails, identify the gap, fix in feat/fp-native-job-model, and
re-run §§ 8.2–8.8.
Phase 9 — Cutover weekend (1 calendar day, ~4 hours active work)
9.1 Pre-cutover communication (T-7 days)
- Email entech operators: "Saturday MM/DD evening: ~4 hours offline for system upgrade. Sunday morning normal."
- Brief 2-3 plating managers on the new menu and the demo path.
- Confirm Saturday on-site presence: 1 manager + 1 tech (you).
9.2 Friday 6pm — stop new work
- Operators wrap up active jobs. No new SO confirms. No new WOs started.
- Verify no in_progress WOs left running. Pause any timers.
9.3 Friday 8pm — backup
# Full DB dump
ssh pve-worker5 "pct exec 111 -- bash -c 'su - postgres -c \"pg_dump admin\" > /var/backups/admin_pre_fp_jobs_$(date +%Y%m%d).sql'"
# Filesystem snapshot
ssh pve-worker5 "pct snapshot 111 pre_fp_jobs_cutover"
Tag the current commit:
cd /Users/gurpreet/Github/Odoo-Modules
git tag -a pre-cutover-$(date +%Y%m%d) -m "Pre-cutover backup point"
git push origin pre-cutover-$(date +%Y%m%d)
9.4 Friday 9pm — deploy + migrate
- Deploy the latest
fusion_plating_jobsto entech (it should already be installed from Phase 7 development; just refresh).
# Sync feat/fp-native-job-model branch state to entech if not already
# (skip if entech is already on this branch)
- Update the module:
ssh pve-worker5 "pct exec 111 -- bash -c 'systemctl stop odoo && su - odoo -s /bin/bash -c \"/usr/bin/odoo -c /etc/odoo/odoo.conf -d admin -u fusion_plating_jobs --stop-after-init\" && systemctl start odoo'"
- Run the migration:
ssh pve-worker5 "pct exec 111 -- bash -c 'systemctl stop odoo && su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/migrate_to_fp_jobs.py"
-
Verify with the post-audit script.
-
Toggle the cutover flag:
# Via odoo shell:
env['ir.config_parameter'].sudo().set_param('fusion_plating_jobs.use_native_jobs', 'True')
env.cr.commit()
- Restart Odoo.
9.5 Friday 10pm — smoke test
Same as §8.5 but on live entech. If anything fails, restore backup (§9.7) and abort.
9.6 Saturday/Sunday — buffer
Shop is offline weekends. Use the time to:
- Fix anything that surfaced during smoke test
- Run additional spot checks on historical jobs
- Verify that print menus default to the new reports for new jobs
- Test sticker scans on a phone
9.7 Rollback procedure (if needed by Sunday evening)
If unrecoverable issues:
# Stop Odoo
ssh pve-worker5 "pct exec 111 -- systemctl stop odoo"
# Restore DB
ssh pve-worker5 "pct exec 111 -- bash -c 'su - postgres -c \"dropdb admin && createdb admin && psql admin < /var/backups/admin_pre_fp_jobs_<date>.sql\"'"
# Or restore container snapshot (faster, but loses any post-snapshot DB writes)
ssh pve-worker5 "pct rollback 111 pre_fp_jobs_cutover"
# Start Odoo
ssh pve-worker5 "pct exec 111 -- systemctl start odoo"
# Communicate to operators that we're back on the legacy flow
After day 7, rollback becomes "forward fix only" — too much new shop activity to restore.
9.8 Monday 7am — operators back on
- 1 manager + 1 tech on site for the first 2 hours
- Walk operators through the new menu (Plating Jobs (new) → Jobs)
- Watch for confusion or errors
- Field tickets as they come in
Phase 10 — Burn-in (2 weeks calendar, ~1 day active work)
10.1 Daily monitoring (Days 1–14)
Check daily:
- Odoo error log:
tail -f /var/log/odoo/odoo-server.log | grep -i error - Job creation rate:
SELECT COUNT(*) FROM fp_job WHERE create_date > now() - interval '1 day' - Step creation rate:
SELECT COUNT(*) FROM fp_job_step WHERE create_date > now() - interval '1 day' - Failed lifecycle hooks:
grep -c "failed to" /var/log/odoo/odoo-server.log - Operator support tickets
Run audit_post_migration.py weekly to catch any drift.
10.2 Forward-fix
Anything that surfaces during burn-in goes through the standard PR/review
workflow on feat/fp-native-job-model (or a new follow-up branch). The
underlying data layer is locked — fixes are mostly UI/report polish.
10.3 Day 14 — drop legacy snapshots
After 14 days of stable operation:
# Drop the pre-cutover snapshot
ssh pve-worker5 "pct delsnapshot 111 pre_fp_jobs_cutover"
# Optional: archive the SQL backup off-site
mv /var/backups/admin_pre_fp_jobs_*.sql /off-site/long-term-archive/
10.4 Bridge_mrp deprecation
fusion_plating_bridge_mrp is still installed and inert (the SO confirm
hook only fires when x_fc_use_native_jobs=False, which it never is post-
cutover). Options for full deprecation:
A) Leave it installed forever. Zero impact.
B) Archive (set installable=False in its manifest, so a future re-install
wouldn't activate it).
C) Uninstall (write a uninstall hook that drops the bridge tables but
preserves the data already migrated to fp.job).
Recommend (A) for the first 6 months, then revisit.
10.5 Phase-end polish
The list of deferred Minor items from Phase 1-7 reviews:
currency_id required=Trueon fp.work.centre and fp.job (and ondelete policies on M2Os uniformly across both core and jobs)tracking=Trueon fp.job.manager_id, facility_iddigits='Product Unit of Measure'on qty_('New')translation safety in create- Field labels: "Reference Product" → cleaner string
- Recipe boolean tests on fp.job.step
index=Trueon M2Os queried frequently (recipe_id, partner_id)- Author/website/maintainer block in fusion_plating_jobs manifest
- i18n wrapping (
_()) on user-visible strings _compute_state_readyfor fp.job.step pending → ready transition (Task 1.5 TODO)button_pause/button_skip/button_cancelreal implementations- Operator UI rewrite (Plant Overview, Tablet Station, Manager Dashboard, Process Tree OWL component) — Phase 6 deferral
These can be batched into one polish PR after burn-in completes (Day 14+).
Appendix A — Communication templates
Email to operators (T-7)
Subject: System maintenance Saturday — ~4 hours
Team — we're upgrading the Fusion Plating Jobs system Saturday MM/DD from 9pm Friday through Saturday morning. The shop will be offline during that window. By Monday 7am everything will be normal except you'll see a new "Plating Jobs (new)" menu in addition to the existing menus. Same data, better workflow. Manager + tech will be on site Monday morning to help.
No action needed from you. Just don't start any new jobs after 6pm Friday.
Questions? Reply or ping the manager.
Manager briefing (T-3)
Walk through:
- The new menu structure
- The settings flag and how to toggle it
- The migration script and rollback procedure
- What to do if an operator reports a bug Monday morning
Appendix B — Open decisions for the user before Phase 9
Schedule the cutover weekend with at least 4 weeks notice. Confirm:
- Date of cutover weekend
- Which manager will be on-site Monday morning
- Whether to keep the legacy menus visible after cutover (recommend: yes, for the first 14 days, then hide via group permission)
- Whether to send the operator email template above as-is or customize
- Acceptance criteria for "burn-in complete" (recommend: 14 days zero critical errors, zero operator support tickets that map to migration issues)
Appendix C — File checklist before Phase 8 starts
Verify these are present (committed to feat/fp-native-job-model):
fusion_plating_jobs/__manifest__.py— version >= 19.0.2.0.0, depends on 9 modulesfusion_plating_jobs/models/fp_job.py— _inherit with all extension fields, hooks, helpers, legacy_idfusion_plating_jobs/models/fp_job_node_override.py— override modelfusion_plating_jobs/models/sale_order.py— SO confirm hookfusion_plating_jobs/models/res_config_settings.py— flagfusion_plating_jobs/models/fp_portal_job.py— x_fc_job_id linkfusion_plating_jobs/models/fp_batch.py— x_fc_step_id / x_fc_job_idfusion_plating_jobs/models/fp_quality_hold.py— x_fc_job_id / x_fc_step_idfusion_plating_jobs/models/fp_certificate.py— x_fc_job_idfusion_plating_jobs/models/fp_thickness_reading.py— x_fc_job_id / x_fc_step_idfusion_plating_jobs/models/fp_delivery.py— x_fc_job_idfusion_plating_jobs/models/fp_racking_inspection.py— x_fc_job_idfusion_plating_jobs/models/account_move.py— invoice → job hookfusion_plating_jobs/models/fp_notification_trigger.py— job_confirmed/job_complete eventsfusion_plating_jobs/models/fusion_plating_kpi_value.py— x_fc_source tagfusion_plating_jobs/views/res_config_settings_views.xml— settings UIfusion_plating_jobs/report/report_fp_job_sticker.xml— stickerfusion_plating_jobs/report/report_fp_job_traveller.xml— travellerfusion_plating_jobs/controllers/job_scan.py— /fp/job/fusion_plating_jobs/controllers/process_tree.py— /fp/jobs/process_treefusion_plating_jobs/scripts/audit_pre_migration.pyfusion_plating_jobs/scripts/migrate_to_fp_jobs.pyfusion_plating_jobs/scripts/audit_post_migration.pyfusion_plating_jobs/scripts/README.mdfusion_plating_jobs/README.md— Phase 6 deferrals docfusion_plating_jobs/security/ir.model.access.csv— ACL rowsfusion_plating_jobs/tests/test_fp_job_extensions.py— comprehensive test suite
If anything in this list is missing, fix before Phase 8.