diff --git a/fusion_plating/docs/superpowers/specs/2026-04-25-fp-native-job-cutover-runbook.md b/fusion_plating/docs/superpowers/specs/2026-04-25-fp-native-job-cutover-runbook.md new file mode 100644 index 00000000..95ab7791 --- /dev/null +++ b/fusion_plating/docs/superpowers/specs/2026-04-25-fp-native-job-cutover-runbook.md @@ -0,0 +1,385 @@ +# Native Job Model — Cutover Runbook (Phases 8, 9, 10) + +**Date:** 2026-04-25 +**Owner:** Nexa Systems +**Status:** Draft. Verify each step on entech-clone before live cutover. +**Predecessor:** Phases 1–7 complete (commits up to current HEAD on +`feat/fp-native-job-model`). Spec: +`docs/superpowers/specs/2026-04-25-fp-native-job-model-design.md`. Plan: +`docs/superpowers/plans/2026-04-25-fp-native-job-model.md`. + +This runbook covers the operational phases of the migration: + +- **Phase 8** — End-to-end testing on a clone of entech (~5 days) +- **Phase 9** — Live cutover weekend (4 hour window) +- **Phase 10** — 2-week burn-in with rollback safety net + +--- + +## Phase 8 — E2E testing on entech-clone (5 days) + +### 8.1 Prepare the clone + +1. **Snapshot live entech:** `pct snapshot 111 pre_fp_jobs_clone` on pve-worker5. +2. **Spin up a sibling LXC** (e.g. `entech-clone` at LXC 511 / pve-worker5). + - Restore from the snapshot + - Configure new IP: 10.200.1.27 (so it doesn't compete with live entech 10.200.1.26) + - Update `odoo.conf` to a separate database name e.g. `admin_clone` +3. **Update Tailscale:** add `entech-clone` to your Tailscale ACL so SSH works. +4. **Verify clone independence:** any DB writes on entech-clone must NOT bleed + to live entech. Different DB name, different IP. + +### 8.2 Pre-migration audit + +Run on entech-clone: + +```bash +ssh pve-worker5 "pct exec 511 -- bash -c 'su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin_clone\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/audit_pre_migration.py" +``` + +Expected output: counts of MOs, WOs, dependent records, data quality flags. + +**Capture the baseline numbers** in `phase8_baseline.txt` for diffing later. + +### 8.3 Run migration + +```bash +ssh pve-worker5 "pct exec 511 -- bash -c 'su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin_clone\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/migrate_to_fp_jobs.py" +``` + +Watch for errors in the output. Audit log at `/tmp/fp_jobs_migration.log`. + +### 8.4 Post-migration audit + +```bash +ssh pve-worker5 "pct exec 511 -- bash -c 'su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin_clone\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/audit_post_migration.py" +``` + +Verify: +- `fp.job` count == `mrp.production` count (every MO has a mirror) +- `fp.job.step` count == `mrp.workorder` count +- Dependent x_fc_*_id counts match production_id / workorder_id counts + +If any mismatch, dig into the audit log for errors. + +### 8.5 Smoke test the new flow + +Manual on the clone via browser: + +1. Toggle `x_fc_use_native_jobs=True` in Settings → Fusion Plating Jobs. +2. Create a new SO with a plating line. +3. Confirm the SO. Verify a `WH/JOB/...` record appears in **Plating Jobs (new)** menu. +4. Verify the recipe steps generated correctly. +5. Open a step, click Start, then Finish. Verify timelog row, duration_actual, + cost_total all populate. +6. Print the new Job Sticker (6×4"). Verify QR scans to `/fp/job/` and + redirects to the form. +7. Print the Job Traveller. Verify all steps listed. +8. Click **Mark Done** on the job. Verify state=done, draft delivery created, + draft cert created (best-effort). + +### 8.6 Replay 30 days of activity + +Identify the last 30 days of MO activity on entech (pre-clone) and replay +those operator actions through the new flow on the clone. Look for: +- Operations that succeeded on the legacy flow but error on native +- Reports that render differently +- Cost / margin numbers that differ between legacy and native + +Diff certificates byte-for-byte: render 100 random CoC PDFs on legacy and on +migrated native job. They should be visually identical. Any differences are +audit-grade red flags (Nadcap / aerospace). + +### 8.7 Performance baseline + +Measure on the clone: +- Plant Overview load time with N active steps (grouped by work_centre) +- Job form open time with 50-step recipe +- Job traveller PDF render time +- Job sticker PDF render time +- Migration script runtime (target: < 30 min on entech-scale data) + +If anything is significantly slower than the legacy MO/WO flow, investigate +indexes (M2M tables, related stores) before cutover. + +### 8.8 Rollback test + +On the clone, simulate a rollback: +1. Restore the pre-cutover snapshot. +2. Verify legacy MO/WO data is intact. +3. Verify the `fusion_plating_jobs` module is still installed but inert + (flag is False). +4. Verify nothing in bridge_mrp / fusion_plating_reports / shopfloor / + notifications regressed. + +Rollback safety is the most important thing to prove before live cutover. + +### 8.9 Sign-off criteria + +Before scheduling Phase 9: +- [ ] All Phase 1+2 tests pass (50+ tests) +- [ ] Migration script runs cleanly on clone with 0 errors in audit log +- [ ] Pre/post audit counts match +- [ ] 100 sample CoCs byte-identical +- [ ] All performance baselines within 20% of legacy +- [ ] Rollback test successful + +If any item fails, identify the gap, fix in `feat/fp-native-job-model`, and +re-run §§ 8.2–8.8. + +--- + +## Phase 9 — Cutover weekend (1 calendar day, ~4 hours active work) + +### 9.1 Pre-cutover communication (T-7 days) + +- Email entech operators: "Saturday MM/DD evening: ~4 hours offline for + system upgrade. Sunday morning normal." +- Brief 2-3 plating managers on the new menu and the demo path. +- Confirm Saturday on-site presence: 1 manager + 1 tech (you). + +### 9.2 Friday 6pm — stop new work + +- Operators wrap up active jobs. No new SO confirms. No new WOs started. +- Verify no in_progress WOs left running. Pause any timers. + +### 9.3 Friday 8pm — backup + +```bash +# Full DB dump +ssh pve-worker5 "pct exec 111 -- bash -c 'su - postgres -c \"pg_dump admin\" > /var/backups/admin_pre_fp_jobs_$(date +%Y%m%d).sql'" + +# Filesystem snapshot +ssh pve-worker5 "pct snapshot 111 pre_fp_jobs_cutover" +``` + +Tag the current commit: + +```bash +cd /Users/gurpreet/Github/Odoo-Modules +git tag -a pre-cutover-$(date +%Y%m%d) -m "Pre-cutover backup point" +git push origin pre-cutover-$(date +%Y%m%d) +``` + +### 9.4 Friday 9pm — deploy + migrate + +1. Deploy the latest `fusion_plating_jobs` to entech (it should already be + installed from Phase 7 development; just refresh). + +```bash +# Sync feat/fp-native-job-model branch state to entech if not already +# (skip if entech is already on this branch) +``` + +2. Update the module: + +```bash +ssh pve-worker5 "pct exec 111 -- bash -c 'systemctl stop odoo && su - odoo -s /bin/bash -c \"/usr/bin/odoo -c /etc/odoo/odoo.conf -d admin -u fusion_plating_jobs --stop-after-init\" && systemctl start odoo'" +``` + +3. Run the migration: + +```bash +ssh pve-worker5 "pct exec 111 -- bash -c 'systemctl stop odoo && su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/migrate_to_fp_jobs.py" +``` + +4. Verify with the post-audit script. + +5. Toggle the cutover flag: + +```bash +# Via odoo shell: +env['ir.config_parameter'].sudo().set_param('fusion_plating_jobs.use_native_jobs', 'True') +env.cr.commit() +``` + +6. Restart Odoo. + +### 9.5 Friday 10pm — smoke test + +Same as §8.5 but on live entech. If anything fails, restore backup +(§9.7) and abort. + +### 9.6 Saturday/Sunday — buffer + +Shop is offline weekends. Use the time to: +- Fix anything that surfaced during smoke test +- Run additional spot checks on historical jobs +- Verify that print menus default to the new reports for new jobs +- Test sticker scans on a phone + +### 9.7 Rollback procedure (if needed by Sunday evening) + +If unrecoverable issues: + +```bash +# Stop Odoo +ssh pve-worker5 "pct exec 111 -- systemctl stop odoo" + +# Restore DB +ssh pve-worker5 "pct exec 111 -- bash -c 'su - postgres -c \"dropdb admin && createdb admin && psql admin < /var/backups/admin_pre_fp_jobs_.sql\"'" + +# Or restore container snapshot (faster, but loses any post-snapshot DB writes) +ssh pve-worker5 "pct rollback 111 pre_fp_jobs_cutover" + +# Start Odoo +ssh pve-worker5 "pct exec 111 -- systemctl start odoo" + +# Communicate to operators that we're back on the legacy flow +``` + +After day 7, rollback becomes "forward fix only" — too much new shop activity +to restore. + +### 9.8 Monday 7am — operators back on + +- 1 manager + 1 tech on site for the first 2 hours +- Walk operators through the new menu (Plating Jobs (new) → Jobs) +- Watch for confusion or errors +- Field tickets as they come in + +--- + +## Phase 10 — Burn-in (2 weeks calendar, ~1 day active work) + +### 10.1 Daily monitoring (Days 1–14) + +Check daily: +- Odoo error log: `tail -f /var/log/odoo/odoo-server.log | grep -i error` +- Job creation rate: `SELECT COUNT(*) FROM fp_job WHERE create_date > now() - interval '1 day'` +- Step creation rate: `SELECT COUNT(*) FROM fp_job_step WHERE create_date > now() - interval '1 day'` +- Failed lifecycle hooks: `grep -c "failed to" /var/log/odoo/odoo-server.log` +- Operator support tickets + +Run audit_post_migration.py weekly to catch any drift. + +### 10.2 Forward-fix + +Anything that surfaces during burn-in goes through the standard PR/review +workflow on `feat/fp-native-job-model` (or a new follow-up branch). The +underlying data layer is locked — fixes are mostly UI/report polish. + +### 10.3 Day 14 — drop legacy snapshots + +After 14 days of stable operation: + +```bash +# Drop the pre-cutover snapshot +ssh pve-worker5 "pct delsnapshot 111 pre_fp_jobs_cutover" + +# Optional: archive the SQL backup off-site +mv /var/backups/admin_pre_fp_jobs_*.sql /off-site/long-term-archive/ +``` + +### 10.4 Bridge_mrp deprecation + +`fusion_plating_bridge_mrp` is still installed and inert (the SO confirm +hook only fires when `x_fc_use_native_jobs=False`, which it never is post- +cutover). Options for full deprecation: + +A) Leave it installed forever. Zero impact. +B) Archive (set `installable=False` in its manifest, so a future re-install + wouldn't activate it). +C) Uninstall (write a uninstall hook that drops the bridge tables but + preserves the data already migrated to fp.job). + +Recommend (A) for the first 6 months, then revisit. + +### 10.5 Phase-end polish + +The list of deferred Minor items from Phase 1-7 reviews: + +- `currency_id required=True` on fp.work.centre and fp.job (and ondelete + policies on M2Os uniformly across both core and jobs) +- `tracking=True` on fp.job.manager_id, facility_id +- `digits='Product Unit of Measure'` on qty +- `_('New')` translation safety in create +- Field labels: "Reference Product" → cleaner string +- Recipe boolean tests on fp.job.step +- `index=True` on M2Os queried frequently (recipe_id, partner_id) +- Author/website/maintainer block in fusion_plating_jobs manifest +- i18n wrapping (`_()`) on user-visible strings +- `_compute_state_ready` for fp.job.step pending → ready transition (Task 1.5 + TODO) +- `button_pause` / `button_skip` / `button_cancel` real implementations +- Operator UI rewrite (Plant Overview, Tablet Station, Manager Dashboard, + Process Tree OWL component) — Phase 6 deferral + +These can be batched into one polish PR after burn-in completes (Day 14+). + +--- + +## Appendix A — Communication templates + +### Email to operators (T-7) + +> Subject: System maintenance Saturday — ~4 hours +> +> Team — we're upgrading the Fusion Plating Jobs system Saturday MM/DD +> from 9pm Friday through Saturday morning. The shop will be offline during +> that window. By Monday 7am everything will be normal except you'll see a +> new "Plating Jobs (new)" menu in addition to the existing menus. Same data, +> better workflow. Manager + tech will be on site Monday morning to help. +> +> No action needed from you. Just don't start any new jobs after 6pm Friday. +> +> Questions? Reply or ping the manager. + +### Manager briefing (T-3) + +Walk through: +1. The new menu structure +2. The settings flag and how to toggle it +3. The migration script and rollback procedure +4. What to do if an operator reports a bug Monday morning + +--- + +## Appendix B — Open decisions for the user before Phase 9 + +Schedule the cutover weekend with at least 4 weeks notice. Confirm: + +1. Date of cutover weekend +2. Which manager will be on-site Monday morning +3. Whether to keep the legacy menus visible after cutover (recommend: yes, + for the first 14 days, then hide via group permission) +4. Whether to send the operator email template above as-is or customize +5. Acceptance criteria for "burn-in complete" (recommend: 14 days zero + critical errors, zero operator support tickets that map to migration + issues) + +--- + +## Appendix C — File checklist before Phase 8 starts + +Verify these are present (committed to feat/fp-native-job-model): + +- [x] `fusion_plating_jobs/__manifest__.py` — version >= 19.0.2.0.0, depends on 9 modules +- [x] `fusion_plating_jobs/models/fp_job.py` — _inherit with all extension fields, hooks, helpers, legacy_id +- [x] `fusion_plating_jobs/models/fp_job_node_override.py` — override model +- [x] `fusion_plating_jobs/models/sale_order.py` — SO confirm hook +- [x] `fusion_plating_jobs/models/res_config_settings.py` — flag +- [x] `fusion_plating_jobs/models/fp_portal_job.py` — x_fc_job_id link +- [x] `fusion_plating_jobs/models/fp_batch.py` — x_fc_step_id / x_fc_job_id +- [x] `fusion_plating_jobs/models/fp_quality_hold.py` — x_fc_job_id / x_fc_step_id +- [x] `fusion_plating_jobs/models/fp_certificate.py` — x_fc_job_id +- [x] `fusion_plating_jobs/models/fp_thickness_reading.py` — x_fc_job_id / x_fc_step_id +- [x] `fusion_plating_jobs/models/fp_delivery.py` — x_fc_job_id +- [x] `fusion_plating_jobs/models/fp_racking_inspection.py` — x_fc_job_id +- [x] `fusion_plating_jobs/models/account_move.py` — invoice → job hook +- [x] `fusion_plating_jobs/models/fp_notification_trigger.py` — job_confirmed/job_complete events +- [x] `fusion_plating_jobs/models/fusion_plating_kpi_value.py` — x_fc_source tag +- [x] `fusion_plating_jobs/views/res_config_settings_views.xml` — settings UI +- [x] `fusion_plating_jobs/report/report_fp_job_sticker.xml` — sticker +- [x] `fusion_plating_jobs/report/report_fp_job_traveller.xml` — traveller +- [x] `fusion_plating_jobs/controllers/job_scan.py` — /fp/job/ +- [x] `fusion_plating_jobs/controllers/process_tree.py` — /fp/jobs/process_tree +- [x] `fusion_plating_jobs/scripts/audit_pre_migration.py` +- [x] `fusion_plating_jobs/scripts/migrate_to_fp_jobs.py` +- [x] `fusion_plating_jobs/scripts/audit_post_migration.py` +- [x] `fusion_plating_jobs/scripts/README.md` +- [x] `fusion_plating_jobs/README.md` — Phase 6 deferrals doc +- [x] `fusion_plating_jobs/security/ir.model.access.csv` — ACL rows +- [x] `fusion_plating_jobs/tests/test_fp_job_extensions.py` — comprehensive test suite + +If anything in this list is missing, fix before Phase 8.