docs(jobs): Phase 8/9/10 cutover runbook
Documents: - Phase 8: 5-day E2E test plan on entech-clone (snapshot, migration, audits, smoke tests, rollback test, sign-off criteria) - Phase 9: Cutover weekend runbook (Friday 6pm stop → Sunday buffer → Monday 7am operators back). 4 hours active work. - Phase 10: 2-week burn-in monitoring + rollback safety net + Day 14 snapshot drop. Bridge_mrp deprecation options. - Phase-end polish task list (deferred Minor items from Phase 1-7 reviews + the Phase 6 operator UI rewrite). - Communication templates (operator email, manager briefing). - Open decisions for user before Phase 9 starts. - File checklist confirming all Phase 1-7 deliverables present. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,385 @@
|
||||
# Native Job Model — Cutover Runbook (Phases 8, 9, 10)
|
||||
|
||||
**Date:** 2026-04-25
|
||||
**Owner:** Nexa Systems
|
||||
**Status:** Draft. Verify each step on entech-clone before live cutover.
|
||||
**Predecessor:** Phases 1–7 complete (commits up to current HEAD on
|
||||
`feat/fp-native-job-model`). Spec:
|
||||
`docs/superpowers/specs/2026-04-25-fp-native-job-model-design.md`. Plan:
|
||||
`docs/superpowers/plans/2026-04-25-fp-native-job-model.md`.
|
||||
|
||||
This runbook covers the operational phases of the migration:
|
||||
|
||||
- **Phase 8** — End-to-end testing on a clone of entech (~5 days)
|
||||
- **Phase 9** — Live cutover weekend (4 hour window)
|
||||
- **Phase 10** — 2-week burn-in with rollback safety net
|
||||
|
||||
---
|
||||
|
||||
## Phase 8 — E2E testing on entech-clone (5 days)
|
||||
|
||||
### 8.1 Prepare the clone
|
||||
|
||||
1. **Snapshot live entech:** `pct snapshot 111 pre_fp_jobs_clone` on pve-worker5.
|
||||
2. **Spin up a sibling LXC** (e.g. `entech-clone` at LXC 511 / pve-worker5).
|
||||
- Restore from the snapshot
|
||||
- Configure new IP: 10.200.1.27 (so it doesn't compete with live entech 10.200.1.26)
|
||||
- Update `odoo.conf` to a separate database name e.g. `admin_clone`
|
||||
3. **Update Tailscale:** add `entech-clone` to your Tailscale ACL so SSH works.
|
||||
4. **Verify clone independence:** any DB writes on entech-clone must NOT bleed
|
||||
to live entech. Different DB name, different IP.
|
||||
|
||||
### 8.2 Pre-migration audit
|
||||
|
||||
Run on entech-clone:
|
||||
|
||||
```bash
|
||||
ssh pve-worker5 "pct exec 511 -- bash -c 'su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin_clone\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/audit_pre_migration.py"
|
||||
```
|
||||
|
||||
Expected output: counts of MOs, WOs, dependent records, data quality flags.
|
||||
|
||||
**Capture the baseline numbers** in `phase8_baseline.txt` for diffing later.
|
||||
|
||||
### 8.3 Run migration
|
||||
|
||||
```bash
|
||||
ssh pve-worker5 "pct exec 511 -- bash -c 'su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin_clone\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/migrate_to_fp_jobs.py"
|
||||
```
|
||||
|
||||
Watch for errors in the output. Audit log at `/tmp/fp_jobs_migration.log`.
|
||||
|
||||
### 8.4 Post-migration audit
|
||||
|
||||
```bash
|
||||
ssh pve-worker5 "pct exec 511 -- bash -c 'su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin_clone\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/audit_post_migration.py"
|
||||
```
|
||||
|
||||
Verify:
|
||||
- `fp.job` count == `mrp.production` count (every MO has a mirror)
|
||||
- `fp.job.step` count == `mrp.workorder` count
|
||||
- Dependent x_fc_*_id counts match production_id / workorder_id counts
|
||||
|
||||
If any mismatch, dig into the audit log for errors.
|
||||
|
||||
### 8.5 Smoke test the new flow
|
||||
|
||||
Manual on the clone via browser:
|
||||
|
||||
1. Toggle `x_fc_use_native_jobs=True` in Settings → Fusion Plating Jobs.
|
||||
2. Create a new SO with a plating line.
|
||||
3. Confirm the SO. Verify a `WH/JOB/...` record appears in **Plating Jobs (new)** menu.
|
||||
4. Verify the recipe steps generated correctly.
|
||||
5. Open a step, click Start, then Finish. Verify timelog row, duration_actual,
|
||||
cost_total all populate.
|
||||
6. Print the new Job Sticker (6×4"). Verify QR scans to `/fp/job/<id>` and
|
||||
redirects to the form.
|
||||
7. Print the Job Traveller. Verify all steps listed.
|
||||
8. Click **Mark Done** on the job. Verify state=done, draft delivery created,
|
||||
draft cert created (best-effort).
|
||||
|
||||
### 8.6 Replay 30 days of activity
|
||||
|
||||
Identify the last 30 days of MO activity on entech (pre-clone) and replay
|
||||
those operator actions through the new flow on the clone. Look for:
|
||||
- Operations that succeeded on the legacy flow but error on native
|
||||
- Reports that render differently
|
||||
- Cost / margin numbers that differ between legacy and native
|
||||
|
||||
Diff certificates byte-for-byte: render 100 random CoC PDFs on legacy and on
|
||||
migrated native job. They should be visually identical. Any differences are
|
||||
audit-grade red flags (Nadcap / aerospace).
|
||||
|
||||
### 8.7 Performance baseline
|
||||
|
||||
Measure on the clone:
|
||||
- Plant Overview load time with N active steps (grouped by work_centre)
|
||||
- Job form open time with 50-step recipe
|
||||
- Job traveller PDF render time
|
||||
- Job sticker PDF render time
|
||||
- Migration script runtime (target: < 30 min on entech-scale data)
|
||||
|
||||
If anything is significantly slower than the legacy MO/WO flow, investigate
|
||||
indexes (M2M tables, related stores) before cutover.
|
||||
|
||||
### 8.8 Rollback test
|
||||
|
||||
On the clone, simulate a rollback:
|
||||
1. Restore the pre-cutover snapshot.
|
||||
2. Verify legacy MO/WO data is intact.
|
||||
3. Verify the `fusion_plating_jobs` module is still installed but inert
|
||||
(flag is False).
|
||||
4. Verify nothing in bridge_mrp / fusion_plating_reports / shopfloor /
|
||||
notifications regressed.
|
||||
|
||||
Rollback safety is the most important thing to prove before live cutover.
|
||||
|
||||
### 8.9 Sign-off criteria
|
||||
|
||||
Before scheduling Phase 9:
|
||||
- [ ] All Phase 1+2 tests pass (50+ tests)
|
||||
- [ ] Migration script runs cleanly on clone with 0 errors in audit log
|
||||
- [ ] Pre/post audit counts match
|
||||
- [ ] 100 sample CoCs byte-identical
|
||||
- [ ] All performance baselines within 20% of legacy
|
||||
- [ ] Rollback test successful
|
||||
|
||||
If any item fails, identify the gap, fix in `feat/fp-native-job-model`, and
|
||||
re-run §§ 8.2–8.8.
|
||||
|
||||
---
|
||||
|
||||
## Phase 9 — Cutover weekend (1 calendar day, ~4 hours active work)
|
||||
|
||||
### 9.1 Pre-cutover communication (T-7 days)
|
||||
|
||||
- Email entech operators: "Saturday MM/DD evening: ~4 hours offline for
|
||||
system upgrade. Sunday morning normal."
|
||||
- Brief 2-3 plating managers on the new menu and the demo path.
|
||||
- Confirm Saturday on-site presence: 1 manager + 1 tech (you).
|
||||
|
||||
### 9.2 Friday 6pm — stop new work
|
||||
|
||||
- Operators wrap up active jobs. No new SO confirms. No new WOs started.
|
||||
- Verify no in_progress WOs left running. Pause any timers.
|
||||
|
||||
### 9.3 Friday 8pm — backup
|
||||
|
||||
```bash
|
||||
# Full DB dump
|
||||
ssh pve-worker5 "pct exec 111 -- bash -c 'su - postgres -c \"pg_dump admin\" > /var/backups/admin_pre_fp_jobs_$(date +%Y%m%d).sql'"
|
||||
|
||||
# Filesystem snapshot
|
||||
ssh pve-worker5 "pct snapshot 111 pre_fp_jobs_cutover"
|
||||
```
|
||||
|
||||
Tag the current commit:
|
||||
|
||||
```bash
|
||||
cd /Users/gurpreet/Github/Odoo-Modules
|
||||
git tag -a pre-cutover-$(date +%Y%m%d) -m "Pre-cutover backup point"
|
||||
git push origin pre-cutover-$(date +%Y%m%d)
|
||||
```
|
||||
|
||||
### 9.4 Friday 9pm — deploy + migrate
|
||||
|
||||
1. Deploy the latest `fusion_plating_jobs` to entech (it should already be
|
||||
installed from Phase 7 development; just refresh).
|
||||
|
||||
```bash
|
||||
# Sync feat/fp-native-job-model branch state to entech if not already
|
||||
# (skip if entech is already on this branch)
|
||||
```
|
||||
|
||||
2. Update the module:
|
||||
|
||||
```bash
|
||||
ssh pve-worker5 "pct exec 111 -- bash -c 'systemctl stop odoo && su - odoo -s /bin/bash -c \"/usr/bin/odoo -c /etc/odoo/odoo.conf -d admin -u fusion_plating_jobs --stop-after-init\" && systemctl start odoo'"
|
||||
```
|
||||
|
||||
3. Run the migration:
|
||||
|
||||
```bash
|
||||
ssh pve-worker5 "pct exec 111 -- bash -c 'systemctl stop odoo && su - odoo -s /bin/bash -c \"/usr/bin/odoo shell -c /etc/odoo/odoo.conf -d admin\"' < /mnt/extra-addons/custom/fusion_plating_jobs/scripts/migrate_to_fp_jobs.py"
|
||||
```
|
||||
|
||||
4. Verify with the post-audit script.
|
||||
|
||||
5. Toggle the cutover flag:
|
||||
|
||||
```bash
|
||||
# Via odoo shell:
|
||||
env['ir.config_parameter'].sudo().set_param('fusion_plating_jobs.use_native_jobs', 'True')
|
||||
env.cr.commit()
|
||||
```
|
||||
|
||||
6. Restart Odoo.
|
||||
|
||||
### 9.5 Friday 10pm — smoke test
|
||||
|
||||
Same as §8.5 but on live entech. If anything fails, restore backup
|
||||
(§9.7) and abort.
|
||||
|
||||
### 9.6 Saturday/Sunday — buffer
|
||||
|
||||
Shop is offline weekends. Use the time to:
|
||||
- Fix anything that surfaced during smoke test
|
||||
- Run additional spot checks on historical jobs
|
||||
- Verify that print menus default to the new reports for new jobs
|
||||
- Test sticker scans on a phone
|
||||
|
||||
### 9.7 Rollback procedure (if needed by Sunday evening)
|
||||
|
||||
If unrecoverable issues:
|
||||
|
||||
```bash
|
||||
# Stop Odoo
|
||||
ssh pve-worker5 "pct exec 111 -- systemctl stop odoo"
|
||||
|
||||
# Restore DB
|
||||
ssh pve-worker5 "pct exec 111 -- bash -c 'su - postgres -c \"dropdb admin && createdb admin && psql admin < /var/backups/admin_pre_fp_jobs_<date>.sql\"'"
|
||||
|
||||
# Or restore container snapshot (faster, but loses any post-snapshot DB writes)
|
||||
ssh pve-worker5 "pct rollback 111 pre_fp_jobs_cutover"
|
||||
|
||||
# Start Odoo
|
||||
ssh pve-worker5 "pct exec 111 -- systemctl start odoo"
|
||||
|
||||
# Communicate to operators that we're back on the legacy flow
|
||||
```
|
||||
|
||||
After day 7, rollback becomes "forward fix only" — too much new shop activity
|
||||
to restore.
|
||||
|
||||
### 9.8 Monday 7am — operators back on
|
||||
|
||||
- 1 manager + 1 tech on site for the first 2 hours
|
||||
- Walk operators through the new menu (Plating Jobs (new) → Jobs)
|
||||
- Watch for confusion or errors
|
||||
- Field tickets as they come in
|
||||
|
||||
---
|
||||
|
||||
## Phase 10 — Burn-in (2 weeks calendar, ~1 day active work)
|
||||
|
||||
### 10.1 Daily monitoring (Days 1–14)
|
||||
|
||||
Check daily:
|
||||
- Odoo error log: `tail -f /var/log/odoo/odoo-server.log | grep -i error`
|
||||
- Job creation rate: `SELECT COUNT(*) FROM fp_job WHERE create_date > now() - interval '1 day'`
|
||||
- Step creation rate: `SELECT COUNT(*) FROM fp_job_step WHERE create_date > now() - interval '1 day'`
|
||||
- Failed lifecycle hooks: `grep -c "failed to" /var/log/odoo/odoo-server.log`
|
||||
- Operator support tickets
|
||||
|
||||
Run audit_post_migration.py weekly to catch any drift.
|
||||
|
||||
### 10.2 Forward-fix
|
||||
|
||||
Anything that surfaces during burn-in goes through the standard PR/review
|
||||
workflow on `feat/fp-native-job-model` (or a new follow-up branch). The
|
||||
underlying data layer is locked — fixes are mostly UI/report polish.
|
||||
|
||||
### 10.3 Day 14 — drop legacy snapshots
|
||||
|
||||
After 14 days of stable operation:
|
||||
|
||||
```bash
|
||||
# Drop the pre-cutover snapshot
|
||||
ssh pve-worker5 "pct delsnapshot 111 pre_fp_jobs_cutover"
|
||||
|
||||
# Optional: archive the SQL backup off-site
|
||||
mv /var/backups/admin_pre_fp_jobs_*.sql /off-site/long-term-archive/
|
||||
```
|
||||
|
||||
### 10.4 Bridge_mrp deprecation
|
||||
|
||||
`fusion_plating_bridge_mrp` is still installed and inert (the SO confirm
|
||||
hook only fires when `x_fc_use_native_jobs=False`, which it never is post-
|
||||
cutover). Options for full deprecation:
|
||||
|
||||
A) Leave it installed forever. Zero impact.
|
||||
B) Archive (set `installable=False` in its manifest, so a future re-install
|
||||
wouldn't activate it).
|
||||
C) Uninstall (write a uninstall hook that drops the bridge tables but
|
||||
preserves the data already migrated to fp.job).
|
||||
|
||||
Recommend (A) for the first 6 months, then revisit.
|
||||
|
||||
### 10.5 Phase-end polish
|
||||
|
||||
The list of deferred Minor items from Phase 1-7 reviews:
|
||||
|
||||
- `currency_id required=True` on fp.work.centre and fp.job (and ondelete
|
||||
policies on M2Os uniformly across both core and jobs)
|
||||
- `tracking=True` on fp.job.manager_id, facility_id
|
||||
- `digits='Product Unit of Measure'` on qty
|
||||
- `_('New')` translation safety in create
|
||||
- Field labels: "Reference Product" → cleaner string
|
||||
- Recipe boolean tests on fp.job.step
|
||||
- `index=True` on M2Os queried frequently (recipe_id, partner_id)
|
||||
- Author/website/maintainer block in fusion_plating_jobs manifest
|
||||
- i18n wrapping (`_()`) on user-visible strings
|
||||
- `_compute_state_ready` for fp.job.step pending → ready transition (Task 1.5
|
||||
TODO)
|
||||
- `button_pause` / `button_skip` / `button_cancel` real implementations
|
||||
- Operator UI rewrite (Plant Overview, Tablet Station, Manager Dashboard,
|
||||
Process Tree OWL component) — Phase 6 deferral
|
||||
|
||||
These can be batched into one polish PR after burn-in completes (Day 14+).
|
||||
|
||||
---
|
||||
|
||||
## Appendix A — Communication templates
|
||||
|
||||
### Email to operators (T-7)
|
||||
|
||||
> Subject: System maintenance Saturday — ~4 hours
|
||||
>
|
||||
> Team — we're upgrading the Fusion Plating Jobs system Saturday MM/DD
|
||||
> from 9pm Friday through Saturday morning. The shop will be offline during
|
||||
> that window. By Monday 7am everything will be normal except you'll see a
|
||||
> new "Plating Jobs (new)" menu in addition to the existing menus. Same data,
|
||||
> better workflow. Manager + tech will be on site Monday morning to help.
|
||||
>
|
||||
> No action needed from you. Just don't start any new jobs after 6pm Friday.
|
||||
>
|
||||
> Questions? Reply or ping the manager.
|
||||
|
||||
### Manager briefing (T-3)
|
||||
|
||||
Walk through:
|
||||
1. The new menu structure
|
||||
2. The settings flag and how to toggle it
|
||||
3. The migration script and rollback procedure
|
||||
4. What to do if an operator reports a bug Monday morning
|
||||
|
||||
---
|
||||
|
||||
## Appendix B — Open decisions for the user before Phase 9
|
||||
|
||||
Schedule the cutover weekend with at least 4 weeks notice. Confirm:
|
||||
|
||||
1. Date of cutover weekend
|
||||
2. Which manager will be on-site Monday morning
|
||||
3. Whether to keep the legacy menus visible after cutover (recommend: yes,
|
||||
for the first 14 days, then hide via group permission)
|
||||
4. Whether to send the operator email template above as-is or customize
|
||||
5. Acceptance criteria for "burn-in complete" (recommend: 14 days zero
|
||||
critical errors, zero operator support tickets that map to migration
|
||||
issues)
|
||||
|
||||
---
|
||||
|
||||
## Appendix C — File checklist before Phase 8 starts
|
||||
|
||||
Verify these are present (committed to feat/fp-native-job-model):
|
||||
|
||||
- [x] `fusion_plating_jobs/__manifest__.py` — version >= 19.0.2.0.0, depends on 9 modules
|
||||
- [x] `fusion_plating_jobs/models/fp_job.py` — _inherit with all extension fields, hooks, helpers, legacy_id
|
||||
- [x] `fusion_plating_jobs/models/fp_job_node_override.py` — override model
|
||||
- [x] `fusion_plating_jobs/models/sale_order.py` — SO confirm hook
|
||||
- [x] `fusion_plating_jobs/models/res_config_settings.py` — flag
|
||||
- [x] `fusion_plating_jobs/models/fp_portal_job.py` — x_fc_job_id link
|
||||
- [x] `fusion_plating_jobs/models/fp_batch.py` — x_fc_step_id / x_fc_job_id
|
||||
- [x] `fusion_plating_jobs/models/fp_quality_hold.py` — x_fc_job_id / x_fc_step_id
|
||||
- [x] `fusion_plating_jobs/models/fp_certificate.py` — x_fc_job_id
|
||||
- [x] `fusion_plating_jobs/models/fp_thickness_reading.py` — x_fc_job_id / x_fc_step_id
|
||||
- [x] `fusion_plating_jobs/models/fp_delivery.py` — x_fc_job_id
|
||||
- [x] `fusion_plating_jobs/models/fp_racking_inspection.py` — x_fc_job_id
|
||||
- [x] `fusion_plating_jobs/models/account_move.py` — invoice → job hook
|
||||
- [x] `fusion_plating_jobs/models/fp_notification_trigger.py` — job_confirmed/job_complete events
|
||||
- [x] `fusion_plating_jobs/models/fusion_plating_kpi_value.py` — x_fc_source tag
|
||||
- [x] `fusion_plating_jobs/views/res_config_settings_views.xml` — settings UI
|
||||
- [x] `fusion_plating_jobs/report/report_fp_job_sticker.xml` — sticker
|
||||
- [x] `fusion_plating_jobs/report/report_fp_job_traveller.xml` — traveller
|
||||
- [x] `fusion_plating_jobs/controllers/job_scan.py` — /fp/job/<id>
|
||||
- [x] `fusion_plating_jobs/controllers/process_tree.py` — /fp/jobs/process_tree
|
||||
- [x] `fusion_plating_jobs/scripts/audit_pre_migration.py`
|
||||
- [x] `fusion_plating_jobs/scripts/migrate_to_fp_jobs.py`
|
||||
- [x] `fusion_plating_jobs/scripts/audit_post_migration.py`
|
||||
- [x] `fusion_plating_jobs/scripts/README.md`
|
||||
- [x] `fusion_plating_jobs/README.md` — Phase 6 deferrals doc
|
||||
- [x] `fusion_plating_jobs/security/ir.model.access.csv` — ACL rows
|
||||
- [x] `fusion_plating_jobs/tests/test_fp_job_extensions.py` — comprehensive test suite
|
||||
|
||||
If anything in this list is missing, fix before Phase 8.
|
||||
Reference in New Issue
Block a user