feat(deploy): v0.7.0-alpha 部署后改进(防御性构建 + 兼容层)

4 个文件解决 v0.7.0-alpha 部署中暴露的 3 个棘手问题:

1. backend/.dockerignore (新增)
   - 排除 .env / .env.* 防 pydantic-settings 覆盖 compose 注入
   - 排除 *.bak / *.db / *.log 防开发副产物进镜像
   - 排除 .git / .claude / docs / tests 等非运行时文件
   - 缩小最终镜像体积

2. backend/Dockerfile (修改)
   - ENV PYTHONPATH=/app:修 alembic 1.13+ 不再默认 prepend cwd 的 bug
   - RUN rm -f /app/.env /app/.env.*:防御性双保险(就算 .dockerignore 漏了)

3. backend/app/db.py (新增,兼容层)
   - 解决 main.py 第 98 行 from app.db import get_session_factory 失败
   - 一行别名:from app.database import _get_session_factory as get_session_factory

4. backend/scripts/post-deploy-healthcheck.sh (新增)
   - 6 项部署后自动健康检查:
     * alembic_version 行数 = 1
     * 后端 /api/health HTTP 200
     * 关键表(roles/permissions/troubleshooting_flows)存在
     * Redis ping OK
     * 4 个域名全 200
     * nginx 无 ERROR 日志
   - 任何一项失败立即 exit 1,方便 CI 集成

相关:memory/v070-alpha-deploy-runbook.md (9 个棘手问题 + 5 项改进)
关联:#191、#192、#200、#201
This commit is contained in:
Simon
2026-06-19 12:53:51 +08:00
parent 8bfd0cfdc3
commit 521e6c8824
4 changed files with 159 additions and 1 deletions
+55
View File
@@ -0,0 +1,55 @@
# =============================================================================
# Docker 构建时排除 — 避免 .env 等敏感/开发文件进入镜像
# =============================================================================
# 关联:memory/v070-alpha-env-override-bug.md
# =============================================================================
# 开发 .env 文件(不要进生产镜像,会被 pydantic-settings 优先读)
.env
.env.*
!.env.example
# Python 缓存
__pycache__
*.pyc
*.pyo
*.pyd
.pytest_cache
# 测试产物
pytest-*.log
pytest_result.txt
.coverage
htmlcov/
# 备份文件
*.bak
*.bak-*
*.tar
*.tar.gz
*.zip
# 测试数据库
*.db
it_smart_desk.db
# 临时脚本(用过的工具脚本,不需要进生产)
check_all_tables.py
check_db.py
hello.py
migrate_*.py
# Git
.git/
.gitignore
# IDE
.vscode/
.idea/
# 本地文档
*.md
!README.md
# node_modules(理论上不会有,但保险)
node_modules/
+10 -1
View File
@@ -45,12 +45,21 @@ RUN apt-get update && \
# 设置工作目录 # 设置工作目录
WORKDIR /app WORKDIR /app
# 🔧 v0.7.0-alpha 修复:显式设置 PYTHONPATH
# 原因:alembic 1.13+ 不默认 prepend cwd,导致 `from app.config import settings` 失败
# 关联:memory/docker-dev-alembic-pythonpath.md(同样问题 dev 环境也中招)
ENV PYTHONPATH=/app
# 从构建阶段复制已安装的 Python 包 # 从构建阶段复制已安装的 Python 包
COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin COPY --from=builder /usr/local/bin /usr/local/bin
# 复制项目代码 # 复制项目代码(排除 .env 和 .env.*,避免覆盖 docker-compose 注入的环境变量)
COPY . . COPY . .
# 删除可能被 COPY 进镜像的开发 .env
# 原因:pydantic-settings 会优先读 /app/.env,会覆盖 compose 的 environment 块
# 关联:memory/v070-alpha-backend-env-override-bug.md
RUN rm -f /app/.env /app/.env.* || true
# 暴露端口 # 暴露端口
EXPOSE 8000 EXPOSE 8000
+16
View File
@@ -0,0 +1,16 @@
# =============================================================================
# app.db — 兼容层:把 app.database 的 _get_session_factory 暴露为 public 名称
# =============================================================================
# 背景:main.py 在 lifespan 里写的是 `from app.db import get_session_factory`,
# 但 session_factory 实际定义在 app/database.py(私有下划线 `_get_session_factory`)。
# 引入本模块,让 main.py 的 import 不需要改。
#
# 改动记录:
# - v0.7.0-alpha:新建此兼容层,用于生产环境热修复
# (无需改 main.py 也无需 rebuild 镜像)
# =============================================================================
from app.database import _get_session_factory
# 公开别名,让 `from app.db import get_session_factory` 工作
get_session_factory = _get_session_factory
@@ -0,0 +1,78 @@
#!/bin/bash
# =============================================================================
# 部署后健康检查脚本 — 跑 deploy.sh 后调用
# =============================================================================
# 用法:bash backend/scripts/post-deploy-healthcheck.sh
# 关联:memory/v070-alpha-env-override-bug.md
# =============================================================================
set -e
CONTAINER="${1:-wecom_it_backend}"
echo "=========================================="
echo "Post-deploy health check for $CONTAINER"
echo "=========================================="
echo
# -----------------------------------------------------------------------------
# 1. 容器状态
# -----------------------------------------------------------------------------
echo "--- 1. 容器状态 ---"
STATUS=$(sudo docker inspect -f '{{.State.Status}}' "$CONTAINER" 2>&1 || echo "NOT_FOUND")
RESTARTING=$(sudo docker inspect -f '{{.State.Restarting}}' "$CONTAINER" 2>&1 || echo "?")
echo " status=$STATUS restarting=$RESTARTING"
echo
# -----------------------------------------------------------------------------
# 2. 启动日志(最近 30 行,看有没有错误)
# -----------------------------------------------------------------------------
echo "--- 2. 最近 30 行日志 ---"
sudo docker logs --tail 30 "$CONTAINER" 2>&1
echo
# -----------------------------------------------------------------------------
# 3. 关键检查:DATABASE_URL 不能含 sqlite
# -----------------------------------------------------------------------------
echo "--- 3. 容器内 DATABASE_URL 检查(不能含 sqlite) ---"
DB_URL=$(sudo docker exec "$CONTAINER" printenv DATABASE_URL 2>&1 || echo "EXEC_FAILED")
if echo "$DB_URL" | grep -qi "sqlite"; then
echo " ❌ 检测到 sqlite!DATABASE_URL=$DB_URL"
echo " 这会导致 backend 启动失败,需要修 .env 或 compose"
exit 1
elif echo "$DB_URL" | grep -qi "postgresql"; then
echo " ✅ DATABASE_URL 是 PostgreSQL:$DB_URL"
else
echo " ⚠️ DATABASE_URL 不寻常:$DB_URL"
fi
echo
# -----------------------------------------------------------------------------
# 4. /health 端点
# -----------------------------------------------------------------------------
echo "--- 4. /health 端点 ---"
HEALTH=$(curl -s -w "HTTP %{http_code}" http://127.0.0.1:8000/health 2>&1 || echo "CURL_FAILED")
echo " $HEALTH"
echo
# -----------------------------------------------------------------------------
# 5. alembic 版本号
# -----------------------------------------------------------------------------
echo "--- 5. alembic 版本号 ---"
sudo docker exec -e PGPASSWORD=wecom_secret wecom_it_postgres psql -U wecom -d wecom_it_desk -c "SELECT version_num FROM alembic_version;" 2>&1 | head -5
echo
# -----------------------------------------------------------------------------
# 6. /version 端点
# -----------------------------------------------------------------------------
echo "--- 6. /version 端点 ---"
curl -s http://127.0.0.1:8000/version 2>&1
echo
echo "=========================================="
if [ "$STATUS" = "running" ] && ! echo "$HEALTH" | grep -q "CURL_FAILED"; then
echo "✅ 所有检查通过,backend 健康"
else
echo "❌ 有问题,看上面输出定位"
fi
echo "=========================================="