VM can ping the host but SSH/SCP fails? Check the host's ufw

I was inside a Windows Server 2025 VM (running on a Linux host via libvirt/Vagrant) trying to scp a file from the host. The host answered ping fine, but every ssh/scp to it just hung and failed. The obvious suspect — the Windows firewall on the VM — was a red herring. The real culprit was ufw on the Linux host, which MX Linux configures to default-deny incoming and only allow the LAN.

The giveaway: ICMP works, TCP doesn't. ufw permits ping by default but drops unlisted TCP ports, so connectivity "looks" fine while SSH silently dies. If you can ping but not connect to a port, suspect a host firewall, not a routing problem.

Check what the host actually allows:

sudo ufw status verbose

In my case port 22 was only open to the physical LAN (192.168.68.0/24), while the VM lived on the libvirt NAT subnet 192.168.121.0/24 — a different network the rules never mentioned. So the VM's packets hit the default deny (incoming) and vanished.

Fix: explicitly allow the VM subnet to reach the port you need.

sudo ufw allow from 192.168.121.0/24 to any port 22 proto tcp comment 'libvirt guests -> host ssh'

After that the VM connects straight away, and the rule survives reboots.

One trap worth naming: if the host is also on a Tailscale/VPN network, its public name may resolve to the Tailscale IP (e.g. 100.x.x.x) — but from a NATed guest that's still the same host, reached through the same deny-by-default firewall. Don't let the fancy hostname distract you; the fix is the same ufw rule on the libvirt subnet.

If you'd rather not open the firewall at all, push the file the other way — from the host into the guest. With Vagrant that's just vagrant upload <src> '<C:\dest>', no inbound rule needed.

gitbash test fixtures: anchor TMPDIR at the real Windows temp root

git -C /tmp/tmp.XXX/repo fails on gitbash with cannot change to C:\Users\<u>\AppData\Local\Temp\2\tmp\tmp.XXX\repo — note the doubled tmp segment. git's Cygwin path resolver treats /tmp/foo as <Temp2>/tmp/foo, so a path the shell resolves fine ends up looking for a directory the disk doesn't have. Anything that pipes mktemp output into git -C hits it; mine broke ~20 of 60+ tests in one go.

The fix is to set TMPDIR to the real Windows temp root before any mktemp -d call. mktemp then returns C:\Users\<u>\AppData\Local\Temp\2\tmp.XXX (no leading /tmp/) and git's resolver is happy. cygpath -w /tmp gives the right value:

if [[ -n "${MSYSTEM:-}" ]] && command -v cygpath >/dev/null 2>&1; then
    export TMPDIR="$(cygpath -w /tmp)"
fi

A few related landmines I tripped over while getting to a green suite — bundle them since you will hit them too:

ln -sfn and readlink round-trip through Cygwin form. Symlink the Windows form C:\Users\...\tmp.X\auth.json and readlink returns /tmp/tmp.X/auth.json. A test that does assert_eq "${primary_home}/auth.json" "$(readlink …)" fails on gitbash even when both point at the same file. A cygpath -u on each side makes them comparable; I added an assert_path_eq helper for this. Don't use cd && pwd -P as the canonicalisation — it fails for target paths that don't yet exist as real directories (the symlink is dangling at assert time).

Windows jq emits CRLF. A pipeline ending in paste -sd, keeps the \r in each token, so expected "3,5,10" got "3\r,5\r,10" is a real failure mode. tr -d '\r' after the jq call fixes it; doing it inside the test's jq wrapper covers most cases.

Windows has no POSIX file mode bits. chmod +x is a no-op, chmod 700 is a no-op, and [[ -x ${path} ]] is always false for a no-extension file. The launch scripts the runtime writes are invoked with bash <path>, so missing exec bit is harmless in production — skip those assertions on MSYSTEM.

bash -x leaks inherited env to the transcript. If the parent shell sources ~/.bashrc.secret (mine does), set -x spews every export OAUTH_TOKEN=… into the output. Diagnostic prints are fine, but never set -x the real test — and redact the log if you did.

These four are the high-leverage fixes; everything else (tmux e2e, setsid/python missing, MINGW64 prompt prefix in pane titles) is one-test-at-a-case and worth a separate pass.

tg-relay 能否驱动非 tmux 的 Claude Code session?

tg-relay 的 inbound 路由依赖 tmux pane ID(%NN)——收到 Telegram 消息后,它调 tmux send-keys/tmp/tg-*.md 注入对应 pane。mux.driver local 跑的是前台进程,没有 tmux pane,所以 relay 找不到投递目标,直接失败。

Outbound 没问题:notify_shuke 不依赖 tmux,local driver 下照常发 TG,reply index 里用 MUX_DRIVER_SLUG 代替 %NN 记录身份。问题只在 inbound。

可行路径:named FIFO + Stop hook exit 2

每个 local session 启动时用 slug 创建一个 named FIFO:

mkfifo /tmp/mux-tg-${MUX_DRIVER_SLUG}.fifo

tg-relay 看到 reply index 里的 pane ID 是 slug 而非 %NN,就写这个 FIFO 而非调 tmux:

echo "$message" > /tmp/mux-tg-${slug}.fifo

Claude Code 的 Stop hook 在每个 turn 结束时检查这条 FIFO:

# Stop hook
fifo="/tmp/mux-tg-${MUX_DRIVER_SLUG}.fifo"
if read -t 0.1 msg < "$fifo" 2>/dev/null; then
    echo "$msg"
    exit 2   # 拦住 stop,把消息作为 additionalContext 注入
fi
exit 0

exit 2 的语义:Claude 不停止,stdout 作为 additionalContext(system feedback)注入同一个 turn,Claude 继续处理。技术上不是新的 user message,是 same-turn 的 system context,但效果上 Claude 会读到并响应 TG 消息。

局限

  • hook 只在 turn 边界触发。FIFO 里的消息要等当前 turn 结束才被拉取。如果turn结束时FIFO里是空的,那这个turn就正常结束了。没有机会再接到后续FIFO的内容。这是一个致命缺陷。让整个方案变得不再可行。
  • additionalContext ≠ user message:conversation history 里这不是一条用户消息,边角行为可能和正常 TG 路由有差异。
  • FIFO 阻塞:写端无读端时 echo > fifo 会阻塞,relay 需要用 O_NONBLOCK 或超时保护。

不完美,但架构上可行,不需要改动 Claude Code 本身。

Git for Windows nagging "Unlink failed. Should I try again? (y/n)"? One env var kills it

On a corporate Windows box, git pull/git fetch keeps stopping to ask:

Unlink of file '.git/objects/pack/pack-305a05....idx' failed. Should I try again? (y/n)

The cause is a security agent — SentinelOne, ZScaler, Defender — holding an open handle on the old .idx files while Git tries to repack. Git for Windows wraps unlink/rename failures in a retry prompt, and you end up babysitting every pull, mashing n.

yes n | git pull works but you have to remember to prefix it every time. The permanent fix is one line in ~/.bashrc (the Git Bash one):

export GIT_ASK_YESNO=false

Git runs the value of GIT_ASK_YESNO as a command to decide whether to retry — a non-zero exit is treated as "n". false always exits non-zero, so every prompt is silently answered "no". It's cleaner than </dev/null redirection (works regardless of whether stdin is a tty) and doesn't touch the other interactive bits — commit-message editor, credential prompts — which go through different machinery.

Answering "n" just means Git leaves the locked old pack file on disk; the new pack is already live, so the repo is fine. Once the security agent lets go, a git gc sweeps up the leftovers.

If the prompts are frequent, this cuts down how often they fire — sometimes the lock is Git's own multi-threaded pack-objects, not the AV:

git config --global pack.threads 1

This is the silence-it companion to the heavier "it's an open handle, not a permission" diagnosis — same root cause (a process holding a handle), but here you just want the nagging to stop, not to hunt the locker down.

`claude install` says success on Windows but `--version` is still old

You run claude install, it prints "successfully installed! Version: 2.1.185", but claude --version keeps reporting the old 2.1.183. Reinstalling doesn't help.

The cause: on Windows you cannot overwrite a running .exe. The native installer downloads the new build into ~/.local/share/claude/versions/<ver> fine, but the final step — copying it onto the launcher at ~/.local/bin/claude.exe — fails silently because a live claude.exe process holds that file locked. The installer reports success on the download, not on the swap.

Confirm it's this by comparing checksums of the launcher against the version store:

sha256sum ~/.local/bin/claude.exe ~/.local/share/claude/versions/2.1.185
ls -la ~/.local/share/claude/versions/   # newest version is there, but bin/claude.exe is stale
tasklist //FI "IMAGENAME eq claude.exe"  # the processes holding the lock

If the launcher hash matches an older version in the store, the swap never happened.

The clean fix is to exit every claude session and re-run claude install. But if you can't (e.g. you're driving from inside a claude session), use the fact that Windows lets you rename a locked file even though it won't let you overwrite it — the running process keeps its open handle, and a fresh file lands in the path:

cd ~/.local/bin
mv claude.exe claude.exe.old
cp ~/.local/share/claude/versions/2.1.185 claude.exe
chmod +x claude.exe
claude --version   # 2.1.185

New shells immediately pick up the new binary; running sessions keep using the old one until restarted. Delete claude.exe.old once everything has been restarted.

Note that Git Bash's bare claude (no extension) is just shell resolution of claude.exe — there's only one physical file to replace, not two. This whole trap is more likely if you have autoUpdates: false in ~/.claude.json and update manually, since background auto-update would normally retry on the next launch when nothing is locked.