This is how I fix a bug with Opus 4.6

I got the first bug report for my recently announced MTP Rust library. Let me show how I’ve fixed it.

First, a little background.

When I announced the library a few days ago, I got a somewhat harsh response (mostly on /r/rust), but also in the discussion for nusb, a library that mine heavily builds upon, where I posted a “thanks” message. Basically both say “AI slop”.

Now, I’ve thought a lot about this in the past few days, and I think devs who use AI in their work have a hard time today. People seem to be fed up with AI slop (which I absolutely get), and it seems that many of us overindex on the fact that AI was involved in creating a product.

My argument is that not all products created with AI are slop.

I intend to write a lot more about this with examples of how I build Cmdr with a lot of personal care and AI, but for now, I want to share a case study on this bug fix.

So, below is the transcript of how I got the code fixed with Claude Code (Opus 4.6), using no special magic, just guidance to Claude to investigate and fix the problem in a solid and elegant way.

Looks long, but it’s a very quick read. I’ve manually edited it for brevity wherever possible without changing its vibe, and added tl;drs so you don’t even need to read through its messages.

Then see the other half of the article below the transcript.

Transcript

Could you help me look at this with gh? https://github.com/vdavid/mtp-rs/issues/1 WDYT might be the cause and solution?

⏺ Bash(gh issue view 1 --repo vdavid/mtp-rs) ⎿ title: 11th gen Kindle not detected as a MTP device …(91 lines removed for brevity – the LLM got the content of the issue)… ⏺ Explore(Explore MTP device detection logic) ⎿ Done (8 tool uses · 31.4k tokens · 26s) …(omitted for brevity)… ⏺ Here’s the issue. The Kindle reports its interface as:

class=ff subclass=ff protocol=00

But the current is_mtp_class check (nusb.rs:118-123) requires:

class = 0x06 (Image) or 0xFF (Vendor) — matches
subclass = 0x01 — fails (Kindle has 0xFF)
protocol = 0x01 — fails (Kindle has 0x00)

So the Kindle passes the class check but fails on subclass and protocol because it uses fully vendor-specific values instead of the standard MTP identifiers.

Likely cause

Amazon’s Kindle firmware uses a vendor-specific USB interface (ff/ff/00) rather than the standard MTP/PTP descriptors (06/01/01). This is common — many devices (especially e-readers and some older Android devices) rely on Microsoft OS Descriptors or the endpoint layout to signal MTP support rather than standard class codes.

Recommended solution

When class == 0xFF (vendor-specific), relax the subclass/protocol requirement and instead use an endpoint-layout heuristic: a valid MTP interface has exactly 1 Bulk IN, 1 Bulk OUT, and 1 Interrupt IN endpoint. The Kindle matches this perfectly:

ep addr=0x81 dir=In  type=Bulk
ep addr=0x01 dir=Out type=Bulk
ep addr=0x82 dir=In  type=Interrupt

This avoids hardcoding VID/PIDs and would catch other vendor-specific MTP devices too. The change would be in is_mtp_device — when class == 0xFF and the standard subclass/protocol don’t match, fall through to an endpoint shape check.

Want me to implement this?

Claude suggested what I’ll call solution #1

Do you think this change might break the lib for other devices?

⏺ Good question. Let me think about this.

The risk is false positives — a vendor-specific interface (ff/xx/xx) with the same endpoint layout (bulk in + bulk out + interrupt in) that is not MTP. This pattern isn’t unique to MTP; HID-like vendor protocols, some printer interfaces, and proprietary sync protocols could share it.

However, detecting a device as MTP doesn’t cause harm on its own — the library would still need to successfully open a session and send MTP commands. A false positive would just mean list_devices() returns a non-MTP device that would then fail when you try to talk to it.

To minimize risk, we could layer the heuristic:

Tight gate: Only apply the endpoint heuristic when class == 0xFF and the standard check already failed — so standard MTP devices are unaffected.
Probe: After the endpoint shape matches, actually attempt a lightweight MTP GetDeviceInfo operation to confirm the device speaks MTP before including it in the list.

Option 2 is safer but more invasive — it means list_devices() would open and talk to suspected devices, which is a behavior change and could be slow or cause side effects on non-MTP devices.

The pragmatic middle ground: add the endpoint heuristic but don’t auto-probe. Document that vendor-specific devices may appear in list_devices() even if they aren’t actually MTP. Users already have to open() the device and handle errors anyway. This matches how libmtp and other implementations work — they cast a wider net at enumeration time and let the protocol handshake be the real validation.

So: low risk of breaking existing users (standard MTP detection path is unchanged), small risk of extra entries in list_devices() for edge cases, and that’s an acceptable tradeoff. What do you think?

Claude came up with solution #2

Sounds good. Do you think this is an actually solid and elegant solution to the problem?

⏺ Honestly, not really. It’s workable but a bit hand-wavy. Bulk IN + Bulk OUT + Interrupt IN is a common USB layout — mass storage adapters, CDC devices, and various proprietary protocols use it too. We’d be guessing based on shape.

The proper mechanism for this is Microsoft OS Descriptors. This is exactly what they were designed for — when a device uses vendor-specific class codes (ff/xx/xx), it can advertise a “compatible ID” of "MTP" via a special control transfer (reading string descriptor at index 0xEE, then querying the Extended Compat ID descriptor). This is how Windows actually detects Kindles and similar devices as MTP.

The approach would be:

Standard class/subclass/protocol check (unchanged, covers well-behaved devices)
For class == 0xFF devices, open the device and query the MS OS descriptor
If the compatible ID says "MTP", it’s MTP — definitively, no guessing

Pros:

No false positives — the device explicitly declares itself as MTP
Covers all vendor-specific MTP devices, not just ones that happen to have the right endpoint shape
This is the industry-standard mechanism

Cons:

Requires opening the device and sending a control transfer during enumeration
Slightly more complex to implement
Some devices might not have MS OS descriptors either (though Kindles almost certainly do, since they work on Windows)

We could also combine both — try MS OS descriptors first, fall back to the endpoint heuristic as a last resort.

Want me to look into whether nusb exposes control transfers for this, and prototype the MS OS descriptor approach?

Claude said #1–#2 were hand-wavy and proposed solution #3

Hmm, sounds like without the MS OS descriptors it’s better UX because it doesn’t need the user to open the device? But then you said that MS OS descriptors are the proper way to go, so I guess we should do that. Please clarify this for me, and also check nusb

⏺ To clarify: the “opening” happens internally inside list_devices() — the library user’s code stays the same either way. In fact, the current code already opens devices as a fallback in is_mtp_device. Let me show you. ⏺ Read(~/projects-git/vdavid/mtp-rs/src/transport/nusb.rs · lines 85-119) ⏺ Explore(Check nusb control transfer API) ⎿ Done (16 tool uses · 46.7k tokens · 51s) (ctrl+o to expand) ⏺ So to clarify your question: there’s no UX difference for the library user. The current code already opens devices during list_devices() (line 102). The MS OS descriptor query would just be another control transfer on the already-opened device. The user’s code stays exactly the same — they just call MtpDevice::list_devices() and get back results.

And nusb fully supports control transfers via device.control_in() — exactly what we need.

The MS OS descriptor flow would be:

Open the device (already happening)
Read string descriptor at index 0xEE → get the vendor request code (bMS_VendorCode)
Use that vendor code to query the Extended Compat ID descriptor
Check if any function’s compatible ID is "MTP"

This slots naturally into the existing is_mtp_device fallback path. The only real cost is one or two extra control transfers for vendor-specific devices, which takes a few milliseconds — negligible during enumeration.

Want me to implement it?

Claude said solution #3 seemed good.

Does this plan lead to an actually solid AND elegant solution?

⏺ Honestly… not quite. Let me reconsider.

The MS OS descriptor approach is correct in principle, but it’s a multi-step control transfer dance (read string 0xEE, parse vendor code, send another vendor request, parse the compat ID structure). That’s a fair amount of USB plumbing for what is essentially a device enumeration heuristic. And it only works for devices that implement MS OS descriptors — which is common but not universal.

Stepping back, the real question is: what are we actually trying to do?

We’re trying to enumerate devices that might speak MTP. Not prove they do — that happens when you open a session. list_devices() is a filter, not a guarantee. The current code already accepts class == 0xFF (vendor-specific) as a valid MTP class, it just demands exact subclass/protocol. That’s the overly strict part.

The actually elegant fix might be simpler than either approach:

When class is vendor-specific (0xFF), don’t require specific subclass/protocol values. Just check the endpoint layout (bulk in + bulk out + interrupt in) as a basic sanity check. The rationale:

For class == 0x06 (Image/PTP): subclass and protocol are standardized, so checking them is meaningful
For class == 0xFF (Vendor): subclass and protocol are vendor-defined by definition — there’s no standard to check against, so requiring 0x01/0x01 is arbitrary

This is a one-line logic change, it’s correct by the USB spec’s own semantics, and the endpoint check prevents obvious false positives. It’s not a heuristic hack — it’s acknowledging that vendor-specific means vendor-specific.

The MS OS descriptor approach is the “proper” way if we want certainty, but certainty isn’t the job of list_devices(). Simplicity + correctness at the right abstraction level is more elegant than over-engineering the detection.

What do you think?

Claude said #3 was overengineered, proposed solution #4.

Hmm, would you say that this is the most resilient, correct, and elegant way of fixing this from the solutions that have come up above?

⏺ Yes, I think so. Let me lay out why it wins over the alternatives:

Hardcoded VID/PID — Brittle, doesn’t scale, needs updating for every new device. Clearly the worst.
MS OS descriptors — Technically “proper” but over-engineered for a filtering function. Adds complex USB plumbing, only works for devices that implement it, and gives us certainty we don’t actually need at this stage.
Endpoint heuristic alone — Too loose. Lots of non-MTP vendor devices have bulk in + bulk out + interrupt in.
Relax subclass/protocol for vendor-specific class + endpoint heuristic — This is the one. It’s:
- Correct: vendor-specific class means subclass/protocol are vendor-defined, so demanding specific values is a spec misunderstanding
- Resilient: catches any vendor-specific MTP device regardless of what subclass/protocol they chose, without needing a device database
- Safe: the endpoint layout check (bulk in + bulk out + interrupt in) filters out obviously wrong vendor devices
- Minimal: small, localized change to existing code
- Honest: list_devices() is a filter, not a proof — actual MTP validation happens at session open

The only scenario it misses is a non-MTP vendor device with class=0xFF and the exact right endpoint layout. That device would show up in list_devices() but fail at session open — which is a fine failure mode.

Want me to implement it?

Claude compared all solutions that came up, said #4 is best

Do you think this change might break the lib for other devices?

⏺ No. The change is purely additive. Devices that currently match is_mtp_class (including class=0xFF, subclass=0x01, protocol=0x01) still match — that check runs first. The new endpoint-layout path only activates for devices that already fail detection today. No existing behavior changes.

Let me implement it.

So, you’d say this plan leads to an actually solid AND elegant solution?

⏺ Yes.

Nice, let’s do it then! Minimal changes pls, but do update the docs, etc. to make this actually complete

⏺ Let me check the nusb descriptor types for the helper function signature. …(removed 11 lines for brevity)… ⏺ The alt setting type is InterfaceDescriptor. Now let me implement the changes. …(removed many lines for brevity)… ⏺ Now let’s build and run the tests. ⏺ Bash(cargo build 2>&1) …(41 lines removed for brevity)… Compiling mtp-rs v0.4.0 (~/projects-git/vdavid/mtp-rs) Finished dev profile [unoptimized + debuginfo] target(s) in 6.89s ⏺ Bash(cargo test 2>&1) …(336 lines removed for brevity)… test result: ok. 2 passed; 0 failed; 15 ignored; 0 measured; 0 filtered out; finished in 0.78s ⏺ All 243 tests pass. Let me verify the final state of the changed file. ⏺ Read(~/projects-git/vdavid/mtp-rs/src/transport/nusb.rs · lines 85-159) ⏺ Everything looks clean. Here’s a summary of the changes:

src/transport/nusb.rs:

Added InterfaceDescriptor import
is_mtp_device: Now also inspects devices with class == 0xFF at device level (previously only class == 0). Uses the new is_mtp_interface check in the fallback path instead of just is_mtp_class
is_mtp_interface: New helper — returns true for standard MTP class/subclass/protocol or vendor-specific class with MTP endpoint layout
has_mtp_endpoint_layout: New helper — checks for bulk IN + bulk OUT + interrupt IN
open_with_timeout: Uses is_mtp_interface instead of is_mtp_class so it can find and open vendor-specific MTP interfaces
Added a test case for ff/ff/00 (Kindle’s class/subclass/protocol)

CHANGELOG.md: Added [Unreleased] entry documenting the fix with a link to #1.

Claude implemented the fix, updated the tests, and the changelog. All tests pass.

Are you satisfied with this result? Is this as elegant as expected? Are you confident that it’ll work well with that Kindle and other devices?

⏺ I’m satisfied with the elegance — the helper layering (is_mtp_class → is_mtp_interface → has_mtp_endpoint_layout) is clean and each function has a single clear job.

For confidence it’ll work, let me trace the Kindle’s path through the code:

Detection (is_mtp_device):

is_mtp_class(dev.class(), ...) — fails (wrong subclass/protocol)
dev.class() != 0 && dev.class() != 0xFF — Kindle’s device class is almost certainly 0x00 or 0xFF, so we continue
Interface-level quick check: is_mtp_class(0xFF, 0xFF, 0x00) — fails
Opens device, calls is_mtp_interface on the alt setting:
- is_mtp_class — fails
- class == 0xFF && has_mtp_endpoint_layout — passes (bulk in + bulk out + interrupt in)

Opening (open_with_timeout):

is_mtp_interface(&alt_setting) — same logic, passes
Extracts endpoints 0x81 (bulk in), 0x01 (bulk out), 0x82 (interrupt in) — all found

So yes, both detection and opening will work for the Kindle.

The one thing I can’t verify without the physical device is the Kindle’s device-level class. If it were something unexpected like 0x02, the early return on line 94 would skip it. But a device with a single ff/ff/00 interface will have device class 0x00 or 0xFF — anything else would be very unusual. The issue reporter could easily confirm this if needed.

✻ Churned for 32s

Claude double-checked it, said it’ll work and is elegant, with one caveat around the device-level class.

If the device-level class turns out to be something unexpected, might that lead to a regression, damage to the device, or anything of that sort which might be a release blocker?

⏺ No. If the device-level class is something unexpected, the device simply won’t be detected — same as today. It’s a no-op, not a regression. No data is written, no state is changed, nothing happens to the device.

The worst case is the fix doesn’t help for that particular device, and we’d need to widen the gate on line 94. Easy follow-up, zero risk.

Nice. Pls help me prep a release

⏺ Read(~/projects-git/vdavid/mtp-rs/docs/releasing.md) ⏺ Read(~/projects-git/vdavid/mtp-rs/Cargo.toml) ⏺ This is a backwards-compatible bug fix, so it should be v0.4.1. ⏺ Update(~/projects-git/vdavid/mtp-rs/Cargo.toml) — version bump ⏺ Update(~/projects-git/vdavid/mtp-rs/CHANGELOG.md) ⏺ Bash(just check-all 2>&1) → All checks passed! ⏺ Ready for commit, tag, and dry run. Want me to proceed?

Claude prepared the release.

/commit-draft

⏺ Bash(git diff) ⏺ All changes are related. Here’s the commit message: ⏺ Bash(printf '%s' 'Detect vendor-specific MTP devices (e.g. Kindle)...(removed bullets for brevity)...' | pbcopy) ⏺ Commit message is in your clipboard. 4 files: nusb.rs, Cargo.toml, Cargo.lock, CHANGELOG.md.

Claude prepared the commit draft.

Commit pls

⏺ Bash(git add src/transport/nusb.rs Cargo.toml Cargo.lock CHANGELOG.md) ⏺ Bash(git commit -m "...(removed for brevity)...") 4 files changed, 67 insertions(+), 25 deletions(-) ⏺ Committed as d565bcf. Want me to proceed with the tag and dry run?

Yes pls

⏺ Bash(git tag v0.4.1) ⏺ Bash(cargo publish --dry-run 2>&1) → Dry run passed. ⏺ Ready for the real publish. Want me to go ahead?

Ship it!

⏺ Bash(cargo publish 2>&1) → Published mtp-rs v0.4.1 ⏺ Bash(git push && git push --tags) → * [new tag] v0.4.1 -> v0.4.1 ⏺ v0.4.1 is live on crates.io and pushed to GitHub.

Evaluation

Did you see how I gave Claude helpful questions to avoid inelegant and overengineered solutions three times before we got to a solution I believe to be the right one?

I pushed back four times when Claude was already eager to jump to implementation. (For larger changes, I even include a planning + 3-round review process, but in the case of small changes like this, I’ve found this level of planning absolutely enough.)

Put this in contrast with vibe coding, where I’d have accepted the first solution.

Well, the peculiarity of this current example is that solutions #1 and #4 are the same. 😅 I wish it was an example where the two were actually different. Three things to note, though:

This is usually not the case. I very often end up with a more elegant solution after similar repeated questioning.
While this was not my purpose, this actually shows how Opus often has pretty good instincts. This will only get better with future LLMs.
What we did achieve between #1 and #4 is for Claude to have a much deeper understanding of the problem space, the USB spec, and the library’s architecture, and for me to get a strong conviction that we are good to go and implement it. (It also resulted in slightly better reasoning in the docs that we wouldn’t have had otherwise.)

Did I code this fix? → Depends on your definition. I oversaw Claude, and I ensured quality.
Can Claude still make mistakes, even if I oversee it? → Yes.
Can I make mistakes when I handcraft code? → OMG yes. If anything, I find Claude vastly less prone to making mistakes than myself.

I think one day, tools like Claude will be able to code libraries and fixes like this on their own. For now, like with any tool, you need to use it with experience, humility, and care, and then you can create great stuff!

Btw, kudos to @jannikac, the issue reporter, for giving me high-quality input. ❤️

This is how I fix a bug with Opus 4.6

Bugfix case study for mtp-rs

Transcript

Evaluation

Comments