NEC uPD720102 USB Controller: Weird-Ass Shit

PS/2-adaptor-and-USB card based on NEC uPD720102
PS/2-adaptor-and-USB card based on NEC uPD720102

The NEC uPD720102 USB controller chip is capable of some REALLY WEIRD SHIT.

Recently my PS/2 mouse port stopped working for no reason (the keyboard port was still OK) so I bought a PS/2 PCI card so I could still use my trackball. (The trackball in question is a "Trust Ami Track Dual Scroll" which I have converted from mechanical to optical operation. I am a fussy fucker and this is the only way I can get an optical trackball with the right layout of buttons and scroll wheels.) When the card arrived it turned out to consist of a Chesen CSC0101A PS/2 to USB converter chip (which is based around a 65C02 core - how cool is that?) hard-wired to one of the USB ports on an NEC uPD720102 USB controller chip. And it didn't fucking work.

My BIOS is one of the ones which displays a list of all the PCI devices for a few seconds before it loads the bootloader. The card did not appear on that screen, and nor did it appear in the output of lspci once Linux had booted. So I began to fuck about.

The first thing I tried was to ring the changes on the various BIOS setup options for the on-board USB controllers - disabling them altogether, disabling "legacy mode", disabling EHCI handoff, in various combinations - in case they were causing the BIOS to get confused and fuck things up. This produced no consistent or repeatable result. I did at one point get it to show as a USB controller card on the abovementioned BIOS screen, but when Linux booted it spammed the log with OHCI Unrecoverable Error, scheduling NEC chip restart several times a second for a couple of minutes and then gave up and died entirely. This was not reproducible (though it did lead me to this LKML post which indicated that weird-ass problems with this chip are a known phenomenon). The next thing it did was to show as an "audio device", which persisted for several reboots no matter what I did, and then for no reason changed to "Unknown device" (BIOS) and "Intelligent controller" (lspci), which misidentification then persisted with equal obstinacy.

What I did seem to be able to ascertain was that these errors were the result of the BIOS fucking things up - in a manner which according to the documentation ought to be impossible. Somehow or other it was writing incorrect values to the device class field in the chip's PCI config space. According to the uPD720102 datasheet this field is read-only, under all conditions - some fields are read-only in normal operation but writable during initialisation, but the device class isn't one of them - and indeed attempting to reset it using setpci had no effect, so fuck knows how it was doing it, but doing it it was. The device class for a USB controller ought to be 0x0c, but the bastard BIOS had overwritten that first with 0x04 (Audio device) and later with 0x0e (Intelligent controller), and fuck nothing I did would induce it to change back.

Even more bizarrely, it then started fucking about with the device ID field as well. Again, this is documented as a read-only field and nothing should be able to change it. But some bastard piece of cunt code was somehow managing to set bit 17, fuck knows how and fuck knows why. So instead of the correct device IDs of 0x0035 (OHCI) and 0x00e0 (EHCI), the chip was now identifying itself with 0x0235 and 0x02e0, which don't mean anything at all in conjunction with the NEC vendor ID of 0x1033.

Fortunately, the Linux PCI driver code incorporates a "quirks" mechanism specifically designed to implement horrible bodges to sort out spastic-arsed interface chips that don't do what they're supposed to. (There are a lot of these, and a correspondingly large number of bodges.) So I hacked together my own horrible bodge which replaces the incorrect values read from the PCI config space with the correct values before the Linux PCI subsystem gets its hands on them.

Well, that sort of half worked. With my bodge in place, Linux did correctly identify the uPD720102 as a USB controller and attempted to initialise it with the USB drivers. However, the initialisation failed. On boot, I was getting this sort of shit:

Sep 6 11:56:34 box kernel: [ 33.103896] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.107886] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.110884] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.113882] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.288051] usb 8-2: device descriptor read/64, error 33554450 Sep 6 11:56:34 box kernel: [ 33.556053] usb 8-2: new full-speed USB device number 3 using ohci-pci Sep 6 11:56:34 box kernel: [ 33.563604] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.567594] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.570588] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.573588] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.748049] usb 8-2: device descriptor read/64, error 33554450 Sep 6 11:56:34 box kernel: [ 33.859411] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.863407] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.866400] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:34 box kernel: [ 33.869396] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:35 box kernel: [ 34.044047] usb 8-2: device descriptor read/64, error 33554450 Sep 6 11:56:35 box kernel: [ 34.324049] usb 8-2: new full-speed USB device number 4 using ohci-pci Sep 6 11:56:35 box kernel: [ 34.332105] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:35 box kernel: [ 34.336101] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:35 box kernel: [ 34.357090] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:35 box kernel: [ 34.357177] usb 8-2: Invalid ep0 maxpacket: 66 Sep 6 11:56:35 box kernel: [ 34.536043] usb 8-2: new full-speed USB device number 5 using ohci-pci Sep 6 11:56:35 box kernel: [ 34.543968] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:35 box kernel: [ 34.547959] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:35 box kernel: [ 34.569951] ohci-pci 0000:04:05.0: bad entry 2000000 Sep 6 11:56:35 box kernel: [ 34.570040] usb 8-2: Invalid ep0 maxpacket: 66 Sep 6 11:56:35 box kernel: [ 34.572026] hub 8-0:1.0: unable to enumerate USB device on port 2

I tried pissing about removing and reinserting the OHCI and EHCI kernel modules, and got some more of the above and also stuff like this:

Sep 6 12:10:19 box kernel: [ 858.312188] ohci-pci 0000:04:05.0: OHCI PCI host controller Sep 6 12:10:19 box kernel: [ 858.312264] ohci-pci 0000:04:05.0: new USB bus registered, assigned bus number 8 Sep 6 12:10:19 box kernel: [ 858.312365] ohci-pci 0000:04:05.0: irq 20, io mem 0xfebff000 Sep 6 12:10:19 box kernel: [ 858.437541] usb usb8: New USB device found, idVendor=1d6b, idProduct=0001 Sep 6 12:10:19 box kernel: [ 858.437612] usb usb8: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Sep 6 12:10:19 box kernel: [ 858.437687] usb usb8: Product: OHCI PCI host controller Sep 6 12:10:19 box kernel: [ 858.437751] usb usb8: Manufacturer: Linux 3.14.15 ohci_hcd Sep 6 12:10:19 box kernel: [ 858.437815] usb usb8: SerialNumber: 0000:04:05.0 Sep 6 12:10:19 box kernel: [ 858.438037] hub 8-0:1.0: USB hub found Sep 6 12:10:19 box kernel: [ 858.438115] hub 8-0:1.0: config failed, hub has too many ports! (err -19) Sep 6 12:10:19 box kernel: [ 858.488019] ohci-pci 0000:04:05.0: controller won't resume Sep 6 12:10:19 box kernel: [ 858.488087] ohci-pci 0000:04:05.0: HC died; cleaning up

Sep 6 12:12:13 box kernel: [ 972.075570] irq 20: nobody cared (try booting with the "irqpoll" option) Sep 6 12:12:13 box kernel: [ 972.075642] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.14.15 #3 Sep 6 12:12:13 box kernel: [ 972.075706] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./JW-RS780G-UVD+, BIOS 080014 05/22/2008 Sep 6 12:12:13 box kernel: [ 972.075784] ffff8800cb9740c4 ffffffff814d69ec ffff8800cb974000 ffffffff810b6b18 Sep 6 12:12:13 box kernel: [ 972.076055] ffff8800cb974000 0000000000000000 0000000000000014 ffffffff810b6ebf Sep 6 12:12:13 box kernel: [ 972.076325] 0000000000000000 0000000000000014 0000000000000000 0000000000000000 Sep 6 12:12:13 box kernel: [ 972.076595] Call Trace: Sep 6 12:12:13 box kernel: [ 972.076656] <IRQ> [<ffffffff814d69ec>] ? dump_stack+0x41/0x51 Sep 6 12:12:13 box kernel: [ 972.076825] [<ffffffff810b6b18>] ? __report_bad_irq+0x28/0xc0 Sep 6 12:12:13 box kernel: [ 972.076890] [<ffffffff810b6ebf>] ? note_interrupt+0x25f/0x2b0 Sep 6 12:12:13 box kernel: [ 972.076955] [<ffffffff810b48c1>] ? handle_irq_event_percpu+0x151/0x1c0 Sep 6 12:12:13 box kernel: [ 972.077021] [<ffffffff810b4963>] ? handle_irq_event+0x33/0x50 Sep 6 12:12:13 box kernel: [ 972.077086] [<ffffffff810b7908>] ? handle_fasteoi_irq+0x58/0x110 Sep 6 12:12:13 box kernel: [ 972.077153] [<ffffffff81015c78>] ? handle_irq+0x18/0x30 Sep 6 12:12:13 box kernel: [ 972.077218] [<ffffffff81015593>] ? do_IRQ+0x43/0xe0 Sep 6 12:12:13 box kernel: [ 972.077283] [<ffffffff814dc26d>] ? common_interrupt+0x6d/0x6d Sep 6 12:12:13 box kernel: [ 972.077346] <EOI> [<ffffffff8108637e>] ? __hrtimer_start_range_ns+0x1be/0x440 Sep 6 12:12:13 box kernel: [ 972.078811] [<ffffffff8104e112>] ? native_safe_halt+0x2/0x10 Sep 6 12:12:13 box kernel: [ 972.078877] [<ffffffff8101c794>] ? default_idle+0x14/0xb0 Sep 6 12:12:13 box kernel: [ 972.078942] [<ffffffff8101c867>] ? amd_e400_idle+0x37/0x100 Sep 6 12:12:13 box kernel: [ 972.079006] [<ffffffff810b3cad>] ? cpu_startup_entry+0xbd/0x260 Sep 6 12:12:13 box kernel: [ 972.079073] [<ffffffff81041c61>] ? start_secondary+0x1d1/0x280 Sep 6 12:12:13 box kernel: [ 972.079137] handlers: Sep 6 12:12:13 box kernel: [ 972.079214] [<ffffffffa0024a40>] usb_hcd_irq [usbcore] Sep 6 12:12:13 box kernel: [ 972.079327] Disabling IRQ #20

Now, up until this point I had had the PS/2 trackball plugged into the card's PS/2 port, and a USB wireless adaptor plugged into one of its USB ports just to give it something to identify. For no particular reason, after yet again removing and reinserting the OHCI and EHCI modules and getting another run of "bad entry 2000000" errors, I decided to unplug the wireless adaptor. Well, fuck me ragged! Immediately I did this, the card "woke up" and the PS/2 side of things started working!

Sep 6 12:56:30 box kernel: [ 978.664035] usb 8-3: new low-speed USB device number 29 using ohci-pci Sep 6 12:56:30 box kernel: [ 978.883011] usb 8-3: New USB device found, idVendor=0a81, idProduct=0205 Sep 6 12:56:30 box kernel: [ 978.883081] usb 8-3: New USB device strings: Mfr=1, Product=2, SerialNumber=0 Sep 6 12:56:30 box kernel: [ 978.883145] usb 8-3: Product: PS2 to USB Converter Sep 6 12:56:30 box kernel: [ 978.883208] usb 8-3: Manufacturer: CHESEN Sep 6 12:56:30 box kernel: [ 978.900521] input: CHESEN PS2 to USB Converter as /devices/pci0000:00/0000:00:14.4/0000:04:05.0/usb8/8-3/8-3:1.0/0003:0A81:0205.0005/input/input16 Sep 6 12:56:30 box kernel: [ 978.900789] hid-generic 0003:0A81:0205.0005: input,hidraw1: USB HID v1.10 Keyboard [CHESEN PS2 to USB Converter] on usb-0000:04:05.0-3/input0 Sep 6 12:56:30 box kernel: [ 978.911767] input: CHESEN PS2 to USB Converter as /devices/pci0000:00/0000:00:14.4/0000:04:05.0/usb8/8-3/8-3:1.1/0003:0A81:0205.0006/input/input17 Sep 6 12:56:30 box kernel: [ 978.912067] hid-generic 0003:0A81:0205.0006: input,hidraw2: USB HID v1.10 Mouse [CHESEN PS2 to USB Converter] on usb-0000:04:05.0-3/input1 Sep 6 12:56:30 box mtp-probe: checking bus 8, device 29: "/sys/devices/pci0000:00/0000:00:14.4/0000:04:05.0/usb8/8-3" Sep 6 12:56:30 box mtp-probe: bus: 8, device: 29 was not an MTP device

So naturally I tore all my clothes off and jumped around the room wanking like a maniac.

Well, actually, no, I didn't. But I did go and make a pot of tea. And it has been working fine ever since - though be it noted that I have not yet dared to reboot; since I never switch the machine off anyway, after all this fucking trouble I am very much inclined to let sleeping dogs lie until they are woken by some external influence like a power cut or something.

Somehow, also, something has managed to write the correct values into the PCI config space now. lspci -xx shows that the device ID field has changed back to 0x0035/0x00e0, and the device class to 0x0c, as they should be.

So, basically, it is working now, but I haven't got a fucking clue why it wasn't working to begin with or why incorrect values were being written to supposedly read-only locations. I suppose my point in writing this page is to convey the message to anyone else who is being fucked around by an NEC uPD720102 acting like a total fuck-arsed spastic piece of shit: Don't give up. Keep kicking the bastard thing's arse and with a bit of luck it might sort itself out eventually, even if it does take 16 hours of pissing around to do it.

At least this is what I thought until I shut the machine down to install some more memory and could NOT get the cunting thing to work again afterwards. No amount of removing/reinserting either modules or things plugged into it made any difference. What's more, the corruption of supposedly read-only registers had gone up another level in weirdness.

The post-POST screen persistently showed the card as "Unknown device" because the device class was persistently wrong. The arse-felching shitebag of a BIOS had again corrupted the device class to 0x0e and the device/subsystem IDs to 0x0235/0x02e0. Trying to reset the class with setpci did nothing at all, but trying to reset the device and subsystem IDs did WEIRD SHIT. Sometimes nothing happened. Sometimes the register values would toggle back and forth between having the spurious 2 in the device ID and having it in the subsystem ID with every attempted write operation. Sometimes writing to the registers of the OHCI section of the card would leave them unchanged but would update the EHCI section instead, or vice versa. I haven't got the faintest fucking clue what the fuck was going on here and am driven to conclude that this chip is a piece of shit and the best thing to do is just get rid of the cunt.

Which, effectively, is what I ended up doing. Needing a functioning PS/2 port, and running out of patience frigging about with this shitey chip while still not understanding what its fucking problem is, I decided to pigeon the hardware instead of the software. I severed the PCB traces between the Chesen and NEC chips, then cut the tail off a dead mouse and soldered the signal conductors directly to the relevant pins on the Chesen chip. I then plugged the other end of the tail into a standard USB port (ie. one not provided by the NECunt chip). Bingo! One functioning PS/2 interface.

But just to compound the weirdness, when I rebooted after installing the pigeoned hardware, the post-POST screen, for the first time in this session, reported the card as a USB controller, and once Linux had come up I could see that the device class had been reset to 0x0c. Waaarrrgghh!!! I am now driven to wonder if the shittiness of the NEC chip is triggered simply by having USB devices connected to it while it is powering up. I could, of course, add a DPDT switch to my hardware bodge so that I could switch it between standard and pigeoned configurations in order to test this possibility. Right now, however, I cannot be arsed.

Oh, and this is my horrible kernel bodge, in case anyone's interested:

diff -ur linux-source-3.14.orig/drivers/pci/quirks.c linux-source-3.14/drivers/pci/quirks.c --- linux-source-3.14.orig/drivers/pci/quirks.c 2014-09-07 14:41:46.492000880 +0100 +++ linux-source-3.14/drivers/pci/quirks.c 2014-09-07 14:42:56.828858008 +0100 @@ -28,6 +28,38 @@ #include "pci.h" /* + * NEC uPD720102 keeps getting the class set wrong by the BIOS. + * _Sometimes_ it gets the correct class 0x0c, but more often + * it gets 0x04 (Audio device) or 0x0e (intelligent controller). + * The result is that Linux does not recognise the device. + * And how the fuck is the BIOS writing to a read-only register + * anyway? There are no clues in the datasheet AFAICT. + */ + +static void quirk_nec_720102_wrong_class(struct pci_dev *dev) { + if (dev->device & 0xff00) { + dev->device &= 0xff; + dev_info(&dev->dev, "NEC 720102: forced device &= 0xff\n"); + } + if (dev->subsystem_device & 0xff00) { + dev->subsystem_device &= 0xff; + dev_info(&dev->dev, "NEC 720102: forced subsystem_device &= 0xff\n"); + } + if ((dev->class & 0xffff) == 0x0310) { + dev->class = PCI_CLASS_SERIAL_USB_OHCI; + dev_info(&dev->dev, "NEC 720102: forced PCI_CLASS_SERIAL_USB_OHCI\n"); + } + if ((dev->class & 0xffff) == 0x0320) { + dev->class = PCI_CLASS_SERIAL_USB_EHCI; + dev_info(&dev->dev, "NEC 720102: forced PCI_CLASS_SERIAL_USB_EHCI\n"); + } +} +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_USB, quirk_nec_720102_wrong_class); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_USB_2_0, quirk_nec_720102_wrong_class); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_USB | 0x200, quirk_nec_720102_wrong_class); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_USB_2_0 | 0x200, quirk_nec_720102_wrong_class); + +/* * Decoding should be disabled for a PCI device during BAR sizing to avoid * conflict. But doing so may cause problems on host bridge and perhaps other * key system devices. For devices that need to have mmio decoding always-on, diff -ur linux-source-3.14.orig/include/linux/pci_ids.h linux-source-3.14/include/linux/pci_ids.h --- linux-source-3.14.orig/include/linux/pci_ids.h 2014-09-07 14:41:06.823517200 +0100 +++ linux-source-3.14/include/linux/pci_ids.h 2014-09-07 14:42:45.164715907 +0100 @@ -652,6 +652,7 @@ #define PCI_DEVICE_ID_NEC_VRC5476 0x009b #define PCI_DEVICE_ID_NEC_VRC4173 0x00a5 #define PCI_DEVICE_ID_NEC_VRC5477_AC97 0x00a6 +#define PCI_DEVICE_ID_NEC_USB_2_0 0x00e0 /* PCI-USB 2.0 Host */ #define PCI_DEVICE_ID_NEC_PC9821CS01 0x800c /* PC-9821-CS01 */ #define PCI_DEVICE_ID_NEC_PC9821NRB06 0x800d /* PC-9821NR-B06 */




Back to Pigeon's Nest


Be kind to pigeons




Valid HTML 4.01!