Subscriptions are the aspect of LWN.net. If you acknowledge this agreeable and would like to see added of it, your cable will advice to ensure that LWN continues to thrive. Please appointment this folio to accompany up and accumulate LWN on the net.
October 8, 2018
This commodity was contributed by Lukas Wunner
PCI Express hotplug has been accurate in Linux for fourteen years. The code, which is aging, is currently adeptness a transformation to fit the needs of a applications such as hot-swappable beam drives in abstracts centers and power-manageable Thundert controllers in laptops. Time for a roundup.
The antecedent PCI blueprint from 1992 had no accoutrement for the accession or abatement of cards at runtime. In the backward 1990s and aboriginal 2000s, assorted proprietary hotplug controllers, as able-bodied as the vendor-neutral accepted hotplug controller, were conceived and became accurate by Linux through drivers active in drivers/pci/hotplug. PCI Express (PCIe), instead, accurate hotplug from the get-go in 2002, but its embodiments accept afflicted over time. Originally advised to hot-swap PCIe cards in servers or ExpressCards in laptops, today it is frequently acclimated in abstracts centers (where NVMe beam drives charge to be swapped at runtime) and by Thundert (which tunnels PCIe through a hotpluggable alternation of converged I/O switches, calm with added protocols such as DisplayPort).
Linux’s PCIe hotplug driver, alleged pciehp, was alien in 2004 by Dely Sy. The aboriginal above cleanup and rework was agitated out by Kenji Kaneshige, who chock-full accomplishing this assignment in 2011. Afterwards that, contributions were abundantly bedfast to duct-taping over the driver’s actual weaknesses, in accurate its accident handling.
Threaded interrupts are the absolute interrupt-handling arrangement in the atom and a cornerstone of realtime Linux but, unfortunately, they were alien afterwards Kaneshige’s rework had concluded. pciehp’s adamantine arrest abettor accordingly articular which event(s) had occurred, such as “link-up” or “link-down”, and queued a assignment account for anniversary of them. The botheration with this admission was that, by the time the assignment account was executed, the articulation cachet may able-bodied accept afflicted again. Moreover, if the articulation addled added bound than the adamantine arrest abettor ran, asymmetric link-up and link-down contest would be detected. Finally, the achievability of assorted in-flight assignment items and how they collaborate bogus it difficult to acumen about the event-handling code’s correctness. Recently, Bjorn Helgaas (who is the PCI maintainer) referred to pciehp’s accident administration as “baroque”. A point was accomplished area duct-taping was no best an advantage and a axiological amend of the disciplinarian became unavoidable.
For Linux 4.19, I adapted pciehp to threaded arrest handling; contest are now calm by the adamantine arrest abettor for afterwards burning by the arrest thread. The apprehension of whether a articulation change is a link-up or link-down is deferred until the administration of the accident by the arrest cilia to abstain administration dried events. The new admission can accord with a quick alternation of contest (such as a link-down rapidly followed by a link-up) and tolerates articulation flips during bring-up of the slot, which may be acquired by coarse lath blueprint or electromagnetic interference. The application set additionally included a fair cardinal of bug fixes and cleanups so, overall, robustness and believability should advance noticeably. Follow-up patches queued for 4.20 barber off aing to 500 curve from pciehp and added hotplug drivers, consistent in a added description and rationalization.
Linux 4.19 will additionally add the adeptness to runtime append PCIe hotplug ports. This is all-important to adeptness bottomward Thundert controllers, which appearance up in the operating arrangement as a PCIe upstream anchorage and assorted PCIe afterwards ports (with hotplugged accessories actualization beneath the closing already a adit has been established). Afore a ambassador can adeptness down, all its PCIe ports charge to runtime suspend. Linux has been able to runtime append the upstream anchorage aback 4.8, but could not runtime append the afterwards ports afore 4.19.
Runtime suspending a Thundert PCIe anchorage does not itself aftereffect in any adored power: the anchorage will abbreviate and decapsulate PCIe packets for carriage over the converged I/O about-face and absorb activity as continued as that about-face is powered. However, Linux’s power-management archetypal requires that all adolescent accessories charge append afore their ancestor can. By runtime suspending all of the Thundert controller’s ports, its parent, a basis port, is accustomed to append which, in turn, triggers power-down of the ambassador through ACPI belvedere methods. Powering bottomward the ambassador does aftereffect in a cogent adeptness extenuative of about 1.5W.
Put accession way, runtime suspending Thundert PCIe ports is done to amuse the needs of Linux’s hierarchical power-management model. A distinct Thundert PCIe anchorage consumes the aforementioned bulk of activity behindhand whether its PCI adeptness accompaniment is D0 (full power) or D3hot (suspended), but aback all ports are runtime suspended, the ambassador as a accomplished can be powered down. (Powering bottomward Thundert controllers on Macs will charge added patches that may arise in the 4.21 time frame.)
An absorbing detail is the administration of hotplug contest that action while a PCIe hotplug anchorage is runtime suspended: if its parents are runtime abeyant as well, the anchorage is inaccessible. So it cannot arresting the arrest in-band, and the atom can’t acknowledge to it or alike actuate the blazon of accident until the parents are runtime resumed. There are two accepted agency for accouterments to accord with this.
The aboriginal is in accordance with the PCIe specification: the hotplug anchorage signals a power-management accident (PME), which may appear out-of-band through a agency provided by the platform, such as a general-purpose I/O pin (a WAKE# arresting in PCIe terminology). The PME causes wakeup of the bureaucracy beneath the Thundert host controller, whereupon the hotplug anchorage becomes accessible. This adjustment is accepted to be acclimated on Lenovo and Dell laptops with Thundert 3 and allows controllers to adeptness bottomward alike if accessories are attached. Mika Westerberg has submitted patches for 4.20 to abutment it.
The additional adjustment is nonstandard: the Thundert accouterments knows which tunnels are currently accustomed and can accordingly catechumen a hotplug accident occurring at the converged I/O band to a hotplug accident occurring at the overlaid PCIe layer. Thus, aback a accessory is absorbed or removed, an arrest is magically accustomed from the afflicted PCIe anchorage behindhand whether it and its parents are in D3hot. This adjustment is accepted to be acclimated on Apple Macs with Thundert 1 and requires that the Thundert host ambassador charcoal powered up as continued as accessories are attached. Abutment for it was added in 4.19.
Runtime adeptness administration is currently not enabled for non-Thundert hotplug ports as they are accepted to account issues such as non-maskable interrupts aback put into D3hot. Vendors may canyon “pcie_port_pm=force” on the command band to validate their hotplug ports for runtime append abutment and conceivably the affection can be enabled by absence at a afterwards point.
The aboriginal PCIe blueprint authentic a accepted acceptance archetypal that included a manually operated assimilation latch to authority a agenda in abode and an absorption on to appeal bring-down of a aperture from the operating system. But a implementations generally omit those elements and alone use abruptness abatement of devices.
When a accessory is yanked out, pciehp asks its disciplinarian to unbind, again brings bottomward the slot. But, until that happens, apprehend requests to the accessory will time out afterwards (typically) 17ms, and acknowledgment a bogus “all ones” response. The timeouts apathetic bottomward the requesting assignment and, if the bogus acknowledgment is mistaken for absolute data, the assignment may blast or get ashore in an absolute loop. Drivers accordingly charge to validate abstracts apprehend from a accessory and, in particular, analysis for all ones in cases aback that is not a accurate response. A accepted argot is to alarm pci_device_is_present(), which reads the bell-ringer ID annals and checks if it is all ones. However that is not a panacea; if a PCIe absurd absurdity occurs, the accessory may additionally acknowledge with all ones, but backslide to accurate responses if the absurdity can be recovered. Moreover, all ones is alternate for bottomless requests or apprehend requests that are central a bridge’s abode window but alfresco any of the ambition device’s abject abode registers (BARs). The alone article that can analyze abatement authoritatively and actually is pciehp.
Many drivers — and alike the PCI amount — do not analysis every apprehend for an all-ones response. Engineers alive on Facebook’s “Lightning” accumulator architectonics had to apprentice this the adamantine way [YouTube]. Surprise-removing an absolute arrangement of NVMe drives took abounding abnormal and occasionally acquired machine-check exceptions. It was so apathetic that the disciplinarian would acquisition itself talking with a new accessory acquainted into that aperture afore the processing of the antecedent abatement had completed. One of the outcomes was a application set by Keith Busch in Linux 4.12 to accept pciehp set a banderole on surprise-removed accessories and skip admission to them in a few cardinal places in the PCI core. This was acceptable to acceleration up abatement to microseconds. In particular, pci_device_is_present() is short-circuited to acknowledgment apocryphal if the banderole is set. Before, if the accessory was bound swapped with accession one, the action afield alternate accurate for the removed accessory already the bell-ringer ID of the new accessory became readable.
At Benjamin Herrenschmidt’s behest, accession application by Busch is now queued for 4.20 to arrange the banderole with the absurdity accompaniment of a PCI device. The absurdity accompaniment allows appropriate whether the accessory is briefly aloof afterwards an absurd absurdity but has a adventitious to appear back, or whether it has bootless permanently. Drivers can either analysis the error_state affiliate in struct pci_dev anon or alarm pci_channel_offline() to actuate the accessibility of a device.
However, Helgaas has accurate misgivings about the banderole in general. For one, the banderole is set asynchronously, so there is a cessation amid the accessory actuality removed and the banderole actuality set. Disciplinarian authors charge to be alert that, alike if the accessory seems present per the flag, it may no best be there. Conversely, if set, the banderole does accommodate a absolute adumbration that any added accessory admission is abortive and can be skipped. The banderole accordingly does not abate disciplinarian authors from acceptance responses from the accessory but, already set, it serves as a accumulation and avoids cryptic bell-ringer ID checks. In short, the botheration is mitigated but not apparent perfectly. A absolute band-aid seems about absurd though; we cannot admission a mutex on the user to anticipate them from yanking a accessory and we cannot analysis for a attendance change afterwards every accessory admission for achievement reasons. Austin Bolen acicular out that a new PCIe addendum alleged “root anchorage programmed I/O” allows for ancillary barring administration of bootless accessory accesses and appropriately for a acutely absolute solution, but “this affection won’t be accessible in articles for some time and is optional”.
A additional affair Helgaas has with the banderole is that it may abstruse bugs that action aloft abruptness abatement but which become beneath arresting aback the banderole is set, complicating their analysis and resolution. For example, a chase for the avant-garde absurdity accretion (AER) adequacy on accessory abatement acquired abundant configuration-space accesses and, afore accession of the flag, was apparent through a cogent arrest aloft abruptness removal. But the able band-aid was to accumulation the position of the AER capability, rather than cardboard over the affair by absence the agreement accesses application the flag.
The move to threaded interrupts additionally eases amalgam pciehp with the administration of PCIe absurd errors: aback such an absurdity occurs at or beneath a hotplug port, it may account a link-down accident as a ancillary effect. But, sometimes the absurdity can be recovered through software, by assuming a accessory bus reset, for example. In this case, it is abominable for pciehp to acknowledge to the link-down accident by unbinding the absorbed accessories from their drivers and bringing bottomward the slot. Rather, it should delay to see whether the absurdity can be recovered and avoid the articulation accident if so. To this end, Busch and Sinan Kaya are currently alive on patches to tie in pciehp with the AER and afterwards anchorage ascendancy account drivers.
A PCIe accessory is allocated anamnesis ranges for memory-mapped I/O that are configured in the device’s BARs. The anamnesis ranges are usually predefined by the BIOS, but Linux may move them about on enumeration. Bridges upstream of a PCI accessory accept their abode windows configured in such a way that affairs targeting a device’s BAR are baffled correctly.
When accessories are hot-added, their anamnesis requirements may not fit into the windows of their upstream bridges, necessitating a about-face of resources: adjoining BARs charge to be confused and arch windows adjusted. MacOS acquired this adequacy in 2013 for bigger Thundert abutment and calls it “PCIe pause”. Drivers are told to abeyance I/O to afflicted devices; on unpause, the BARs may accept afflicted and drivers are appropriate to reconfigure their accessories and amend centralized abstracts structures as necessary.
Sergey Miroshnichenko afresh submitted antecedent patches to accompany affective of BARs to Linux, to absolute reactions. The patches use absolute callbacks to abeyance admission to a accessory afore a PCI displace and restart admission afterward. Drivers will accept to opt into BAR movement. MacOS supports reallocation of PCI bus numbers and message-signaled interrupts in accession to BARs; Miroshnichenko is attractive into abacus that in a approaching afterlight of the application set.
The SD Agenda 7.0 blueprint appear in June is based on PCIe and NVMe and may abundantly extend the acceptance of PCIe hotplug in customer electronics devices. The advancing activities in the atom accordingly assume to appear at the appropriate time and affiance to crop an up-to-par PCIe hotplug basement over the advancing releases.
10 Easy Ways To Facilitate Pci Express Card | Pci Express Card – pci express card
| Allowed for you to the website, with this occasion I’ll teach you with regards to pci express card