In this blog post series I will show how we can use an ARM single board computer (SBC) as a PCIe card (PCIe endpoint).

At REDS when developing PCIe based devices we usually rely on FPGAs, for example to develop FPGA PCIe accelerators. These are often based on existing PCIe cards from AMD (Xilinx) and Intel (Altera). For example the Xilinx Alveo series or Zynq based development boards.

Alveo PCIe card – Image courtesy of Xilinx

To develop accelerators such cards are fine and the development boards allow for prototyping, however, they are often expensive and sometimes hard to develop for. We were searching for an alternative to do rapid prototyping of new PCIe devices. For several projects we relied on emulation with QEMU for prototyping then we went straight to an FPGA based platform to implement the hardware. It would be nice to have a hardware platform that allows to define the PCIe endpoint function in software.

Note : In PCIe terminology an “endpoint” is a device and a “root complex” is the device that connects the CPU, RAM and PCIe, we can see it as the host computer.

A Linux based PCIe endpoint ?

Tux the penguin – Source : https://en.wikipedia.org/wiki/Linux

The Linux kernel provides a PCIe endpoint framework that allows to develop PCIe endpoint functions independently of the underlying hardware. This means a developer can develop a PCIe endpoint function e.g., a GPU, an NVMe drive, a Network card, etc. once and it can then in theory be run on any hardware that provides Linux drivers for its PCIe controller as endpoint. This is wonderful !

The Linux kernel comes with two example PCIe endpoint functions :

  1. A PCIe test function
  2. A PCIe non transparent bridge (NTB)

The test function implements all the basic functionality we would expect of a PCIe device, it provide some memory space through base address registers (BARs), it can send interrupts, and move data to and from the host over PCIe, it can even use the Linux DMA subsystem.

The non transparent bridge (NTB) allows to connect two hosts (root complexes) to communicate over PCIe. (Note, in a PCIe topology there can only be one root complex).

Wow ! we can create PCIe devices with Linux !

Searching for compatible hardware

While the PCIe endpoint functions in Linux would work with any hardware that has drivers for its PCIe controller as endpoint, we still need to find such hardware. A quick web search yields some results :

Searching in the Linux kernel source code we can see :

There are some drivers for PCIe endpoint controllers from different providers, for example the “cadence” ones are for TI’s J721E SoCs. There are also drivers for Qualcomm, Rockchip based chips etc.

The RK3399 based FriendlyElec NanoPC-T4 SBC

REDS being an embedded lab we had a Rockchip based single computer board (SBC) available. A FriendlyElec NanoPC-T4. This is quite a powerful SBC, it has an hexacore RK3399 ARM CPU and can be faster than a Raspberry PI 4 at a very reasonable price. The RockPro64 board would also be a good candidate because it comes with DDR4 instead of DDR3 (NanoPC-T4) and is cheaper at 59.99$ (2GB) or 79.99$ (4GB).

FriendlyElec NanoPC-T4 – Image courtesy of FriendlyElec

Nevertheless, we had this board available and from its technical reference manual it should be able to operate as a PCIe endpoint also there seemed to be a Linux driver for it. Someone did try to use a Rockchip RK3399 based board (RockPro64) as a PCIe endpoint (see post1, post2) however, there was no info on if the person succeeded…

FriendlyElec NanoPC-T4 back side with PCIe M.2 female connector – Image courtesy of FriendlyElec

The FriendlyElec NanoPC-T4 comes with a PCIe M.2 female connector that supports up to 4 PCIe lanes (PCIe 2.0 x4). This is often used to mount an NVMe SSD but can also be used to mount other cards.

PCIe card mounted on the M.2 slot with adapter – Source : https://gadgetrip.jp/2019/01/review_friendlyelec_nanopc_t4/

Such adapters can be found online, we chose a M.2 to PCIe 16x adapter from Delock.

M.2 to PCIe x16 female (electrically connected as PCIe x4 only) – Source : https://www.delock.com/produkt/64133/merkmale.html?f=s

A gender fluid single board computer ?

The SBC comes with a female PCIe port as it is most often used as a host (root complex), however, PCIe endpoint devices should come with a male PCIe connector. We cannot plug this board in a normal computer as is… Usually when developing a hardware board, either FPGA based or SoC based, the PCIe PCB routing will be done either to a male or female connector based on the application. Here the PCB is already laid out with a female connector, can we still use it as a PCIe endpoint device ?

Well in order to connect this board to our computer we will need some adapters, the PCIe standard connects Tx (transmitters) differential pairs to Rx (receivers) differential pairs. The root complex (host) Tx are connected to the endpoint Rx and vice-versa

PCIe pinout – Source : https://en.wikipedia.org/wiki/PCI_Express#Pinout

So we cannot simply use a male-to-male connector, because it would connect Tx (transmitters) to Tx (transmitters) and probably cause some magic smoke. Therefore in order to connect a PCIe endpoint device with a female connector one needs to use a male-to-male cable that swaps the Tx/Rx pairs. These specialty cables exist and can be ordered from here or here.

PCIe x4 male-to-male with Tx/Rx signal swap (SS) – Source : http://www.adtlink.cn/en/product/R22SS.html

A cheaper alternative is to buy a pair of PCIe risers typically used in GPU coin mining applications. These risers use a male-to-male USB3 cable to transport the differential pairs. The USB3 cable has enough pins to transport a PCIe x1 link. Luckily when using such a cable and connecting it in more unusual ways such as to connect to male PCIe connectors or two female PCIe connectors, the Tx/Rx pairs get switched. (Yes, I verified this with a continuity check).

Male-to-male and Female-to-Female PCIe adapters made from (cheap) PCIe risers. These cross the Tx/Rx pairs, all other signals remain the same.

These cheap PCIe risers allowed us to connect two female PCIe connectors through a PCIe x1 link easily. However to use the full x4 lanes available we bought the R22SS adapter cable shown above. Both methods work.

A computer “inside” another computer

Well the concept of having a computer inside another computer is not new, these examples come to mind, “Perhaps the ultimate Raspberry PI case : Your PC“, “Intel NUC Compute elements” but those don’t act as PCIe endpoint devices, only use the slot form factor or to draw power. Another example comes to mind the Mustang-200 compute card, which is composed of Intel laptop CPUs that are on a PCIe board with a Gigabit PCIe network card, the host PC sees the network card and on the PCB. the laptop CPUs are connected to that network, so still no direct PCIe connection. Nevertheless, ARM based PCIe cards exist, for example the Nvidia Bluefield DPUs (data processing units) (previously from Mellanox). These cards offer direct PCIe connection but are way expensive and hard to source.

We now created our own ARM based PCIe endpoint device with a cheap single board computer and a few adapter cables.

A hacky but functional PCIe x4 connection between host PC and SBC

This board could be easily fitted inside a PC, a custom PCB could be made to connect everything more cleanly, but for the moment this is enough to experiment and try out the Linux PCIe endpoint framework.

Stay tuned for part 2

As one can imagine things are never that easy in reality… Connecting everything without causing a fire was the easy part ! Stay tuned for part 2 were we will find out if the Linux drivers for the RK3399 in PCIe endpoint mode will work properly (spoiler alert, they won’t). Fix them, and have the SBC run as a PCIe endpoint device !

Can you guess what PCIe function we will implement ?

Making sure that everything is right…

References