Monthly Archives: June 2015

OSp: Bare-metal OSv

In this post, I’d like to summarize what I did for bare-metal OSv, or OSp.

Fortunately, if you give up using some functions that rely on Virtio devices (e.g., networking), OSv is already capable of running on bare-metal with only two small fixes.

First, you need to skip AcpiReallocateRootTable() in drivers/acpi.cc. In this function, something causes triple fault and prevents OSv from booting up.

Second, you had better insert sleep() to wait for AHCI port linkup. Without it, OSv often mistakenly believes that AHCI ports won’t get ready.

The remaining tasks for bare-metal OSv are straightforward.

First, build the OSv’s disk image.

Then, you will get the image, build/release.x64/usr.img, which is in the QCOW2 format. So, next, convert it to the raw image (e.g., named bare.img).

Then, start your favorite OS on the target physical machine, on which bare-metal OSv is supposed to run, without mounting the primary local disk (e.g., you can use USB booting or network booting), and simply dd the bare.img you’ve just created onto the primary local disk.

Finally, restart the target physical machine from the local disk. OSp will show up :)

ONE MORE THING…

Since I’m tired of dd-ing OSv’s kernel image to the local disk each time I modified the source code of OSv, I make OSv kernel network-bootable (as shown in the figure).

FlyingOSp.002

First, apply the patch. Then, try git submodule update because the patch adds iPXE as a submodule.

Next, make the network-bootable image of OSv.

Then, use the script, scripts/embed, to specify the command parameters you want to pass to OSv.

Now, you have two important files: build/release.x64/loader.bin (network-bootable OSv kernel iage) and build/release.x64/loader.ipxe (network-bootable iPXE image). The remaining steps are for allowing PXE/iPXE chain-loading using the two files.

Setup TFTP server and put the two files (loader.bin and loader.ipxe) on the TFTP directory. Also, setup DHCP server so that the target client machine can be properly instructed during PXE booting. In the configuration file of DHCP (typically dhcp.conf), specify loader.ipxe as the boot file name for the client machine.

(How to setup TFTP/DHCP servers are well described in this article, for example.)

Here is the example of DHCP/TFTP configuration:

Finally, starts the client machine enabling PXE booting. Flying OSp will show up :)

DEMO

In this movie, OSv performs network booting and lands on a physical machine (my HP server). I also applied the patch, which is for the physical 10GbE driver for OSv, so the OSv can also use network functions.

(Firmware initialization of my HP server takes more than a minute in that movie. Skip it if you are not interested in. Sorry…)

ENVIRONMENT

Because OSp runs on physical machines, things can be easily changed depending on underlying hardware. So, here I’ll note my server specification where I confirmed OSp ran. My HP server is ProLiant ML310e Gen8 v2 (Intel Xeon E3-1241 v3 @ 3.50GHz, PC3-12800E-11 4GB * 4, SAMSUNG HD103UJ 7200RPM 1T). Here I put the output of lspci, /proc/cpuinfo and hdparm -I on the server as a reference.

Also note that OSv has the AHCI driver but does not have drivers for other physical block devices such as SCSI or RAID controllers.

SEE ALSO

Practicing C++: Hello World

For some reasons, I need to get used to C++ including C++11, even C++14 possibly.

I’ve read some books of C++ and learnt basic things such as class, template, inheritance, capsulation, polymorphism, virtual functions, collections and some new features introduced in C++11 such as type reference, initializer lists, lambda expressions, smart pointers, rvalue references. Now I’m reading and modifying some open source programs written in C++ for practice.

Through the practice, I’ve added a simple piece of code to OSv, which is an Intel 10GbE driver. With this code, OSv can directly handle a physical 10GbE NIC if the host allows NIC pass-through. I’ve confirmed DHCP and ping succeed. (It is a kind of “Hello World” example program in C++, in which OSv can directly say “Hello” to physical LAN “World”.)

I put my dirty patch here.
(Revised ones: Intel 10GbE Driver and Pass-through Script)

USAGE

(Assuming that you have already compiled the patched OSv on KVM)

First, make sure that IOMMU is enabled by your BIOS and host OS.

Second, use the shell script (bindctrl), which is added by the patch, in order to remove the target 10GbE NIC from your host OS and make it ready for OSv. If your 10GbE NIC is located at Function #0 of Device #00 in Bus #07 with vendor ID 8086h and device ID 1528h, then the command will be:

Finally, run the python script with the special option (-a 07:00.0), which is also added by the patch, to start OSv with PCI pass-through.

The output will be:

TODO

  • Use interrupts
  • Use offloading
  • Use header write back
  • Use advanced descriptors

COMPONENTS

  • pf class describes a physical function of PCI device, which consists of ioreg, phyreg, tx_queue, rx_queue classes.
  • ioreg class describes a full set of I/O registers of the pf
  • phyreg class describes a full set of PHY registers of the pf, which can be accessed through MSCA/MSRWD registers.
  • tx_desc_layout struct is exactly a data structure of a TX descriptor.
  • rx_desc_layout struct is exactly a data structure of a RX descriptor.
  • desc template abstracts a descriptor, which is a template that depends on the descriptor types (tx_desc_layout or rx_desc_layout).
  • tx_desc class abstracts a TX descriptor, which is derived from desc with TX descriptor type, tx_desc_layout.
  • rx_desc class abstracts a RX descriptor, which is derived from desc with RX descriptor type, rx_desc_layout.
  • queue template describes a queue, descriptors of which depends on the descriptor types (tx_desc or rx_desc).
  • tx_queue class behaves as a queue with RX descriptors but also supports transmission operations that manipulate the queue.
  • rx_queue class behaves as a queue with TX descriptors but also supports reception operations that manipulate the queue.