Sunday, June 28, 2015

GSoC: Some notes of hacking on drm drivers

In the very first, I converted the DRM_BOCHS within just one patch, it's doing too many things, makes me lost my way in kernel land. Daniel suggusted me to split up the patch into smaller steps, this really helps a lot.

Then when test each small step, I find I got stuck in the function register_framebuffer(), locked up while holding console_lock. So Daniel explained a bit the background:

For all kernel log output to the console there is one big lock to protect it all, that's the console_lock. We also need this lock to register a new console and register a legacy fbdev, we have that in drm to be able to use the fb console.

Problem is now that the first thing fbcon does is setup a mode while we still hold console_lock, which means it's going to call all the bochs modeset functions, if it crash in there or anything else happens, not a single line will go to dmesg since console_lock is locked.

The messages that the between the console_lock() and console_unlock() will not display to the console imediatelly, but shows after the lock was released.

At the end of register_framebuffer the console_lock get released, and everything that was logged with DRM_DEBUG or printk appears in one go in the logs, which means if now bug is there it looks like the printing works. But if it crash nothing past the console_lock() call will ever show up anywhere.

Solution:
We need to get rid of fbcon for debugging.
This means the screen will be dark until X comes up, and there won't be a kernel console any more.

So needs prep:
1. make sure X comes up automatically
2. make sure we can log into the vm using ssh (since no other console will work)
3. change the kernel config and set FRMEBUFFER_CONSOLE=n (make munuconfig)

There is even easier way to do this:
Turn off fbdev support in bochs, set bochs.enable_fbdev = false will work.

I tested it with a working distro kernel, set enable_fbdev = false will not print drm log messages like before, and X can finally start up.

Now, as we are clear of the background and solution, we should be able to log into the machine with ssh and look at dmesg to figure out what's wrong.

The native linux and the qemu VM are not in the same Network segment, it's ok to ssh from vm to native linux, but not work the other way around, so I have to configure the qemu network.

Qemu networking:
There are two parts to networking within QEMU:

  • the virtual network device that is provided to the guest(e.g. a PCI network card).
  • the network backend that interacts with the emulated NIC (e.g. puts packets onto the host's network).
There are a range of options for each part. By default QEMU will create a SLiRP user network backend and an appropriate virtual network device for the guest (eg an E1000 PCI card for most x86 PC guests).

Note: if you are using the (default) SLiRP user networking, the ping(ICMP) will not work, though TCP and UDP will.

In most cases, if you don't have any specific networking requirements other than be able to access to a web page from your guest, user networking(slirp) is a good choice. However, if you are looking to run any kind of network service or have your guest participate in a network in any meaningful way, tap is usually the best choice.

Tap
The tap networking backend makes use of a tap networking device in the host. It offers very good performance and can be configured to create virtually any type of network topology. Unfortunately, it requires configuraion of that network topology in the host which tends to be different depending on the operating system you are using. Generally speaking, it also requres that you have root privileges.
  -netdev tap, id=mynet0

Below is what I did configuring the network.

We need two tools to configure the network.
  1. #apt-get install bridge-utils
  2. #apt-get install uml-utilities
Then configure the bridge:
  1. #ifconfig eth0 down
  2. #brctl addbr br0
  3. #brctl addif br0 eth0
  4. #brctl stp br0 off
  5. #brctl setfd br0 1
  6. #brctl sethello br0 1
  7. #ifconfig br0 0.0.0.0 promisc up
  8. #ifconfig eth0 0.0.0.0 promisc up
  9. #dhclient br0
Then configure TAP device:
  1. #tunctl -t tap0 -u root
  2. #brctl addif br0 tap0
  3. #ifconfig tap0 0.0.0.0 promisc up
Now add some option when I start the qemu:
qemu-system-x86_64 -m 1024 -smp 2 -hda ./linux-bochs.img -no-acpi -vga std -k en -us -serial stdio -net nic -net tap,ifname=tap0,script=no,downscript=no

It works, now I can use ping and ssh to comunicate from host to guest.

Note:
Must make sure that eth0 has the IP of 0.0.0.0, if the ip of eth0 is same as br0, it will not work.

modetest, a basic kms test program
  ./modetest -h
  ./modetest -M bochs-drm
  ./modetest -p
  ./modetest -s 21@19:1024x768


No comments:

Post a Comment