serve Xorg performance penalty due to [u]vesafb creating wrong PAT entries

Bug #574733 reported by Thomas Schlichter
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

Hi,

when using uvesafb or vesafb, these drivers will create uncached-minus PAT entries for the framebuffer memory because they use ioremap(). WHen the framebuffer memory intersects with the video RAM used by Xorg, the complete video RAM will be mapped uncached-minus what results in a server performance penalty.

Here are the correct MTRR entries created by uvesafb:
schlicht@netbook:~$ cat /proc/mtrr
reg00: base=0x000000000 ( 0MB), size= 2048MB, count=1: write-back
reg01: base=0x06ff00000 ( 1791MB), size= 1MB, count=1: uncachable
reg02: base=0x070000000 ( 1792MB), size= 256MB, count=1: uncachable
reg03: base=0x0d0000000 ( 3328MB), size= 16MB, count=1: write-combining

And here are the problematic PAT entries:
schlicht@netbook:~$ sudo cat /sys/kernel/debug/x86/pat_memtype_list
PAT memtype list:
write-back @ 0x0-0x1000
uncached-minus @ 0x6fedd000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee3000-0x6fee4000
uncached-minus @ 0x6fee3000-0x6fee4000
uncached-minus @ 0x6fee3000-0x6fee4000
uncached-minus @ 0xd0000000-0xe0000000 <-- created by xserver-xorg
uncached-minus @ 0xd0000000-0xd1194000 <-- created by uvesafb
uncached-minus @ 0xf4000000-0xf4009000
uncached-minus @ 0xf4200000-0xf4400000
uncached-minus @ 0xf5000000-0xf5010000
uncached-minus @ 0xf5100000-0xf5104000
uncached-minus @ 0xf5400000-0xf5404000
uncached-minus @ 0xf5404000-0xf5405000
uncached-minus @ 0xf5404000-0xf5405000
uncached-minus @ 0xfed00000-0xfed01000

Therefore I created the attached patch for uvesafb which uses ioremap_wc() to create the correct PAT entries, as shown below:
schlicht@netbook:~$ sudo cat /sys/kernel/debug/x86/pat_memtype_list
PAT memtype list:
write-back @ 0x0-0x1000
uncached-minus @ 0x6fedd000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee2000-0x6fee3000
uncached-minus @ 0x6fee3000-0x6fee4000
uncached-minus @ 0x6fee3000-0x6fee4000
uncached-minus @ 0x6fee3000-0x6fee4000
write-combining @ 0xd0000000-0xe0000000
write-combining @ 0xd0000000-0xd1194000
uncached-minus @ 0xf4000000-0xf4009000
uncached-minus @ 0xf4200000-0xf4400000
uncached-minus @ 0xf5000000-0xf5010000
uncached-minus @ 0xf5100000-0xf5104000
uncached-minus @ 0xf5400000-0xf5404000
uncached-minus @ 0xf5404000-0xf5405000
uncached-minus @ 0xf5404000-0xf5405000
uncached-minus @ 0xfed00000-0xfed01000

This results in a performance gain, objectively measurable with e.g. x11perf -comppixwin10 -comppixwin100 -comppixwin500:
1: x11perf_xaa_lucid.log
2: x11perf_xaa_lucid_patched.log

    1 2 Operation
-------- ----------------- -----------------
300000.0 296000.0 ( 0.99) Composite 10x10 from window to window
 38400.0 38500.0 ( 1.00) Composite 100x100 from window to window
  1760.0 1760.0 ( 1.00) Composite 500x500 from window to window
124000.0 202000.0 ( 1.63) Composite 10x10 from pixmap to window
  3340.0 24400.0 ( 7.31) Composite 100x100 from pixmap to window
   131.0 1150.0 ( 8.78) Composite 500x500 from pixmap to window

You can see the serve performance gain when composing larger pixmaps to window.
Please consider applying/pushing the attached patch. I'll also attach a very similar patch for vesafb.

Kind regards,
  Thomas
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.22.1.
AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: VT82xx [HDA VIA VT82xx], device 0: ALC272 Analog [ALC272 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: VT82xx [HDA VIA VT82xx], device 0: ALC272 Analog [ALC272 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'VT82xx'/'HDA VIA VT82xx at 0xf5400000 irq 51'
   Mixer name : 'Realtek ALC272'
   Components : 'HDA:10ec0272,144dc04e,00100001'
   Controls : 14
   Simple ctrls : 8
DistroRelease: Ubuntu 10.10
LiveMediaBuild: Ubuntu 10.10 "Maverick Meerkat" - Alpha i386 (20100602.2)
MachineType: SAMSUNG ELECTRONICS CO., LTD. NC20/NB20
Package: linux (not installed)
ProcCmdLine: BOOT_IMAGE=/casper/vmlinuz noprompt cdrom-detect/try-usb=true file=/cdrom/preseed/hostname.seed boot=casper initrd=/casper/initrd.lz video=vesafb:mtrr=3 vga=792 quiet splash -- priority=low debian-installer/language=de console-setup/layoutcode?=de
ProcEnviron:
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.34-5.13-generic 2.6.34
Regression: No
RelatedPackageVersions: linux-firmware 1.35
Reproducible: Yes
Tags: maverick kconfig needs-upstream-testing
Uname: Linux 2.6.34-5-generic i686
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
dmi.bios.date: 11/25/2009
dmi.bios.vendor: Phoenix Technologies Ltd.
dmi.bios.version: 10MQ
dmi.board.name: NC20/NB20
dmi.board.vendor: SAMSUNG ELECTRONICS CO., LTD.
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: SAMSUNG ELECTRONICS CO., LTD.
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLtd.:bvr10MQ:bd11/25/2009:svnSAMSUNGELECTRONICSCO.,LTD.:pnNC20/NB20:pvr04MQ:rvnSAMSUNGELECTRONICSCO.,LTD.:rnNC20/NB20:rvr:cvnSAMSUNGELECTRONICSCO.,LTD.:ct10:cvrN/A:
dmi.product.name: NC20/NB20
dmi.product.version: 04MQ
dmi.sys.vendor: SAMSUNG ELECTRONICS CO., LTD.

Revision history for this message
Thomas Schlichter (bigboss77) wrote :
Revision history for this message
Thomas Schlichter (bigboss77) wrote :

Similar patch for vesafb to use the correct ioremap_* calls to create suited PAT entries.

tags: added: kj-triage
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Thomas,

This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 574733

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Thomas Schlichter (bigboss77) wrote :

Hi Jeremy,

today I tested the maverick alpha 1 and observed the same behavior as with lucid:
1. No extra kernel parameters:
  - vga16fb is loaded and not [u]vesafb
  - thus the framebuffer terminal does not work on my NC20 (maybe I should open a bug for this, too?)
  - no PAT entries are created by vga16fb
  - correct (write combining) PAT entries for "openchrome"
2. "video=vesafb:mtrr=3 vga=792" kernel parameter
  - vesafb is loaded
  - now the framebuffer terminal works correctly
  - "uncached-minus" PAT entries are created by vesafb
  - incorrect (uncached-minus) PAT entries for "openchrome"

Unfortunately the maverick alpha 1 ISO has no uvesafb in the initramfs, so I couldn't test this.

I will upload the apport files while above config 2. is loaded.

Kind regards,
  Thomas

tags: added: apport-collected
description: updated
Revision history for this message
Thomas Schlichter (bigboss77) wrote : AlsaDevices.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : AudioDevicesInUse.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : BootDmesg.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : IwConfig.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : Lspci.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : Lsusb.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : PciMultimedia.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : ProcModules.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : RfKill.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : UdevDb.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : UdevLog.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote : WifiSyslog.txt

apport information

Revision history for this message
Thomas Schlichter (bigboss77) wrote :

Hi,

I tested this kernel linked from https://wiki.ubuntu.com/KernelMainlineBuilds with lucid:
   linux-image-2.6.35-999-generic_2.6.35-999.201006021335_i386.deb

It does not contain vesafb but uvesafb module. So I tested uvesafb and it behaves just like the lucid kernel module. This means even though uvesafb adds write-combining MTRR entries it adds uncached-minus PAT entries. Therefore the openchrome driver cannot add overlapping write-combining PAT entries and also uses uncached-minus PAT entries. This results in the serve performance penalty exactly as described in my first post.

So no difference between lucid kernel and mainline kernel. I'll remove the 'needs-upstream-testing' tag.

Best regards,
  Thomas

tags: removed: needs-kernel-logs needs-upstream-testing
tags: added: cherry-pick kernel-graphics kernel-needs-review
Changed in linux (Ubuntu):
status: Incomplete → Triaged
importance: Undecided → Medium
Revision history for this message
Steve Conklin (sconklin) wrote :

Thomas, have you sent your patch upstream?

Thanks,

Steve

tags: added: kernel-reviewed
removed: kernel-needs-review
tags: added: patch
Revision history for this message
Thomas Schlichter (bigboss77) wrote :

Hi Steve,

no, I have not sent my patch upstream, yet. Do you want me to?

Kind regards,
  Thomas

Revision history for this message
Andy Whitcroft (apw) wrote :

@Tomas -- this patch still looks applicable to mainline v2.6.38-rc2, so yes I think it is worth pushing it upstream.

Revision history for this message
Thomas Schlichter (bigboss77) wrote :

@Andi -- I already did the first step with slightly improved patches:

  http://marc.info/?m=129086507505163

Unfortunately the ioremap_{wc,cache,nocache} functions seem not to be supported by all the architectures where [u]vesafb build. Paul Mundt recognized it, and now I'm looking if I can establish these functions for all architectures, or at least be sure to correctly use fall-back functions.

As I'm quite busy now with my daily work, this may have to wait a bit...

Revision history for this message
Thomas Schlichter (bigboss77) wrote :

By now the patch has been merged upstream.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=803a4e14a7dceaf01dbc0e02c0fdea2eba4c81b3

So I'd be happy if you could also take it into the ubuntu tree.

Revision history for this message
Brad Figg (brad-figg) wrote :

@Thomas -- This bug was filed against a pre-release of Maverick. This patch is in the Oneiric kernel. Is this still an issue for you?

Changed in linux (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Thomas Schlichter (bigboss77) wrote :

I was able to verify that the problem is fixed in Oneiric. So for me this is not an issue anymore.
(but it's a pity that it wasn't fixed in Natty...)

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.