On Thursday 19 September 2013 13:35:42 Timothy Pearson wrote:
I have to ask: If the XRender system worked so well in
Qt3, even over the
network, why is the XRender system working so badly in Qt4 that Nokia now
needs to fall back on a software rasterizer? What changed?
your assumption is
wrong. XRender is slow, software rendering is faster. This
is no surprise. The rendering if done on the CPU is performed in one go by one
process. If done with XRender the rendering is split over two applications
which results by definition in non-deterministic behavior. With XRender you
need to have roundtrips to the XServer during the rendering to e.g. fetch the
geometry of the pixmap you are rendering to, etc. etc. Thus the number of
context switches is rather high. It gets better when using XCB, but neither Qt
3 nor Qt 4 have been using XCB, so that is a rather irrelevant point to make.
Using X for rendering makes everything more difficult, like when to time the
frame so that you get flicker free rendering. X itself doesn't have a concept
of double buffering, so this has to be done client side. Otherwise you run
into situation that X performs a scan out while you still render to the window
thus you have tearing. The solution is to render to a off-screen pixmap and
then swap it. Now if you already render to an off-screen pixmap why would you
want to use X if you have an alternative renderer available? (Note: the link
you provided shows that Qt 3 used to demand applications to carry their own
double buffering code while Qt 4 is double buffered by default. This might
explain the differences you notice if you compare a single buffered
application to a double buffered one).
As KWin has an XRender compositing backend I am rather familiar with the
quirks and the problems it needs. XRender performance is extremely dependent
on the implementation inside the X drivers. This is completely outside of
control for Qt (or KWin) and we have seen that the performance can become very
bad depending on the driver. The CPU is way more reliable in that regard and
can be timed better. If it doesn't render the frame in 16 msec you reduce the
complexity. This is not possible with X - one cannot determine the length of
one rendered frame from the previous one. One roundtrip can completely screw
your timing.
Last but not least: rendering on CPU allows you to render in a thread, this is
not possible with the native graphics system. If you render in a thread you
can keep the application responsive even if the frame takes quite some time.
So I'm sorry to say: your assumptions are wrong.
Even over a fast RDP connection (yes, RDP, not X11 native, though X11
native shows the same effect) I can tell the difference between Qt3 and
Qt4 apps just from their redraw speed alone. Essentially, *as far as I
can tell*, if your remote desktop is not blasting entire screens over the
network on each update (e.g. some old versions of VNC), you will see a
definite performance difference between Qt3 and Qt4 apps. My best *guess*
is that the graphics server does not know what changed when a Qt4 app
updates its visible contents, therefore it sends the entire contents of
the window over the remote desktop connection, even if 99% of the pixels
are the same.
I assume if you talk about Qt 4 you mean the Oxygen widget style,
which is
extremely heavy compared to any style used in Qt 3? Please don't mix things.
The performance of Qt doesn't depend on Oxygen ;-)
I probably got some of the Qt4 stuff above wrong as usual, but I don't
really have the time or desire to keep up with the latest Qt4 information
as I don't use Qt4 very often. :-) All I know is that there *is* a
performance drop on many systems (unquantified, though this seems most
severe on non-VNC remote desktops) that so far has eluded the teams at
Nokia, the various KDE SC developers, the TDE developers, and various
application developers such as Xilinx.
and that is something I doubt - are you
really the one who sees that the
emperor is not wearing cloths, are the hundreds of experienced Qt developers
wrong, while just you found the ultimate truth? I rather think that you have
an incorrect setup[1] combined with that you see what you want to see. Yeah if
one wants to see Qt 4 being slower you will see that it is slower - whenever I
do an optimization I'm sure that it's faster. Our eye is not good enough to
notice such difference - this needs benchmarks. We try to render at 60 frames
per second, thus each frame should be there after 16.6 msec - I cannot see
such differences and you neither. As you probably know I have to work on an
application where every frame counts and we try to do animations at 60 Hz. I
know what can be seen and what not. A missing frame for example is something I
do not spot.
Cheers
Martin
[1] One of the reasons why we introduced support information inside KWin was,
that users claimed it to be slow and with very few settings we were able to
get it fast. In most cases it was because the users "optimized" by changing
away from defaults or using something they thought is "faster".