optimised jpeg lib

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

optimised jpeg lib

Pierre Ossman-3
Hi everyone,

Based on the ideas in TurboVNC, I've looked at optimised TightVNC's
JPEG implementation. So I've taken the code here:

http://cetus.sakura.ne.jp/softlab/jpeg-x86simd/jpegsimd.html

and cleaned it up. The result can be found here:

http://git.infradead.org/users/drzeus/libjpeg.git

To get an idea of the performance boost, I've used TurboJPEG (to get
access to IPP for comparison) and encoded/decoded an 8192x8192
image. These are the results:

Orig. libjpeg encoding: 3.8 s
IPP encoding: 1.1 s
Optimised libjpeg encoding: 1.6 s

Orig. libjpeg decoding: 5.5 s
IPP decoding: 1.6 s
Optimised libjpeg decoding: 2.9 s (1.7 s with an optimised entropy
decoder that I haven't had time to clean up yet)

Any objections to merging this to trunk?

Rgds
--
Pierre Ossman            OpenSource-based Thin Client Technology
System Developer         Telephone: +46-13-21 46 00
Cendio AB                Web: http://www.cendio.com

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
VNC-Tight-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vnc-tight-devel

signature.asc (204 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: optimised jpeg lib

Adam Tkac
On Wed, Feb 04, 2009 at 11:34:57AM +0100, Pierre Ossman wrote:
> Hi everyone,

Hi Pierre,

first, ideas in that code look fine for me and I think it will be
accepted in future.

>
> Based on the ideas in TurboVNC, I've looked at optimised TightVNC's
> JPEG implementation. So I've taken the code here:
>
> http://cetus.sakura.ne.jp/softlab/jpeg-x86simd/jpegsimd.html
>
> and cleaned it up. The result can be found here:
>
> http://git.infradead.org/users/drzeus/libjpeg.git
>
> To get an idea of the performance boost, I've used TurboJPEG (to get
> access to IPP for comparison) and encoded/decoded an 8192x8192
> image. These are the results:
>
> Orig. libjpeg encoding: 3.8 s
> IPP encoding: 1.1 s
> Optimised libjpeg encoding: 1.6 s
>
> Orig. libjpeg decoding: 5.5 s
> IPP decoding: 1.6 s
> Optimised libjpeg decoding: 2.9 s (1.7 s with an optimised entropy
> decoder that I haven't had time to clean up yet)
>
> Any objections to merging this to trunk?
>

It's quite hard to verify that code because I don't have enough time
to read such huge number of lines in asm. Although I have some
questions.

I compiled vncviewer with your libjpeg. Then I run standard Xvnc (=
without improved libjpeg) and run mplayer in Xvnc session to get large
amount of transferred data. Unfortunately vncviewer always gets
sigsegv and was aborted.

The first problem is somewhere in SSE* code. vncviewer always gets
segfault on jdsamss2.asm, line 102. When I disabled SSE code and used
only MMX code viewer still often gets segfaults - usually due double
free(). Are you able to reproduce those problems?

Next question is about djpeg and cjpeg programs. Why have you included
them into tightvnc jpeg? I don't see any reason.

Regards, Adam

--
Adam Tkac, Red Hat, Inc.

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
VNC-Tight-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vnc-tight-devel
Reply | Threaded
Open this post in threaded view
|

Re: optimised jpeg lib

Pierre Ossman-3
On Wed, 4 Feb 2009 16:06:32 +0100
Adam Tkac <[hidden email]> wrote:

>
> It's quite hard to verify that code because I don't have enough time
> to read such huge number of lines in asm. Although I have some
> questions.
>

Can't say that I've verified them myself. They produce correct output
so I've assumed they are okay enough. :)

> I compiled vncviewer with your libjpeg. Then I run standard Xvnc (=
> without improved libjpeg) and run mplayer in Xvnc session to get large
> amount of transferred data. Unfortunately vncviewer always gets
> sigsegv and was aborted.
>
> The first problem is somewhere in SSE* code. vncviewer always gets
> segfault on jdsamss2.asm, line 102. When I disabled SSE code and used
> only MMX code viewer still often gets segfaults - usually due double
> free(). Are you able to reproduce those problems?

Ouch. I haven't seen any segfaults with the current code no, but I
haven't gotten to any proper testing within TightVNC yet. Crashes in
SSE code might be because of alignment (SSE instructions _require_
proper alignment, unlike all other x86 instructions who just are more
efficient with alignment).

What processor do you have on your system? I've only tested a P4 and a
Athlon 64 so far.

>
> Next question is about djpeg and cjpeg programs. Why have you included
> them into tightvnc jpeg? I don't see any reason.
>

They were just for testing. I needed some light weight programs to
profile. That commit will not be merged. :)

Rgds
--
Pierre Ossman            OpenSource-based Thin Client Technology
System Developer         Telephone: +46-13-21 46 00
Cendio AB                Web: http://www.cendio.com

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
VNC-Tight-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vnc-tight-devel

signature.asc (204 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: optimised jpeg lib

Bob Friesenhahn
On Wed, 4 Feb 2009, Pierre Ossman wrote:
>
> Ouch. I haven't seen any segfaults with the current code no, but I
> haven't gotten to any proper testing within TightVNC yet. Crashes in
> SSE code might be because of alignment (SSE instructions _require_
> proper alignment, unlike all other x86 instructions who just are more
> efficient with alignment).

It took quite a few iterations before such code seemed to work
reliably in libpng.  Shortly thereafter it was determined that there
was not enough perfomance benefit to libpng, so the code was removed.
It seems that the JPEG enhancements are worth this portability effort.

Bob
======================================
Bob Friesenhahn
[hidden email], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/


------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
VNC-Tight-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vnc-tight-devel
Reply | Threaded
Open this post in threaded view
|

Re: optimised jpeg lib

Pierre Ossman-3
In reply to this post by Pierre Ossman-3
On Wed, 4 Feb 2009 16:50:00 +0100
Pierre Ossman <[hidden email]> wrote:

> On Wed, 4 Feb 2009 16:06:32 +0100
> Adam Tkac <[hidden email]> wrote:
>
> >
> > The first problem is somewhere in SSE* code. vncviewer always gets
> > segfault on jdsamss2.asm, line 102. When I disabled SSE code and used
> > only MMX code viewer still often gets segfaults - usually due double
> > free(). Are you able to reproduce those problems?
>
> Ouch. I haven't seen any segfaults with the current code no, but I
> haven't gotten to any proper testing within TightVNC yet. Crashes in
> SSE code might be because of alignment (SSE instructions _require_
> proper alignment, unlike all other x86 instructions who just are more
> efficient with alignment).
>
Ok, I'm seeing these here too. I'm investigating...

Rgds
--
Pierre Ossman            OpenSource-based Thin Client Technology
System Developer         Telephone: +46-13-21 46 00
Cendio AB                Web: http://www.cendio.com

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
VNC-Tight-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vnc-tight-devel

signature.asc (204 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: optimised jpeg lib

Pierre Ossman-3
On Thu, 5 Feb 2009 10:16:40 +0100
Pierre Ossman <[hidden email]> wrote:

> On Wed, 4 Feb 2009 16:50:00 +0100
> Pierre Ossman <[hidden email]> wrote:
>
> > On Wed, 4 Feb 2009 16:06:32 +0100
> > Adam Tkac <[hidden email]> wrote:
> >
> > >
> > > The first problem is somewhere in SSE* code. vncviewer always gets
> > > segfault on jdsamss2.asm, line 102. When I disabled SSE code and used
> > > only MMX code viewer still often gets segfaults - usually due double
> > > free(). Are you able to reproduce those problems?
> >
> > Ouch. I haven't seen any segfaults with the current code no, but I
> > haven't gotten to any proper testing within TightVNC yet. Crashes in
> > SSE code might be because of alignment (SSE instructions _require_
> > proper alignment, unlike all other x86 instructions who just are more
> > efficient with alignment).
> >
>
> Ok, I'm seeing these here too. I'm investigating...
>
Problems found and resolved. The server hosting my public repo is down
right now though, but as soon as it is back I'll push the updated
version.

Rgds
--
Pierre Ossman            OpenSource-based Thin Client Technology
System Developer         Telephone: +46-13-21 46 00
Cendio AB                Web: http://www.cendio.com

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
VNC-Tight-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vnc-tight-devel

signature.asc (204 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: optimised jpeg lib

Adam Tkac
On Fri, Feb 06, 2009 at 02:49:15PM +0100, Pierre Ossman wrote:

> On Thu, 5 Feb 2009 10:16:40 +0100
> Pierre Ossman <[hidden email]> wrote:
>
> >
> > Ok, I'm seeing these here too. I'm investigating...
> >
>
> Problems found and resolved. The server hosting my public repo is down
> right now though, but as soon as it is back I'll push the updated
> version.
>

Improved version seems fine for me, I don't see any crash now. However
have you measured how much cpu time is saved when you use mmx/sse
instructions? It would be nice to have some comparative tests.
Arguments for such huge asm code have to exist.

Regards, Adam

--
Adam Tkac, Red Hat, Inc.

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
VNC-Tight-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vnc-tight-devel
Reply | Threaded
Open this post in threaded view
|

Re: optimised jpeg lib

Pierre Ossman-3
On Thu, 12 Feb 2009 12:04:36 +0100
Adam Tkac <[hidden email]> wrote:

>
> Improved version seems fine for me, I don't see any crash now. However
> have you measured how much cpu time is saved when you use mmx/sse
> instructions? It would be nice to have some comparative tests.
> Arguments for such huge asm code have to exist.
>

Nothing more than the previously stated numbers. I'm hunting down some
other bottlenecks in the rfb jpeg code right now, but I can see if I
can produce some numbers after that.

Rgds
--
Pierre Ossman            OpenSource-based Thin Client Technology
System Developer         Telephone: +46-13-21 46 00
Cendio AB                Web: http://www.cendio.com

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
VNC-Tight-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vnc-tight-devel

signature.asc (204 bytes) Download Attachment