473,320 Members | 2,112 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

compress html output with php?

HI I know about
ob_start( 'ob_gzhandler' );
But I'm looking for something that removes all line breaks and extra
whitespace in the html before sending it to the visitor's browser. Is
this possible?

Cheers,
Ciarán

Mar 23 '07 #1
11 19828
Hi,

This comment has an example of stripping whitespace:

http://ru.php.net/manual/en/function.ob-start.php#71953

On Mar 22, 9:48 pm, "Ciaran" <cronok...@hotmail.comwrote:
HI I know about
ob_start( 'ob_gzhandler' );
But I'm looking for something that removes all line breaks and extra
whitespace in the html before sending it to the visitor's browser. Is
this possible?

Cheers,
Ciarán

Mar 23 '07 #2
Hmm, yeah looks good. Thanks a lot Peter.

Does anyone know if it's worth it? I mean is the time spent running
the function less then the time it would take to download the longer
html file? Any thoughts on this?

Cheers,
Ciarán

Mar 23 '07 #3
OK I ran a few tests on my slowest page... Here's my results:

WITHOUT COMPRESSION FUNCTION:::::::::::
Page Size: 523.97 kb
Load Time: 0.9995 seconds
Load Time: 0.8 seconds
Load Time: 0.8095 seconds
Load Time: 0.7091 seconds
Load Time: 0.7223 seconds

WITH COMPRESSION FUNCTION:::::::::::
Page Size: 494.77 kb
Load Time: 0.8448 seconds
Load Time: 0.8307 seconds
Load Time: 0.8307 seconds
Load Time: 0.8444 seconds
Load Time: 0.9014 seconds

AVERAGE SPEED WITHOUT COMPRESSION: 0.80808
AVERAGE SPEED WITH COMPRESSION: 0.8504

Hope that helps someone!
Cheers,
Ciarán

Mar 23 '07 #4
Dammit I'm confused! I'm not sure how acurate this info is! when I
downloaded the uncompressed version it was 613 KB while the compressed
version of the same page was a tiny 48KB! Surely the download speed of
the page has to be considered?
Anyone?

Mar 23 '07 #5
Ciaran wrote:

Does anyone know if it's worth it? [compressing HTML] I mean is the time
spent running
the function less then the time it would take to download the longer
html file? Any thoughts on this?

This comes up over at alt.html all the time. I have seen arguments put
forward where compressing a 50K file to 30K is a good thing *but* that 50K
file links to 500K of images. It's the images that cause the trouble.

--
Richard.
Mar 23 '07 #6
Richard Formby wrote:
Does anyone know if it's worth it? [compressing HTML] I mean is the time
spent running the function less then the time it would take to download
the longer html file? Any thoughts on this?
Not the way Ciaran's attempting to do it. Gzipping HTML as you send it
will lead to a dramatic reduction in file size. (The resultant file will
probably be half the size of the original, or even smaller.) Whatsmore,
the compression is done in well-optimised C code, so it uses very little
time to do. You can perform it using particular settings in php.ini or
Apache, so it doesn't require any modification to your PHP code base,
making it very easy to toggle on or off as required.

Stripping out redundant whitespace in a file leads to a small reduction in
file size. Depending on how much whitespace there is in the first place,
you might shave off 10% or so from the file size. The compression is
typically done using PHP and regular expressions, which is slower than the
method above. It generally requires you to make some modifications to your
PHP code. Whatsmore, it's error-prone. Whitespace is significant in some
places (e.g. within PRE, TEXTAREA and SCRIPT elements). Most whitespace
stripping scripts get this wrong in certain places -- getting it right
requires even more careful effort parsing the HTML, and slows the script
down even more.

Zipping HTML content in transit can save significant bandwidth on
mainly textual websites without using much extra CPU time.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux

* = I'm getting there!
Mar 23 '07 #7
Hi again fellas, thank for the replies on this. I'm using a
combination of ob_gzhandler and this php function to compress my
pages. Depending on the page, I'm getting a small increase in server
side time and a huge reduction in the size of the outputted html page
(and bandwidth!). The problem is, as Toby mentioned, the compression
function messes up some things. One thing I've noticed is some of my
javascript functions are breaking because of it so I'm only adding it
on select pages. I love the result so is there any way to stop it
breaking things or is there a better way to get the same effect?

Here's the function:

function compress($buffer){
$search = array('/\>[^\S ]+/s','/[^\S ]+\</s','/(\s)+/s');
$replace = array('>','<','\\1');
$buffer = preg_replace($search, $replace, $buffer);
return $buffer;
}

Cheers,
Ciarán

Mar 23 '07 #8
Ciaran wrote:
Dammit I'm confused! I'm not sure how acurate this info is! when I
downloaded the uncompressed version it was 613 KB while the compressed
version of the same page was a tiny 48KB! Surely the download speed of
the page has to be considered?
Anyone?
Sorry Ciaran, but you're metrics are meaningless unless you can tell us what
you were measuring (hardware at each end, intervening network hardware,
bandwidths, RTT, network latency, request latency...).

I find it very hard to believe that the gz handler would only reduce a 524Kb
HTML or Text file to 494Kb. I think your methodology is flawed.

C.
Mar 23 '07 #9
Ciaran wrote:
The problem is, as Toby mentioned, the compression function messes up
some things.
I can share a little code with you I suppose... I happen to do exactly the
opposite of what you're describing -- add *more* whitespace to some HTML,
in order to pretty-print it. Obviously, this screws up when you get inside
a PRE or TEXTAREA element, so I made my function smart enough to know when
it's inside one of those.

http://svn.sourceforge.net/viewvc/de...15&view=markup

It's the indent_html() function you're looking for. Obviously, you'll need
to work at it a bit to get it to do what you want, but you should see that
it fairly reliably knows at each point whether or not it's within a "safe
tag" or not.

That said, I'd still advise against your plan. Gzipping your files will be
far more effective, more reliable and easier.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux

* = I'm getting there!
Mar 24 '07 #10
Sorry Ciaran, but you're metrics are meaningless unless you can tell us what
you were measuring (hardware at each end, intervening network hardware,
bandwidths, RTT, network latency, request latency...).
I don't see why that matters. All I'm measuring is the increase in
speed. The speed itself is not an issue.

I find it very hard to believe that the gz handler would only reduce a 524Kb
HTML or Text file to 494Kb. I think your methodology is flawed.
Sorry I dont think I made this clear - I'm already using the gz
handler. The stats I posted are using the 'homemade' compression
function I posted that removes whitespace.
Mar 24 '07 #11
Thanks for the info Toby - I'll check it out when I get a chance.
That said, I'd still advise against your plan. Gzipping your files will be
far more effective, more reliable and easier.
I was actually planning on doing both. I've always been using gzip
compression - I started this thread in the hope I could squeeze a bit
more compression in there.

THE BOTTOM LINE:::::::::::::::::::::::::::::
Using the (temperamental) compression function posted earlier I'm
getting a reduction of 4.7% in filesize but my server is 5% slower at
throwing the pages together. I guess that means using the function
will save a small amount of bandwidth at the expense of a tiny
increase in page access time. You can make up your own minds weather
that's worth it! ;)

Ciarán

Mar 24 '07 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: José Carlos | last post by:
Hi. How i could compress string of data?. I heart that it´s possible to make with librep, but i´dont know the way to do it. if somebody know any way to do it or any web where explain it i...
0
by: Tony | last post by:
Hello, Any assistance would be appreciated: As the subject suggests I am trying to gunzip files using the Compress::Zlib module. The following is basically the code that I am working with: ...
11
by: Shiperton Henethe | last post by:
Dreamweaver 4 Hi Can anyone recommend a decent utility for compressing HTML that is safe - i.e. that *definitely* doesn't mess with the appearance in any browsers. I run a growing website...
5
by: Alan | last post by:
I have a website with no ssi, php, cgi...nothing. Plain old flat pages are all it serves. I need to upload a list to it - a big, plain text list in html. A 2Mb list! With gzip compress, it comes...
2
by: gregory_may | last post by:
I am looking for a better way to compress screen shot data. Currently this line of code: MyBitMapImage.Save(MyMemoryStream, System.Drawing.Imaging.ImageFormat.Jpeg) Takes between 70-90...
3
by: dust | last post by:
I need to improve the download speed for an asp.net 2.0 application. So I'd like to compress the dimension of the .aspx pages. These pages don't have any images or heavy element. The most long...
6
by: Champika Nirosh | last post by:
Hi, I have two machine where I needed to have a extended TCP/IP protocol to make the link between the two machines Mean,I need to write a application that compress every data the machine send...
3
by: kpmassey | last post by:
I'm using javascript to construct large tables from an array of data, using ideas from: http://www.oreillynet.com/pub/a/javascript/2003/05/06/dannygoodman.html The data itself cannot be...
6
by: Adriano | last post by:
Can anyone recommend a simple way to compress/decomress a String in .NET 1.1 ? I have a random string of 70 characters, the output from a DES3 encryption, and I wish to reduce the lengh of it, ...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Shćllîpôpď 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.