473,404 Members | 2,114 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,404 software developers and data experts.

compress html output with php?

HI I know about
ob_start( 'ob_gzhandler' );
But I'm looking for something that removes all line breaks and extra
whitespace in the html before sending it to the visitor's browser. Is
this possible?

Cheers,
Ciarán

Mar 23 '07 #1
11 19835
Hi,

This comment has an example of stripping whitespace:

http://ru.php.net/manual/en/function.ob-start.php#71953

On Mar 22, 9:48 pm, "Ciaran" <cronok...@hotmail.comwrote:
HI I know about
ob_start( 'ob_gzhandler' );
But I'm looking for something that removes all line breaks and extra
whitespace in the html before sending it to the visitor's browser. Is
this possible?

Cheers,
Ciarán

Mar 23 '07 #2
Hmm, yeah looks good. Thanks a lot Peter.

Does anyone know if it's worth it? I mean is the time spent running
the function less then the time it would take to download the longer
html file? Any thoughts on this?

Cheers,
Ciarán

Mar 23 '07 #3
OK I ran a few tests on my slowest page... Here's my results:

WITHOUT COMPRESSION FUNCTION:::::::::::
Page Size: 523.97 kb
Load Time: 0.9995 seconds
Load Time: 0.8 seconds
Load Time: 0.8095 seconds
Load Time: 0.7091 seconds
Load Time: 0.7223 seconds

WITH COMPRESSION FUNCTION:::::::::::
Page Size: 494.77 kb
Load Time: 0.8448 seconds
Load Time: 0.8307 seconds
Load Time: 0.8307 seconds
Load Time: 0.8444 seconds
Load Time: 0.9014 seconds

AVERAGE SPEED WITHOUT COMPRESSION: 0.80808
AVERAGE SPEED WITH COMPRESSION: 0.8504

Hope that helps someone!
Cheers,
Ciarán

Mar 23 '07 #4
Dammit I'm confused! I'm not sure how acurate this info is! when I
downloaded the uncompressed version it was 613 KB while the compressed
version of the same page was a tiny 48KB! Surely the download speed of
the page has to be considered?
Anyone?

Mar 23 '07 #5
Ciaran wrote:

Does anyone know if it's worth it? [compressing HTML] I mean is the time
spent running
the function less then the time it would take to download the longer
html file? Any thoughts on this?

This comes up over at alt.html all the time. I have seen arguments put
forward where compressing a 50K file to 30K is a good thing *but* that 50K
file links to 500K of images. It's the images that cause the trouble.

--
Richard.
Mar 23 '07 #6
Richard Formby wrote:
Does anyone know if it's worth it? [compressing HTML] I mean is the time
spent running the function less then the time it would take to download
the longer html file? Any thoughts on this?
Not the way Ciaran's attempting to do it. Gzipping HTML as you send it
will lead to a dramatic reduction in file size. (The resultant file will
probably be half the size of the original, or even smaller.) Whatsmore,
the compression is done in well-optimised C code, so it uses very little
time to do. You can perform it using particular settings in php.ini or
Apache, so it doesn't require any modification to your PHP code base,
making it very easy to toggle on or off as required.

Stripping out redundant whitespace in a file leads to a small reduction in
file size. Depending on how much whitespace there is in the first place,
you might shave off 10% or so from the file size. The compression is
typically done using PHP and regular expressions, which is slower than the
method above. It generally requires you to make some modifications to your
PHP code. Whatsmore, it's error-prone. Whitespace is significant in some
places (e.g. within PRE, TEXTAREA and SCRIPT elements). Most whitespace
stripping scripts get this wrong in certain places -- getting it right
requires even more careful effort parsing the HTML, and slows the script
down even more.

Zipping HTML content in transit can save significant bandwidth on
mainly textual websites without using much extra CPU time.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux

* = I'm getting there!
Mar 23 '07 #7
Hi again fellas, thank for the replies on this. I'm using a
combination of ob_gzhandler and this php function to compress my
pages. Depending on the page, I'm getting a small increase in server
side time and a huge reduction in the size of the outputted html page
(and bandwidth!). The problem is, as Toby mentioned, the compression
function messes up some things. One thing I've noticed is some of my
javascript functions are breaking because of it so I'm only adding it
on select pages. I love the result so is there any way to stop it
breaking things or is there a better way to get the same effect?

Here's the function:

function compress($buffer){
$search = array('/\>[^\S ]+/s','/[^\S ]+\</s','/(\s)+/s');
$replace = array('>','<','\\1');
$buffer = preg_replace($search, $replace, $buffer);
return $buffer;
}

Cheers,
Ciarán

Mar 23 '07 #8
Ciaran wrote:
Dammit I'm confused! I'm not sure how acurate this info is! when I
downloaded the uncompressed version it was 613 KB while the compressed
version of the same page was a tiny 48KB! Surely the download speed of
the page has to be considered?
Anyone?
Sorry Ciaran, but you're metrics are meaningless unless you can tell us what
you were measuring (hardware at each end, intervening network hardware,
bandwidths, RTT, network latency, request latency...).

I find it very hard to believe that the gz handler would only reduce a 524Kb
HTML or Text file to 494Kb. I think your methodology is flawed.

C.
Mar 23 '07 #9
Ciaran wrote:
The problem is, as Toby mentioned, the compression function messes up
some things.
I can share a little code with you I suppose... I happen to do exactly the
opposite of what you're describing -- add *more* whitespace to some HTML,
in order to pretty-print it. Obviously, this screws up when you get inside
a PRE or TEXTAREA element, so I made my function smart enough to know when
it's inside one of those.

http://svn.sourceforge.net/viewvc/de...15&view=markup

It's the indent_html() function you're looking for. Obviously, you'll need
to work at it a bit to get it to do what you want, but you should see that
it fairly reliably knows at each point whether or not it's within a "safe
tag" or not.

That said, I'd still advise against your plan. Gzipping your files will be
far more effective, more reliable and easier.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux

* = I'm getting there!
Mar 24 '07 #10
Sorry Ciaran, but you're metrics are meaningless unless you can tell us what
you were measuring (hardware at each end, intervening network hardware,
bandwidths, RTT, network latency, request latency...).
I don't see why that matters. All I'm measuring is the increase in
speed. The speed itself is not an issue.

I find it very hard to believe that the gz handler would only reduce a 524Kb
HTML or Text file to 494Kb. I think your methodology is flawed.
Sorry I dont think I made this clear - I'm already using the gz
handler. The stats I posted are using the 'homemade' compression
function I posted that removes whitespace.
Mar 24 '07 #11
Thanks for the info Toby - I'll check it out when I get a chance.
That said, I'd still advise against your plan. Gzipping your files will be
far more effective, more reliable and easier.
I was actually planning on doing both. I've always been using gzip
compression - I started this thread in the hope I could squeeze a bit
more compression in there.

THE BOTTOM LINE:::::::::::::::::::::::::::::
Using the (temperamental) compression function posted earlier I'm
getting a reduction of 4.7% in filesize but my server is 5% slower at
throwing the pages together. I guess that means using the function
will save a small amount of bandwidth at the expense of a tiny
increase in page access time. You can make up your own minds weather
that's worth it! ;)

Ciarán

Mar 24 '07 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: José Carlos | last post by:
Hi. How i could compress string of data?. I heart that it´s possible to make with librep, but i´dont know the way to do it. if somebody know any way to do it or any web where explain it i...
0
by: Tony | last post by:
Hello, Any assistance would be appreciated: As the subject suggests I am trying to gunzip files using the Compress::Zlib module. The following is basically the code that I am working with: ...
11
by: Shiperton Henethe | last post by:
Dreamweaver 4 Hi Can anyone recommend a decent utility for compressing HTML that is safe - i.e. that *definitely* doesn't mess with the appearance in any browsers. I run a growing website...
5
by: Alan | last post by:
I have a website with no ssi, php, cgi...nothing. Plain old flat pages are all it serves. I need to upload a list to it - a big, plain text list in html. A 2Mb list! With gzip compress, it comes...
2
by: gregory_may | last post by:
I am looking for a better way to compress screen shot data. Currently this line of code: MyBitMapImage.Save(MyMemoryStream, System.Drawing.Imaging.ImageFormat.Jpeg) Takes between 70-90...
3
by: dust | last post by:
I need to improve the download speed for an asp.net 2.0 application. So I'd like to compress the dimension of the .aspx pages. These pages don't have any images or heavy element. The most long...
6
by: Champika Nirosh | last post by:
Hi, I have two machine where I needed to have a extended TCP/IP protocol to make the link between the two machines Mean,I need to write a application that compress every data the machine send...
3
by: kpmassey | last post by:
I'm using javascript to construct large tables from an array of data, using ideas from: http://www.oreillynet.com/pub/a/javascript/2003/05/06/dannygoodman.html The data itself cannot be...
6
by: Adriano | last post by:
Can anyone recommend a simple way to compress/decomress a String in .NET 1.1 ? I have a random string of 70 characters, the output from a DES3 encryption, and I wish to reduce the lengh of it, ...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.