Knowledgebase

Article number: 301431

Calculating the Size of an HTML E-mail for a Mailshot

There are discrepancies in terms of the the size of MHT files used in mailouts. This article explains why they occur.

http://bob.beta.sitekit.net/admin/ws/EmailSizeCalculator.asmx

The output is an xml document that has a) the size of the resulting email in bytes b) the resulting html email as a xml'ified mime type.

  1. Loaded http://bob.beta.sitekit.net/admin/ws/EmailSizeCalculator.asmx/GetSize?URL=http://bob.sitekit.net/ in IE - passed, returned size of 92351
  2. Loaded http://bob.beta.sitekit.net/admin/ws/EmailSizeCalculator.asmx?op=GetSize
  3.  in IE and entered url of http://bob.sitekit.net/ - passed, returned size of 92351
  4. Used IE to save http://bob.beta.sitekit.net/ as a complete webpage - passed, size of all files is 97264 bytes which with the overhead of a MIME email will work roughly the size given by this function
  5. Repeated the above tests with http://www.hie.co.uk and http://northlinknew.beta.sitekit.net/ and the sizes given by the function seem a lot larger than the sizes of the files - FAIL.
  6. Sent a mailing list email with content of bob.sitekit.net and the filesize of the email is 73087 - FAIL, I can't work out why the email is so much smaller than the predicted size. Is there some compression that this function isn't allowing for? over to you Ian, This seems a rather complex issue and I may have raised more questions than I have answered. It appears that Robert tested by comparing the size of the mht as measured by the EmailSizeCalculator web service with the size of the web page as saved from Internet Explorer using the "Save As .." -> "Web Page Complete" option. As well as improving the code used by the EmailSizeCalculator (so that it measures the length of the actual MHT instead of the length of the XML version of the MHT), I have compared the Sitekit CMS MHT measurement with the size of the web page saved by IE using the "save as MHT" option (a more relevant comparison). The results of the set of comparisons I did (see attached .xls) show that there is a difference in size between the Sitekit CMS MHT & the IE MHT which ranges roughly between 85% - 110%. (There is no significant difference between LIVE Sitekit CMS pages & BETA Sitekit CMS pages.)

Upon further investigation I discovered the following:

  1. IE & Sitekit CMS generate quite different forms of Mht - Sitekit CMS uses the Chilkat.Mht class which appears to use a completely different algorithm to create an mht version of a web page
  2. One major difference is that the IE mht retains the element referencing the css stylesheet, whilst the Chilkat mht retrieves the full css stylesheet & replaces the link with the entire stylesheet code
  3. Another difference is the format of the image attachments. For example, in the IE mht, image attachments have a "Content-Location" attribute which contains a full, absolute, url of the image in the original web page, whilst in the Chilkat mht, image attachments have only a "name=value" pair which contains a relative url of the image. E.g. - IE: Content-Location: http://www.calmac.co.uk/layout/calmac-logo-bg.jpg - Chilkat: name="ck0-calmac-logo-bg.jpg"
  4. I cannot work out why the Chilkat mht email that is sent out by the broadcast application differs (significantly) in size from that calculated by the EmailSizeCalculator for http://www.sitekit.net/ but closely matches for http://www.northlinkferries.co.uk. Once it has been received, Outlook exposes the size only in KB, and offers no way to access the original email (offering only to save it as html or in Outlook .msg format, neither of which resemble the original mht format). (My Yahoo mail client exposes a "content-length" header of the received email but this is so much smaller than either mht versions that I suspect it may measure only the body & exclude attachments.)
  5. I found that web pages sent as mht by the Sitekit CMS broadcast application cannot be reconstituted correctly by the recipient Mail client, either by Outlook or by Yahoo Mail. (It has also come to my attention that our clients are encountering the same issue.) Dave has explained to me that this is a known issue and the current work-around is to only use web pages that have been specifically created for use as an email broadcast - this requires creating a layout template & stylesheet specifically for the page that is to be sent. I strongly suggest that this is not acceptable as it means the html email broadcast functionality is not available to the vast majority of Sitekit CMS users, and it will appear as broken to most users. In conclusion: - the variance in size between that reported by Sitekit CMS (via EmailSizeCalculator) and the IE mht can be explained by the differences in mht format, in particular the inclusion of the entire css stylesheet within the Chilkat mht - I propose that the html broadcast of web page functionality be modified to send the actual html of the web page (instead of the mht version of it) in order to: - - reduce the size of each email, as images & other objects (e.g. flash) would be referenced from the original url instead of being inserted into the email - - ensure that the html email message is displayed correctly in the recipient's email client (instead of having each image & other object appearing as separate attachments)

Additional File ( MhtSizes.xls )

Related questions