Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
962 views
in Technique[技术] by (71.8m points)

ms word - Converting Email to PDF

I have expended a good deal of effort trying to convert emails to PDF.

I am using Delphi 10.4 although that is not necessarily relevant to the question.

I came up with a solution that involves extraction of the body from the email in whatever format (HTML, RTF or TXT). I use INDY for this or Outlook if email is in MSG format.

I then save the body to file and open it using MS Word via automation. Then it should be a simple matter of saving the Word document in PDF format.

However, MS Word doesn't seem to read html files that well.

From the numerous samples of emails that I have tried, I have come across several issues which were complex to solve.

Examples:

  • html tables expanding beyond the document's page width. I solved this by working out what the page width is, setting the offending table's width as fixed and setting it to the page width and finally resizing it's columns proportionately to its new width.
  • That worked well until I tried to process an email with html tables with differing numbers of columns/cells per row. That causes a crash. I solved that by handling the exception and iterating through each table by row and working with its cells rather than columns.
  • Images within table cells often overlap the cell and the page width. Solved by iterating through all InlineShapes, checking whether they are within a table and, if so setting their width to the cell width.

There have been other issues, but I now have something that seems to work pretty well on a fairly disparate bunch of emails.

But I would think it incredibly likely that there will be new issues that will crop up from time to time and since this procedure is designed to deal unsupervised with batches of emails, this is a concern.

So my question is, does anyone know of a better way of dealing with this? For example, is there some simple way of getting Word to to "nicely" format the html on loading so that it displays and saves to PDF in a readable fashion similar to how it looks when you open the same email in Outlook.

question from:https://stackoverflow.com/questions/65904761/converting-email-to-pdf

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Have you tried using the WordEditor property of the Outlook Inspector object? This returns the Microsoft Word Document Object Model of the message and you can export directly to PDF from that.

Here is a basic example...

Private Sub Demo()
    Dim MailItem As MailItem
    Dim FileName As String
    
    FileName = "C:UsersSamDesktopEmail.pdf"
    
    Set MailItem = ActiveExplorer.Selection.Item(1)
    
    With MailItem.GetInspector

        .WordEditor.ExportAsFixedFormat FileName, 17
        .Close 0
    End With
    
    MsgBox "Export complete"
End Sub

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...