473,484 Members | 1,661 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Couldn’t get equations in html when convert word .docx file to html file in C#.

1 New Member
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Expand|Select|Wrap|Line Numbers
  1. Globals.ThisAddIn.Application.ActiveDocument.Select();
  2. Microsoft.Office.Interop.Word.Document doc = Globals.ThisAddIn.Application.ActiveDocument;
  3.  
  4. string result = Path.GetTempPath();
  5.  
  6. string tmpFileName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
  7. doc.SaveEncoding = Microsoft.Office.Core.MsoEncoding.msoEncodingUSASCII;
  8. if (File.Exists(result + "temp.html"))
  9. {
  10.     File.Delete(result + "temp.html");
  11. }
  12. doc.SaveAs(result + "temp.html", WdSaveFormat.wdFormatFilteredHTML); 
  13.  
  14. doc.Close(Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges);
  15.  
  16. HtmlAgilityPack.HtmlDocument mangledHTML = new HtmlAgilityPack.HtmlDocument();
  17. mangledHTML.Load(result + "temp.html");
  18.  
  19.  
  20. if (File.Exists(result + "newtemp.html"))
  21. {
  22.     File.Delete(result + "newtemp.html");
  23. }
  24.  
  25. mangledHTML.Save(result + "newtemp.html");
  26. // Remove standalone CRLF
  27.  
  28. string badHTML = File.ReadAllText(result + "newtemp.html");
  29. badHTML = badHTML.Replace("\r\n\r\n", "ackThbbtt ");
  30. badHTML = badHTML.Replace("\r\n", " ");
  31. badHTML = badHTML.Replace("ackThbbtt ", "\r\n");
  32. badHTML = badHTML.Replace('�', ' ');
  33. if (File.Exists(result + "finaltemp.html"))
  34. {
  35.     File.Delete(result + "finaltemp.html");
  36. }
  37. File.WriteAllText(result + "finaltemp.html", badHTML);
  38.  
  39. // Clean up temp files, show the finished result in Notepad
  40. File.Delete(result + "temp.html");
  41. File.Delete(result + "newtemp.html");
  42.  
  43. Microsoft.Office.Interop.Word.Document orignalDoc = new Document();
  44. orignalDoc = Globals.ThisAddIn.Application.Documents.Open(tmpFileName);
  45.  
  46.  
Basically, what I want to do is I want to store all word document paragraph data separately in database and I also want it’s all property like font size, font width, font name and font style. So that I can show it in my application as it is as I written in word document file.

To represent it as it is I need to convert it html format and the by sepreting all paragraphs I can store it in database. But when in my word document has paragraph which have equations then

Expand|Select|Wrap|Line Numbers
  1. Globals.ThisAddIn.Application.ActiveDocument.Select();
  2. Microsoft.Office.Interop.Word.Document doc = Globals.ThisAddIn.Application.ActiveDocument;
  3.  
  4. string result = Path.GetTempPath();
  5.  
  6. string tmpFileName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
  7. doc.SaveEncoding = Microsoft.Office.Core.MsoEncoding.msoEncodingUSASCII;
This code converts my word documents all equations in Images and as it convert in image I can’t show the equation properly in my application.

So I tried to convert this equations in MATHML form but I couldn’t solve this.
2 Weeks Ago #1
0 4529

Sign in to post your reply or Sign up for a free account.

Similar topics

1
3809
by: Ashutosh | last post by:
How can i convert Word file to txt file in ASP.NET using CSharp?
3
4616
by: Chris Davoli | last post by:
I've got a requirement to build a page using MS WORD and then have the page show up on a web site. I know I can do a binary write and open up the WORD document in IE plugin. Don't really want to do...
1
2093
by: ananth | last post by:
Hi All, Do anyone know how to get a word document in a rich text field and convert them into a HTML page programatically.The requirement is that there shouldnt be any third party tool...
1
1666
by: firozfasilan | last post by:
I want the complete module for converting a word document to html file using visual basic 6 can you help me?
5
5081
by: sangith | last post by:
Hi, How do I convert a word document into a text file. (For eg: If I give input as file1.doc, my Perl program should automatically convert it into file1.txt) Is there any Perl module which does...
0
3317
DaBarrett
by: DaBarrett | last post by:
Hi, I tried to word repair 2007 document from the recycle bin on windows 2010 home edition. When I try to open it now i get the message; Word experienced an error trying to open this file. Try...
0
7082
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
6953
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7105
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7144
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
7214
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
4529
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3041
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
592
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
235
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.