473,468 Members | 1,909 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Couldn’t get equations in html when convert word .docx file to html file in C#.

1 New Member
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Expand|Select|Wrap|Line Numbers
  1. Globals.ThisAddIn.Application.ActiveDocument.Select();
  2. Microsoft.Office.Interop.Word.Document doc = Globals.ThisAddIn.Application.ActiveDocument;
  3.  
  4. string result = Path.GetTempPath();
  5.  
  6. string tmpFileName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
  7. doc.SaveEncoding = Microsoft.Office.Core.MsoEncoding.msoEncodingUSASCII;
  8. if (File.Exists(result + "temp.html"))
  9. {
  10.     File.Delete(result + "temp.html");
  11. }
  12. doc.SaveAs(result + "temp.html", WdSaveFormat.wdFormatFilteredHTML); 
  13.  
  14. doc.Close(Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges);
  15.  
  16. HtmlAgilityPack.HtmlDocument mangledHTML = new HtmlAgilityPack.HtmlDocument();
  17. mangledHTML.Load(result + "temp.html");
  18.  
  19.  
  20. if (File.Exists(result + "newtemp.html"))
  21. {
  22.     File.Delete(result + "newtemp.html");
  23. }
  24.  
  25. mangledHTML.Save(result + "newtemp.html");
  26. // Remove standalone CRLF
  27.  
  28. string badHTML = File.ReadAllText(result + "newtemp.html");
  29. badHTML = badHTML.Replace("\r\n\r\n", "ackThbbtt ");
  30. badHTML = badHTML.Replace("\r\n", " ");
  31. badHTML = badHTML.Replace("ackThbbtt ", "\r\n");
  32. badHTML = badHTML.Replace('�', ' ');
  33. if (File.Exists(result + "finaltemp.html"))
  34. {
  35.     File.Delete(result + "finaltemp.html");
  36. }
  37. File.WriteAllText(result + "finaltemp.html", badHTML);
  38.  
  39. // Clean up temp files, show the finished result in Notepad
  40. File.Delete(result + "temp.html");
  41. File.Delete(result + "newtemp.html");
  42.  
  43. Microsoft.Office.Interop.Word.Document orignalDoc = new Document();
  44. orignalDoc = Globals.ThisAddIn.Application.Documents.Open(tmpFileName);
  45.  
  46.  
Basically, what I want to do is I want to store all word document paragraph data separately in database and I also want it’s all property like font size, font width, font name and font style. So that I can show it in my application as it is as I written in word document file.

To represent it as it is I need to convert it html format and the by sepreting all paragraphs I can store it in database. But when in my word document has paragraph which have equations then

Expand|Select|Wrap|Line Numbers
  1. Globals.ThisAddIn.Application.ActiveDocument.Select();
  2. Microsoft.Office.Interop.Word.Document doc = Globals.ThisAddIn.Application.ActiveDocument;
  3.  
  4. string result = Path.GetTempPath();
  5.  
  6. string tmpFileName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
  7. doc.SaveEncoding = Microsoft.Office.Core.MsoEncoding.msoEncodingUSASCII;
This code converts my word documents all equations in Images and as it convert in image I can’t show the equation properly in my application.

So I tried to convert this equations in MATHML form but I couldn’t solve this.
1 Week Ago #1
0 3939

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Ashutosh | last post by:
How can i convert Word file to txt file in ASP.NET using CSharp?
3
by: Chris Davoli | last post by:
I've got a requirement to build a page using MS WORD and then have the page show up on a web site. I know I can do a binary write and open up the WORD document in IE plugin. Don't really want to do...
1
by: ananth | last post by:
Hi All, Do anyone know how to get a word document in a rich text field and convert them into a HTML page programatically.The requirement is that there shouldnt be any third party tool...
1
by: firozfasilan | last post by:
I want the complete module for converting a word document to html file using visual basic 6 can you help me?
5
by: sangith | last post by:
Hi, How do I convert a word document into a text file. (For eg: If I give input as file1.doc, my Perl program should automatically convert it into file1.txt) Is there any Perl module which does...
0
DaBarrett
by: DaBarrett | last post by:
Hi, I tried to word repair 2007 document from the recycle bin on windows 2010 home edition. When I try to open it now i get the message; Word experienced an error trying to open this file. Try...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.