Results 1 to 3 of 3
  1. #1
    New Lounger
    Join Date
    Sep 2014
    Posts
    3
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Unhappy Macro to convert word to XML

    Hi,

    I've recently started using word a lot of times, so unsure of how to do this. I've a document like below in word.

    This is heading

    This is sub heading 1

    This is sub heading 3

    This is para

    here each line is in different format.

    and i want to map each of the format to a XML tag. like the output that i need is like below.

    <root>
    <heading>This is heading</heading>
    <subheading level='1'>This is sub heading 1</subheading>
    <subheading level='2'>This is sub heading 2</subheading>
    <para>This is para</para>
    </root>

    one of my friends told me that there can be a macro written based on the type of format. but i'm not sure of how to write it. can you please help me with it.

    Thanks,
    Sunny

  2. #2
    Silver Lounger Charles Kenyon's Avatar
    Join Date
    Jan 2001
    Location
    Sun Prairie, Wisconsin, Wisconsin, USA
    Posts
    2,049
    Thanks
    124
    Thanked 119 Times in 116 Posts
    Why do you want to do this? If you are using Word 2007 or later's .docx format, the document is already in XML format.

    Here is an excerpt from an actual document formatted using the three heading styles Heading 1, Heading 2, and Heading 3:
    Code:
    <w:pStyle w:val="Heading1"/>
    </w:pPr>
    <w:r>
    <w:t>
    This is heading
    </w:t>
    </w:r>
    </w:p>
    <w:p w:rsidR="00214475" w:rsidRDefault="00214475" w:rsidP="00214475">
    <w:pPr>
    <w:pStyle w:val="Heading2"/>
    </w:pPr>
    <w:r>
    <w:t>
    This is sub heading 1
    </w:t>
    </w:r>
    </w:p>
    <w:p w:rsidR="00214475" w:rsidRPr="00214475" w:rsidRDefault="00214475" w:rsidP="00214475">
    <w:pPr>
    <w:pStyle w:val="Heading3"/>
    </w:pPr>
    <w:r>
    <w:t>
    This is sub heading 2
    </w:t>
    </w:r>
    </w:p>
    Charles Kyle Kenyon
    Madison, Wisconsin

  3. The Following User Says Thank You to Charles Kenyon For This Useful Post:

    sunnykeerthi (2014-09-08)

  4. #3
    Super Moderator
    Join Date
    Jan 2001
    Location
    Melbourne, Victoria, Australia
    Posts
    3,852
    Thanks
    4
    Thanked 259 Times in 239 Posts
    Charles is completely correct but the XML code that Word stores the document in is so very complicated that almost no-one would ever attempt to use the content that way.

    Simple code can rapidly add tags based on the style applied to each paragraph. There are plenty of ways this could fall well short of what you need but the basic core of the code is as follows.
    Code:
    Sub TagParas()
      Dim aPar As Paragraph, sTag As String
      Dim aRng As Range
      For Each aPar In ActiveDocument.Paragraphs
        Set aRng = aPar.Range
        sTag = Replace(aRng.Style, " ", "") 'no spaces allowed in xml tag names
        aRng.MoveEnd Unit:=wdCharacter, Count:=-1
        aRng.InsertBefore "<" & sTag & ">"
        aRng.InsertAfter "</" & sTag & ">"
      Next aPar
    End Sub
    Andrew Lockton, Chrysalis Design, Melbourne Australia

  5. The Following User Says Thank You to Andrew Lockton For This Useful Post:

    sunnykeerthi (2014-09-08)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •