Results 1 to 8 of 8
  1. #1
    3 Star Lounger
    Join Date
    Jan 2007
    Location
    Massachusetts, USA
    Posts
    272
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Macro to convert word styles to XSD Schema tags? (MS Word 2003, SP2)

    Hello,

    I have a fixed number of word styles in a MS Word document that I would like to somehow map out to an attached schema (e.g. test.xsd) in MS Word.

    Question 1: Is it possible to run a Find and Replace in MS Word and where ever a style is seen, to wrap it with a corresponding element (e.g. paragraph text style to paragraph text element)?

    Question 2: If it is possible to run a Find and Replace, then can a Macro be created to convert twenty styles into twenty corresponding elements?

    Question 3: Can the Macro be smart enough to automatically wrap a style that is not listed in the Schema (XSD) with for example a generic element (e.g. unknown style to generic element)?

    Question 4: Would the Macro also be able to wrap specific styles like:

    - Table Text Style to Table Text Element
    - Figure Title Style to Figure Title Element
    - Figure Cross Reference to Figure Cross Reference Element
    - Table Cross Reference to Table Cross Reference Element

    and so on...

    I am not sure if this has been tried before but it will save a lot of time if I can create a process (e.g. macro) to automate the wrapping of styles with corresponding elements.
    If I can get this to work, then it will minimize the amount of cleanup I have to do when I import the tagged (MS WORD) XML file into Adobe FrameMaker 8.

    The only other thing I would want before saving the MS Word document to XML, is to make sure that none of the MS Word codes goes into the XML file when it is saved. I know that when a MS Word file is saved as an HTML file, that a lot of MSO tags get inserted.

    Thanks in response for any suggestions.

    Regards,

    -J

  2. #2
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Macro to convert word styles to XSD Schema tags? (MS Word 2003, SP2)

    Could you give a specific example of what you want, for those Loungers who, like me, have no idea what an XSD schema is?

  3. #3
    3 Star Lounger
    Join Date
    Jan 2007
    Location
    Massachusetts, USA
    Posts
    272
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Macro to convert word styles to XSD Schema tags? (MS Word 2003, SP2)

    Hello Hans,

    Yes, I am more than happy to give further details. In fact, I have attached a sample FrameMaker DTD file that I converted to an XSD.
    If you follow these steps, you will see where I am at. First, save the XSD file to some location on your computer.

    1. From MS Word (2003 or newer) select: TOOLS>Templates and Add-Ins from the drop down menu.
    2. From the Templates and Add-Ins popup window, select Add Schema.
    3. Navigate to where you saved the Schema and click to open it.
    4. From the Schema Settings popup, enter anything in the URI and ALIAS fields such as: URI: http://www.wopr.com/Namespace Alias: WOPR Chapter Namespace
    5. Click OK at the Templates and Add-Ins popup window. Your MS Word window will now have a panel on the right side filled with elements. To add an element to a style, simply highlight the elment and select the style. Keep in mind that there is and order that has to be followed (parent and children) based on the XSD structure.

    I would like to have a macro that will automatically take an XSD, find a word style and wrap it with a corresponding element (e.g. Heading 1 Style with Heading 1 Element). Right now, I can do this manually, but it will take a long time to highlight and click.

    Please note that there are options you can check off in the Templates and Add-Ins window. The only option I have checked off is: "Validate document against attached schemas"

    Any suggestion you can provide is very much appreciated. To open the attached file in MS Word, just change the file name extension to .XSD

    Thanks,

    Jim
    Attached Files Attached Files

  4. #4
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Macro to convert word styles to XSD Schema tags? (MS Word 2003, SP2)

    I'm using Word 2002 at the moment, so I can't help, sorry. I hope that someone else will have a suggestion.

  5. #5
    Super Moderator
    Join Date
    Jan 2001
    Location
    Melbourne, Victoria, Australia
    Posts
    3,852
    Thanks
    4
    Thanked 259 Times in 239 Posts

    Re: Macro to convert word styles to XSD Schema tag

    I understand what you are asking but believe that this goes far beyond the realms of a simple macro. It is easy enough to build a macro which replaces puts an xml tag at the beginning and end of each paragraph based on the paragraph style name but this will only end up giving you a 'well formed' xml file rather than one validated against your schema. The two biggest sticking points will be the nesting of tags (applying hierarchy) to comply with the schema and the likely inconsistencies within the source documents.

    The avenue that I would pursue would be to save the Word file as XML (since this will be valid XML, albeit heavily cluttered with Microsoft artifacts) and then develop a 'transformation' to parse this standardised data into a new XML file which validates to your Schema/DTD and removes the unwanted artifacts. I don't know how to do this myself but I believe that this will give you to most logical workflow and standardise the process. The transformation may be best done using a non-Microsoft or non-Adobe product. The existence of commercial solutions makes me think that is it possible to automate large chunks of this task but there is likely to remain a considerable amount of human intervention required on inconsistent source documents.

    However, there are commercial options out there to assist you. I did a search and found a few worth looking at.
    http://www.legacydataconversion.com/ take your base files and return them to you with the work done offsite.
    http://www.scriptorium.com/docframe/ appears to approach this from the Framemaker end.
    http://www.livelinx.com/downloads/XML-Gene...White-Paper.pdf appears to approach from the Word end.

    I am very interested in how you get on with this project, let us know what direction you follow to solve your problem.
    Andrew Lockton, Chrysalis Design, Melbourne Australia

  6. #6
    3 Star Lounger
    Join Date
    Jan 2007
    Location
    Massachusetts, USA
    Posts
    272
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Macro to convert word styles to XSD Schema tag

    Hello Andrew,

    Thanks for taking the time to look at my posting in detail. I appreciate the links that you have found too.

    Yes, I understand what you are say about the hierarchy, but that is something I will worry about later. For now, I am just trying to focus on finding a way to create a Macro that will find each style in the word document (e.g. 20 styles) and then simply wrap that style in an element tag, based on the Schema that I have attached to that Word document. I am sure there has to be a way to do this within MS Word or through some type of form (Visual Basic). How would I go about adding new options to the Find and Replace "Format" or "Special" drop down options? Currently, under the Format option, you can select Style and then a Find Style popup menu appears. It would be great if I could customize this window to show the XSD elements that are with any attached Schema.

    Then perhaps, I can manually run a Find and Replace for each style (one at a time).

    Better yet, if I am going to create a process, it might as well be a Macro, as it could apply to many styles at once.

    The links you sent me were very informative, but my overall goal is to to find a FREE approach to this task.


    One last thing, I liked what you said about just saving the MS Word document as an XML file and then bringing that file into Structured FrameMaker 8.

    But, are there any free tools out there that will remove the Microsoft code that is embedded in the document?

    I will have to see if I can get FrameMaker 8 to ignore (filter out - maybe via a FrameScript)any of the Microsoft code and just bring in the XML structure. Then perhaps, I can use the some type of conversion table in FrameMaker to migrate the data into the desired structure.

    Once again though, I feel it would be best if I can set up the XML tags in MS Word first. So for now, I will pursue that path.

    Thanks again for the information and I will make sure to keep you posted on my progress.


    Regards,

    Jim

  7. #7
    WS Lounge VIP
    Join Date
    Mar 2006
    Location
    Maryland, USA
    Posts
    690
    Thanks
    17
    Thanked 66 Times in 56 Posts

    Re: Macro to convert word styles to XSD Schema tag

    I see that you want something that is free. Still, you might want to take a look at RazzmaTag, "a universal tagging utility that finds formatting in Microsoft Word and marks it with typesetting tags for use in such programs as QuarkXPress, Ventura,TeX, or pretty much anything else". At the least, reading the full description may give you some ideas. The site is http://www.editorium.com/

    I use some of the company's other add-ins, but not this one.

    PamC
    Pam Caswell

  8. #8
    Super Moderator
    Join Date
    Jan 2001
    Location
    Melbourne, Victoria, Australia
    Posts
    3,852
    Thanks
    4
    Thanked 259 Times in 239 Posts

    Re: Macro to convert word styles to XSD Schema tag

    I see from your other thread that you are still pursuing this path so here is a little kick start for you. As I said earlier, the code to tag an unstructured document is far from trivial but the basic code to tag paragraphs based on stylename is simple enough.
    <pre>Sub temp1()
    Dim aPara As Paragraph
    For Each aPara In ActiveDocument.Paragraphs
    Debug.Print aPara.Range.Style
    ActiveDocument.XMLNodes.Add Name:=aPara.Range.Style, _
    Namespace:="SimpleSample", Range:=aPara.Range
    Next aPara
    End Sub</pre>

    The shortcomings of this simplistic approach are numerous. For starters, the xml node can't have spaces since I don't think that would be a valid xml tag. Also the tag must already exist in the namespace you are using. You won't get any nesting of tags either although this would be potentially possible to deduce if your documents were structured in the same way as the schema.

    Saving the resultant xml without any of the Microsoft Word tags is as simple as looking carefully at the SaveAs dialog. The checkbox which says 'Save Data Only' will give you a clean xml file without any of the microsoft formatting tags.
    Attached Images Attached Images
    Andrew Lockton, Chrysalis Design, Melbourne Australia

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •