Results 1 to 9 of 9
  1. #1
    Star Lounger
    Join Date
    Aug 2003
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts

    select foreign characters (word 2003)

    I post this on excel forum post #379179. The macro for excel is perfect, now I want to use the macro on word. Hans gave me a lots of help.
    unfortunatelly, the word macro can not be run for large file.
    Any one can help on this, please.

  2. #2
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: select foreign characters (word 2003)

    Some additional info for those who haven't followed the Excel thread starting at <post#=379179>post 379179</post#>:

    Joe has documents with a mixture of Western and Chinese characters. He wants to change only the Chinese characters (or only the Western characters) in size. I adapted an Excel macro by sdckapr for Word; it loops through the document character by character, and tests whether the AscW value of the character is within or outside the range 0..255. It works, but it is excruciatingly slow. Anyone got a better idea?

  3. #3
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: select foreign characters (word 2003)

    Maybe you could adapt the code in the attached? It's a demo that highlights the "non-ANSI" characters by looping through a byte array of each paragraph. It has a problem with the "smart" apostrophe, which gives an ASC value under 255 but which gives an ASCW value greater than 255. I'm not sure what the solution is for that. If I knew more about the Chinese character set, that probably would help a lot. <img src=/S/laugh.gif border=0 alt=laugh width=15 height=15>
    Attached Files Attached Files

  4. #4
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: select foreign characters (word 2003)

    Hi Hans,

    A For...Each will get throgh the characters much faster. It got through 27 pages of "the quick brown fox ..." in less than 20 seconds; your mileage may vary.

    <pre>Sub ChangeCharacterSize()
    Dim dSize As Double
    Dim char As Range
    dSize = 22

    For Each char In ActiveDocument.Characters
    If AscW(char.Text) > 255 _
    Or AscW(char.Text) < 0 Then _
    char.Font.Size = dSize
    Next char
    End Sub
    </pre>


  5. #5
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: select foreign characters (word 2003)

    Jefferson and Andrew: thanks. <img src=/S/thumbup.gif border=0 alt=thumbup width=15 height=15>

    Andrew's For Each is indeed spectacularly faster then For x = 1 To ... Jefferson's solution is even faster than that. In a test I did on a 15 page document (before running the macro), the For x code did not finish within a reasonable time; Andrew's took only 4.5 seconds, and Jefferson's just 0.7 seconds. The result seemed correct.

    Now it's up to Joe to test.

  6. #6
    Star Lounger
    Join Date
    Aug 2003
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: select foreign characters (word 2003)

    Thank you. Jefferson , Andrew and Hans
    I try Andrew's 261 pages For 8'25"
    Jefferson's for 9'10"

  7. #7
    Silver Lounger
    Join Date
    Jan 2001
    Location
    West Long Branch, New Jersey, USA
    Posts
    1,921
    Thanks
    6
    Thanked 9 Times in 7 Posts

    Re: select foreign characters (word 2003)

    Hi Jefferson,

    Although I don't think this can be done in VBA or even VB (so what's the point, you ask), we had another approach that we used when doing some development using IBM's PL/1 (is that still used). The approach would go something like this:
    - starting with your approach of examining para by par, we would NOT examine each char initially. The first step would have a big payoff with big paras or a lot of paras of western only characters. This first step would go as follows:
    ---"overlay" a Char string over the arrBytes (so it becomes another way of looking at the storage of arrBytes but occupies no extra storage)
    ---define a "mask" string of bytes alternating as X'FF' and X'00' with X'FF' being the odd numbered bytes (like how you do your testing of odd bytes)
    ---AND the mask string with the original Char string and store as Result string
    ---compare the Result string to a string of all bytes set to 0

    For big paragraphs, this has a bigger payoff than for small paragraphs.

    When doing the AND, any bits set to 1 in the original string in the odd bytes are preserved but any bits set to 1 in even bytes are forced to 0. Now for setting whatever chars you want to make bigger, you need to look at the result of the comparison and what you're trying to flag:
    - If the comparison to the all 0's string is equal, then the original was all western characters.
    ---If you want to set western characters to be bigger, then you can do this for the entire para at one shot
    ---if you want to set non-western characters to be bigger, then skip this para since there are none
    - If the comparison to the all 0's string is not equal, then there were some (or all) non-western characters in the original
    ---what you do here depends on what you want to make bigger (western or non-western). For example, if you want to make western chars bigger, you could use InStr searching for the null X'00' string (assume no even bytes have this for any characters)

    In general, we found these string operations to be much faster when starting with "big" strings.

    But I don't think this can be done with VB or VBA so it's somewhat academic here.

    But it was a useful exercise for us since CPU time was a scarce resource.

    Fred

  8. #8
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: select foreign characters (word 2003)

    Fred, I agree with the idea of "pre-screening" ranges of text, and in the past when we had threads about finding, for example, some kind of errant formatting, I think we had code that did that. I'm not sure it's feasible in this case, but I haven't really given it a lot of thought.

    > CPU time was a scarce resource

    Not to mention the time waiting for the results back from school district headquarters while they ran your punch cards. <img src=/S/laugh.gif border=0 alt=laugh width=15 height=15> (Back in my Fortran days.)

  9. #9
    Silver Lounger
    Join Date
    Jan 2001
    Location
    West Long Branch, New Jersey, USA
    Posts
    1,921
    Thanks
    6
    Thanked 9 Times in 7 Posts

    Re: select foreign characters (word 2003)

    Jefferson,

    I was looking in my VB/VBA ref book and the closest I got to being able to do this was the Filter BIF but that's only in VB, not VBA. Not sure it's doable in VBA. But if filter (or the equivalent of PL/1 AND) is available, then I think it's doable. I did note that VBA allows 2 numbers to be ANDed or ORed together, the result being the bit-wise AND or OR. So 6 AND 2 results in 2. But to make it superfast, you want to do an AND over the entire array at once and not iterate thru a do-loop.

    Of course, Fortran had no native string manipulation functions at all so you're "forgiven". <img src=/S/laugh.gif border=0 alt=laugh width=15 height=15> My very first professional assignment was to do a big computer system for tracking problems in telephone central office equipment based on collected data. Everyone wanted to do it in Fortran bcs that's what they had used on the previous project. I said let's do PL/1 and we did. But we used punch cards too. (They got back at me bcs the next project was done in Fortran but we used a lot of text-manip libraries.)

    Fred

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •