Results 1 to 13 of 13
  • Thread Tools
  1. Lounger
    Join Date
    Jan 2008
    Posts
    37
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I am looking for the easiest way to import the attached file into Excel so that analysis can be performed. The file was created by another application and has all the pertinent data. What I am struggling with is how to get the two lines of data onto one line in order to easily import it into Excel using the Text to Data feature. If you look at the copy/paste below, you'll notice the header and the subsequent data is coming across on two lines. I cannot do a clean import into Excel and this is quite a big file. All suggestions are welcome.

    CHECK
    BANK CHECK # CHECK AMT CHECK DATE VENDOR # ADDRS # VENDOR NAME STATUS
    WO # AMOUNT G/L ACCT # DESCRIPTION INVOICE # INV VEND
    ------------------------------------------------------------------------------------------------------------------------------------

    1 1 $38742.79 05/21/09 INT01 0 INTERNAL REVENUE SERVICE OUTSTANDING
    38742.79 01-00000-20600-00000-000 PAYROLL TAXES FOR PE 4/24/09 PPE 4/24/09

    1 2 $552.60 05/21/09 CON06 0 CONEXIS OUTSTANDING
    552.60 90-00000-23327-00000-000 FLEX 125-DEPENDENT CARE REIMBS 051509

    Thanks.
    Attached Files Attached Files

  2. WS Lounge VIP sdckapr's Avatar
    Join Date
    Jul 2002
    Location
    Pittsburgh, Pennsylvania, USA
    Posts
    10,167
    Thanks
    8
    Thanked 159 Times in 154 Posts
    I think the easiest thing to do would be to have the program exporting this text file, export it in a more appropriate way. Some programs can export into an excel worksheet instead of using text.

    If it is only this file, I would manually work on it, trying to determine what "type" the line of text is. If you are only interested in the data and not the header, I would add first add a column numbering the items (to always be able to sort to the original list) then a column with a 0, 1, 2, 3 (0 for header, 1 for line1, 2 for line2, 3 for blank) then you can sort by type and then original number, delete the rows with 0 and 3, and then do a text to column with type 1 and type 2 and then move type 2 columns into the columns in the same rows as the type 1


    If you have to do this a lot a macro could be written to do it, but how complex will depend on how much it may change from time to time

    Steve

  3. Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    3,309
    Thanks
    0
    Thanked 94 Times in 92 Posts
    Hi aapke,

    Before loading the file into Excel, load it into Word and run the following macro against it, then re-save the file and load into Excel. From there you can use the Excel text-to-columns function, with fixed-with splits, to parse the data.

    Code:
    Sub StatementReformat()
    Application.ScreenUpdating = False
    With ActiveDocument.Range.Find
      .ClearFormatting
      .Replacement.ClearFormatting
      .Forward = True
      .Wrap = wdFindContinue
      .Format = False
      .MatchCase = False
      .MatchWholeWord = False
      .MatchAllWordForms = False
      .MatchSoundsLike = False
      .MatchWildcards = True
      'First Page
      .Text = "REPORT*EXTRACT FILE : [A-Z0-9]{1,}^13^12"
      .Replacement.Text = ""
      .Execute Replace:=wdReplaceOne
      'Column Headers (except New First Page)
      .Text = "^12*VEND^13*[-]{1,}^13"
      .Execute Replace:=wdReplaceAll
      'Header Underline on First Page
      .Text = "[-]{1,}^13"
      .Execute Replace:=wdReplaceAll
      'Last Page
      .Text = "          TOTAL #*^12REPORT*{1,255}PAGE*{1,255}FUND TOTALS*[=]{1,}*^13*^13"
      .Execute Replace:=wdReplaceAll
      'Unwanted Header Lines - First Page
      .Text = "REPORT*CHECK^13"
      .Execute Replace:=wdReplaceAll
      'Empty Paragraphs
      .Text = "[ ]{1,}^13"
      .Execute Replace:=wdReplaceAll
      'Header Line Wrap - First Page
      .Text = "(STATUS)^13"
      .Replacement.Text = "\1"
      'Record Line Wraps
      .Execute Replace:=wdReplaceAll
      .Text = "(OUTSTANDING)^13"
      .Execute Replace:=wdReplaceAll
      .Text = "(CLEARED)^13"
      .Replacement.Text = "\1    "
      .Execute Replace:=wdReplaceAll
      .Text = "(VOIDED)^13"
      .Replacement.Text = "\1     "
      .Execute Replace:=wdReplaceAll
      'Pad Records with Multiple Rows
      .Text = "(^13)([!^13]{1,120}^13)"
      .Replacement.Text = "\1                                                                                                                               \2"
      .Execute Replace:=wdReplaceAll
      .Execute Replace:=wdReplaceAll
    End With
    Application.ScreenUpdating = True
    End Sub
    Update:
    I noticed that negative values in the data have the '-' sign trailing the values concerned. Excel doesn't handle those too well. You can convert them to values with a preceding '-' sign by adding the following four lines immediately before the 'End With' statement in the macro:
    Code:
      'Reformat -ve values
      .Text = "( )([0-9]{1,}.[0-9]{2})(-)"
      .Replacement.Text = "\3\2\1"
      .Execute Replace:=wdReplaceAll
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  4. Lounger
    Join Date
    Jan 2008
    Posts
    37
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Okay, that was pretty cool to just watch. Thank YOU! Do I need to run it more than once to fix the issue with the multiple lines for one record?

    1 3000022 $697.49 04/16/09 COA05 0 COASTAL BLUE, INC OUTSTANDING 24.84 50-00000-62250-07203-000 FILM/MAPS/BLUEPRINTS 423225

    25.91 25-00000-63608-80028-000 FILMS/MAPS/BLUEPRINTS 423272
    12.34 50-00000-62250-09207-000 FILMS/MAPS/BLUEPRINTS 423458

    20.10 25-00000-63608-80058-000 FILMS/MAPS/BLUEPRINTS 423527
    19.22 50-00000-62250-09207-000 FILMS/MAPS/BLUEPRINTS 423950

    59.86 25-00000-63608-80068-000 FILMS/MAPS/BLUEPRINTS 423980
    51.56 25-00000-63608-80064-000 FILMS/MAPS/BLUEPRINTS 424065

    77.85 01-62000-63608-00000-000 FILMS/MAPS/BLUEPRINTS 424076
    30.44 50-00000-62250-07104-000 FILMS/MAPS/BLUEPRINTS 424187

    77.58 50-00000-62250-07107-000 FILMS/MAPS/BLUEPRINTS 424191
    37.34 50-00000-62250-07101-000 FILMS/MAPS/BLUEPRINTS 424320

    47.02 01-62000-63608-00000-000 FILMS/MAPS/BLUEPRINTS 424439
    20.10 60-94030-63608-00000-000 FILMS/MAPS/BLUEPRINTS 424479

    29.56 60-94030-63608-00000-000 FILMS/MAPS/BLUEPRINTS 424480
    53.71 60-94030-63608-00000-000 FILMS/MAPS/BLUEPRINTS 424620

    34.53 50-00000-62250-07104-000 FILMS/MAPS/BLUEPRINTS 424695
    75.53 25-00000-63608-80065-000 FILMS/MAPS/BLUEPRINTS 425067

    Thanks.
    Attached Files Attached Files

  5. Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    3,309
    Thanks
    0
    Thanked 94 Times in 92 Posts
    Hi aapke,

    You shouldn't need to. The two consecutive '.Execute Replace:=wdReplaceAll' statements near the end of the first code module I posted should take care of that - it did in my testing.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  6. Lounger
    Join Date
    Jan 2008
    Posts
    37
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I just knew I did something incorrectly. Basic steps I used are: Generate the report again in the finance system. Saved the generated file as .TXT. Opened it up in Word. Inserted your totally cool code into a module. Saved the file as a .docm. Ran the macro, sat back, and watched it work it's magic. Saved the file as a .TXT again. Imported it into Excel using the Text to Data feature. And yes, I did incorporate the additional 4 lines.

    Did I miss a step along the way? Either way, the new file I have to work with is so much cleaner than the old one, it won't take more than a few hours to clean it up manually if needed! Thank you so much!

    aapke

  7. Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    3,309
    Thanks
    0
    Thanked 94 Times in 92 Posts
    Hi aapke,

    I suspect what you're seeing is line-wrapping effects. Attached is a Word file containing the full macro, with the prcoessed data and the page format set up so that each record takes a single line in the document. As you can see, the multi-entry records are fully padded.
    Attached Files Attached Files
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  8. Lounger
    Join Date
    Jan 2008
    Posts
    37
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi macropod,

    I've copied the steps exactly (I believe) and am still getting the line wrapping. Can you tell from a quick glance what I am doing incorrectly?

    Thanks again,
    aapke
    Attached Files Attached Files

  9. Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    3,309
    Thanks
    0
    Thanked 94 Times in 92 Posts
    Hi aapke,

    I don't understand why your system isn't behaving the same as mine. Is your copy of Word of Word 2007 fully updated? There were some problems with the vba implementation in the earlier releases. I've run the code with both Word 2000 SP3 and Word 2007 SP2 and it works fine. Nevertheless, it seems what you're ending up with is not so much a line wrapping issue as unwanted line breaks in every 2nd padded line. You should be able to fix that by adding the following extra lines to your code:
    Code:
      'Fix Broken Lines
      .Text = "( )(^13)( )"
      .Replacement.Text = "\1\3"
      .Execute Replace:=wdReplaceAll
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  10. Lounger
    Join Date
    Jan 2008
    Posts
    37
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Ah. That may just be it. My copy of Word 2007 shows as "MS Word 2007 (12.0.6514.5000) SP2 MSO (12.0.6521.5000). Hopefully that'll fix everything!

    Thank you again for ALL the assistance.
    aapke

  11. Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    3,309
    Thanks
    0
    Thanked 94 Times in 92 Posts
    Hi aapke,
    My copy of Word 2007 shows as "MS Word 2007 (12.0.6514.5000) SP2 MSO (12.0.6521.5000).
    In that case, you do indeed have Word 2007's SP2 update installed. Strange - perhaps it's just some little glitch lurking somewhere else in your system.
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  12. Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    3,309
    Thanks
    0
    Thanked 94 Times in 92 Posts
    Hi aapke,

    Instead of adding the extra lines of code for repairing the broken data lines that I suggested earlier, try changing:
    Code:
      'Pad Records with Multiple Rows
      .Text = "(^13)([!^13]{1,120}^13)"
      .Replacement.Text = "\1                                                                                                                               \2"
      .Execute Replace:=wdReplaceAll
      .Execute Replace:=wdReplaceAll
    to
    Code:
      'Pad Records with Multiple Rows
      .Text = "(^13)([ ]{20})([ ]{1,8}[0-9]{1,8}.[0-9]{2})"
      .Replacement.Text = "\1\2\2\2\2\2\2\2       \3"
      .Execute Replace:=wdReplaceAll
    You'll notice that the new block has only one '.Execute Replace:=wdReplaceAll' statement. That's all I found necessary and it should make the overall execution just that little bit quicker - as should not needing those other extra lines of code for repairing the broken data lines .
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  13. Lounger
    Join Date
    Jan 2008
    Posts
    37
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Will do!

    Thank you again.
    aapke

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •