Page 1 of 2 12 LastLast
Results 1 to 15 of 27
  1. #1
    Super Moderator BATcher's Avatar
    Join Date
    Feb 2008
    Location
    A cultural area in SW England
    Posts
    2,832
    Thanks
    19
    Thanked 110 Times in 104 Posts

    Want command-line utility to search single file for multiple strings

    I have a single log-type ASCII text file whose lines come in a significant number of possible line formats (perhaps 50?).
    I want to test each line to see if it contains one of a number of unique strings, and write the line to an output file if any of these required strings is found.
    If the line contains none of the required strings, or contains one of a number of NON-required strings, the line is to be ignored.

    The strings would usually contain more than one word separated by blanks, and the line will usually contain at least one email address wrapped in angle brackets (e.g. <gladys@domain.com>)

    The built-in commands FIND and FINDSTR really only handle a single string, without errors, so up to 50 runs against the same input file would be somewhat inefficient!

    Does anyone know of a command-line utility which would do this? (If anyone has a UK IBM mainframe background, what I'm really asking for is a free version of the SELCOPY utility!)
    Last edited by BATcher; 2013-04-02 at 08:58.
    BATcher

    Dear Diary, today the Hundred Years War started ...

  2. #2
    4 Star Lounger access-mdb's Avatar
    Join Date
    Dec 2009
    Location
    Oxfordshire, UK
    Posts
    527
    Thanks
    50
    Thanked 40 Times in 37 Posts
    This would be a cinch in Perl using regular expressions! However a quick Google for 'DOS Regular expressions' comes up with some suggestions e.g.
    http://www.computerhope.com/findstr.htm
    http://www.2150.com/regexfilter/Docu...xpressions.asp

  3. #3
    Administrator
    Join Date
    Jun 2010
    Location
    Portugal
    Posts
    10,306
    Thanks
    130
    Thanked 1,158 Times in 1,066 Posts
    Probably wouldn't be too hard to write a small C# app to do that, using regular expressions.
    Rui
    -------
    R4

  4. #4
    Super Moderator BATcher's Avatar
    Join Date
    Feb 2008
    Location
    A cultural area in SW England
    Posts
    2,832
    Thanks
    19
    Thanked 110 Times in 104 Posts
    I don't think regular expressions would assist greatly - it's simply a multiple string-matching problem.
    if line contains "string 1" then write it to output-file
    if line contains "string 2" then write it to output-file
    ...
    if line contains "string z" then ignore line
    ...

    And it would be extremely hard to write any application other than BATch for me - the last computer language programming I did was in IBM Assembler-360 and Rexx. There ain't much of that around no more.
    BATcher

    Dear Diary, today the Hundred Years War started ...

  5. #5
    Administrator
    Join Date
    Jun 2010
    Location
    Portugal
    Posts
    10,306
    Thanks
    130
    Thanked 1,158 Times in 1,066 Posts
    Well, even for whole string matching, using a regular expression can make things easy, although it seems that wouldn't be needed here.
    Rui
    -------
    R4

  6. #6
    Administrator
    Join Date
    Mar 2001
    Location
    St Louis, Missouri, USA
    Posts
    20,554
    Thanks
    2
    Thanked 614 Times in 550 Posts
    Couldn't you use Powershell? See some of the links at powershell search text file for string for a starting point. In particuler, see Hey, Scripting Guy! How can I use Windows PowerShell to Search a Text File for multiple strings?.

    Joe

  7. #7
    Super Moderator RetiredGeek's Avatar
    Join Date
    Mar 2004
    Location
    Manning, South Carolina
    Posts
    6,227
    Thanks
    202
    Thanked 794 Times in 728 Posts
    BATcher,

    Here's a powershell program that will do the trick.
    Code:
    param (
      [string]$ExistingFile = "WindowsUpdate.log",
      [string]$NewFile = "PSProcessed.txt",
      [string]$DriveDirPath  = "G:\BEKDocs\Scripts"
    )
    # ******* Setup Section *********
     $LinesProcessed = 0
     $LinesMatched = 0
     $MatchStrings = @("*Process:*", "*AUSearcher Search*")
     $MatchCnt = $MatchStrings.Count
     remove-item "$DriveDirPath\$NewFile" -ErrorVariable Errs 2>$null
    # *******End of Setup Section ********
    
    ForEach( $Line in get-content "$DriveDirPath\$ExistingFile" ) {
       $LinesProcessed++
       For( $Cnt = 0 ; $Cnt -lt $MatchCnt; $Cnt++) {
    
         if($Line -like $MatchStrings[$Cnt] ) {
           add-content "$DriveDirPath\$NewFile" $Line
           $LinesMatched++
           break
         }  #End If
    
       }  #End For 
    
    }   #End ForEach
    
    Write-Host "$LinesProcessed lines were tested. `n"  `
               "$LinesMatched lines were matched. `n"  `
               $Errs.count " errors encountered."
    Notes:
    1. You need to change the $Matchstrings = @(.....) array. Just replace the ..... with a list of the phrases you want to search for and add an wildcard * on either side. See the example. I've only got 2 items in the array but you can place as many as you need.

    2. All the parameters will default (lines 2-4) when you call the program you can over ride the defaults:
    .\ProccessTextFile.ps1 -ExistingFile filespec -Newfile filespec -DriveDirPath d:path\path\...
    as written the source and output files have to be in the same directory of course this could be changed. You also only need to include those you want to change and they can be in any order.

    3. Type powershell in the search box to get a PowerShell prompt to get started.

    Here is the sample output from 2 successive runs:
    PowerShellRun.JPG
    Note the second one shows an error. That's because I deleted the destination file and when I tried to remove it it wasn't there. If you get more that 1 error something's rotten in Mudville!

    HTH
    Last edited by RetiredGeek; 2013-04-04 at 15:26.
    May the Forces of good computing be with you!

    RG

    VBA Rules!

    My Systems: Desktop Specs
    Laptop Specs


  8. The Following User Says Thank You to RetiredGeek For This Useful Post:

    Frank S (2013-04-14)

  9. #8
    New Lounger
    Join Date
    Dec 2009
    Location
    St Helens, Merseyside, England
    Posts
    23
    Thanks
    2
    Thanked 4 Times in 1 Post

    Give Textreme a try

    Hi there, I had a need to do this to extract Argos (an eVisions reporting tool) actual report runtime events from a system logfile with loads of other stuff in it ages ago, and consequently it occurred to me that if I made the thing more versatile, it would save me from trying to resurrect my oft-rusting Perl/RegExp skills when I needed to do this in future with other text files.

    So, I wrote a small utility using AutoIT and dubbed it Textreme, and you can download it from my website to see if it might come some way towards meeting your needs.

    It's available at www.jollybean.co.uk under the Textreme menu item on the right.

    If you DO give it a try I would be interested to know if it was of use to you!

    Best regards,

    Jim.

    [Edit] I just tried it and it appears that the Include function does not 'cascade' properly so that you would have to make several passes -- sorry about that but thanks for highlighting a 'feature' that I need to work on!

    [Edit2] Okay, I think I have got both the Include and Exclude functions cascading correctly now. If you can wait until tomorrow, I should be able to upload the latest version to my site this evening (UK time). I had been meaning to do this anyway because I recently added another function ('Move') as a result of an article about editors written by Verity Stob on TheRegister website
    Last edited by jimollerhead; 2013-04-04 at 06:29. Reason: extra information

  10. #9
    Star Lounger
    Join Date
    Feb 2010
    Location
    near Ottawa, Ontario, Canada
    Posts
    57
    Thanks
    65
    Thanked 12 Times in 11 Posts
    Hi Batcher,

    At work I have had to do that a bunch. On the Linux side it's easy with the command "grep". In order to be portable, I have installed Cygwin on my WinXP laptop so it acts like Linux. Then using Perl I have written several different grep-like programs for various specific functions. For your purpose that's likely overkill.

    How about a windows version of grep?
    see http://www.wingrep.com/index.htm

    I have never used it, but under "Features" it does claim to have a command-line interface.
    If nothing else it may give you another term to google for....

    Good Luck!
    brino

  11. #10
    Super Moderator BATcher's Avatar
    Join Date
    Feb 2008
    Location
    A cultural area in SW England
    Posts
    2,832
    Thanks
    19
    Thanked 110 Times in 104 Posts
    RetiredGeek - it looks as if my reply to you yesterday has disappeared! I have never used PowerShell (at least knowingly), but will see if I can get your program to work in my circumstance

    Jim - your mention of Verity Stob (one of my literary/literate heroines, like Lucy Kellaway) makes it compulsory for me to try your utility! (But tomorrow...)

    Brino - I wasn't aware that grep handled such a large number of search strings, but I will look into it.
    BATcher

    Dear Diary, today the Hundred Years War started ...

  12. #11
    Lounger
    Join Date
    May 2011
    Posts
    33
    Thanks
    0
    Thanked 1 Time in 1 Post
    BATcher,

    Another candidate from the Unix/Linux camp may be awk (or gawk). I believe it is also available in the Cygwin package that brino mentioned (http://cygwin.com/packages/). Another implementation is at http://gnuwin32.sourceforge.net/packages/gawk.htm.

    mo.eu

  13. #12
    New Lounger
    Join Date
    Jan 2011
    Posts
    8
    Thanks
    0
    Thanked 4 Times in 1 Post
    Quote Originally Posted by BATcher View Post
    I have a single log-type ASCII text file whose lines come in a significant number of possible line formats (perhaps 50?).
    I want to test each line to see if it contains one of a number of unique strings, and write the line to an output file if any of these required strings is found.
    If the line contains none of the required strings, or contains one of a number of NON-required strings, the line is to be ignored.

    The strings would usually contain more than one word separated by blanks, and the line will usually contain at least one email address wrapped in angle brackets (e.g. <gladys@domain.com>)

    The built-in commands FIND and FINDSTR really only handle a single string, without errors, so up to 50 runs against the same input file would be somewhat inefficient!

    Does anyone know of a command-line utility which would do this? (If anyone has a UK IBM mainframe background, what I'm really asking for is a free version of the SELCOPY utility!)
    BATcher, I have a program that I believe does what you are looking for. See the program titled String Search Counter (which is the very first program listed) on my page, http://www.billanddot.com/downloads.htm .

    Cheers,

    Bill P.

  14. #13
    Super Moderator RetiredGeek's Avatar
    Join Date
    Mar 2004
    Location
    Manning, South Carolina
    Posts
    6,227
    Thanks
    202
    Thanked 794 Times in 728 Posts
    BATcher,
    I just noticed this
    or contains one of a number of NON-required strings
    The PowerShell code I posted does not currently do this part. Sorry I missed it on 1st reading. I'll try to get it updated.
    May the Forces of good computing be with you!

    RG

    VBA Rules!

    My Systems: Desktop Specs
    Laptop Specs


  15. #14
    Super Moderator RetiredGeek's Avatar
    Join Date
    Mar 2004
    Location
    Manning, South Carolina
    Posts
    6,227
    Thanks
    202
    Thanked 794 Times in 728 Posts
    BATcher,

    Here's version 2 that adds the capability to exclude matched records with exclusion strings!

    Code:
    param (
      [string]$ExistingFile = "WindowsUpdate.log",
      [string]$NewFile = "PSProcessed.txt",
      [string]$DriveDirPath  = "G:\BEKDocs\Scripts"
    )
    
    Function ExcludeText([String]$LineToCheck,$ExcludeList) {
       $ListCnt = $ExcludeList.count
       $WriteRecord = "YES"
       For($ExclCnt = 0 ; $ExclCnt -lt $ListCnt ; $ExclCnt++) {
    
          If($LinetoCheck -like $ExcludeList[$ExclCnt]) {
            $WriteRecord = "NO"
            Break
          } #End If
    
       } #End For
    
       $WriteRecord
    
    }  #End ExcludeText
    
    # ******* Setup Section *********
     $LinesProcessed = 0
     $LinesMatched = 0
    
     #  *** Array for strings to MATCH/Select Records ***
     $MatchStrings = @("*Process:*", "*AUSearcher Search*")
    
     #  *** Array for strings to Exclude matched strings!  ***
     $ExcludeStrings=@("*ReScan = FALSE*","*WinAudit.exe*")
     $MatchCnt = $MatchStrings.Count
     remove-item "$DriveDirPath\$NewFile" -ErrorVariable Errs 2>$null
    # *******End of Setup Section ********
    
    ForEach( $Line in get-content "$DriveDirPath\$ExistingFile" ) {
       $LinesProcessed++
       For( $Cnt = 0 ; $Cnt -lt $MatchCnt; $Cnt++) {
    
         If($Line -like $MatchStrings[$Cnt] ) {
    
           $Result = ExcludeText -LineTocheck $Line -ExcludeList $ExcludeStrings
           If($Result -eq "YES"){ 
             add-content "$DriveDirPath\$NewFile" $Line
             $LinesMatched++
             Break
           }
         }  #End If
    
       }  #End For 
    
    }   #End ForEach
    
    Write-Host "$LinesProcessed lines were tested. `n"  `
               "$LinesMatched lines were matched. `n"  `
               $Errs.count " errors encountered."
    BTW: I used the WindowsUpdate.log file as my test file I just copied it to the same directory as the script.

    HTH
    May the Forces of good computing be with you!

    RG

    VBA Rules!

    My Systems: Desktop Specs
    Laptop Specs


  16. #15
    New Lounger
    Join Date
    Dec 2009
    Location
    St Helens, Merseyside, England
    Posts
    23
    Thanks
    2
    Thanked 4 Times in 1 Post
    OK BATcher the 'latest & greatest' Textreme is now uploaded to www.jollybean.co.uk and ready to try. Make sure you keep the 3 files (exe, ini and chm) local to one another after extraction.

    The pop-up screen shot shows v2.3 (grr, only just noticed that!) but v2.4 is in the zip file. Because of the website builder I have used (WebPlus X6) I have to recompile the whole page to fix that but I can upload the new zip file no problem. Please let me know how you get on with it!

    Best regards,

    Jim.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •