Page 1 of 2 12 LastLast
Results 1 to 15 of 17
  1. #1
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    DIR: Is it really a Filename? (WordXP)

    I'd like to be able to distinguish between real names and patterns. My code is fed data such as "*.dot" and "C:GreavesClientsXSTAND97xp.DOT".

    I test for the presence of a colon to dinstinguish between patterns and real names. Has anyone a better idea (simple, fast) for detection? (Apart from going to what seems like unnecessary expense with FSO, Shell and other methods).

    <pre> If InStr(1, strEntry, ":") > 0 Then
    If Len(Dir(strEntry)) > 0 Then
    strAr(UBound(strAr)) = strEntry
    ReDim Preserve strAr(UBound(strAr) + 1)
    Else
    End If
    Else
    End If</pre>

    I should add that the strings will be well-behaved; they arrive from the registry via "softwaremicrosoftoffice10.0commonopen findMicrosoft WordSettings"; see for example the "Open", the "Insert File" and the very strange "Save As".

  2. #2
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    How about testing for the presence of a wildcard ? or *

  3. #3
    Plutonium Lounger
    Join Date
    Dec 2000
    Location
    Sacramento, California, USA
    Posts
    16,775
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: DIR: Is it really a Filename? (WordXP)

    The colon as a test will fail if you get a UNC path, which may not be anything you have to deal with. In VBA I usually test for a , but what have you considered using regular expressions to determine if any of the possible wildcards or strings like "*.dot" are in the string?
    Charlotte

  4. #4
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    Hans and Charlotte, thank you both for your replies.

    In this particular case ("... commonopen findMicrosoft WordSettings") I receive by return a mish-mash of strings. I'm interested in deducing those strings which represent proper and existing files on the system. Right now I can ignore UNC, although that may change.

    I usually tackle this kind of problem by asking myself how my own brain *knows* that this is a Fullpath filename and that that isn't. As I eyeball the list, I see "C:", surprise surprise.
    Testing for the presence of a wildcard will tell me I'm looking at "set*.bat", but won't help me with "temp". (I'm not complaining; I should have posted the data below and the attached code snippet on my original post).

    This seems to me to be a more general problem: Given a string, how do you know that it represents a valid file name, suitable as a candidate for a file operation, such as Kill, FileCopy, or even Documents.Open ? OnError isn't a solution, as I could kill a slew with *.bat when I didn't ought to.

    At first I thought I'd ask dear old Dir to tell me, but sadly, DIR returns a non-empty string for "set*.bat". Of course, I could loop to see if any real files were returned, or if I looped more than once etc., but that seems to be placing calls to the system and I am curious about raw string analysis.

    I would consider "C:template.dot" to be a valid filename, even without any back-slash, but "template.dot" seems to me to be NOT a filename.

    Having typed that, I think I'm wrong. I think that the criteria for a FullName must be the presence of exactly one colon and at least one backslash.

    Here's the list I receive when I run the attached code.:<pre>*.dot
    *.doc
    dri*.bat
    P*.*
    98*.bat
    set*.bat
    temp
    C:GreavesClientsCCharonXSTAND97xp.DOT
    C:GreavesPcfGreenMonster08.dot
    C:GreavesPcfGreenMonster09.dot
    C:GreavesPcfgenerate fax, email and mail objects.dot
    C:GreavesTrainingWordProcessingWord97VbaLibrariesM ainApplication.dot
    C:GreavesProductsUSERMRUListMRUse113.dot
    C:GreavesProductsUSERDocleAnalYsisHardC012.dot
    C:GreavesProductsUSERDocleAnalYsisDocAn.dot
    C:GreavesProductsDEVELOPERdBasedBase021.dot
    RegFn001
    Water-Phase 2
    20050516Analayis OfExistingFilename
    20050514
    Ozlip Australian word game
    Utils
    UT
    PClip201.dot
    1
    3</pre>


    Note that I may not be the one interrogating the registry, or whatever the source. I am just the Word-bunny that is given a set of strings and asked to do something with them. In theory, who knows whence they arrived? "Just act on those you can deal with".

  5. #5
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    > The colon as a test will fail if you get a UNC path,

    True, and since then (see above) I seem to be moving away from colon and towards slash anyway.

    I have utility routines that parse strings to obtain pointers to drive, path, name and extent; I could use those routines and check that all pointers were >=0, but that's a heck of a lot of analysis for a simple yes/no answer. I don't need to know the drive or path, just that there is one.

    >you considered using regular expressions t
    Ever since I read Andrew's book I've been trying to avoid RegExp; they are just too fascinating for words and he makes me spend the rest of my day creating brilliant code that doesn't earn me any money (grin!).

    RegExp would do, but I seem then still to be setting up some basic criteria. For example, the RegExp, crudely stated, would be asking "Is there exactly one colon, at least one slash, are the other characters valid in a file name; there can be any number of periods, .....". By the time I've worked out how to define the RegExp, I'd have to know the basic criteria anyway. I ought to be able to simplify that to "colon, slash, and passes the DIR test".

    Perhaps I'm going about this the wrong way. My current task is to harvest useful/useable filenames from the registry. For a general problem, different criteria will apply. For example, this exercise might just want filenames, regardless of whether they still exist on the system; no need to run DIR. The next exercise might require existing files, so DIR is required. That is, my testing criteria will change from job to job.

  6. #6
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    Well, how can I help from chiming in on this one...

    If you're looking for a concise solution to this one, I think you'll have to deal with a RegExp. I started tinkering, and here's what I came up with:
    <pre>Sub IsDirName()
    Dim re As New RegExp
    re.Pattern = "^(w:)?([^:])+?$" ' --> bold tag on the first paren to stop the smilie
    MsgBox re.Test("foobar") ' True
    MsgBox re.Test("c:foobar") ' True
    MsgBox re.Test("c:foo:bar") ' False
    MsgBox re.Test("foobar") ' True
    MsgBox re.Test("foobar") ' False
    MsgBox re.Test("c:") ' False
    Exit Sub
    </pre>


  7. #7
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    > bold tag on the first paren to stop the smilie
    Thanks Andrew. I could've been Platinum a long time ago if I'd worked out how to overcome this (grin!). Now my pasta is going cold and dry, it's the first time I've made creamy clam sauce and that has curdled, but I'm into RegExp again (gloom!)

    I went back and read Regular Expression Syntax as you suggested, and could comprehend almost all of your timely and welcomed response. Since you have ruined my lunch, can we see how well I did?

    ^ - up against the left-hand end of the string
    (w:[/b]) - a pattern, any word character followed by a single colon character ' --> improved on Andrew's technique by using a Bold-OFF switch in isolation thereby obviating any bolding at all!
    ? - zero or one of that pattern. i.e. Either there is a drive-letter-colon, or there is not.
    - a slash, using the slash escape character to escape from itself.
    ([^:]) - a pattern, basically any character, but NOT a colon
    + - one or more of this pattern
    ? - zero or one of those patterns - I don't understand this at all
    $ - up against the RHS of the string.

    I can see that this ferrets out a leading drive letter, if any, followed by a non-optional back-slash. I think I understand "a set of characters as long as there's no colon", but I can't see how I get to repeat the backslash-name combination. Is that the plus-sign?
    And if so, why do I need the question mark?


    Thanks for the extra lesson in regExp. I think I have some bread and cheese in the 'fridge.

  8. #8
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    > considered using regular expressions

    I think I'm arguing from a lack of knowledge when I say "I don't want the overhead of ...." without knowing what that overhead is. I've been thinking.

    My drive/path/name/extent analysis code is interpreted VBA; that has to be slower than molasses in Minneapolis.
    DIR may or may not be fast. I assume nowadays that WinXP loads the folder structure into RAM and keeps it there.
    Regular Expressions may or may not be fast.
    The distinction between execution time in RAM and that on disk is blurred with mega-RAM cache available.

    I know that development time is the killer. I've spent (and involved other people's) more time agonising over this than if I'd just got on with it.

    Regular expressions are neat, concise, and they permit extension to a better library of file name component manipulation than I've developed.

    You could be right........

  9. #9
    3 Star Lounger
    Join Date
    Apr 2004
    Location
    Boston, Massachusetts, USA
    Posts
    389
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    Hi Chris,

    You're spot on on your analysis of the RegExp. Well done.

    The "?" after the "+" in this case isn't really necessary, it's just something I usually do out of habit. Here's why:

    By default, a regular expression match is "greedy" -- it'll match the largest possible string that satisfies the pattern. For example, suppose we had the following string:
    <pre>str = "One fish, two fish,"
    </pre>

    And the following RegExp:
    <pre>re.Pattern = "One.+,"
    </pre>

    At first glance, you'd think the pattern would match the first part of the phrase, "One fish,". But again, the RegExp is "greedy" and matches the largest possible string. So the actual match here would be "One fish, two fish,".

    Adding the "?" after certain patterns ("*" and "+" are the most common) makes the match "non-greedy" -- it matches the smallest possible match. I often wish Word's wildcard feature had this option -- those are always pretty darned greedy.

    Again, with the RegExp I posted, it's not actually necessary -- I think it was left over from an earlier permutation of the match as I was refining it. But it certainly doesn't hurt in this case.

  10. #10
    5 Star Lounger st3333ve's Avatar
    Join Date
    May 2003
    Location
    Los Angeles, California, USA
    Posts
    705
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    Speaking of "unnecessary expense," a side note: According to VB/VBA in a Nutshell, ReDim Preserve is a costly maneuver (VB creates a new array and copies the contents of the old array into it). So if you've got a loop that's adding items to an array and you don't know in advance how big the array will be, the recommended practice is to initially dimension the array as big as it could possibly need to be, then do the loop, and then ReDim Preserve just once (after you know the number of items), to shrink the array to the right size.

    Alternatively, you can test within the loop to make sure the item count hasn't reached the UBound and do an interim ReDim Preserve that adds, say 50 or 100 to the array's capacity if it has.

  11. #11
    Star Lounger
    Join Date
    Jan 2001
    Posts
    71
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    If I'm correct, you want to ascertain that:
    a) the file spec does NOT contain wildcards
    [img]/forums/images/smilies/cool.gif[/img] the file spec DOES contain path information

    How about this as an approach:

    Private Declare Function StrCSpnI Lib "Shlwapi" _
    Alias "StrCSpnW" (ByVal lpStr As Long, _
    ByVal lpSet As Long) As Long

    Private Declare Function PathIsFileSpec Lib "Shlwapi" _
    Alias "PathIsFileSpecW" _
    (ByVal lpszPath As Long) As Boolean


    Function IsValidPathName(ByVal sPath As String) As Boolean
    IsValidPathName = ContainsNoWildCards(sPath) And ContainsPathInfo(sPath)
    End Function

    Function ContainsNoWildCards(ByVal sPath As String) As Boolean
    Const sWildcard As String = "*?"
    Dim nRet As Long
    nRet = StrCSpnI(StrPtr(sPath), StrPtr(sWildcard))
    ContainsNoWildCards = (nRet = 0 Or nRet = Len(sPath))
    End Function

    Function ContainsPathInfo(ByVal sPath As String) As Boolean
    ContainsPathInfo = Not PathIsFileSpec(StrPtr(sPath))
    End Function

  12. #12
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    > If I'm correct, you want to ascertain that: a) the file spec does NOT contain wildcards [img]/forums/images/smilies/cool.gif[/img] the file spec DOES contain path information

    If I'm going to be fair, I have to confess that now i don't really know what i want. I should dig out the specification for "valid unique filename" from MS.

    The original task is to harvest FullNames from certain branches of the registry. That got me into devising registry functions, including assembly of arrays from MULTI-SZ items. My original code included tests for null strings, as well as tests for "vaild filenames". I am now clearing my head.

    I've not met "Shlwapi" before, so i'll study it.

    I've attached a module.txt with your code and a TEST macro at the foot. Early tests are de-activated, but the first active line fails the test. I'm not sure why.

  13. #13
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    > ReDim Preserve is a costly maneuver

    I agree with this. I know what happens inside interpreters with data items; there's an awful lot of shuffling, and on my slow old beast (233MHz!) I've observed applications gumming up as the hours slip by.

    > adds, say 50 or 100 to the array's capacity
    And I remember seeing this in a manifestation of QSort; the progammer Redim'd every 10.

    Perhaps I'm no different from others, in that I tend to narrow-focus on execution efficiency while being disgustingly inefficient elsewhere. On top of that I maintain that writing/maintenance efficiency almost always outweighs execution efficiency.

    In my defense I offer that I grew up in the days when core memory was in short supply and hard drives were slow and un-buffered.

  14. #14
    Star Lounger
    Join Date
    Jan 2001
    Posts
    71
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    > but the first active line fails the test. I'm not sure why.

    'MsgBox blnIsValid("foobar") ' True
    'MsgBox blnIsValid("temp") ' True

    I get a True for both, which is correct. A pathname starting with a backslash points to the root of the current drive.
    As far as I can see the ShlWapi functions work properly.

  15. #15
    WS Lounge VIP rory's Avatar
    Join Date
    Dec 2000
    Location
    Burwash, East Sussex, United Kingdom
    Posts
    6,280
    Thanks
    3
    Thanked 191 Times in 177 Posts

    Re: DIR: Is it really a Filename? (WordXP)

    Don,
    As far as the StrCSpnI function goes, it will return 0 if the first character of string 1 is included in the list of search characters, because it returns the length of the substring of string1 before it reaches one of the specified characters (hence it returns len(string1) if the characters are not found). Therefore the last line of ContainsNoWildcards should read:
    <pre>ContainsNoWildcards = (nret = len(sPath))
    </pre>

    I think Chris was saying that the function was returning true for the input "*.dot" when it should have returned false.
    Regards,
    Rory

    Microsoft MVP - Excel

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •