Results 1 to 7 of 7

Thread: Text Search

  1. #1
    Star Lounger
    Join Date
    Jun 2006
    Posts
    88
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Text Search

    This is not meant as a fun game puzzle, however, I do have a puzzle that I could use some ideas to help solve.

    I am working on a report that is looking through hundreds of thousands of text strings. Each string has a position locator associated with it. The text field is free form and is meant for making notes. However, we have employees that are typing credit card numbers into this field. This is prohibited by company policy, however, we do not have the proper means to catch this since the field is free form. The credit card numbers could be anywhere in the text field (beginning, middle, or end) and may be either formated as ####-####-####-####, #### #### #### ####, or ################. There are expiration dates and Security numbers included, however since they can type anything they want, there is no exact format to search for. I am somewhat limited on search methods from the existing software, but if need be the data could be exported to a very large excel file.

    Any thoughts of how to search these text strings for credit card numbers would be much appreicated.

    Some things I have tried are looking for characters like EXP (expiration) and SEC (security). I tried searching for front slashes, but there are many dates included for other note purposes, so not really ideal. I have looked for numbers >1000000000, however, this does not work since the string is a text string.

    Any thoughts??

  2. #2
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Text Search

    How and where are the strings stored, and what methods for searching are available to you?
    For example, if they are stored in text form on a Unix machine, you can use grep.
    If you can use VB / VBA / VBScript, you can use the RegExp object from the Microsoft VBScript Regular Expressions 1.0 (or 5.5) library

  3. #3
    Star Lounger
    Join Date
    Jun 2006
    Posts
    88
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Text Search

    Our software does have something very similar to RegExp built into it. I am searching for approximate matches (not exact) and I know I may still be missing some and also getting false values. However, the report is providing me with sufficient trends to catch the employees that are doing this on a regular basis. I appreciate the help as always!!

    ck

  4. #4
    5 Star Lounger
    Join Date
    Apr 2003
    Location
    Hampshire, United Kingdom
    Posts
    602
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Text Search

    I have a vague recollection (from when I worked in a department store some years ago) that the first 4 digits of a credit card number tell you what type of card it is (Visa / American Express / Mastercard etc.). Presumably there's therefore a limited number of beginnings for a credit card number - perhaps searching for some of those might help.

    The following web site may also help:
    Anatomy of Credit Card Numbers
    Waggers
    If at first you do succeed, you've probably missed something.

  5. #5
    Star Lounger
    Join Date
    Jun 2006
    Posts
    88
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Text Search

    Thanks for the idea. That was a great thought, but unfortunately nowdays credit cards are distinguished by only the first digit in the card. 3 (amx), 4 (visa), 5 (mastercard), 6 (discover) and so on. I feel searching for just one number would return too much "false" data. Thanks though. I've got a pretty good search written now. It's not perfect, but it will catch a majority of the CC numbers.

    Thanks!!

  6. #6
    4 Star Lounger
    Join Date
    Feb 2004
    Location
    Saint Charles, Missouri, USA
    Posts
    565
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Text Search

    You are correct in that credit cards are distinguished by the first letter.

    However, Waggers <!post=Post 591349,591349>Post 591349<!/post> reference "Anatomy of Credit Card Numbers" offers some good information.

    Why not create a simple lookup table for KNOWN institutions (digits 1-6 identify institutions). So for example, 4ABC-DEXX-XXXX-XXXX you know is "My Bank Visa Institution". Add the six digit 4ABCDE to the lookup table of numbers to search for. You can add numbers to the lookup table as you determine valid institution 6 digit credit card numbers.

    Just a thought.
    Scott

  7. #7
    Star Lounger
    Join Date
    Jul 2006
    Location
    Colorado, USA
    Posts
    55
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Text Search

    I think Hans is right about Regular Expressions. I have just started using them with Perl and I am amazed at how quickly and efficiently they can search through text files. An expression like /d{4}[ -]?d{4}[ -]?d{4}[ -]?d{4}/ would be just what was needed to find those type of numbers.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •