Results 1 to 5 of 5
  1. #1
    WS Lounge VIP
    Join Date
    Mar 2002
    Location
    Newcazzle, UK
    Posts
    2,826
    Thanks
    136
    Thanked 482 Times in 459 Posts

    MATCH formula performance (Excel2000)

    I have 2 lists of about 35,000 6-digit account numbers.
    I want to use an efficient formula to cross-check these lists, i.e. in ListA, I want to check whether any entry there is NOT in ListB (i.e. 'inactive accounts'), and in ListB, I want to check whether any entry there is NOT in ListA (i.e. 'new accounts').
    I will then use advanced datafilter to extract lists of 'inactive accounts' and'new accounts'.
    I could use a simple MATCH function to look for exact matches.
    I seem to recall seeing somewhere a method of using dynamic ranges to improve the speed.
    For example, if the lists are sorted, it is not necessary to begin the match search range at the top for an entry say, halfway or more down the list. Does this make any sense?

    I would be grateful for any insights.

    zeddy

  2. #2
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts

    Re: MATCH formula performance (Excel2000)

    Hi Zeddy,

    Say your ListA data is in Column A and your ListB data is in Column B. If you name each column, ListA and ListB, respectively, and put the following formulae in C1 and D1, respectively, then copy them down for the full length of each list, they'll show which values in ListB don't appear in ListA, and vice-versa:
    =COUNTIF(ListB,A1)
    =COUNTIF(ListA,B1)
    You can then sort on Columns C and D to group the missing (and any duplicated!) values together. in each of Columns C & D, 0 = 0 matches, 1 = 1 match, 2 = 2 matches, etc.

    Alternatively, you could combine both of the above formuale for a single sort list, ie:
    =COUNTIF(ListB,A1)*COUNTIF(ListA,B1)
    where a 0 result means that the value in one of the Lists is not found in the other.

    All three formulae could be used, to initiate a sort with the last formula (in, say, Column E) as the primary sort key and the others as the secondary keys.

    Cheers
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  3. #3
    WS Lounge VIP
    Join Date
    Mar 2002
    Location
    Newcazzle, UK
    Posts
    2,826
    Thanks
    136
    Thanked 482 Times in 459 Posts

    Re: MATCH formula performance (Excel2000)

    I like your method but would only use this if I only had a few hundred items to check.
    I used FastExcel (from DecisionModels.com) to do some timing tests:
    Worksheet calc time:
    Using the COUNTIF formula on ListA only: 1342.9 secs (22mins!)
    Using Match with type 0 (exact match): 234.6 secs (~4mins)
    Using Match with type 1 (find nearest) : 0.2 sec

    As you can see, with ~35,000 entries to check, timing can be important.
    What I was hoping for was a formula combination that makes use of the last item to restrict the range searched for the exact match, i.e. so that I could reduce the overall time to somewhere between the 0.2sec and 234.6 secs.

    zeddy

  4. #4
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Weert, Limburg, Netherlands
    Posts
    4,812
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: MATCH formula performance (Excel2000)

    Why not do it like this:

    =INDEX(Sheet1!$A$1:$A$3000,MATCH(Sheet2!A1,Sheet1! $A$1:$A$3000,1))=Sheet2!A1

    Returns TRUE on exact matches. For a range of 6000 cells, it took 21 ms.
    Jan Karel Pieterse
    Microsoft Excel MVP, WMVP
    www.jkp-ads.com
    Professional Office Developers Association

  5. #5
    WS Lounge VIP
    Join Date
    Mar 2002
    Location
    Newcazzle, UK
    Posts
    2,826
    Thanks
    136
    Thanked 482 Times in 459 Posts

    Re: MATCH formula performance (Excel2000)

    Brilliant. I struggle for a day, find the answer and then come back to post it and there you have the solution!
    I used the formula
    =if(a2=index(ListB,match(a2,ListB),1),"yes","no")

    Copying the formula to 35000 rows produced the results in 0.33 seconds!
    Compare this with using =match(a2,ListB,0) which took 234.62 seconds.

    Fantastic result.

    Many thanks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •