Results 1 to 10 of 10
  1. #1
    edavis24601
    Guest

    text frequency distribution (2000)

    Hi all,

    I have a column of text entries, many of which are repeating. some fields are blank. I want to us arrays identify the unique fields in one column, and populate the number of times each is used the next column. For example, range "animals" contains:
    dog
    cat
    dog
    dog
    <blank>
    cat

    My results would populate:
    dog 3
    cat 2
    blank field 1

    I can see a few ways to do this manually, in a step by step process, such as filtering out unique fields "unique_animals", then countif(animals, unique_animals), although that wouldn't include blank cells, which I also want to count. Given the number of columns I would have to do this for, wondering if there is an all-in-one array formula so I could just populate the output without extracting unique fields first, or have to make a separate formula to count the blanks.

    I think this is a common array, but don't recall how to do it...

    thanks,

    -Eric

  2. #2
    3 Star Lounger
    Join Date
    Jan 2001
    Location
    Flims, Switzerland
    Posts
    271
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: text frequency distribution (2000)

    Eric,

    The easiest way is probably to use a PivotTable.

    1. Give your column of data a heading, e.g. 'animal'.

    2. Create a PivotTable with your column set as the source data.

    3. in the PivotTable Layout, drag the animal column into both the ROW and DATA areas.

    Your PivotTable should now show the unique animals and a count of their instances as you require.

    (PivotTables do have certain limitations, e.g. size, but they do cater nicely for a huge subset of data consolidation or analysis requirements in a very quick and easy fashion)

    Regards,

  3. #3
    Uranium Lounger
    Join Date
    Jan 2001
    Location
    South Carolina, USA
    Posts
    7,295
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: text frequency distribution (2000)

    That does not count the blank cells for me. It shows (blank), but there is no count. Is there any way to get it to count blanks?
    Legare Coleman

  4. #4
    Uranium Lounger
    Join Date
    Dec 2000
    Location
    Salt Lake City, Utah, USA
    Posts
    9,508
    Thanks
    0
    Thanked 6 Times in 6 Posts

    Re: text frequency distribution (2000)

    I can't do it with array formulas or pivot tables. A formula approach from Chip Person is in columns B & C, but it's unsorted and the blanks count shows in every column C cell which is a duplicate.

    FWIW I can kind of do it with a WorkBook_Sheetchange event and Filter, Advanced, Unique, Another Location, per the attached example; see the Workbook Object. Not sure the code is terribly robust, had to mess about to ensure blank cells are counted. Also, I tried the Worksheet_SelectionChange event, see Sheet object, but it runs much slower on my machine than WorkBook_Sheetchange.
    Attached Files Attached Files
    -John ... I float in liquid gardens
    UTC -7ąDS

  5. #5
    4 Star Lounger
    Join Date
    Jan 2001
    Location
    Kortrijk, Belgium
    Posts
    571
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: text frequency distribution (2000)

    I think I am close, but not yet there, anyway, here it is:

    =(IF(COUNTIF($A$1:A1;A1)=1;COUNTIF($A$1:$A$10;$A$1 :$A$10);IF(ISBLANK(A1);COUNTBLANK($A$1:$A$10);"")) )

    where I assume the data is in A1:A10. Put this formula in B1 and drag down and if you then sort - ascending, (if you want) you are close to what you want. One unique entry should only give you one result in column B, except for the blanks, where you get the same count for each blank. If you sort them afterwards, this is not really a problem.
    Hope this helps.

  6. #6
    Uranium Lounger
    Join Date
    Dec 2000
    Location
    Salt Lake City, Utah, USA
    Posts
    9,508
    Thanks
    0
    Thanked 6 Times in 6 Posts

    Re: text frequency distribution (2000)

    Replacing your ";" argument separator with US convention ",", I think you have it (where "Animals" is the range name of the list, and starting with cell A2 as the first data cell instead of A1):

    =(IF(COUNTIF($A$2:A2,A2)=1,COUNTIF(Animals,Animals ),IF(AND(ISBLANK(A2),COUNTBLANK($A$2:A2)<2),COUNTB LANK(Animals),"")))

    I must remember =COUNTBLANK()!
    -John ... I float in liquid gardens
    UTC -7ąDS

  7. #7
    4 Star Lounger
    Join Date
    Jan 2001
    Location
    Kortrijk, Belgium
    Posts
    571
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: text frequency distribution (2000)

    Sorry for the ; John, I should know it by now...

  8. #8
    Uranium Lounger
    Join Date
    Dec 2000
    Location
    Salt Lake City, Utah, USA
    Posts
    9,508
    Thanks
    0
    Thanked 6 Times in 6 Posts

    Re: text frequency distribution (2000)

    Consider it a lesson to me in cross-cultural adapability! <img src=/S/laugh.gif border=0 alt=laugh width=15 height=15> It did throw me for a while, asking myself, what the heck are all these semicolons for?
    -John ... I float in liquid gardens
    UTC -7ąDS

  9. #9
    Lounger
    Join Date
    Oct 2001
    Location
    Isleworth, Gtr London, England
    Posts
    28
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Re: text frequency distribution (2000)

    You can do this without sorting if you have 3 intermediate columns.

    This example is using a list that extends to row 25.
    Assuming that row 1 has text titles, and the data starts in row 2, then in column B from row 2 onwards have this formula
    =MATCH(A2,$A$1:$A$25,0)
    which calculates the row position of the first occurence of any animal.

    And in column C from row 2 onwards have this
    =IF(NOT(ISNA(INDEX($A$1:$A$25,MATCH(ROW(),$B$1:$B$ 25,0)))),"here","not")
    which singles out the first occurence of the results in column B.

    And in column D have the number 1 in cell D1, and then also have this formula
    =MATCH("here",OFFSET($C$1,D1,0,500,1),0)+D1
    from row 2 onwards. Which collects the individual animal names row positions.

    Then in columns E and F have the results, using these formulae. In column E have this formula
    =IF(ISNA(D2),IF(COUNTIF($E$1:E1,"=<blank>")>0,""," <blank>"),INDEX($A$1:$A$25,D2))
    from row 2 onwards. Which creates the names of the animals in the cells at the top of the column,
    together with a special entry for blanks. And then in column F have this formula
    =IF(E2="<blank>",COUNTBLANK($A$1:$A$25),IF(ISNA(D2 ),"",COUNTIF($A$1:$A$25,"="&E2)))
    which gives the count for each animal.

    You can then group the columns B thru' D, or hide them, as they are only necessary as intermediate steps.

    Hope this is useful.
    Glenn B
    Attached Files Attached Files

  10. #10
    3 Star Lounger
    Join Date
    Jan 2001
    Location
    Flims, Switzerland
    Posts
    271
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: text frequency distribution (2000)

    you are right! I completely forgot about the blanks <img src=/S/doh.gif border=0 alt=doh width=15 height=15>

    the only way I think is to replace them with a blank identifier e.g. '-'

    if this is feasible then I still think a PivotTable is by far the easiest way, far simpler then adding columns with long formulae

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •