1. ## grouping (excel2003)

hi all,
sample of the data i have is:
name id X Y group
KALM 1 3039 26813 1
KALM 2 3053 26866
KALM 3 3038 26759
KALM 7 2942 26842
KALM 8 3095 27100
KALM 9 2995 27196
KALM 10 3037 26846

I need a mean to group the data into say 5( desired is any number) by conducting internal evaluation based on X&Y value and a spefied delta x and delta y value,say plus or minus 10 and output the group # result into the group column.

regards

2. ## Re: grouping (excel2003)

I don't understand what you are asking. Could you elaborate?

3. ## Re: grouping (excel2003)

hi steve,
i plotted the data in the attached file to simplify the explination. what i need is a mean( formula,code,add-ins) to group the ID. the criterias could be the X&Y value and/or a pecified ∆X & ∆Y.
regards.

4. ## Re: grouping (excel2003)

Perhaps someone has a solution lying around, or is willing to create one for you. But frankly, I think this question is beyond the scope of the Lounge. Analyzing and grouping spatial data is a specialized subject, for which highly sophisticated algorithms and software exist.

5. ## Re: grouping (excel2003)

Thanks HansV,

to add to my explination, consider a grid designed based on the minimum and the maximum values of x&y using a defined increment. Then,for similicity say the final grid design my consist of 20 blocks(groups). output the IDs in each block(group).

Regards

6. ## Re: grouping (excel2003)

<P ID="edit" class=small>(Edited by sdckapr on 12-Mar-06 10:47. Added PS)</P>By waht basis are you determining that they are "different"? How do you know that spread is not just some type of random variation?

If the data were someone firing a gun at a target, there is no need to separate them, they are from the same poplulation.

If the data was some type of XY "correlation" (Y dependent on X) then the groups may also be meaningless or there could be multiple possibilities of groups.

The min group would be 1, presuming they are from the same poplulation (The target analogy).

I can think of a a couple 2 group examples:
7 and 9 forming 1 relationship "trend" and the others forming another
or
8 and 9 forming 1 relationship "trend" and the others forming another

There are several 3 groupings:
8 and 9 separate from all trend in all the rest
7 and 9 separate from the trend in all the rest
Etc, etc.

If you are putting things into groups there should be a logical scheme to do it based on something other than the results. It would be better to decide on some reason first.

If you want to proceed a criteria of "distance may be possible. How far apart would the points be to be considered "different"?

Steve

ps in addition how do you want to balance the number of groups vs the distrance. One approach may be to minimize the deviations within a group, but doing this would give you a group for each point. The desire to balance the deviations and the number of groups must me made. How would you like to balance this?

7. ## Re: grouping (excel2003)

Steve,

Consider the id as well number in a field with x & y as the location co-ordinates for that well. I am trying to divide the field into blocks and identify wells by a block name which could be a number or a letter. What I need is something dynamic that allows changing the distance to increase or decrease the number of groups which will affect the number of wells (ID) in a block.

In the posted example the minimum and the maximum values for the X are 2942 & 3095, a length of x of 153. At increment or a distance value of 15.3 it will give 10 blocks along the x axis. For the Y the minimum & maximum are 26759 & 27196, a length of y of 437. At increment or a distance value of 43.7 it will give 10 blocks. As such there will be 100 blocks. The id will be distributed in these blocks (1-100).

In summary, let

8. ## Re: grouping (excel2003)

But doesn't that make more than the 4 groups? It makes them all separate groups except for 1 and 10.

If you divide into 4 squares you get the 4 groups you indicate.

Are you after trying to try different number of squares (1,4, 9, 16, 25, 36, 49, 64, etc) and see how many groups there are for each? How would you want to optimize it to minimize the number of groups? The Minimum will always be 1, unless you have some penalty or benefit to calculate with the number of squares vs the number of groups.

I had not thought a grid (I was thinking more in lines of calculating circles surrounding the groups with a given radius. In youyr scheme you get arbitrary separation of the close points just due to the grid you select.

Since I am not sure what you really are after, I don't know the best approach.

Steve

9. ## Re: grouping (excel2003)

Steve,

The data I included under the group column in the attached file is based on how the data looks like on the plot, it's not based on grids or squares. You are right a grid of 4 squares will give four groups. yes I am after trying different number of squares ,say if the grid has 100 squares and each square represent a group and they are numbered 1,2,3,4,5........100. What will be the square number (group) for each ID.? Remember the number of squares (groups) is determined by the increment values of x &Y.

A criterion of how many ids per square or group is helpful to add which I am going around it by the flexibility of having 100 squares or groups.

10. ## Re: grouping (excel2003)

If you put the number of grids in a named range (NumGrids) you can use the formula in G4:

=(SQRT(NumGrids)-INT((F4-MIN(\$F\$4:\$F\$10))/((MAX(\$F\$4:\$F\$10)+0.1-MIN(\$F\$4:\$F\$10))/SQRT(NumGrids)))-1)*SQRT(NumGrids)+INT((E4-MIN(\$E\$4:\$E\$10))/((MAX(\$E\$4:\$E\$10)+0.1-MIN(\$E\$4:\$E\$10))/SQRT(NumGrids)))+1

You can copy it down the columns. [The 0.1 is so the max Y or MaxX value is not indicated in the next grid since if it were on the line it would be the next one. ]

Note I counted the grids from the upper left and then across the columns, then moved down a row. If you number them a different way the calc will have to be changed.

Steve

11. ## Re: grouping (excel2003)

Thanks Steve,
Sorry for the late reply. I will use the formula and post the findings shortly. For my clarity NumGrids is a single integer value.

12. ## Re: grouping (excel2003)

It does not have to be.

It makes the most sense if the number is a square: 1,4,9,16,25,36,49, etc

so that the you have the X/Y divided evenly. If you example your "NumGrids" is 2 you will still have 4 grids, they just won't all be the same size the last column and last row will be smaller.

You could replace sqrt(NumGrids) with an integer value of how much to divide x and Y by: this will be an integer value(1,2,3,4,5,6,7, etc) and would give you the number of grids as 1,4,9,16,25,36,49, etc

It all depends on how you want to think about it. I used NumGrids since that is what you used (you mention 100 squares, not 10x10)
Steve

13. ## Re: grouping (excel2003)

Steve,
Steve,
The formula does the group assignment based on square grids perfectly. After examining it with my data, I think it will be more helpful if it can accept a variable grid dimensions, that is to say, I specify the ∆X&Y values for the group #. Is it a possible modification?
Regards

14. ## Re: grouping (excel2003)

NumX is the number of x segments, NumY is the number of y segments:
=(NumY-INT((F4-MIN(\$F\$4:\$F\$10))/((MAX(\$F\$4:\$F\$10)+0.1-MIN(\$F\$4:\$F\$10))/NumY))-1)*NumX+INT((E4-MIN(\$E\$4:\$E\$10))/((MAX(\$E\$4:\$E\$10)+0.1-MIN(\$E\$4:\$E\$10))/NumX))+1

Steve

15. ## Re: grouping (excel2003)

many thanks Steve.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•