Results 1 to 6 of 6

20140730, 12:07 #1
 Join Date
 Jun 2003
 Location
 Utah, USA
 Posts
 433
 Thanks
 45
 Thanked 5 Times in 5 Posts
Conditionally Void a Cumulative Sum
I have a large data set (800K rows), so I'm looking for 1) any solution to my problem, but 2) an efficient one would be better.
See the attached file ...
I have columns A and B filled with binary data (ones and zeros). The ones and zeros come in bunches. The two columns are not correlated.
I want to fill column C from column A. If the cell in A is zero, set the cell in C to be zero (no problem). If the cell in A is a 1, start a cumulative sum in column C (so 0,1,1,1,0) in A converts to (0,1,2,3,0) in C. Again, not a problem.
Here's the problem: I want to mask some of column C and put the result in column D. If there's any cell in column B that's a 1, and there's a cumulative sum next to it in column C, then I want the entire set of cells in the cumulative sum to be reset to 0 in column D, including the rows before and after the one where there was a 1 in column B. In the example above, a (1,0,0,0,0) or (0,0,0,0,1) in B does nothing to C, but a (0,1,0,0,0), or a (0,0,1,0,0) or a ((0,0,0,1,0) would convert D to a (0,0,0,0,0). So do other combinations with more than one 1 in any of those three spots.
For me, the roadblock is that I never know how many cells in D — before the 1, or after the 1, shows up in B — need to be reset to 0.

20140730, 13:31 #2
 Join Date
 Jul 2002
 Location
 Pittsburgh, Pennsylvania, USA
 Posts
 11,225
 Thanks
 14
 Thanked 342 Times in 335 Posts
How about this in D2:
Code:=IF(OR(A2=0,B2=1,AND(A1>0,D1=0)),0,IF(SUM(OFFSET(A2,0,1,MATCH(0,OFFSET(A2,0,0,COUNT(A:A),1),0)1,1))=0,1+D1,0))
=IF(A2=0,0,1+C1)
Steve

The Following User Says Thank You to sdckapr For This Useful Post:
boobounder (20140801)

20140730, 16:33 #3
 Join Date
 Jun 2003
 Location
 Utah, USA
 Posts
 433
 Thanks
 45
 Thanked 5 Times in 5 Posts
Promising (seems OK at the top). But ... it returned a #REF! from row 262,600 downward.
I think I know where the problem is, but I'm not sure how to address it.
Because there are 786,435 rows in my worksheet, the count(A:A) portion of the formula will return a large height =offset(), which may carry that range beyond the number of rows allocated by Excel.
***********************
Also, I have 2 rows of labels at the top, and 1 row of labels at the bottom which will make the count(A:A) slightly off. But, I'm not quite sure how this is working yet, so I'm not sure how to adjust for those. What's the best way to make that modification?

20140730, 21:04 #4
 Join Date
 Jun 2003
 Location
 Utah, USA
 Posts
 433
 Thanks
 45
 Thanked 5 Times in 5 Posts
Need confirmation ...
I believe I have solved this problem by changing the count(A:A) bit to count(A2:A$786434) for an entry in row 2.
Am I missing anything that anyone can see?

20140731, 04:46 #5
 Join Date
 Jul 2002
 Location
 Pittsburgh, Pennsylvania, USA
 Posts
 11,225
 Thanks
 14
 Thanked 342 Times in 335 Posts
You could use:
=IF(OR(A2=0,B2=1,AND(A1>0,D1=0)),0,IF(SUM(OFFSET(A 2,0,1,MATCH(0,OFFSET(A2,0,0,COUNT(A:A)COUNT($A1:A1),1),0)1,1))=0,1+D1,0))
I reduce the count by the number of entries before. I just used count since I needed to ensure that I had the whole range of rows in a "bunch". If you know the bunch will have a max number, you could just use that number (10, 100, etc). The exact number is not critical, it just has to be large enough to see the transition in the "bunch" it currently is on.
Steve

The Following User Says Thank You to sdckapr For This Useful Post:
boobounder (20140801)

20140801, 01:52 #6
 Join Date
 Jun 2003
 Location
 Utah, USA
 Posts
 433
 Thanks
 45
 Thanked 5 Times in 5 Posts
Thanks a bunch.
I'm actually parsing through a matrix that's written out as a vector. And I can replace the count() with a large integer, since what I'm counting up has to all fall within a single row that's about 1000 elements long. I'm sure that will save time.
EOM