Help to tidy up information, standardise it & spot the clashes

ellison

Active Member
Joined
Aug 1, 2012
Messages
343
Office Version
  1. 365
Platform
  1. Windows
Hi, we are trying to tidy up some information where we have:
- Column A in the inputs
- Column B is the status which has been added, but needs tidying up
At the moment we use a combination of sorting information, vlookups and pot luck to try to come up with the tidied up version of the Status info, but we are wondering if there may be a better method?!
Hopefully this shows some detail about what we are trying to get at...
But if you have any questions, please just ask.

Please note we don't need the notes column automating, it's just there to help to show what we are trying to achieve in the Status-V2 column :)


RowInputStatus-OrigNotesStatus-V2
1AlphaAlpha: 2 blank entries and 1 x "pending", so "pending" woud be in Status V2:pending
2AlphapendingAlpha: 2 blank entries and 1 x "pending", so "pending" woud be in Status V2:pending
3AlphaAlpha: 2 blank entries and 1 x "pending", so "pending" woud be in Status V2:pending
4BetacompletedBeta: 0 blank entries and 1 x "completed", so "completed" would be in Status V2:completed
5CharliecompletedCharlie: 1 blank entries and 1 x "completed", so "completed" would be in Status V2:completed
6CharlieCharlie: 1 blank entries and 1 x "completed", so "completed" would be in Status V2:completed
7DeltacompletedDelta: 0 blank entries and 1 x "completed" AND 1 x "WIP", so "_clash" would be in Status V2:_clash
8DeltaWIPDelta: 0 blank entries and 1 x "completed" AND 1 x "WIP", so "_clash" would be in Status V2:_clash
9EchoWIPEcho: 0 blank entries and 1 x "WIP", so "WIP" would be in Status V2:WIP
10FoxtrotFoxtrot: 1 blank entries Status V2 would be blank:
 

Excel Facts

Move date out one month or year
Use =EDATE(A2,1) for one month later. Use EDATE(A2,12) for one year later.
Try

Book2
ABCDEF
1RowInputStatus-OrigNotesStatus-V2
21AlphaAlpha: 2 blank entries and 1 x "pending", so "pending" woud be in Status V2:Pending
32AlphapendingAlpha: 2 blank entries and 1 x "pending", so "pending" woud be in Status V2:Pending
43AlphaAlpha: 2 blank entries and 1 x "pending", so "pending" woud be in Status V2:Pending
54BetacompletedBeta: 0 blank entries and 1 x "completed", so "completed" would be in Status V2:Completed
65CharliecompletedCharlie: 1 blank entries and 1 x "completed", so "completed" would be in Status V2:Completed
76CharlieCharlie: 1 blank entries and 1 x "completed", so "completed" would be in Status V2:Completed
87DeltacompletedDelta: 0 blank entries and 1 x "completed" AND 1 x "WIP", so "_clash" would be in Status V2:_Clash
98DeltaWIPDelta: 0 blank entries and 1 x "completed" AND 1 x "WIP", so "_clash" would be in Status V2:_Clash
109EchoWIPEcho: 0 blank entries and 1 x "WIP", so "WIP" would be in Status V2:WIP
1110FoxtrotFoxtrot: 1 blank entries Status V2 would be blank: 
12
Sheet3
Cell Formulas
RangeFormula
E2:E11E2=IF(AND(ISNUMBER(SEARCH("blank",D2)),ISNUMBER(SEARCH("Completed",D2)),ISNUMBER(SEARCH("WIP",D2))),"_Clash",IF(AND(ISNUMBER(SEARCH("blank",D2)),ISNUMBER(SEARCH("Completed",D2))),"Completed",IF(AND(ISNUMBER(SEARCH("blank",D2)),ISNUMBER(SEARCH("Pending",D2))),"Pending",IF(AND(ISNUMBER(SEARCH("blank",D2)),ISNUMBER(SEARCH("WIP",D2))),"WIP",""))))
 
Last edited:
Upvote 0
Two other options

hulp key.xlsx
ABCDEF
1RowInputStatus-OrigNotesFormula 1Formula 2
21AlphaAlpha: 2 blank entries and 1 x "pending", so "pending" woud be in Status V2:PendingPending
32AlphapendingAlpha: 2 blank entries and 1 x "pending", so "pending" woud be in Status V2:PendingPending
43AlphaAlpha: 2 blank entries and 1 x "pending", so "pending" woud be in Status V2:PendingPending
54BetacompletedBeta: 0 blank entries and 1 x "completed", so "completed" would be in Status V2:CompletedCompleted
65CharliecompletedCharlie: 1 blank entries and 1 x "completed", so "completed" would be in Status V2:CompletedCompleted
76CharlieCharlie: 1 blank entries and 1 x "completed", so "completed" would be in Status V2:CompletedCompleted
87DeltacompletedDelta: 0 blank entries and 1 x "completed" AND 1 x "WIP", so "_clash" would be in Status V2:_Clash_Clash
98DeltaWIPDelta: 0 blank entries and 1 x "completed" AND 1 x "WIP", so "_clash" would be in Status V2:_Clash_Clash
109EchoWIPEcho: 0 blank entries and 1 x "WIP", so "WIP" would be in Status V2:WIPWIP
1110FoxtrotFoxtrot: 1 blank entries Status V2 would be blank:
Sheet1
Cell Formulas
RangeFormula
E2:E11E2=LET(ar,{"Pending","Completed","WIP"},BYROW(ISNUMBER(SEARCH(ar,D2:D11)),LAMBDA(a,SUBSTITUTE(TEXTJOIN(,,FILTER(ar,a,"")),"CompletedWIP","_Clash"))))
F2:F11F2=IFERROR(CHOOSE(MMULT(ISNUMBER(SEARCH({"Pending","Completed","WIP"},D2:D11))*{1,2,3},{1;1;1}),"Pending","Completed","WIP",,"_Clash"),"")
Dynamic array formulas.
 
Upvote 0
Hi @JEC

Can you please explain you first formula, its beyond my capability.
 
Upvote 0
It starts with this

=ISNUMBER(SEARCH({"Pending","Completed","WIP"},D2:D11) --> gives array of true and falses

If you put this in the BYROW, it iterates from top to bottom.
Here some detail. The next step would be the substitute

Book1
GHIJK
1Starting pointByrow
2
3TRUEFALSEFALSEPending
4TRUEFALSEFALSEPending
5TRUEFALSEFALSEPending
6FALSETRUEFALSECompleted
7FALSETRUEFALSECompleted
8FALSETRUEFALSECompleted
9FALSETRUETRUECompletedWIP
10FALSETRUETRUECompletedWIP
11FALSEFALSETRUEWIP
12FALSEFALSEFALSE
Sheet2
Cell Formulas
RangeFormula
G3:I12G3=ISNUMBER(SEARCH({"Pending","Completed","WIP"},D2:D11))
K3:K12K3=BYROW(G3#,LAMBDA(a,TEXTJOIN(,,FILTER({"Pending","Completed","WIP"},a,""))))
Dynamic array formulas.
 
Upvote 0
hi, huge thanks for your suggestions, which will work great. But (why is there so often a but!) I should have a been a little bit more clear…

The reports I try to tidy up generally are about 10k rows, with around 500-ish different entries in the status column. (But those 500 aren’t static, new ones are added all the time)

So I’n not I could enter each of the unique pieces of text into the formula, well not without giving myself brain freeze :)

Sort of wondering another way of me explaining slightly better the possible steps would be:

-For each set of statuses that is attached to each entry
-count the number of ‘different’ statuses
-but do ‘not’ count ‘blanks’
-if that number for each status is more than 1 (not counting blanks), print”_ clash” in the status-v2 field
-if the number is 1, replicate whatever the entry is in the status-v2 field
- if the number is 0, leave the status-v2 field blank

Sorry - I really should have said at the outset that the length of the files

Happy to use helper columns / macro’s if that helps.

Thanks again
 
Upvote 0
Ok, try this instead

Excel Formula:
=IFERROR(DROP(REDUCE("",D2:D11,LAMBDA(a,b,VSTACK(a,LET(z,UNIQUE(TEXTBEFORE(TEXTAFTER(b,"""",{1,3}),""""),1),IF(COUNTA(z)=1,z,"_Clash"))))),1),"")
 
Upvote 0
Hi Jec, that is an astonishing formula, wow - kudos!
I'm really sorry - but the "notes" column in Column D isn't ever actually on the info that we receive.
(I only included the info in Column D to try to illustrate the end result of what we are trying to acheive in Status-V2. I don't think we'd actually be able to keep up with making a notes column for each report that comes in... :( )

For what it's worth, the formula works beautifully on the notes info - BUT the only info that we actually receive is the info info in columns A (which a row ID), B (which is the "Input") and C( which is the "Status-Orig").

Hope that helps.
 
Upvote 0
The reports I try to tidy up generally are about 10k rows, with around 500-ish different entries in the status column.
Hi, I don't know how well this will perform speed wise with that many rows, but here is another option you could try.

Book2
ABCDE
1RowInputStatus-OrigStatus-V2
21Alphapending
32Alphapendingpending
43Alphapending
54Betacompletedcompleted
65Charliecompletedcompleted
76Charliecompleted
87Deltacompleted_Clash
98DeltaWIP_Clash
109EchoWIPWIP
1110Foxtrot 
Sheet1
Cell Formulas
RangeFormula
E2:E11E2=LET(u,UNIQUE(FILTER($C$2:$C$11,($B$2:$B$11=B2)*($C$2:$C$11<>""),"")),IF(COUNTA(u)>1,"_Clash",u))
 
Upvote 0
Solution
Wow, that has worked beautifully - just tried on a dataset of 10K rows...

Many thanks to you and to JEC for persevering with that solution!

All the best
 
Upvote 0

Forum statistics

Threads
1,215,343
Messages
6,124,398
Members
449,155
Latest member
ravioli44

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top