********************************** COMMENTS Welcome to the fourth issue of the Excel Experts E-letter (or EEE), by David Hager. EEE is a semi-monthly publication. Feel free to distribute copies of EEE to your friends and colleagues. Again, I have selected a theme for this issue. This time it is dealing with unique items in a dataset. Remember that EEE is devoted to sharing ideas across the Excel developer community, so if you have some great Excel technique that is not well- known, send it to me and I will include it in a future issue. For back issues of EEE, check out: http://www.j-walk.com/ss/excel/eee/index.htm ********************************** TOP EXCEL WEB SITES There is a wealth of information at the following site. http://www.baarns.com/IE4/index_devonly.asp However, new material has not been added in quite some time. ********************************** WORKSHEET FORMULA TIPS Created by David Hager This array formula returns the number of unique items in a worksheet range. =SUM(1/COUNTIF(Rng,Rng)) However, if Rng contains blank cells, this formula returns an error. In this case, use this modified version of the formula. =SUM(COUNTIF(Rng,Rng)/IF(NOT(COUNTIF(Rng,Rng)),1,COUNTIF(Rng,Rng))^2) Created by David Hager This array formula returns the Nth largest unique value in a column range. =LARGE(IF(MATCH(Rng,Rng,0)=ROW(Rng)-MIN(ROW(Rng))+1,Rng,""),N) Created by David Hager To apply data validation to a column which allows only unique items to be entered, highlight that column and select (in Excel 97 and later versions) Data, Validation from the menu. Choose the custom option and enter the following formula (for column A): =COUNTIF($A$1:A1,A1)=1 Created by Laurent Longre This formula counts the number of unique items a column range, only if the cells in the lookup range contain the specified string. =SUM(N(FREQUENCY(IF(lookupRange="specifStr",MATCH(colRange,colRange,0)), MATCH(colRange,colRange,0))>0)) ********************************** POWER FORMULA TECHNIQUE Created by David Hager, Bob Umlas and Laurent Longre The problem - to create an array containing only the unique items from an expanding column list. In other words, if items are typed down column A, what is the formula that will return the unique items? The following example further illustrates the problem. ColA a b 1 b 3 In this case, the array should be {"a";"b";1;3}. Then, if additional values are added: a b 1 b 3 c 1 d a b the array should be {"a";"b";1;3;"c";"d"}. The answer to this problem has eluded me for years, but with recent input from Bob and Laurent, I have successfully constructed a solution to this problem. The formula is somewhat long, so it is necessary to define parts of the formula to simplify the final form. Define TheList as: =OFFSET(Sheet1!$A$1,,,COUNTA(Sheet1!$A:$A),) This formula creates the expanding range for the items as they are entered into column A. Define sArray as: =SMALL(IF(MATCH(TheList,TheList,0)=ROW(TheList),ROW(TheList),""), ROW(INDIRECT("1:"&SUM(N(MATCH(TheList,TheList,0)=ROW(TheList))))))-1 This formula contains several important elements that require explanation. The formula IF(MATCH(TheList,TheList,0)=ROW(TheList),ROW(TheList),"") returns an array of positions for the unique items that is the same size as the TheList array, where the duplicates items are now represented by empty strings. The formula ROW(INDIRECT("1:"&SUM(N(MATCH(TheList,TheList,0)=ROW( TheList))))) returns an array of numbers from 1 to n, where n is the number of unique items in the list, as calculated by the formula SUM(N(MATCH(TheList ,TheList,0)=ROW(TheList))). What is desired is an array that contains the unique positions with no empty strings. This is accomplished by the use of the SMALL function which, along with the LARGE function, is unique among Excel functions in its ability to create different sized arrays than the array used in the 1st argument if the 2nd argument is also an array. The -1 is used to adjust the item positions for use in the formula shown below. Define TheUniqueArray as: =IF(T(OFFSET(TheList,sArray,,1))="",N(OFFSET(TheList,sArray,,1)), T(OFFSET(TheList,sArray,,1))) The formula OFFSET(TheList,sArray,,1) is an array of single element arrays, as explained in detail in the 1st issue of EEE. It can be converted into a normal array by using the N or T functions. Both N and T are used here since TheList can contain either text or numeric items. WARNING: ALTHOUGH THIS TECHNIQUE WORKS, IT MUST NOT BE APPLIED TO SITUATIONS WHERE THERE ARE LARGE ARRAYS (>1000 ITEMS). EXCEL'S CALCULATION ENGINE RUNS MUCH TOO SLOWLY ON THIS TYPE OF FORMULA. FOR EXAMPLE, IT CALCULATES ~100 TIMES SLOWER THAN THE PROGRAMMING TECHNIQUES FOR COUNTING UNIQUE ITEMS THAT ARE SHOWN BELOW FOR AN ARRAY OF 5000 ITEMS. ********************************** VBA CODE EXAMPLES Here are four examples of counting unique values in a list. Each of these examples creates an array of the unique items, so they can be modified to to those arrays for a purpose other than just counting the unique items. Created by David Hager Sub cMethodAdvFilter() CountUniqueByAdvFilter Selection.Address End Sub Sub CountUniqueByAdvFilter(mRange As String) Dim TheRange As String Application.ScreenUpdating = False TheRange = "'[" & ActiveWorkbook.Name & _ "]" & ActiveSheet.Name & "'!" & mRange Workbooks.Add Range(TheRange).AdvancedFilter Action:=xlFilterCopy, CopyToRange _ :=Range("A1"), Unique:=True MsgBox Application.WorksheetFunction.CountA(Range("A:A")) ActiveWorkbook.Close False Application.ScreenUpdating = True End Sub Created by Keyuan Jiang Sub cMethodDAO() Dim strDBFullName As String Dim dbData As Database, rstWork As Recordset, strSQL As String strDBFullName = ThisWorkbook.Path & "\" & ThisWorkbook.Name strSQL = "select distinct [your_field] from dataarea" 'Appropriate driver needed for this statement Set dbData = OpenDatabase(strDBFullName, False, True, _ "Excel8.0;HDR=YES;") Set rstWork = dbData.OpenRecordset(strSQL) rstWork.MoveLast MsgBox rstWork.RecordCount Set rstWork = Nothing Set dbData = Nothing End Sub where [your_field] is the header of the column you are interested in and the dataarea is a named area that contains all data in question (could be the single column you are interested in). By David Hager Sub CountUniqueByPivotTable() On Error GoTo uOut Application.ScreenUpdating = False Application.DisplayAlerts = False TheHeader = ActiveCell.Value ActiveSheet.PivotTableWizard SourceType:=xlDatabase, _ SourceData:=ActiveSheet.Name & "!" & _ Selection.Address, TableDestination:="", TableName:="uPivotTable" ActiveSheet.PivotTables("uPivotTable").AddFields RowFields:=TheHeader ActiveSheet.PivotTables("uPivotTable").PivotFields(TheHeader). _ Orientation = xlDataField MsgBox Application.WorksheetFunction.CountA(Range("a:a")) - 3 ActiveSheet.Delete Application.ScreenUpdating = True Application.DisplayAlerts = True Exit Sub uOut: Application.ScreenUpdating = True Application.DisplayAlerts = True End Sub By John Walkenbach Sub cMethodByCollection() CountUniqueByCollection Selection.Address End Sub Sub CountUniqueByCollection(AllCells As String) Dim NoDupes As New Collection On Error Resume Next For Each Cell In Range(AllCells) NoDupes.Add Cell.Value, CStr(Cell.Value) 'Note: the 2nd argument (key) for the Add method must be a string Next Cell On Error GoTo 0 End Sub Although not tested extensively, it appears that the procedure that uses the Collection object produces the fastest result. ********************************** EXCEL 2000 TIP Under Tools, Options, View there is a checkbox entitled "Windows in Taskbar". When it is checked, every file/window that is open in Excel has its own Taskbar button. Uncheck that box if you do not want this feature. Under View, Toolbars, Customize, Options there is a checkbox entitled "Menus show recently used commands first". When it is checked, menu items change their position based how often they are used. Uncheck that box if you do not want this feature. ********************************** DID YOU KNOW?... that the Office Spreadsheet Component can have up to 676 columns. ********************************** Issue No.4 OF EEE (PUBLISHED 30Apr1999) Next issue scheduled for 16MAY1999. BY David Hager dchager@compuserve.com