Flash Fill: Data Extraction in MS Excel

In its most recent iterations, Microsoft Excel has changed only minimally, at least as far as the average user is concerned.  Microsoft have only added a few ‘bells and whistles’ to the familiar interface, which have not changed the user’s experience in any fundamental way.  However, one particularly useful enhancement was added to the 2013 version to make the challenge of data extraction both simpler and more intuitive.

Imagine that you are called upon to work with and manage a dataset in an Excel spreadsheet that contains a series of individual names, be they employees, clients, stakeholders or what have you.  The names are contained within a single column, with each consecutive cell containing both the first and last name of the individual concerned, as in the example below:

The problem here lies in the difficulty involved in trying to manage and analyse data in this configuration.  For example, if you want to sort these records in order of the last name, there is no simple way to accomplish this.  With the first and last names combined in the same column in this way, life is much more difficult when it comes to the manipulation and meaningful interpretation of these records.  You really need to separate the first and last names into separate distinct columns to provide you with the flexibility needed in order to facilitate the greater manipulation you desire.

There exist two obvious ways to achieve this.

  • There has long been a feature in Excel called Text to Columns, found a little more than ½ way along the Data ribbon, which exists for the sole purpose of dividing single columns of text into multiple columns of text. This can do the job well, but it requires the user to navigate their way through a series of screens in a wizard to achieve the desired outcome.
  • You can undertake an exhaustive (and quite possibly exhausting) cut and paste operation to create the separate columns as required.  The obvious problem with this is that you may have a significant list of, for example, 100 (or more) names; going the cut-and-paste route could be tedious and long-winded in the extreme.

There must be a better way to achieve this kind of data extraction in a simpler and more intuitive way. Happily, from the 2013 version onwards, there is.

If you go to a blank column adjacent to the list of names and enter a copy of the first name (insert a new blank column if necessary), you can then use the Fill Handle i.e. the little green square in the lower right corner of the cell to copy this name down to the rest of the cells in the column below.  The use of the fill handle in this way will simply copy the contents of the original cell, which is obviously not what you want.

However, if you immediately activate the Autofill Options menu that will appear in the lower right corner of your destination range, you will see that the last choice at the foot of this menu is called Flash Fill.  If you choose this command, Excel will then produce a list of names from the same relative location in the source column, as in the example below:

Flash Fill can also be found in an alternative location on the immediate right of Text to Columns, in the Data Tools group on the Data tab.  The only possible complication with this method of data extraction is the circumstance that the data in the source column does not all conform to the exact same pattern.  The example that I’ve used so far assumes that each cell in the source contains the first name at the start followed by the last name with a space separating the names.  If one or more of the names doesn’t follow the same pattern, Flash Fill is not clever enough to cope with that situation and will not produce a result for these names as intended; you may find that you will be left with a blank row in this location.

However, by and large Flash Fill is a useful addition to the Excel repertoire, and can provide a much simpler and more efficient means of extracting data across a range of different scenarios.  For example, it is often used to extract the individual elements from an address field and hence create separate columns for street, suburb, state, postcode et al.  This example described here merely skims the surface of the applications of Flash Fill in Excel: it can be a most versatile and flexible element in your spreadsheet toolkit.