Name Formatting

Remove dates and Unwanted Characters from Name Fields

One item that came up in this example was that some items had birth and death ranges that we needed to remove. While you can find and replace specific punctuation in Google Sheets, random sets of yyyy date ranges is a little trickier. Using OpenRefine:

  • Select the column with the author names
  • Select the dropdown menu > Edit Cells > Transform
  • Enter the following GREL expression:
value.replace(/,\s?\d{4}-?/, "")
Gif of the OpenRefine Interface Removing Dates and Unwanted Characters from Name Fields
Removing Dates and Unwanted Characters from Name Fields

Explanation of the Formula

  • value: Refers to the content of each cell in the column.
  • .replace(...): Applies a regular expression to find and replace the birth date portion.
  • /,\s?\d{4}-?/: This regular expression matches:
    • ,: A comma
    • \s?: An optional space
    • \d{4}: Exactly four digits (representing the year)
    • -?: An optional hyphen (to account for cases where the year might or might not be followed by a dash)

This formula will remove any pattern like , 1957- or ,1957 from the names, leaving just the author’s name without the birth date.


Standardizing Author Names to “Last Name, First Name” Format

While the names in the example metadata we are using were already formatted correctly, the following steps will show you how to transform names from “First Last” to “Last, First” format, using GREL to split and reorder the names.

a. Select Dropdown of Author Column

b. “Edit cells” > “Transform…”

c. Enter the Transformation Expression:

 ```
if(value.split(" ").length() == 3,    value.split(" ")[2] + " " + value.split(" ")[1] + ", " + value.split(" ")[0],    value.split(" ")[1] + ", " + value.split(" ")[0] )
 ```
  • Click “OK” to apply the transformation.
Gif of the OpenRefine Interface Converting Author Names from First Name Last Name to Last Name, First Name
Converting Author Names from First Name Last Name to Last Name, First Name

Explanation of the Formula

  1. value.split(" ").length() == 3: Checks if there are three parts in the name.
  2. For Three-Part Names: If the name has three parts (e.g., "first name middle initial last name"), it reorders to "last name middle initial, first name".
  3. For Two-Part Names: If there are only two parts (e.g., "first name last name"), it reorders to "last name, first name".