Correct Name Capitalization in PHP

One annoying scenario is when you let users enter their names and then you need to output their names nicely, for example in a newsletter. Some users simply enter their names in upper/lowercase, but obviously when you address them you can’t do the same. On the other hand PHP’s ucfirst() and ucwords() functions are too naive for proper capitalization.

Let’s consider a few use cases:

 Original  ucwords(strtolower($str)) proper
michael o’carrol Michael O’carrol Michael O’Carrol
lucas l’amour Lucas L’amour Lucas l’Amour
george d’onofrio George D’onofrio George d’Onofrio
william stanley iii William Stanley Iii William Stanley III
UNITED STATES OF AMERICA United States Of America United States of America
t. von lieres und wilkau T. Von Lieres Und Wilkau T. von Lieres und Wilkau
paul van der knaap Paul Van Der Knaap Paul van der Knaap
jean-luc picard Jean-luc Picard Jean-Luc Picard
JOHN MCLAREN John Mclaren John McLaren
hENRIC vIII Henric Viii Henric VIII
VAsco da GAma Vasco Da Gama Vasco da Gama

You get the picture.

To make this work we need to observe three things:

  • some words should be separated not just by space, but also by hypens and apostrophes.
  • some words (especially “of” variations in different languages) must always be lower case.
  • On the contrary, a some words like roman numerals must always be upper case.

This is what I came up with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
function titleCase($string) 
{
	$word_splitters = array(' ', '-', "O'", "L'", "D'", 'St.', 'Mc');
	$lowercase_exceptions = array('the', 'van', 'den', 'von', 'und', 'der', 'de', 'da', 'of', 'and', "l'", "d'");
	$uppercase_exceptions = array('III', 'IV', 'VI', 'VII', 'VIII', 'IX');
 
	$string = strtolower($string);
	foreach ($word_splitters as $delimiter)
	{ 
		$words = explode($delimiter, $string); 
		$newwords = array(); 
		foreach ($words as $word)
		{ 
			if (in_array(strtoupper($word), $uppercase_exceptions))
				$word = strtoupper($word);
			else
			if (!in_array($word, $lowercase_exceptions))
				$word = ucfirst($word); 
 
			$newwords[] = $word;
		}
 
		if (in_array(strtolower($delimiter), $lowercase_exceptions))
			$delimiter = strtolower($delimiter);
 
		$string = join($delimiter, $newwords); 
	} 
	return $string; 
}

This should work for most cases. I did not test it for non-latin alphabets.

2 replies
  1. Noah M
    Noah M says:

    Wow. I just had a client meeting where she complained about people entering all caps and making her pretty website look ugly. I can’t wait to give this a try! Thanks!

Comments are closed.