
Diacritical marks and spelling
Herewith, an explanatory note about diacritical marks and spelling in non-English names.
It identifies some printing and visualization problems with
diacritical marks that do not survive the translation from complex PC/Mac character sets
into the more simplistic world of HTML. This is obvious, for example, when
one examines the entries for
Guner:1986
and following (normally spelled with u-umlaut, and present in this text as part of the
unsupported extended character set (decimal 237) and often visualized by web browsers as
a capital-Y)
or
Carbo
et al:1990 and following (normally spelled with an o-acute, and often visualized by web
browsers as a hyphen). In these cases, the
diacritical mark does not translate properly into HTML, concealing the proper spelling
of the author's name.
Most vowel diacriticals, and a few consonant diacriticals, as would be found in common German, French or Spanish are supported in the MSWord document in their native format (e.g.: Guner, Ariens, Laszlo, Bohm, Zoltan, Carbo, Veronique, Rose, etc.), but, as seen above, are NOT supported in the default HTML formats. Russian and Ukranian names are transliterated in canonical form, following the citation practice of ACS and CAS (as found in the CAS files on STN and DIALOG). However, for Hungarian, Czech and several other Eastern European name forms, the typefaces supported by MSWord and standard Windows/Mac font suppliers seldom allow the construction of such proper forms as "hacheck" (c with inverted-^) or acute-c. In the former case, such as with authors Novic(inv-^) and Kvasnic(inv-^)ka, the diacritical is explained in parenthesis following the consonant. The the latter cases, such as with authors Randic and Trinajstic, the diacritical marks are standalone and precede the consonant. With apologies, many other diacritical marks are simply ignored. Chinese, Japanese and Korean name forms have been re- ordered, as necessary, to follow the American and European convention of placing the family name in primary sort position.
Spelling and diacritical mark variants, when discovered to be different in different database records, have been resolved by reference to the original papers.
For primary sort order, fused diacritical marks have been ignored.