Two popular products of LIVAC are the annual New Chinese Buzzwords Roster and the annual Newsmakers Roster. They reflect on the life and times of the Chinese communities in the previous year. The 2020 Livac Pan-Chinese Newsmaker Roster has been relseased.
Shakespeare reminds us that the world is like a stage and that transient players on it come and go. This week the LiVaC 2020 Pan-Chinese newsmaker rosters from Beijing, Hong Kong and Taipei have been released [available HERE] and we can see this to be the case, especially when we compare them with past years. Many names on the rosters are politicians for the praiseworthy or deplorable roles they have played in society the year before, or because of individuals’ perpetuation of bad deeds (e.g. corruption).
These people can be KOLs (Key Opinion Leaders) because what they say is well noted and influences others (e.g. Trump and riots on Capitol Hill in Washington DC) or they can be objects of attention of KOLS (usually for morally or physically depraved deeds, e.g. Bin Ladin)
The names of newsmakers belong to the class of Proper Nouns. The others are Place names and Organization names and they are the most numerous in the 2.5 million words we have collected over 22 years in our LiVaC corpus (please seeHere). The identification of proper names is very important in NLP (Natural Language Processing) and in information retrival and filtering. Some words in English can be ambiguous as to which kind of proper nouns they may belong. For example, FUJI may be a personal name e.g. [Mr. Fuji], a place name [Mount Fuji] and an organization name [Fuji Film]. The problem of making the proper distincion in English can be important to facilitate information retrieval and is also shared by Chinesse.
However, in contrast to Western languages, Chinese presents an unusual challenge for the computer. This is because Chinese has no capital letters to distinguish proper nouns from other kinds of words and so Chinese NLP has to overcome an additional critical hurdle. The solution has to be based on accumulated knowledge of existing proper nouns. But then, as noted in our earlier blog, new words always crop up, as do new names of players on the World Stage.