CN115310458A - Name translation method, system, equipment and computer readable storage medium - Google Patents

Name translation method, system, equipment and computer readable storage medium Download PDF

Info

Publication number
CN115310458A
CN115310458A CN202210949527.3A CN202210949527A CN115310458A CN 115310458 A CN115310458 A CN 115310458A CN 202210949527 A CN202210949527 A CN 202210949527A CN 115310458 A CN115310458 A CN 115310458A
Authority
CN
China
Prior art keywords
name
vietnamese
letters
vietnam
names
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210949527.3A
Other languages
Chinese (zh)
Inventor
苑聪虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Glabal Tone Communication Technology Co ltd
Original Assignee
Glabal Tone Communication Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Glabal Tone Communication Technology Co ltd filed Critical Glabal Tone Communication Technology Co ltd
Priority to CN202210949527.3A priority Critical patent/CN115310458A/en
Publication of CN115310458A publication Critical patent/CN115310458A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/45Example-based machine translation; Alignment

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a name translation method, a system, equipment and a computer readable storage medium, wherein the method comprises the following steps: obtaining corresponding English letters according to Vietnam letters of Vietnam surnames, and converting the Vietnam letters of Vietnam name words into English letter forms; converting the Vietnamese surnames and the Vietnamese names into male name comparison data in the form of English letters and female name comparison data in the form of English letters according to the collected English letters corresponding to the Vietnamese surnames and the Vietnamese human names; training a gender identification classifier by using the collected male name comparison data and female name comparison data; find the name of more than two continuous words beginning with capital letter in the middle through regular expression. The invention can correctly convert the Vietnamese name into the Chinese name, greatly improves the accuracy of Vietnamese name translation and improves the machine translation level of Vietnamese.

Description

Name translation method, system, equipment and computer readable storage medium
Technical Field
The present invention relates to the field of artificial intelligence technology, and more particularly, to a technique for translating digits in vietnamese into chinese, and more particularly, to a method, system, device, and computer-readable storage medium for name translation.
Background
The translation engines in various networks have the condition that many Vietnamese names cannot be translated into Chinese, and the condition is that the translation engines cannot translate names which do not exist in data into Chinese in the training process, so that the integral reading of the Chinese is influenced, and the names cannot be covered by data exhaustively.
The present invention has been made in view of this situation.
Disclosure of Invention
The technical problem to be solved by the invention is to overcome the defects of the prior art and provide a name translation method, a name translation system, a name translation device and a computer readable storage medium, which can be used for correctly converting names into Chinese names, greatly improving the accuracy of Vietnamese name translation and improving the machine translation level of Vietnamese.
In order to solve the above technical problems, the first aspect of the present invention adopts the following basic concept:
a method of name translation, the method comprising the steps of:
step 1: acquiring corresponding English letters according to Vietnam letters of Vietnam surnames, and converting the Vietnam letters of Vietnam name words into English letters;
and 2, step: converting the Vietnamese surnames and the Vietnamese first names into English letter form male name comparison data and English letter form female name comparison data according to the collected Vietnamese surnames and English letters corresponding to the Vietnamese first names;
and step 3: training a gender identification classifier by using the collected male name comparison data and female name comparison data;
and 4, step 4: find the name of more than two continuous words beginning with capital letter in the middle through regular expression.
In an embodiment that is preferred in any of the foregoing solutions, before obtaining the corresponding english alphabet according to the vietnamese alphabet of the vietnamese last name and converting the vietnamese alphabet of the vietnamese name word into the form of the english alphabet, the method further includes:
and 5: collecting vietnamese data;
and 6: and acquiring the Vietnam letters of the Vietnam surnames according to the Vietnam surname data.
In an embodiment of any of the foregoing solutions, the collecting vietnamese data further includes:
step 51: the names of the fixed translations in the vietnam names are collected.
In an embodiment of any of the foregoing solutions, the converting the vietnamese and the vietnamese into the male name comparison data in the form of english alphabets and the female name comparison data in the form of english alphabets further includes:
putting the name found in the step 2 into a fixed _ name to inquire whether the name is a fixed translation, if so, returning a search result, and ending;
and (3) inquiring whether the name in the step (2) exists in Vietnamese, and if not, ending.
In a preferred embodiment of any of the above schemes, the training a gender identification classifier using the collected male name comparison data and female name comparison data comprises:
and (3) classifying the Vietnamese names inquired in the step (2) by using a gender identification classifier, converting the names by using fixed _ name, and performing name conversion on last _ man _ name or last _ women _ name, wherein if all Vietnamese words are converted into Chinese characters, the converted Chinese characters are returned, and otherwise, the operation is ended.
In a second aspect, a person name translation system includes:
the first acquisition module is used for acquiring corresponding English letters according to Vietnam letters of Vietnam surnames and converting the Vietnam letters of Vietnam name words into English letters;
the conversion module is used for converting the Vietnamese surnames and the Vietnamese first names into English letter form male name comparison data and English letter form female name comparison data according to the collected Vietnamese surnames and English letters corresponding to the Vietnamese first names;
the training module is used for training a gender identification classifier by utilizing the collected male name comparison data and female name comparison data;
and the calculation module is used for finding the names of more than two continuous words beginning with capital letters in the Chinese through the regular expression.
In an embodiment preferred in any of the above schemes, the person name translation system further includes:
the collection module is used for collecting Vietnamese data;
and the second acquisition module is used for acquiring the Vietnamese letters of the Vietnamese according to the Vietnamese surname data.
In an embodiment preferred in any of the above schemes, the person name translation system further includes:
and the classification module is used for classifying the searched Vietnamese personal names by using the gender identification classifier, converting the personal names by using fixed _ name, and performing name conversion on last _ man _ name or last _ name, if all Vietnamese words are converted into Chinese characters, returning the converted Chinese characters, and otherwise, ending.
In a third aspect, a person name translation apparatus includes:
one or more processors;
a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the person name translation method.
In a fourth aspect, a computer-readable storage medium stores a program which, when executed by a processor, implements the person name translation method.
Compared with the prior art, the name translation method provided by the embodiment of the application can be used for training a gender recognition classifier by utilizing the collected male name comparison data and female name comparison data, so that the Chinese names can be correctly converted, the accuracy of Vietnamese name translation is greatly improved, and the machine translation level of Vietnamese is improved.
The following describes embodiments of the present invention in further detail with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. Some specific embodiments of the present application will be described in detail hereinafter by way of illustration and not limitation with reference to the accompanying drawings. The same reference numbers will be used throughout the drawings to refer to the same or like parts or portions, and it will be understood by those skilled in the art that the drawings are not necessarily drawn to scale, in which:
fig. 1 is a schematic flow chart of a name translation method according to an embodiment of the present application.
FIG. 2 is a schematic diagram of a name translation system according to an embodiment of the present application.
Fig. 3 is a schematic diagram of a name translation device according to an embodiment of the present application.
It should be noted that the drawings and the description are not intended to limit the scope of the inventive concept in any way, but to illustrate it by a person skilled in the art with reference to specific embodiments.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the described embodiments are merely exemplary of some, and not all, of the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort shall fall within the protection scope of the present application.
It should be noted that the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
The following examples of the present application illustrate the scheme of the present application in detail by taking the name translation method as an example, but the scope of the present application is not limited by the examples.
Examples
As shown in fig. 1, the present invention provides a name translation method, which comprises the following steps:
step 1: acquiring corresponding English letters according to Vietnam letters of Vietnam surnames, and converting the Vietnam letters of Vietnam name words into English letters;
and 2, step: converting the Vietnamese surnames and the Vietnamese names into male name comparison data in the form of English letters and female name comparison data in the form of English letters according to the collected English letters corresponding to the Vietnamese surnames and the Vietnamese human names;
and 3, step 3: training a gender identification classifier by using the collected male name comparison data and female name comparison data;
and 4, step 4: and finding names of more than two continuous words beginning with capital letters in Chinese through the regular expression.
In the name translation method provided by the embodiment of the invention, a gender recognition classifier is trained by utilizing the collected male name comparison data and female name comparison data, so that the names can be correctly converted into Chinese names, the accuracy of Vietnamese name translation is greatly improved, and the machine translation level of Vietnamese is improved. Vietnamese letters are evolved based on Latin letters. The Vietnamese alphabet has 9 diacritical marks, of which 4 diacritical marks are used to add vowels and the other 5 marks represent tones (flat tone (1 st tone) no tone mark). Vietnamese often has two diacritical marks on a vowel, which is one of the most obvious features of Vietnamese text. In addition, there are 10 letters in two (CH, GH, GI, KH, NG, NH, PH, QU, TH, TR) and one letter in three (NGH). These bigrams and trigrams were previously treated as individual letters and were listed in the old dictionary as itemized. Now no longer arranged as a separate letter, for example "CH" is in today's dictionary arranged between "CA" and "CO". Vietnamese does not use "F", "J", "W", "Z" by itself, but will be used in foreign languages. "W" sometimes replaces "U'" in an abbreviation. In addition, in the informal writing, "W", "F", and "J" may be used instead of "QU", "PH", and "GI".
In the name translation method according to the embodiment of the present invention, the correspondence between spelling and pronunciation is sometimes complicated. In some cases, the same letter may represent several tones, and the same tone may be expressed in more than one letter. The letter y and the letter i are in the multipleIn several cases, there is no regular indication of when and which letter is used. Even though the standard is not applicable to diphthongs, tritons and exemptions of proprietary names, some people still have Raney surnames
Figure BDA0003788901520000061
Is modified into
Figure BDA0003788901520000062
Th u y (common female name, i.e. emerald) becomes Th u i ("odor"). At present, the method of changing to i appears only in scientific publications and textbooks, and most people and media still use the past spelling. The letters are mostly derived from Portuguese letters, with the bigram letters "gh" and "gi" being derived from Italian letters (e.g.: ghetto, giuseppe), and the letters "c", "k", "qu" being derived from Greek letters and Latin letters. Under the influence of the Chinese writing system, if the pronunciation is originally a Chinese character, the pronunciation is written separately with blank spaces. In the past, syllables in a multi-syllable word have been concatenated with hyphens, but this is no longer the case; hyphens are only seen in foreign words. A syllable comprises a maximum of 3 parts, in left to right order: an optional initial consonant (or none); a necessary vowel (vowel) with a tone symbol on top or bottom when needed; an optional final is one of the following: c, ch, m, n, ng, nh, p, t, or none.
In the embodiment of the present invention, before the obtaining a corresponding english letter according to a vietnamese letter of a vietnamese surname and converting the vietnamese letter of a vietnamese word into an english letter form, the method further includes:
and 5: collecting Vietnamese data, and collecting fixed translated names in Vietnamese names;
step 6: and acquiring the Vietnamese letters of the Vietnamese surnames according to the Vietnamese surname data.
In the embodiment of the present invention, the converting the vietnamese surname and the vietnamese first name into the english alphabet-form male name comparison data and the english alphabet-form female name comparison data further includes:
putting the name found in the step 2 into a fixed _ name to inquire whether the name is a fixed translation, if so, returning a search result, and ending;
and (4) inquiring the name in the step (2) to determine whether the name exists in Vietnamese, and if not, ending.
In an embodiment of the present invention, the training a gender identification classifier using the collected male name comparison data and female name comparison data includes:
and (3) classifying the names of the Vietnamese inquired in the step (2) by using a gender identification classifier, converting the names by using fixed _ name, and performing name conversion on last _ man _ name or last _ name, if all Vietnamese words are converted into Chinese characters, returning the converted Chinese characters, and otherwise, ending.
For example, with line _ vi:
Figure BDA0003788901520000071
ban Kinh
Figure BDA0003788901520000072
Thanh
line _ zh economic Committee Vu Hong Thanh as an example
(1) Finding out Vu Hong Thanh by using a Find _ name method;
(2) Searching Vietnamese words corresponding to Vu, hong and Thanh by using last _ man _ name and last _ wmon _ name
Figure BDA0003788901520000073
Figure BDA0003788901520000074
(3) Generating words in the searched line _ vi according to the Vietnamese words obtained in the step (2)
Figure BDA0003788901520000075
Figure BDA0003788901520000076
Then can inquire to
Figure BDA0003788901520000077
(4) Lookup with fixed _ name
Figure BDA0003788901520000078
If the translation name is a fixed translation name, returning a search result, and if not, performing gender classification on the name in the original text obtained in the step (3) by using gender-classification;
(5) Obtaining by using f _ name according to the classification result in the step (4)
Figure BDA0003788901520000079
The Chinese character Wu as the surname is searched in last man name
Figure BDA00037889015200000710
Thanh, corresponding to "hong" and "Qing" respectively;
(6) The Chinese name from (5) replaces the Vu Hong Thanh from (1) to get the final result economic Committee Wuhong Qing and returns.
As shown in fig. 2, a person name translation system includes:
the first acquisition module is used for acquiring corresponding English letters according to Vietnam letters of Vietnam surnames and converting the Vietnam letters of Vietnam name words into English letters;
the conversion module is used for converting the Vietnamese surnames and the Vietnamese names into English letter type male name comparison data and English letter type female name comparison data according to the collected Vietnamese surnames and English letters corresponding to the Vietnamese names;
the training module is used for training a gender identification classifier by utilizing the collected male name comparison data and the collected female name comparison data;
and the calculation module is used for finding names of more than two continuous words beginning with the capital letters in the Chinese through the regular expression.
In a preferred embodiment of any of the foregoing schemes, the name translation system further includes:
the collection module is used for collecting Vietnamese surname data;
and the second acquisition module is used for acquiring the Vietnamese letters of the Vietnamese according to the Vietnamese surname data.
In a preferred embodiment of any of the foregoing schemes, the name translation system further includes:
and the classification module is used for classifying the searched Vietnamese personal names by using the gender identification classifier, converting the personal names by using fixed _ name, and performing name conversion on last _ man _ name or last _ name, if all Vietnamese words are converted into Chinese characters, returning the converted Chinese characters, and otherwise, ending.
Fig. 3 is a schematic structural diagram of a name translation device according to an embodiment of the present invention. FIG. 3 illustrates a block diagram of an exemplary name translation device suitable for use in implementing embodiments of the present invention. The person name translation apparatus shown in fig. 3 is only an example, and should not impose any limitation on the function and scope of use of the embodiment of the present invention.
As shown in fig. 3, the name translation device is in the form of a general purpose computing device. Components of the name translation device may include, but are not limited to: one or more processors or processing units, memory, and a bus connecting the various system components (including the memory and processing units).
A bus represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
The name translation device typically includes a variety of computer system readable media. Such media may be any available media that is accessible by the name translation device and includes both volatile and nonvolatile media, removable and non-removable media.
The memory may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory. The name translation device may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, the storage system may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, often referred to as a "hard drive"). Although not shown in FIG. 3, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to the bus by one or more data media interfaces. The memory may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility having a set (at least one) of program modules may be stored, for example, in the memory, such program modules including but not limited to an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination may comprise an implementation of a network environment. The program modules generally perform the functions and/or methodologies of the described embodiments of the invention.
The name translation device may also communicate with one or more external devices (e.g., keyboard, pointing device, display, etc.), one or more devices that enable a user to interact with the name translation device, and/or any devices (e.g., network card, modem, etc.) that enable the name translation device to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface. Also, the name translation device may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via a network adapter. As shown, the network adapter communicates with the other modules of the name translation device via a bus. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the name translation device, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, to name a few.
The processing unit executes programs stored in the memory to perform various functional applications and data processing, such as implementing the stack splitting processing method provided by any embodiment of the present invention. Namely: obtaining corresponding English letters according to Vietnam letters of Vietnam surnames, and converting the Vietnam letters of Vietnam name words into English letter forms; converting the Vietnamese surnames and the Vietnamese names into male name comparison data in the form of English letters and female name comparison data in the form of English letters according to the collected English letters corresponding to the Vietnamese surnames and the Vietnamese human names; training a gender identification classifier by using the collected male name comparison data and female name comparison data; find the name of more than two continuous words beginning with capital letter in the middle through regular expression.
An embodiment of the present invention further provides a computer-readable storage medium, in which a program is stored, and when the program is executed by a processor, the method for processing stack splitting according to any embodiment of the present invention is implemented, where the method includes:
obtaining corresponding English letters according to Vietnam letters of Vietnam surnames, and converting the Vietnam letters of Vietnam name words into English letter forms;
converting the Vietnamese surnames and the Vietnamese first names into English letter form male name comparison data and English letter form female name comparison data according to the collected Vietnamese surnames and English letters corresponding to the Vietnamese first names;
training a gender identification classifier by using the collected male name comparison data and female name comparison data;
find the name of more than two continuous words beginning with capital letter in the middle through regular expression.
Computer storage media for embodiments of the present invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and these modifications or substitutions do not depart from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A method of name translation, the method comprising the steps of:
step 1: acquiring corresponding English letters according to Vietnam letters of Vietnam surnames, and converting the Vietnam letters of Vietnam name words into English letters;
and 2, step: converting the Vietnamese surnames and the Vietnamese names into male name comparison data in the form of English letters and female name comparison data in the form of English letters according to the collected English letters corresponding to the Vietnamese surnames and the Vietnamese human names;
and 3, step 3: training a gender identification classifier by using the collected male name comparison data and female name comparison data;
and 4, step 4: and finding names of more than two continuous words beginning with capital letters in Chinese through the regular expression.
2. The human name translation method according to claim 1, wherein before the obtaining of the corresponding english alphabet according to the vietnamese alphabet of the vietnamese surname and the conversion of the vietnamese alphabet of the vietnamese word into the english alphabet form, the method further comprises:
and 5: collecting surname data of Vietnam;
step 6: and acquiring the Vietnam letters of the Vietnam surnames according to the Vietnam surname data.
3. The person name translation method according to claim 2, characterized in that: the collecting vietnamese data further comprises:
step 51: the names of the fixed translations in the Vietnam names are collected.
4. The person name translation method according to claim 3, characterized in that: the converting the vietnamese surnames and the vietnamese personal names into English letter form male name comparison data and English letter form female name comparison data further comprises:
putting the name found in the step 2 into a fixed _ name to inquire whether the name is a fixed translation, if so, returning a search result, and ending;
and (4) inquiring the name in the step (2) to determine whether the name exists in Vietnamese, and if not, ending.
5. The person name translation method according to claim 4, characterized in that: the training of a gender recognition classifier using the collected male and female name comparison data includes:
and (3) classifying the names of the Vietnamese inquired in the step (2) by using a gender identification classifier, converting the names by using fixed _ name, and performing name conversion on last _ man _ name or last _ name, if all Vietnamese words are converted into Chinese characters, returning the converted Chinese characters, and otherwise, ending.
6. A person name translation system characterized by: the method comprises the following steps:
the first acquisition module is used for acquiring corresponding English letters according to Vietnam letters of Vietnam surnames and converting the Vietnam letters of Vietnam name words into English letters;
the conversion module is used for converting the Vietnamese surnames and the Vietnamese names into English letter type male name comparison data and English letter type female name comparison data according to the collected Vietnamese surnames and English letters corresponding to the Vietnamese names;
the training module is used for training a gender identification classifier by utilizing the collected male name comparison data and the collected female name comparison data;
and the calculation module is used for finding names of more than two continuous words beginning with the capital letters in the Chinese through the regular expression.
7. The person name translation system according to claim 6, characterized in that: further comprising:
the collection module is used for collecting Vietnamese surname data;
and the second acquisition module is used for acquiring the Vietnamese letters of the Vietnamese according to the Vietnamese surname data.
8. The person name translation system according to claim 7, characterized in that: the training module comprises:
and the classification module is used for classifying the searched Vietnamese personal names by using the gender identification classifier, converting the personal names by using fixed _ name, and performing name conversion on last _ man _ name or last _ name, if all Vietnamese words are converted into Chinese characters, returning the converted Chinese characters, and otherwise, ending.
9. A person name translation apparatus, comprising:
one or more processors;
storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the person name translation method as claimed in any one of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a program which, when executed by a processor, implements the person name translation method according to any one of claims 1 to 7.
CN202210949527.3A 2022-08-09 2022-08-09 Name translation method, system, equipment and computer readable storage medium Pending CN115310458A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210949527.3A CN115310458A (en) 2022-08-09 2022-08-09 Name translation method, system, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210949527.3A CN115310458A (en) 2022-08-09 2022-08-09 Name translation method, system, equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN115310458A true CN115310458A (en) 2022-11-08

Family

ID=83860333

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210949527.3A Pending CN115310458A (en) 2022-08-09 2022-08-09 Name translation method, system, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN115310458A (en)

Similar Documents

Publication Publication Date Title
CN102982021B (en) For eliminating the method for the ambiguity of the multiple pronunciations in language conversion
US20070021956A1 (en) Method and apparatus for generating ideographic representations of letter based names
US20050216253A1 (en) System and method for reverse transliteration using statistical alignment
JPH0736882A (en) Dictionary search device
Naseem et al. A novel approach for ranking spelling error corrections for Urdu
CN104239289A (en) Syllabication method and syllabication device
US10120843B2 (en) Generation of parsable data for deep parsing
Sodhar et al. Identification of issues and challenges in romanized Sindhi text
US20050249419A1 (en) Apparatus and method for handwriting recognition
Uthayamoorthy et al. Ddspell-a data driven spell checker and suggestion generator for the tamil language
US9536180B2 (en) Text recognition based on recognition units
Wu et al. Integrating dictionary and web N-grams for chinese spell checking
Wang et al. Chinese-braille translation based on braille corpus
Jamro Sindhi language processing: A survey
Antony et al. Kernel method for English to Kannada transliteration
US20190155902A1 (en) Information generation method, information processing device, and word extraction method
CN115310458A (en) Name translation method, system, equipment and computer readable storage medium
Choudhary et al. An annotated urdu corpus of handwritten text image and benchmarking of corpus
KR20130122437A (en) Method and system for converting the english to hangul
KR20120057742A (en) Word alignment method using character alignment of statistical machine translation and apparatus using the same
CN112966510A (en) Weapon equipment entity extraction method, system and storage medium based on ALBERT
Doermann et al. Translation lexicon acquisition from bilingual dictionaries
KR100910275B1 (en) Method and apparatus for automatic extraction of tuning fork band pairs in bilingual documents
Lehal et al. Automatic Bilingual Legacy-Fonts Identification and Conversion System.
Jamwal et al. A Novel Hybrid Approach for the Designing and Implementation of Dogri Spell Checker

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
OSZAR »