Coder Social home page Coder Social logo

pinyin4j's Introduction

pinyin4j

A copy of http://sourceforge.net/projects/pinyin4j, then deploy it to maven central repository.

Download

Download the jar via maven:

<dependency>
    <groupId>com.belerweb</groupId>
    <artifactId>pinyin4j</artifactId>
    <version>2.5.0</version>
</dependency>

QQ Group

QQ群:479540440

多音字识别

在pinyin4j的基础上添加了多音字识别,带近一万个多音词,但是这远远不够,所以用户可设置外挂词库

外挂多音词库

用户配置的外挂词库会覆盖系统中相同词的读音,可用于纠错

配置方式很简单,只需要配置路径即可

MultiPinyinConfig.multiPinyinPath="/Users/yiboliu/my_multi_pinyin.txt"

格式同系统的多音词库,如:

吸血鬼日记 (xi1,xue4,gui3,ri4,ji4)

pinyin4j's People

Contributors

belerweb avatar obiteaaron avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pinyin4j's Issues

Android Studio 3.0 duplicate entry

Error:Execution failed for task ':app:transformClassesWithStackFramesFixerForMockDebug'.

com.android.build.api.transform.TransformException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: com.android.builder.utils.FileCache$FileCreatorException: java.util.zip.ZipException: duplicate entry: META-INF/maven/com.belerweb/pinyin4j/pom.xml

ps: packagingOptions is not working.

返回值存在一点小问题

代码:
private static final HanyuPinyinOutputFormat HANYU_PINYIN_OUTPUT_FORMAT = new HanyuPinyinOutputFormat();

static {
    HANYU_PINYIN_OUTPUT_FORMAT.setVCharType(HanyuPinyinVCharType.WITH_U_UNICODE);
    HANYU_PINYIN_OUTPUT_FORMAT.setCaseType(HanyuPinyinCaseType.LOWERCASE);
    HANYU_PINYIN_OUTPUT_FORMAT.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
}

public static String getPinYinChar(String str) {
try {
return PinyinHelper.toHanYuPinyinString(str, HANYU_PINYIN_OUTPUT_FORMAT, " ", false);
} catch (BadHanyuPinyinOutputFormatCombination badHanyuPinyinOutputFormatCombination) {
throw new GlobalSystemException(badHanyuPinyinOutputFormatCombination.getMessage(), badHanyuPinyinOutputFormatCombination);
}
}
测试:
@test
public void getPinYinChar(){
String s = PinYinUtil.getPinYinChar("大鱼吃小鱼,杨幂是个女的");
System.out.println(s);
}

输出结果:
da yu chi xiao yu yang mi shi ge nüde
bug描述:
separate我使用的是一个空格,返回值最后 [ 女的] 的拼音未使用separate 进行拼接
感谢开源

输出拼音缺少分隔符

HanyuPinyinOutputFormat outputFormat = new HanyuPinyinOutputFormat();
outputFormat.setToneType(HanyuPinyinToneType.WITH_TONE_NUMBER);
outputFormat.setCaseType(HanyuPinyinCaseType.LOWERCASE);
outputFormat.setVCharType(HanyuPinyinVCharType.WITH_V);
String pinyin = PinyinHelper.toHanYuPinyinString("一日千里", outputFormat, " ", true);
System.out.println(pinyin);

输出: yi1 ri4 qian1li3

张字转拼音 是我使用问题嘛?

@Test
    void test() {
        HanyuPinyinOutputFormat fmt = new HanyuPinyinOutputFormat();
        fmt.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
//        fmt.setCaseType(HanyuPinyinCaseType.LOWERCASE);
        String[] pinyin_z = PinyinHelper.toHanyuPinyinStringArray('张');
        String[] pinyin_l = PinyinHelper.toHanyuPinyinStringArray('李');
        String[] z = PinyinHelper.toTongyongPinyinStringArray('张');
        String[] l = PinyinHelper.toTongyongPinyinStringArray('李');
        System.out.println(Stream.of(z).collect(Collectors.toList()).toString());
        System.out.println(Stream.of(l).collect(Collectors.toList()).toString());
        System.out.println(Stream.of(pinyin_z).collect(Collectors.toList()).toString());
        System.out.println(Stream.of(pinyin_l).collect(Collectors.toList()).toString());

    }

output:
[jhang1] [li3] [zhang1] [li3]

繁體的“葉”翻錯了

我使用 PinyinHelper.toWadeGilesPinyinStringArray('葉');
結果出來是 [hsieh2, yeh4, she4],第一個是謝,也不會有人改把謝唸成往上的二聲,變成了爺,也不是葉,請將 [hsieh2] 刪除,謝謝

多音字的词库配置

作者首页的说明多音字词库配置是基于作者2.5.0之后版本的操作。
我是在 2.5.1版本才能找到MultiPinyinConfig这个类。
建议作者可以更新一下首页md说明,避免其他朋友也产生类似误解。
另外外挂词库的配置期待可以通过springboot配置化管理。

单个字转换的时候 为什么出现三个相同的...拼音

 /**
     * 将中文转换为拼音
     */
    @Test
    void test() throws BadHanyuPinyinOutputFormatCombination {
      HanyuPinyinOutputFormat format=new HanyuPinyinOutputFormat();
      format.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
//      format.setCaseType(HanyuPinyinCaseType.UPPERCASE);
//      format.setVCharType(HanyuPinyinVCharType.WITH_V);
        String[] strings = PinyinHelper.toHanyuPinyinStringArray('华', format);
        System.out.println(String.join("",strings));

    }

output:

huahuahua

打包失败!!!

我想把pinyin4J放置到我的Solr6.1.0环境中使用,我的环境是这样的:
pinyin4j]# mvn --version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-11T00:41:47+08:00)
Maven home: /usr/local/maven3.3.9
Java version: 1.8.0_92, vendor: Oracle Corporation
Java home: /usr/java/jdk1.8.0_92/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "2.6.32-642.el6.x86_64", arch: "amd64", family: "unix"

编译遇到的问题如下:

T E S T S

Running net.sourceforge.pinyin4j.test.PinyinHelperTest
Tests run: 7, Failures: 6, Errors: 0, Skipped: 0, Time elapsed: 0.338 sec <<< FAILURE!
testToGwoyeuRomatzyhStringArray(net.sourceforge.pinyin4j.test.PinyinHelperTest) Time elapsed: 0.231 sec <<< FAILURE!
junit.framework.AssertionFailedError: Expected: but was: [Ljava.lang.String;@7a4f0f29
at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.Assert.assertTrue(Assert.java:22)
at junit.framework.Assert.assertNull(Assert.java:277)
at junit.framework.Assert.assertNull(Assert.java:268)
at junit.framework.TestCase.assertNull(TestCase.java:438)
at net.sourceforge.pinyin4j.test.PinyinHelperTest.testToGwoyeuRomatzyhStringArray(PinyinHelperTest.java:279)
at sun.reflect.NativeMethodAccessorImpl.invoke0
.........中间部分内容略
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test (default-test) on project pinyin4j: There are test failures.
[ERROR]
[ERROR] Please refer to /data/solr-6.1.0/pinyin4j/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
请问是不是pinyin4j目前不支持java8的缘故?

韩语识别报错

2020-06-30 16:08:42.521 15008-15008/cn.flyaudio.onlinemusic E/AndroidRuntime: FATAL EXCEPTION: main Process: cn.flyaudio.onlinemusic, PID: 15008 java.lang.ArrayIndexOutOfBoundsException: length=0; index=0 at cn.flyaudio.onlinemusic.utils.PinyinUtils.getFirstSpell(PinyinUtils.java:57) at cn.flyaudio.onlinemusic.music.hotsinger.HotSingerDetailView$attViewToRoot$1.onChanged(HotSingerDetailView.kt:93) at cn.flyaudio.onlinemusic.music.hotsinger.HotSingerDetailView$attViewToRoot$1.onChanged(HotSingerDetailView.kt:27) at androidx.lifecycle.LiveData.considerNotify(LiveData.java:131) at androidx.lifecycle.LiveData.dispatchingValue(LiveData.java:149) at androidx.lifecycle.LiveData.setValue(LiveData.java:307) at androidx.lifecycle.MutableLiveData.setValue(MutableLiveData.java:50) at androidx.lifecycle.LiveData$1.run(LiveData.java:91) at android.os.Handler.handleCallback(Handler.java:873) at android.os.Handler.dispatchMessage(Handler.java:99) at android.os.Looper.loop(Looper.java:201) at android.app.ActivityThread.main(ActivityThread.java:6810) at java.lang.reflect.Method.invoke(Native Method) at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:547) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:873)

/**
* 汉字转换为汉语拼音首字母,英文字符不变
* @param chinese
* @return
*/
public static String getFirstSpell(String chinese){
StringBuffer pybf = new StringBuffer();
char[] arr = chinese.toCharArray();
HanyuPinyinOutputFormat defaultFormat = new HanyuPinyinOutputFormat();
defaultFormat.setCaseType(HanyuPinyinCaseType.LOWERCASE);
defaultFormat.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
for (char curchar : arr){
if (curchar > 128){
try {
String[] temp = PinyinHelper.toHanyuPinyinStringArray(curchar, defaultFormat);
if (temp != null) {
pybf.append(temp[0].charAt(0));// TODO 也就是这一行
}
} catch (BadHanyuPinyinOutputFormatCombination e) {
e.printStackTrace();
}
}else {
pybf.append(curchar);
}
}
return pybf.toString().replaceAll("\W", "").trim();
}

编译失败

我在我们的util库中使用 compile ('com.belerweb:pinyin4j:2.5.0'),然后加入了 packagingOptions { exclude 'META-INF/maven/com.belerweb/pinyin4j/pom.xml' exclude 'META-INF/maven/com.belerweb/pinyin4j/pom.properties' },在引用util库的项目中编译报错,
`Execution failed for task ':app:transformClassesWithStackFramesFixerForProductDebug'.

com.android.build.api.transform.TransformException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: com.android.builder.utils.FileCache$FileCreatorException: java.util.zip.ZipException: duplicate entry: META-INF/maven/com.belerweb/pinyin4j/pom.xml
`,在引用util库的项目中也加入了exclude仍然不行,这个该怎么办呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.