搜索资源列表
libcharguess-src-1.0b.tar
- 判断一串字符是属于什么字符集的程序,如判断是否属于utf-8,gb2312
segment
- segment,一个简单的中文分词程序,命令行如下: java -jar segmenter.jar [-b|-g|-8|-s|-t] inputfile.txt -b Big5, -g GB2312, -8 UTF-8, -s simp. chars, -t trad. chars Segmented text will be saved to inputfile.txt.seg
ntf_code
- utf-8和unicode的互转的c代码-UTF-8<->Unicode converter
GB2312_TO_UTF8
- 将gb2312编码的字符转换为UTF-8编码的字符。-convert gb2312 char to utf-8 char
convertz802
- UTF-8格式文本文件到GBK格式的相互转换. 也支持繁体中文的转换.在许多网站开发中有应用.-UTF-8 format text files to the GBK format conversion. Also support Traditional Chinese conversions. In many applications in web site development.
utf8
- VB6中实现UTF-8编码解码的源代码程序。最常见的UTF-8应用,在搜索引擎搜索连接URL编码。-UTF-8 encoding and decoding of the source code program.
ChCodeSet
- 打印输出所有汉字编码的字符集,包括gbk, big5, utf, unicode等-Printout of all characters coded character set, including the gbk, big5, utf, unicode, etc.
CodeConverter
- 指定文本文件路径,转换文件的字符编码,包括gbk,unicode,utf-8互换等。-Specify the path to a text file, convert the file character encoding
UTF-8andGB2312
- 这是一种网页编码转换的方法。很实用的,希望大家工同学习。-This is a web transcoding methods. Very practical, and hope that we work with the study.
ICTCLAS2012
- ICTCLAS的最新版本2012. 1.增加了CICTCLAS部分函数 2.修正了部分再UTF-8下计算位移偏移量的Bug. 3.ICTCLAS完全兼容开源搜索引擎Sphinx,具体可以访问Sphinx官网; 4.为保障用户使用的便利,从本版开始,调用的dll的名称一律为ICTCLAS2011.dll,不再变化,一般用户只需要变更dll及对应的.user授权文件,无需重新编译自己的程序,即可兼容新版本分词程序。 -ICTCLAS the latest version 2
UTF-8toANSI
- convert UTF-8 to ANSI -I needed to convert UTF-8 to ANSI and after a lot of research I have found nothing conclusive, so I scrambled from the RFC. I put the code here in the hope that it will one day be used to someone. Source code free.