Perl中文/unicode/utf8/GB2312之间的转换

2023-12-15 19:41•python•阅读 1069

参考：http://daimajishu.iteye.com/blog/959239

不过具测试，也有错误：

原文如下：

# author: jiangyujie

use utf8; ##在最后一个例子，这里面不能有use utf8;

use Encode;

use URI::Escape;

$\ = "\n";

#从unicode得到utf8编码

$str = '%u6536';

$str =~ s/\%u([0-9a-fA-F]{4})/pack("U",hex($1))/eg;

$str = encode( "utf8", $str );

print uc unpack( "H*", $str );

# 从unicode得到gb2312编码

$str = '%u6536';

$str =~ s/\%u([0-9a-fA-F]{4})/pack("U",hex($1))/eg;

$str = encode( "gb2312", $str );

print uc unpack( "H*", $str );

# 从中文得到utf8编码

$str = "收";

print uri_escape($str);

# 从utf8编码得到中文

$utf8_str = uri_escape("收");

print uri_unescape($str);

# 从中文得到perl unicode

utf8::decode($str);

@chars = split //, $str;

foreach (@chars) {

printf "%x ", ord($_);

}

# 从中文得到标准unicode

$a = "汉语";

$a = decode( "utf8", $a );

map { print "\\u", sprintf( "%x", $_ ) } unpack( "U*", $a );

# 从标准unicode得到中文

$str = '%u6536';

$str =~ s/\%u([0-9a-fA-F]{4})/pack("U",hex($1))/eg;

$str = encode( "utf8", $str );

print $str;

# 从perl unicode得到中文

my $unicode = "\x{505c}\x{8f66}";

print encode( "utf8", $unicode ); ##据我测试，这里有错误！应该这样写： utf8::encode($unicode); print $unicode;

======================下面是我的测试

1）编码中文

[root@tts177:/tmp]$more uuu.pl

#!/usr/bin/perl

use warnings;

use Data::Dumper;

use URI::Escape;

$utf8_str = uri_escape("收");

print $utf8_str;

[root@tts177:/tmp]$

[root@tts177:/tmp]$./uuu.pl

%E6%94%B6[root@tts177:/tmp]$

[root@tts177:/tmp]$

2）解码url

[root@tts177:/tmp]$more uuu.pl

#!/usr/bin/perl

use warnings;

use Data::Dumper;

use URI::Escape;

$utf8_str = uri_escape("收");

print $utf8_str;

[root@tts177:/tmp]$

[root@tts177:/tmp]$./uuu.pl

%E6%94%B6[root@tts177:/tmp]$

[root@tts177:/tmp]$

[root@tts177:/tmp]$more uuu.pl

#!/usr/bin/perl

use warnings;

use Data::Dumper;

use URI::Escape;

$str = "%E6%94%B6";

print uri_unescape($str);

[root@tts177:/tmp]$

[root@tts177:/tmp]$./uuu.pl

收[root@tts177:/tmp]$

[root@tts177:/tmp]$

上一篇 »php 执行的系统命令带中文时乱码
下一篇 »javascript的URL编码和解码

Perl中文/unicode/utf8/GB2312之间的转换

相关推荐

PHP中文字符串编码转换

PHP正则匹配字符串中的中文

php和JS正则表达式匹配中文

PHP与MYSQL中UTF8 中文排序例子

php字符编码转换之gb2312转为utf8，转

【python】or【php】网页中字符编码转换，将反斜杠u \u字符串转为unicode/utf8

如何使用python批量修改文本文件编码格式？

解决python3爬取网页，GB2312编码中文乱码问题