1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
ruby--ruby/enc
naruse c11e648799 Regexp supports Unicoe 9.0.0's \X
* meta character \X matches Unicode 9.0.0 characters with some workarounds
  for UTR #51 Unicode Emoji, Version 4.0 emoji zwj sequences.
  [Feature #12831] [ruby-core:77586]

The term "character" can have many meanings bytes, codepoints, combined
characters, and so on. "grapheme cluster" is highest one of such words,
which means user-perceived characters.
Unicode Standard Annex #29 UNICODE TEXT SEGMENTATION specifies how to
handle grapheme clusters (extended grapheme cluster).
But some specs aren't updated to current situation because Unicode Emoji
is rapidly extended without well definition.
It breaks the precondition of UTR#29 "Grapheme cluster boundaries can be
easily tested by looking at immediately adjacent characters". (the
sentence will be removed in the next version)
Though some of its detail are described in Unicode Technical Report #51
UNICODE EMOJI but it is not merged into UTR#29 yet.

http://unicode.org/reports/tr29/
http://unicode.org/reports/tr51/
http://unicode.org/Public/emoji/4.0/

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-30 17:29:19 +00:00
..
jis Makefile.in: suppress warnings 2014-05-22 15:09:11 +00:00
trans Update windows-1255 table 2016-10-28 15:14:32 +00:00
unicode Regexp supports Unicoe 9.0.0's \X 2016-11-30 17:29:19 +00:00
ascii.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
big5.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
cp949.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
depend enc/depend: downcase 2016-10-28 14:25:57 +00:00
ebcdic.h enc/ebcdic.h, enc/trans/ebcdic.trans, 2015-12-15 08:57:58 +00:00
emacs_mule.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
encdb.c * enc/encdb.c: Include internal.h. 2014-11-18 15:24:41 +00:00
encinit.c.erb load.c: defer static linked init 2014-12-03 07:47:37 +00:00
euc_jp.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
euc_kr.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
euc_tw.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
gb2312.c
gb18030.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
gbk.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_2022_jp.h * enc/iso_2022_jp.h: fix typos. 2015-12-14 02:50:21 +00:00
iso_8859.h iso_8859.h: SHARP_s 2016-06-11 02:24:38 +00:00
iso_8859_1.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_2.c iso_8859_2.c: dedent [ci skip] 2016-07-30 10:32:06 +00:00
iso_8859_3.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_4.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_5.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_6.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_7.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_8.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_9.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_10.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_11.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_13.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_14.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_15.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
iso_8859_16.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
koi8_r.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
koi8_u.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
make_encmake.rb make_encmake.rb: expand srcdir 2015-09-05 15:32:20 +00:00
Makefile.in common.mk: directory timestamps 2016-07-15 21:26:02 +00:00
mktable.c
prelude.rb enc/prelude.rb: no encdb and transdb 2014-12-03 07:47:11 +00:00
shift_jis.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
unicode.c fix uppercasing for U+A64B, CYRILLIC SMALL LETTER MONOGRAPH UK 2016-11-30 08:25:46 +00:00
us_ascii.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
utf_7.h * enc/iso_2022_jp.h: fix typos. 2015-12-14 02:50:21 +00:00
utf_8.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
utf_16_32.h * enc/iso_2022_jp.h: fix typos. 2015-12-14 02:50:21 +00:00
utf_16be.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
utf_16le.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
utf_32be.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
utf_32le.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
windows_31j.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
windows_1250.c * enc/windows_1250.c, test/ruby/enc/test_case_comprehensive.rb: 2016-07-26 07:19:43 +00:00
windows_1251.c * enc/windows_1251.c, test/ruby/enc/test_case_comprehensive.rb: 2016-07-26 06:30:39 +00:00
windows_1252.c * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c, 2016-07-24 07:33:15 +00:00
windows_1253.c * enc/windows_1253.c: Remove dead code found by Coverity Scan. 2016-07-27 01:33:01 +00:00
windows_1254.c * enc/windows_1254.c: Fix typo. Reported by k-takata at 2016-10-29 21:39:37 +00:00
windows_1257.c * enc/windows_1257.c, test/ruby/enc/test_case_comprehensive.rb: 2016-07-26 07:33:18 +00:00
x_emoji.h * enc/x_emoji.h: fix dead-link. 2015-12-27 11:00:36 +00:00