1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
Commit graph

7 commits

Author SHA1 Message Date
nobu
8083a359d0 enc-unicode.rb: uniname2ctype_offset
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58065 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-23 07:59:56 +00:00
nobu
12b8058661 update name2ctype.h
* enc/unicode/9.0.0/name2ctype.h: update due to merger of Onigmo
  6.0.0.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58064 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-23 07:53:35 +00:00
duerst
31fb4e3ec3 Reorder codepoints in some entries of CaseUnfold_11_Table
* enc/unicode/case-folding.rb: Reorder codepoints so that the upper-case
  mapping comes first.
* enc/unicode/9.0.0/casefold.h: Codepoints reordered, upper-case mapping
  flag added.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56975 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-04 01:17:34 +00:00
nobu
671c929f0a Use offsetof macro and shrink table size
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56952 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-01 00:34:42 +00:00
nobu
4f7c3d3583 constify CaseMappingSpecials
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56951 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-01 00:34:41 +00:00
naruse
c11e648799 Regexp supports Unicoe 9.0.0's \X
* meta character \X matches Unicode 9.0.0 characters with some workarounds
  for UTR #51 Unicode Emoji, Version 4.0 emoji zwj sequences.
  [Feature #12831] [ruby-core:77586]

The term "character" can have many meanings bytes, codepoints, combined
characters, and so on. "grapheme cluster" is highest one of such words,
which means user-perceived characters.
Unicode Standard Annex #29 UNICODE TEXT SEGMENTATION specifies how to
handle grapheme clusters (extended grapheme cluster).
But some specs aren't updated to current situation because Unicode Emoji
is rapidly extended without well definition.
It breaks the precondition of UTR#29 "Grapheme cluster boundaries can be
easily tested by looking at immediately adjacent characters". (the
sentence will be removed in the next version)
Though some of its detail are described in Unicode Technical Report #51
UNICODE EMOJI but it is not merged into UTR#29 yet.

http://unicode.org/reports/tr29/
http://unicode.org/reports/tr51/
http://unicode.org/Public/emoji/4.0/

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-30 17:29:19 +00:00
duerst
d25e478e91 * common.mk: Updated Unicode version to 9.0.0 [Feature #12513]
* unicode/9.0.0/casefold.h, name2ctype.h, unicode/data/9.0.0:
  new directories/files for Unicode version 9.0.0


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-09-07 08:13:08 +00:00