1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
Commit graph

38 commits

Author SHA1 Message Date
nobu
7aaf5b2878 Embed the Emoji version
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66023 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-27 06:44:02 +00:00
nobu
34cc6fef83 Make some internal functions static
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-16 06:52:00 +00:00
nobu
04a353fe02 tool/enc-unicode.rb: rewrote without flip-flop
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64814 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-09-22 20:39:35 +00:00
nobu
36bc8c0b28 tool: removed unused variables
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63459 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-05-18 00:38:00 +00:00
nobu
a4804fbdf5 support gperf 3.1
* tool/gperf.sed: extracted sed commands to a script.  ANSI-C code
  produced by gperf 3.1 declares length arguments as `size_t`.  it
  causes conflict with existing declarations, and needs casts for
  a local variable and return statements.
  [Feature ]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61076 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-08 05:51:19 +00:00
nobu
01830719f6 fix for emoji-data.txt
* common.mk: download emoji-data.txt.  As emoji data files are
  located in a separate directory in Unicode.org site, reearranged
  Unicode data files directories same as the site.

* tool/enc-unicode.rb (get_file): search emoji data files in the
  second argument path.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60977 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-02 03:12:51 +00:00
nobu
8b180dd74e enc-unicode.rb: for gperf 3.1
* tool/enc-unicode.rb: support for gperf 3.1, which defines length
  arguments as `size_t` but a local variable as `unsigned int`.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60976 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-02 03:12:50 +00:00
naruse
31796f17d3 Update to Onigmo 6.1.3-669ac9997619954c298da971fcfacccf36909d05.
[Bug ]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60966 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-01 13:50:13 +00:00
naruse
9cf7985893 Merge Onigmo 6.1.2
1364ae3488

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58768 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-17 05:38:37 +00:00
nobu
e1e5857c08 enc-unicode.rb: fix version matching
* tool/enc-unicode.rb (data_foreach): version comments do not
  include sub directory names.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58070 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-23 15:55:00 +00:00
nobu
81ab413288 fix GraphemeBreakProperty.txt
* tool/downloader.rb: download to the file given in ARGV.

* tool/enc-unicode.rb (parse_GraphemeBreakProperty): fix data file
  path as $(UNICODE_PROPERTY_FILES) in common.mk.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58069 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-23 15:49:10 +00:00
nobu
d77214e8a3 enc-unicode.rb: ifdef blocks
* tool/enc-unicode.rb (Unifdef#ifdef): enclose conditional blocks
  in blocks.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58066 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-23 07:59:57 +00:00
nobu
8083a359d0 enc-unicode.rb: uniname2ctype_offset
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58065 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-23 07:59:56 +00:00
naruse
2873edeafb Merge Onigmo 6.0.0
* https://github.com/k-takata/Onigmo/blob/Onigmo-6.0.0/HISTORY
* fix for ruby 2.4: https://github.com/k-takata/Onigmo/pull/78
* suppress warning: https://github.com/k-takata/Onigmo/pull/79
* include/ruby/oniguruma.h: include onigmo.h.
* template/encdb.h.tmpl: ignore duplicated definition of EUC-CN in
  enc/euc_kr.c. It is defined in enc/gb2313.c with CRuby macro.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-10 17:47:04 +00:00
nobu
671c929f0a Use offsetof macro and shrink table size
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56952 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-01 00:34:42 +00:00
naruse
c11e648799 Regexp supports Unicoe 9.0.0's \X
* meta character \X matches Unicode 9.0.0 characters with some workarounds
  for UTR  Unicode Emoji, Version 4.0 emoji zwj sequences.
  [Feature ] [ruby-core:77586]

The term "character" can have many meanings bytes, codepoints, combined
characters, and so on. "grapheme cluster" is highest one of such words,
which means user-perceived characters.
Unicode Standard Annex  UNICODE TEXT SEGMENTATION specifies how to
handle grapheme clusters (extended grapheme cluster).
But some specs aren't updated to current situation because Unicode Emoji
is rapidly extended without well definition.
It breaks the precondition of UTR#29 "Grapheme cluster boundaries can be
easily tested by looking at immediately adjacent characters". (the
sentence will be removed in the next version)
Though some of its detail are described in Unicode Technical Report 
UNICODE EMOJI but it is not merged into UTR#29 yet.

http://unicode.org/reports/tr29/
http://unicode.org/reports/tr51/
http://unicode.org/Public/emoji/4.0/

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-30 17:29:19 +00:00
nobu
2ae5e54e62 open Unicode data in binary mode
* tool/enc-unicode.rb (data_foreach): open in binary mode because
  Unicode 9.0.0 contains non-ascii characters.

* template/unicode_norm_gen.tmpl: ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55945 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-08-16 13:01:30 +00:00
nobu
e827c334c3 enc/unicode: check Unicode versions
* enc/unicode/case-folding.rb, tool/enc-unicode.rb: check if
  Unicode versions are consistent with each other.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55687 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-15 00:53:50 +00:00
nobu
0fd7666d57 enc-unicode.rb: check Unicode version
* tool/enc-unicode.rb (data_foreach): check Unicode version in
  data files, and yield each lines.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55685 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-14 16:32:13 +00:00
normal
f8d0bdedf1 tool: add descriptions and fix typos
* tool/asm_parse.rb: add description
* tool/change_maker.rb: ditto
* tool/downloader.rb: ditto
* tool/eval.rb: ditto
* tool/expand-config.rb: ditto
* tool/extlibs.rb: ditto
* tool/fake.rb: ditto
* tool/file2lastrev.rb: ditto
* tool/gem-unpack.rb: ditto
* tool/gen_dummy_probes.rb: ditto
* tool/gen_ruby_tapset.rb: ditto
* tool/generic_erb.rb: ditto
* tool/id2token.rb: ditto
* tool/ifchange: ditto
* tool/insns2vm.rb: ditto
* tool/instruction.rb: ditto
* tool/jisx0208.rb: ditto
* tool/merger.rb: ditto
* tool/mkrunnable.rb: ditto
* tool/node_name.rb: ditto
* tool/parse.rb: ditto
* tool/rbinstall.rb: ditto
* tool/rbuninstall.rb: ditto
* tool/rmdirs: ditto
* tool/runruby.rb: ditto
* tool/strip-rdoc.rb: ditto
* tool/vcs.rb: ditto
* tool/vtlh.rb: ditto
* tool/ytab.sed: ditto
* tool/enc-unicode.rb: fix typo
* tool/mk_call_iseq_optimized.rb: ditto
* tool/update-deps: ditto
  [ruby-core:76215] [Bug ]
  by Noah Gibbs <the.codefolio.guy@gmail.com>

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55564 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-02 21:01:04 +00:00
nobu
2fac41a63e enc-unicode.rb: --header
* tool/enc-unicode.rb: add --header option to emit name2ctype.h
  directly.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52681 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-11-20 04:03:10 +00:00
naruse
64c81e40d4 * regcomp.c: Merge Onigmo 5.14.1 25a8a69fc05ae3b56a09.
this includes Support for Unicode 7.0 [Bug ].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46831 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-07-16 03:27:25 +00:00
naruse
407bcb4bc6 * Merge Onigmo d4bad41e16e3eccd97ccce6f1f96712e557c4518.
fix lookbehind assertion fails with /m mode enabled. [Bug ]
  fix \Z matches where it shouldn't. [Bug ]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@39718 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-03-11 03:46:55 +00:00
naruse
78dbaa1648 * Merge Onigmo 0fe387da2fee089254f6b04990541c731a26757f
v5.13.3 [Bug#7972] [Bug#7974]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@39547 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-03-01 16:36:37 +00:00
naruse
06d483006c * Makefile.in: don't remove macros. now name2ctype uses macros.
* tool/enc-unicode.rb: add comment why it uses Hash#index.

* enc/unicode/{name2ctype.kwd,name2ctype.src,name2ctype.h.blt}:
  update to follow the current name2ctype.h.
  FYI current Unicode version is 6.1.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@36070 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-06-13 17:54:14 +00:00
naruse
6227690604 * tool/enc-unicode.rb: don't use 1.9 feature on tools.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34671 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-02-18 02:47:41 +00:00
naruse
0424e152c6 * Merge Onigmo-5.13.1. [ruby-dev:45057] [Feature ]
https://github.com/k-takata/Onigmo
  cp reg{comp,enc,error,exec,parse,syntax}.c reg{enc,int,parse}.h
  cp oniguruma.h
  cp tool/enc-unicode.rb
  cp -r enc/

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34663 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-02-17 07:42:23 +00:00
naruse
375fd3152f * tool/transcode-tblgen.rb (import_ucm): don't use \h because the
script should work with ruby 1.8.

* tool/enc-unicode.rb: ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34650 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-02-17 00:53:13 +00:00
naruse
a0265b0662 * tool/enc-unicode.rb,
enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  Add Age property to regexp. [ruby-core:33019]
  patched by Ammar Ali, tested by Run Paint Run Run

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29717 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-11-08 05:32:45 +00:00
naruse
f85b841a01 * tool/enc-unicode.rb,
enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  Add 'Unknown' Script.
  patched by Run Paint Run Run. [ruby-core:32937] 

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29626 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-10-29 01:03:21 +00:00
naruse
fc9176ac0e * tool/enc-unicode.rb,
enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  Update Oniguruma for Unicode 6.
  patched by Run Paint Run Run. [ruby-core:32923] 

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29620 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-10-28 11:14:05 +00:00
nobu
b238a3f3fd * tool/enc-unicode.rb: get rid of lots of warnings.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29489 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-10-13 14:16:49 +00:00
naruse
d5537936ab * tool/enc-unicode.rb,
enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  use UTS#18 for POSIX character class.
  http://rubyspec.org/issues/show/161

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25338 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-14 16:51:52 +00:00
naruse
181eb7d5c1 Add derived core and binary property and aliases.
* tool/enc-unicode.rb,
  enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  Add DerivedCoreProperties, PropList (Binary Property),
  PropertyAlias and PropertyValueAlias.
  Now users of tool/enc-unicode.rb should specify
  the directory of UCD files.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25324 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-13 12:27:00 +00:00
naruse
5a4ce608e2 * tool/enc-unicode.rb: optimized.
* enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  U+100000-U+10FFFD is assigned, not Cn.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-08 18:07:08 +00:00
naruse
866c79e2de * tool/enc-unicode.rb: parse range notation of UnicodeData.txt.
* enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  follow above change. [ruby-dev:39444]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25260 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-10-08 02:49:11 +00:00
naruse
ee4b59a419 * unicode.c (onigenc_unicode_property_name_to_ctype):
ignore case of properties.

* tool/enc-unicode.rb: downcase properties list.

* enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt,
  enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src:
  follow above.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24836 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-10 22:54:01 +00:00
naruse
f1eff95745 Update Oniguruma's UnicodeData to 5.1.
* tool/enc-unicode.rb: added for generate name2ctype.kwd.
  contributed by Run Paint Run Run [ruby-core:24775]
  use like following:
    ruby19 tool/enc-unicode.rb enc/unicode/UnicodeData.txt \
      enc/unicode/Scripts.txt > enc/unicode/name2ctype.kwd

* enc/unicode.c (CodeRanges): move definitions to name2ctype.h.

* enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd,
  enc/unicode/name2ctype.src: updated to v5.1.

* enc/unicode/UnicodeData.txt, enc/unicode/Scripts.txt: added v5.1.

* Makefile.in: add rule to generate name2ctype.kwd from
  UnicodeData.txt and Scripts.txt.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24651 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-25 16:15:38 +00:00