1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
Commit graph

82 commits

Author SHA1 Message Date
naruse
2873edeafb Merge Onigmo 6.0.0
* https://github.com/k-takata/Onigmo/blob/Onigmo-6.0.0/HISTORY
* fix for ruby 2.4: https://github.com/k-takata/Onigmo/pull/78
* suppress warning: https://github.com/k-takata/Onigmo/pull/79
* include/ruby/oniguruma.h: include onigmo.h.
* template/encdb.h.tmpl: ignore duplicated definition of EUC-CN in
  enc/euc_kr.c. It is defined in enc/gb2313.c with CRuby macro.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-10 17:47:04 +00:00
duerst
8baa73be48 remove special processing for U+03B9/U+03BC/U+A64B
* enc/unicode.c: Remove special processing for U+03B9/U+03BC/U+A64B
  (GREEK SMALL LETTERs IOTA/MU, CYRILLIC SMALL LETTER MONOGRAPH UK)
  from onigenc_unicode_case_map and simplify code.

* enc/unicode/case-folding.rb: Remove check for U+03B9/U+03BC/U+A64B.

This and the previous few related commits make sure that we won't hit
the equivalent of bug  anymore for future updates of Unicode versions.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56976 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-04 01:58:54 +00:00
nobu
4f7c3d3583 constify CaseMappingSpecials
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56951 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-01 00:34:41 +00:00
duerst
87b937bdfd fix uppercasing for U+A64B, CYRILLIC SMALL LETTER MONOGRAPH UK
* enc/unicode.c: Add U+A64B to the special cases 03B9 and 03BC
  at the end of onigenc_unicode_case_map (Bug ).

* enc/unicode/case-folding.rb: Add U+A64B to the special cases
  03B9 and 03BC. Add a comment pointing to enc/unicode.c.
  Change warnings to exceptions for unpredicted cases,
  because this would have been more easily noticed
  (the warning was not noticed when upgrading to Unicode 9.0.0).

* test/ruby/enc/test_case_comprehensive.rb: Remove temporary
  exclusion of U+A64B from testing.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56941 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-30 08:25:46 +00:00
duerst
6ed393ad89 * regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c,
emacs_mule.c, euc_jp.c, euc_kr.c, euc_tw.c, gb18030.c, gbk.c,
  iso_8859_1|2|3|4|5|6|7|8|9|10|11|13|14|15|16.c, koi8_r.c, koi8_u.c,
  shift_jis.c, unicode.c, us_ascii.c, utf_16|32be|le.c, utf_8.c,
  windows_1250|51|52|53|54|57.c, windows_31j.c, unicode.c:
  Remove conditional compilation macro ONIG_CASE_MAPPING. [Feature ].


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55740 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-24 07:33:15 +00:00
nobu
af2d3c9866 Move generated headers to unicode data directory
* common.mk, enc/depend (casefold.h, name2ctype.h): move to
  unicode data directory per version.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55701 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-17 11:59:26 +00:00
duerst
3dd98b2446 * string.c: Raise ArgumentError when invalid string is detected in
case mapping methods.
* enc/unicode.c: Check for invalid string and signal with negative
  length value.
* test/ruby/enc/test_case_mapping.rb: Add tests for above.
* test/ruby/test_m17n_comb.rb: Add a message to clarify test failure.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55253 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-02 01:24:52 +00:00
duerst
46647ac8df * enc/unicode.c: Handle DOTLESS_i by hand because it isn't involved in folding.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55164 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-25 10:07:19 +00:00
duerst
ef6405f71c * enc/unicode.c: Fix flag error for switch from titlecase to lowercase.
* test/ruby/enc/test_case_mapping.rb: Tests for above error.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55153 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-24 23:01:39 +00:00
duerst
84cd51919b * enc/unicode.h: Additional uses of ONIG_CASE_MAPPING compilation switch
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55020 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-16 11:00:29 +00:00
svn
3ab0ea8027 * append newline at EOF.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55019 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-16 10:46:33 +00:00
duerst
65db16de9f * include/ruby/oniguruma.h: Introducing ONIG_CASE_MAPPING compilation
switch
* include/ruby/oniguruma.h, enc/unicode.h: Using ONIG_CASE_MAPPING
  compilation switch


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-16 10:46:32 +00:00
duerst
5e9d33ad49 * enc/unicode/case-folding.rb, casefold.h: Data generation to implement
swapcase functionality for titlecase characters. Swapcase isn't defined
  by Unicode, because the purpose/usage of swapcase is unclear anyway.
  The implementation follows a proposal from Nobu, swaping the case of
  each component of a titlecase character individually.
  This means that the titlecase characters have to be decomposed.
* enc/unicode.c: Code using the above data.
* test/ruby/enc/test_case_mapping.rb: Tests for the above.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-04-01 11:58:47 +00:00
kazu
4a58f51a95 fix a typo [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54400 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29 12:34:37 +00:00
duerst
78f540019a * enc/unicode/case-folding.rb, casefold.h: Tweaked handling of 6
special cases in CaseUnfold_11_Table.
* enc/unicode.c: Adjustments for above.
* test/ruby/enc/test_case_mapping.rb: Tests for the above: Some tests in
  test_titlecase activated; test_greek added. A test in test_cherokee fixed.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29 07:53:43 +00:00
duerst
49f25a1299 * enc/unicode.c: Cleaned up some comments.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54349 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29 04:30:58 +00:00
duerst
0e6f8b166d * enc/unicode/case-folding.rb, casefold.h: Removing data for idempotent
titlecasing.
* enc/unicode.c: Adjust code to data removal.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54347 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29 04:24:55 +00:00
duerst
2d20a27fb4 * enc/unicode.c: Refactoring in preparation for data reduction for
titlecase.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54313 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28 05:54:47 +00:00
duerst
890ce36b79 * enc/unicode.c: Minor refactoring for I WITH DOT ABOVE.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54312 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28 05:36:35 +00:00
duerst
1582093c77 * enc/unicode.c: Removed code now covered by data from table.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54311 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28 05:26:23 +00:00
duerst
663fb4dd44 * enc/unicode.c: Adding comments. [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28 02:49:20 +00:00
duerst
2f455ceca4 * include/ruby/oniguruma.h: Additional flag for characters that are titlecase.
* enc/unicode/case-folding.rb, casefold.h: Using above flag in data.
* enc/unicode.c: Marking capitalized character as unmodified if it is
  already titlecase.
* test/ruby/enc/test_case_mapping.rb: Tests for above functionality.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54229 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-22 12:08:30 +00:00
duerst
fdbb82967f * enc/unicode.c: Fixed two macro definitions.
* test/ruby/enc/test_case_mapping.rb: Test cases that detected
  the above bugs.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54140 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-17 03:09:00 +00:00
duerst
e89232eb15 * enc/unicode.c: Eliminating common code.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54118 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15 07:29:51 +00:00
duerst
8679f113e9 * enc/unicode.c: Expansion of some code repetition in preparation for
elimination of common code pieces.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54117 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15 07:17:09 +00:00
svn
10fa31a603 * remove trailing spaces.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54113 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15 04:49:25 +00:00
duerst
00cc59a054 * enc/unicode.c: Additional macros and code to use mapping data in
CaseMappingSpecials array.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54112 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15 04:49:24 +00:00
duerst
4b15b54d68 * include/ruby/oniguruma.h, enc/unicode.c: Adjusting flag assignments
and macros to work with unified CaseMappingSpecials array.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54101 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-14 09:39:54 +00:00
nobu
8b4448e2e1 unicode.c: off-by-one error
* enc/unicode.c (CodePointListValidP): fix off-by-one error.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-12 01:35:54 +00:00
nobu
d48f923648 unicode.c: boundary check
* enc/unicode.c (CodePointListValidP): add pathological boundary
  check, for gcc 4.9.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54090 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-12 01:15:31 +00:00
duerst
59766643db * enc/unicode/case-folding.rb, casefold.h: Streamlining approach to
case mapping data not available from case folding by unifying all
  three cases (special title, special upper, special lower).
* enc/unicode.c: Adjust macro names for above (macros are currently inactive).
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54085 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-11 07:11:27 +00:00
duerst
f1f48e6103 * include/ruby/oniguruma.h: Rearranging flag assignments and making
space for titlecase indices; adding additional macros to add or
  extract titlecase index; adding comments for better documentation.
* enc/unicode.c: Moving some macros to include/ruby/oniguruma.h;
  activating use of titlecase indices.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-24 13:32:01 +00:00
duerst
1cc579cb00 * enc/unicode/case-folding.rb, casefold.h: Outputting actual titlecase
data (new table, with indices from other tables).
* enc/unicode.c: Ignoring titlecase data indices for the moment.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53906 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-23 12:53:10 +00:00
duerst
6286ff6301 * enc/unicode.c: Activated use of case mapping data in CaseUnfold_11 array.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53870 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-19 03:45:32 +00:00
duerst
2ca7569c6d * string.c, enc/unicode.c: Disassociating ONIGENC_CASE_FOLD flag from
ONIGENC_CASE_DOWNCASE.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53778 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08 11:44:12 +00:00
nobu
584f9e51d6 unicode.c: magic numbers
* enc/unicode.c (I_WITH_DOT_ABOVE, DOTLESS_i, DOT_ABOVE): name
  magic numbers.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53776 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08 05:01:00 +00:00
duerst
8f10a72d90 * enc/unicode.c: Shortened macros for enc/unicode/casefold.h to
single-letter; use flags in casefold.h for logic.
* enc/unicode/case-folding.rb: Added flag for case folding.
  Changed parameter passing.
* enc/unicode/casefold.h: New flags added.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53775 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08 04:00:31 +00:00
duerst
49ca434bcf * common.mk: Added two more precondition files for enc/unicode/casefold.h
* enc/unicode.c: Added shortening macros for enc/unicode/casefold.h
* enc/unicode/case-folding.rb: Fixed file encoding for CaseFolding.txt
  to ASCII-8BIT (should fix some ci errors). Clarified usage. Created
  class MapItem. Partially implemented class CaseMapping.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53767 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-07 13:10:20 +00:00
duerst
e8dde46b60 * test/ruby/enc/test_regex_casefold.rb: Added data-based testing for
String#downcase :fold.
* enc/unicode.c: Fixed a range error (lowest non-ASCII character affected
  by case operations is U+00B5, MICRO SIGN)
* test/ruby/enc/test_case_mapping.rb: Explicit test for case folding of
  MICRO SIGN to Greek mu.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53749 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-06 06:18:38 +00:00
duerst
81515b2381 * enc/unicode.c, test/ruby/enc/test_case_mapping.rb: Implemented :fold
option for String#downcase by using case folding data from
  regular expression engine, and added a few simple tests.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53747 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-06 05:37:29 +00:00
duerst
b658249cef * enc/unicode.c: Activated :ascii flag for ASCII-only case conversion
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53740 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-04 12:05:23 +00:00
duerst
a7c987968d * enc/unicode.c: Fixed bit mask in macro OnigCodePointCount
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53670 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-27 09:54:38 +00:00
duerst
415949faba * enc/unicode.c: Protect code point count by macro, in order to
be able to use the remaining bits for flags.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-27 08:55:40 +00:00
duerst
f307d1fe21 * enc/unicode.c: Fixed a logical error and some comments.
* test/ruby/enc/test_case_mapping.rb: Made tests more general.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53564 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-17 11:10:45 +00:00
nobu
39f44f0113 get rid of non-ascii chars
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53563 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-17 09:03:11 +00:00
duerst
959bbb6f72 * enc/unicode.c: Removed artificial expansion for Turkic,
added hand-coded support for Turkic, fixed logic for swapcase.
* string.c: Made use of new case mapping code possible from upcase,
  capitalize, and swapcase (with :lithuanian as a guard).
* test/ruby/enc/test_case_mapping.rb: Adjusted for above.
  (with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53562 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-17 08:42:16 +00:00
duerst
c12af76763 * enc/unicode.c: Artificial mapping to test buffer expansion code.
* string.c: Fixed buffer expansion logic.
* test/ruby/enc/test_case_mapping.rb: Tests for above.
(with Kimihito Matsui)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53554 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-16 08:24:58 +00:00
hsbt
219467abde * enc/unicode.c: fix implicit conversion error with clang. fixup r53548.
* string.c: ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53552 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-16 01:51:58 +00:00
duerst
be897c2507 * string.c, enc/unicode.c: New code path as a preparation for Unicode-wide
case mapping. The code path is currently guarded by the :lithuanian
  option to avoid accidental problems in daily use.
* test/ruby/enc/test_case_mapping.rb: Test for above.
* string.c: function 'check_case_options': fixed logical errors

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53548 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-16 01:24:03 +00:00
naruse
9ed1d63f41 * regcomp.c, regenc.c, regexec.c, regint.h, enc/unicode.c:
Merge Onigmo 58fa099ed1a34367de67fb3d06dd48d076839692
  + https://github.com/k-takata/Onigmo/pull/52

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52756 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-11-26 08:31:27 +00:00