ruby--ruby

mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00

Author	SHA1	Message	Date
akr	862b6b68a4	update tests for String#inspect replacing \xHH instead of \OOO. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14166 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 23:18:25 +00:00
akr	5a1c2b2677	* re.c (append_utf8): check unicode range. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 03:50:11 +00:00
nobu	45fa4a4b63	* string.c (tr_find): returns true if no characters to be removed is specified. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14151 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 03:12:25 +00:00
nobu	54c146d2c3	* string.c (tr_trans): get rid of segfaults when has mulitbytes but source sets have no mulitbytes. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14148 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-09 02:29:24 +00:00
akr	b12bb50149	* re.c (rb_reg_check_preprocess): new function for validating regexp fragment. * parse.y (regexp): invoke reg_fragment_check. (reg_fragment_check): defined. (reg_fragment_check_gen): defined. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14133 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-08 07:21:05 +00:00
akr	9667f7953e	add test for UTF-8 bit pattern. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14132 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-08 04:31:26 +00:00
akr	f1b7e60cb9	* encoding.c (rb_enc_mbclen): make it never fail. (rb_enc_nth): don't check the return value of rb_enc_mbclen. (rb_enc_strlen): ditto. (rb_enc_precise_mbclen): return needmore(1) if e <= p. (rb_enc_get_ascii): new function for extracting ASCII character. * include/ruby/encoding.h (rb_enc_get_ascii): declared. * include/ruby/regex.h (ismbchar): removed. * re.c (rb_reg_expr_str): use rb_enc_get_ascii. (unescape_escaped_nonascii): use rb_enc_precise_mbclen to determine the termination of escaped non-ASCII character. (unescape_nonascii): use rb_enc_precise_mbclen. (rb_reg_quote): use rb_enc_get_ascii. (rb_reg_regsub): use rb_enc_get_ascii. * string.c (rb_str_reverse) don't check the return value of rb_enc_mbclen. (rb_str_split_m): don't call rb_enc_mbclen with e <= p. * parse.y (is_identchar): use ISASCII. (parser_ismbchar): removed. (parser_precise_mbclen): new macro. (parser_isascii): new macro. (parser_tokadd_mbchar): use parser_precise_mbclen to check invalid character precisely. (parser_tokadd_string): use parser_isascii. (parser_yylex): ditto. (is_special_global_name): don't call is_identchar with e <= p. (rb_enc_symname_p): ditto. [ruby-dev:32455] * ext/tk/sample/tkextlib/vu/canvSticker2.rb: remove coding cookie because the encoding is not UTF-8. [ruby-dev:32475] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-08 02:50:43 +00:00
akr	69406aad50	* encoding.c (rb_enc_precise_mbclen): new function for mbclen with validation. * include/ruby/encoding.h (rb_enc_precise_mbclen): declared. (MBCLEN_CHARFOUND): new macro. (MBCLEN_INVALID): new macro. (MBCLEN_NEEDMORE): new macro. * include/ruby/oniguruma.h (OnigEncodingTypeST): replace mbc_enc_len by precise_mbc_enc_len. (ONIGENC_PRECISE_MBC_ENC_LEN): new macro. (ONIGENC_CONSTRUCT_MBCLEN_CHARFOUND): new macro. (ONIGENC_CONSTRUCT_MBCLEN_INVALID): new macro. (ONIGENC_CONSTRUCT_MBCLEN_NEEDMORE): new macro. (ONIGENC_MBCLEN_CHARFOUND): new macro. (ONIGENC_MBCLEN_INVALID): new macro. (ONIGENC_MBCLEN_NEEDMORE): new macro. (ONIGENC_MBC_ENC_LEN): use ONIGENC_PRECISE_MBC_ENC_LEN. * enc/euc_jp.c: validation implemented. * enc/sjis.c: ditto. * enc/utf8.c: ditto. * string.c (rb_str_inspect): use rb_enc_precise_mbclen for invalid encoding. (rb_str_valid_encoding_p): new method String#valid_encoding?. * io.c (rb_io_getc): use rb_enc_precise_mbclen. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14119 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-06 09:28:26 +00:00
akr	9bd11f24b3	s/unicode/Unicode/ in error messages. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14078 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-02 02:53:46 +00:00
akr	21729a1c77	* parse.y (regexp): fix /#{}\xa1\xa2/e to be EUC-JP. (reg_fragment_setenc_gen): extracted from reg_compile_gen. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14075 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-02 00:16:24 +00:00
akr	7ff702406a	* include/ruby/intern.h (rb_uv_to_utf8): declared. * re.c (rb_reg_preprocess): new function for dynamic regexp with \u{} such as Regexp.new("\\u{6666}"). (rb_reg_prepare_re): preprocess regexp for recompiling. (read_escaped_byte): new function. (unescape_escaped_nonascii): new function. (append_utf8): new function. (unescape_unicode_list): new function. (unescape_unicode_bmp): new function. (unescape_nonascii): new function. (rb_reg_initialize): preprocess regexp. * pack.c (rb_uv_to_utf8): renamed from uv_to_utf8. * parse.y (STR_NEW3): take func instead of has8 and hasmb. (parser_str_new): use default coderange mechanism except for regexp. (parser_tokadd_utf8): copy regexp source as-is. (parser_read_escape): UTF-8 stuff removed. (parser_tokadd_escape): has8bit and hasmb removed. (parser_tokadd_string): fix 8-bit single byte character with \u. (parser_parse_string): has8bit and hasmb removed. (parser_here_document): has8bit and hasmb removed. (parser_yylex): call parser_tokadd_utf8 instead of read_escape for UTF-8 character. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-12-01 16:56:19 +00:00
akr	bfc502fe51	more tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14023 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-26 13:05:08 +00:00
akr	b2e60b2ce7	* include/ruby/encoding.h (rb_enc_str_asciionly_p): declared. (rb_enc_str_asciicompat_p): defined. * re.c (rb_reg_initialize_str): use rb_enc_str_asciionly_p. (rb_reg_quote): return ascii-8bit string if the argument is ascii-only to generate encoding generic regexp if possible. (rb_reg_s_union): fix encoding handling. [ruby-dev:32094] * string.c (rb_enc_str_asciionly_p): defined. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14013 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-25 13:25:34 +00:00
akr	2109a52503	* re.c (REG_CASESTATE): unused macro removed. (rb_reg_prepare_re): check encoding difference. (rb_reg_initialize): check 8bit byte. * parse.y (parser_tokadd_escape): fix has8bit. [ruby-dev:32113] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2007-11-23 06:30:26 +00:00

14 commits