1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
Commit graph

33 commits

Author SHA1 Message Date
duerst
c5e46ef397 * test/ruby/test_transcode.rb: added test_euc_jp
(contributed by Yoshihiro Kambayashi)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18862 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-26 04:26:00 +00:00
akr
8f9ed3c464 * include/ruby/encoding.h (rb_econv_open_exc): declared.
* transcode.c (rb_eNoConverter): new exception.
  (rb_econv_open_exc): new function.
  (transcode_loop): use rb_econv_open_exc.

* io.c (make_writeconv): use rb_econv_open_exc.
  (make_readconv): ditto.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18803 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-24 02:42:37 +00:00
duerst
5dd5311fdf * test/ruby/test_transcode.rb: test_shift_jis:
fixed comment strings (see r18291)



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18772 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-22 05:12:33 +00:00
akr
5ade93542f * transcode_data.h (TRANSCODE_ERROR): removed.
* tool/transcode-tblgen.rb: 8bit byte of ASCII-8BIT is a valid
  (but unique to ASCII-8BIT) character.

* transcode.c (rb_eConversionUndefined): new error.
  (rb_eInvalidByteSequence): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18524 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-12 07:20:10 +00:00
akr
61f512a357 add a test.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18504 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-11 22:49:15 +00:00
akr
94ca2d94de * transcode_data.h (rb_transcoder): add resetstate_func field for
resetting a state of stateful encoding.

* enc/trans/iso2022.trans (rb_EUC_JP_to_ISO_2022_JP): specify
  finish_eucjp_to_iso2022jp for resetstate_func.

* tool/transcode-tblgen.rb: specify NULL for resetstate_func.

* transcode.c (output_replacement_character): call resetstate_func
  before appending the replacement character.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18503 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-11 22:44:23 +00:00
akr
425098de96 * transcode.c (rb_trans_conv): find second last error.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18500 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-11 21:49:38 +00:00
akr
a2901a7c75 * transcode_data.h (rb_trans_result_t): new type.
(rb_trans_elem_t): new type.
  (rb_trans_t): new type.

* transcode.c (transcode_dispatch_cb): removed.
  (transcode_dispatch): removed.
  (rb_transcoding_result_t): moved to rb_trans_result_t in
  transcode_data.h.
  (transcode_restartable0): goto follow_info when FUNsi.
  (rb_transcoding_open): use get_transcoder_entry.
  (rb_trans_open): new function.
  (rb_trans_conv): ditto.
  (rb_trans_close): ditto.
  (trans_open_i): ditto.
  (trans_sweep): ditto.
  (more_output_buffer): take rb_trans_t instead of rb_transcoding as
  an argument.
  (transcode_loop): take from_encoding and to_encoding instead of tr
  as arguments.  use rb_trans_open/rb_trans_conv/rb_trans_close.
  (str_transcode): don't use transcode_dispatch.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18498 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-11 15:50:42 +00:00
akr
94342f89f9 add several tests for UTF-32LE.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18448 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-08 16:12:03 +00:00
akr
2833d9f95d * transcode_data.h (rb_transcoder): from_unit_length field added.
from_utf8 field removed.

* tool/transcode-tblgen.rb: generate offsets range.
  follow rb_transcoder change.

* transcode.c (transcode_loop): don't use from_utf8.
  make invalid region from_unit_length wise.

* enc/trans/iso2022.erb.c: follow rb_transcoder and 
  transcode_generate_node change.

* enc/trans/utf_16_32.erb.c: follow rb_transcoder and
  transcode_generate_node change.
  explicit :invalid map removed.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18445 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-08 15:48:17 +00:00
akr
1504652373 * transcode_data.h (rb_transcoding): new field "stateful".
(rb_transcoder): preprocessor and postprocessor field removed.
  change arguments of func_ii, func_si, func_io and func_so.
  new field "finish_func".

* tool/transcode-tblgen.rb: make FUNii, FUNsi and FUNio
  generatable.

* transcode.c (transcoder_lib_table): removed.
  (transcoder_table): change structure.
  (transcoder_key): removed because the above structure change.
  (make_transcoder_entry): new function.
  (get_transcoder_entry): ditto.
  (rb_register_transcoder): follow the structure change.
  (declare_transcoder): ditto.
  (transcode_search_path): new function for breadth first search to
  find a list of converters.
  (transcode_search_path_i): new function.
  (transcode_dispatch_cb): ditto.
  (transcode_dispatch): use transcode_search_path.
  (transcode_loop): follow the argument change.
  (str_transcode): preprocessor and postprocessor stuff removed.

* enc/trans/iso2022.erb.c: new file.  ISO-2022-JP conversion
  re-implemented.

* enc/trans/japanese.erb.c: ISO-2022-JP stuff removed.

nute(23:52:53)% head -40 ChangeLog
Thu Aug  7 23:43:11 2008  Tanaka Akira  <akr@fsij.org>

* transcode_data.h (rb_transcoding): new field "stateful".
  (rb_transcoder): preprocessor and postprocessor field removed.
  change arguments of func_ii, func_si, func_io and func_so.
  new field "finish_func".

* tool/transcode-tblgen.rb: make FUNii, FUNsi and FUNio
  generatable.

* transcode.c (transcoder_lib_table): removed.
  (transcoder_table): change structure.
  (transcoder_key): removed because the above structure change.
  (make_transcoder_entry): new function.
  (get_transcoder_entry): ditto.
  (rb_register_transcoder): follow the structure change.
  (declare_transcoder): ditto.
  (transcode_search_path): new function for breadth first search to
  find a list of converters.
  (transcode_search_path_i): new function.
  (transcode_dispatch_cb): ditto.
  (transcode_dispatch): use transcode_search_path.
  (transcode_loop): follow the argument change.
  (str_transcode): preprocessor and postprocessor stuff removed.

* enc/trans/iso2022.erb.c: new file.  ISO-2022-JP conversion
  re-implemented.

* enc/trans/japanese.erb.c: ISO-2022-JP stuff removed.

* enc/trans/utf_16_32.erb.c: follow argument change of FUNso.

[ruby-dev:35798]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18419 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-07 14:53:30 +00:00
akr
45428f2257 add tests for [ruby-dev:35726] and [ruby-dev:35709].
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18388 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-08-06 11:43:37 +00:00
naruse
00aef398d0 * transcode.c (output_replacement_character):
rename from _get_replacement_character.

* transcode.c (output_replacement_character):
  fix replacement on UTF-32{BE,LE}. [ruby-dev:35705]

* transcode.c (transcode_loop): ditto.

* test/ruby/test_transcode.rb (test_invalid_replace):
  add for above.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-31 20:35:35 +00:00
naruse
a4077a65de * test/ruby/test_transcode.rb (test_unicode_public_review_issue_121):
fix option1 and 3.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-31 11:08:04 +00:00
naruse
931ba3f3b7 * transcode.c (get_replacement_character): use U+FFFD as replacement
character when convert to Unicode.

* test/ruby/test_transcode.rb (test_unicode_public_review_issue_121):
  rename from test_public_review_issue_121.

* test/ruby/test_transcode.rb (test_unicode_public_review_issue_121):
  enable option2.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18294 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-31 10:59:39 +00:00
duerst
0469c8d95b test/ruby/test_transcode.rb: added test_shift_jis
(contributed by Yoshihiro Kambayashi) and
  test_public_review_issue_121
  (see http://www.unicode.org/review/pr-121.html)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18291 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-31 06:12:13 +00:00
duerst
3e53486295 * test/ruby/test_transcode.rb: refactoring/cleanup of
test_iso_2022_jp(_1)



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18203 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-25 01:01:38 +00:00
duerst
ba3fe885d5 * test/ruby/test_transcode.rb: added two comments
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@18160 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-07-22 10:39:30 +00:00
mame
d6ada9f14b * test/ruby/test_transcode.rb: add tests for iso-2022-jp.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16821 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-04 16:48:07 +00:00
duerst
2e7815dd80 Sun Mar 16 18:07:07 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* enc/trans/utf_16_32.c: bug fix (some invalid UTF-8 sequences
	  were legal)

	* test/ruby/test_transcode.rb: test for above bug



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15786 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-16 09:09:53 +00:00
duerst
08631278ad Web Mar 5 17:43:43 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* transcode.c (transcode_loop): Adjusted detection of invalid
	  (ill-formed) UTF-8 sequences. Fixing potential security issue, see
	  http://www.unicode.org/versions/Unicode5.1.0/#Notable_Changes.

	* test/ruby/test_transcode.rb: Added two tests for above fix.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15692 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-03-05 08:45:51 +00:00
duerst
6d5ef97a32 Thu Feb 21 17:15:15 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* transcode.c: Added basic support for passing options to String#encode
	  via a hash. Currently only one option, with one value, is supported:
	  invalid: :ignore (dropping invalid byte sequences instead of
	  producing an error). Option naming is not yet stable!

	* test/ruby/test_transcode.rb: Added a single test for invalid: :ignore
	  option. Not more tests because most data does not yet distinguish
	  between INVALID and UNKNOWN.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15565 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-21 08:42:10 +00:00
naruse
e22ff0c9b6 * enc/trans/korean.c: add support for CP949 by Park Ji-In. [ruby-dev:33626]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15393 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-02-07 06:05:32 +00:00
duerst
38321fc0eb Mon Jan 21 19:42:42 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* transcode.c, enc/trans/utf_16_32.c, test/ruby/test_transcode.rb:
	  added UTF-32BE and UTF-32LE conversions.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15156 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-21 10:41:59 +00:00
duerst
a9b15a4e0c Sun Jan 20 20:00:20 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* transcode.c, enc/trans/utf_16_32.c, test/ruby/test_transcode.rb:
	  added UTF-16LE conversions.

	* fixed changelog for last commit



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15144 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-20 11:00:24 +00:00
duerst
3d0c7bea4d Sun Jan 20 15:08:08 2008 Martin Duerst <duerst@it.aoyama.ac.jp>
* enc/trans/utf_16_32.c: new file, currently implementing
	  UTF-16BE conversions only.

	* test/ruby/test_transcode.rb: Added tests for UTF-16BE;
	  made check_both_ways() use force_encoding differently.

	* transcode_data.h, transcode.c: Support for more conversion
	  functions.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15142 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-20 06:12:48 +00:00
duerst
793e9423cd Fri Dec 28 01:55:04 2007 Martin Duerst <duerst@it.aoyama.ac.jp>
* transcode.c (transcode_dispatch): reverted some of the changes
          in r14746.

	* transcode.c, enc/trans/single_byte.c: Added conversions to/from
	  US-ASCII and ASCII-8BIT (using data tables).

	* enc/trans/single_byte.c: Some spacing/ordering changes due to
	  automatic data file generation.

	* transcode_data.h, transcode.c: Preliminary code for using
	  micro-conversion functions.

	* test/ruby/test_transcode.rb: Added some tests for US-ASCII and
	  ASCII-8BIT conversions.



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14766 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-28 09:26:55 +00:00
duerst
a95ae9619f Sat Dec 22 15:54:54 2007 Martin Duerst <duerst@it.aoyama.ac.jp>
* test/ruby/test_transcode.rb: Added simple tests for
	  EUC-JP and Shift_JIS and tests for ASCII-only range



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14486 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-22 09:52:00 +00:00
matz
5c4cf9bfdf for undefined conversions.
* transcode_data_iso_8859.c: Changed from character constants
  ('\xC2') to integer contants (0xC2) for shorter files and
  better readability; eliminated duplicated tables; changed
  from -1 offset to actual UNDEF entry (not yet distinguishing
  UNDEF and ILLEGAL correctly).

* test/ruby/test_transcode.rb: added a test for UNDEF conversion.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14251 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17 01:28:26 +00:00
matz
f2b0dba1cf * transcode.c (str_transcode, transcode_dispatch): added two-step
* trancode.c: some minor formatting fixes

* transcode_data.h, transcode_data_iso_8859.c: Shortened
  extremely frequently used macros to shorten file length.

* test/ruby/test_transcode.rb: Fixed name of test class;
  added setup method to ensure all necessary encodings exist;
  split tests into more test methods; added tests; fixed ordering
  of arguments in assert_equal to have expected result first.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14236 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-15 05:42:25 +00:00
nobu
3609358aac * test/ruby/test_transcode.rb: added tests from Martin Duerst <duerst
AT it.aoyama.ac.jp>.  [ruby-dev:32532]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14192 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-11 05:27:52 +00:00
nobu
38b92f838f * transcode*.[ch], test/ruby/test_transcode.rb: set properties.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14175 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10 08:25:01 +00:00
matz
7ded13f54b * transcode.c: new file to provide encoding conversion features.
code contributed by Martin Duerst.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14172 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10 05:01:47 +00:00