1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
ruby--ruby/sample/trick2015/ksk_2/remarks.markdown
Nobuyoshi Nakada db16629008
Fixed misspellings
Fixed misspellings reported at [Bug #16437], only in ruby and rubyspec.
2019-12-20 09:32:42 +09:00

7.2 KiB
Raw Blame History

Remarks

The program is run with a data file from the standard input, e.g.,

  ruby entry.rb < data

where < can be omitted. The data file must be in the DIMACS CNF format (see Description for detail). It has been confirmed to be run on

  ruby 1.9.3p385 (2013-02-06 revision 39114) [x86_64-darwin11.4.2]
  ruby 2.0.0p481 (2014-05-08 revision 45883) [universal.x86_64-darwin13]
  ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-linux]

For particular inputs, the program works differently on these environments (see Limitation).

Description

The program is a very small SAT solver with 194 bytes making use of a powerful feature of Regexp matching in Ruby. It receives a data file from the standard input in the DIMACS CNF that is a standard format for inputs of SAT solvers. For example, the text in the DIMACS CNF format,

c
c This is a sample input file.
c
p cnf 3 5
 1 -2  3 0
-1  2 0
-2 -3 0
 1  2 -3 0
 1  3 0

corresponds to a propositional formula in conjunctive normal form,

(L1 ¬L2 L3) ∧ (¬L1 L2) ∧ (¬L2 ¬L3) ∧ (L1 L2 ¬L3) ∧ (L1 L3).

In the DIMACS CNF format, the lines starting with c are comments that are allowed only before the line p cnf .... The line p cnf 3 5 represents that the problem is given in conjunctive normal form with 3 variables (L1,L2,and L3) and 5 clauses. A clause is given by a sequence of the indices of positive literals and the negative indices of negative literals. Each clause is terminated by 0. For the input above, the program outputs

s SATISFIABLE
v 1 2 -3

because the formula is satisfiable by L1=true, L2=true, and L3=false. If an unsatisfiable formula is given, the program should output

s UNSATISFIABLE

This specification is common in most exiting SAT solvers and required for entries of SAT competition.

The program is very small with no other external libraries thanks to the wealth of string manipulations in Ruby. It is much smaller than existing small SAT solvers like minisat and picosat!

Internals

The basic idea of the program is a translation from DIMACS CNF format into Ruby. For example, the data file above is translated into a Regexp matching expression equivalent to

 '---=-' =~ 
 /(-?)(-?)(-?)-*=(?=\1$|-\2$|\3$|$)(?=-\1$|\2$|$)(?=-\2$|-\3$|$)(?=\1$|\2$|-\3$|$)(?=\1$|\3$|$)(?=)/

that returns MatchData if the formula is satisfiable and otherwise returns nil. The beginning of regular expression (-?)(-?)(-?)-*= matches a string "---=" so that each capturing pattern (-?) matches either "-" or "", which corresponds to an assignment of true or false, respectively, for a propositional variable. Each clause is translated into positive lookahead assertion like (?=\1$|-\2$|\3$|$) that matches "-" only when \1 holds "-", \2 holds "", or \3 holds "-". This exactly corresponds to the condition for L1¬L2L3 to be true. The last case |$ never matches "-" but it is required for making the translation simple. The last meaningless positive lookahead assertion (?=) is added for a similar reason. This translation is based on Abigail's idea where a 3SAT formula is translated into a similar Perl regular expression. The differences are the submitted Ruby program translates directly from the DIMACS CNF format and tries to make the code shorter by using lookahead assertion which can also make matching more faster.

Thanks to the x option for regular expression, the input above is simply translated into

  ?-*3+'=-'=~/#{'(-?)'*3}-*=(?=
   \1$| -\2$|  \3$| $)(?=
  -\1$|  \2$| $)(?=
  -\2$| -\3$| $)(?=
   \1$|  \2$| -\3$| $)(?=
   \1$|  \3$| $)(?=
  )/x

which has a structure similar to the DIMACS CNF format.

The part of formatting outputs in the program is obfuscated as an inevitable result of 'golfing' the original program

   if ...the matching expression above... then
     puts 's SATISFIABLE'
     puts 'v '+$~[1..-1].map.with_index{|x,i|
       if x == '-' then
         i+1
       else
         ['-',i+1].join
       end
     }.join(' ')
   else
     puts 's UNSATISFIABLE'
   end

In the satisfiable case, the MatchData $~ obtained by the regular expression has the form of

  #<MatchData "---=" 1:"-" 2:"-" 3:"">

which should be translated into a string 1 2 -3. The golfed code simply does it by eval(x+?1)*i-=1 where x is matched string "x" or "" and i be a negated index.

Data files

The submission includes some input files in the DIMACS CNF format for testing the program.

Limitation

The program may not work when the number of variables exceeds 99 because \nnn in regular expression with number nnn does not always represent backreference but octal notation of characters. For example,

  /#{"(x)"*999}:\502/=~"x"*999+":x"
  /#{"(x)"*999}:\661/=~"x"*999+":x"
  /#{"(x)"*999}:\775/=~"x"*999+":x"

fail due to the syntax error (invalid escape), while

  /#{"(x)"*999}:\508/=~"x"*999+":x"
  /#{"(x)"*999}:\691/=~"x"*999+":x"
  /#{"(x)"*999}:\785/=~"x"*999+":x"

succeed (to return 0) because 508, 691, and 785 are not in octal notation. Since Ruby 1.9.3 incorrectly returns nil instead of terminating with the error for

  /#{"(x)"*999}:\201/=~"x"*999+":x"
  /#{"(x)"*999}:\325/=~"x"*999+":x"

the present SAT solver may unexpectedly return "UNSATISFIABLE" even for satisfiable inputs. This happens when the number is in octal notation starting with either 2 or 3.

In the case of the number starting with 1, the code like the above does work on all versions of Ruby I tried. For example,

  /#{"(x)"*999}:\101/=~"x"*999+":x"
  /#{"(x)"*999}:\177/=~"x"*999+":x"

succeed (to return 0). Interestingly,

  /#{"(x)"*999}:\101/=~"x"*999+":\101"
  /#{"(x)"*999}:\177/=~"x"*999+":\177"

return nil, while

  /:\101/=~":\101"
  /:\177/=~":\177"

succeed to return 0. The meaning of \1nn in regular expression seems to depend on the existence of capturing expressions.

In spite of these Ruby's behaviors, we have a good news! The present SAT solver does not suffer from the issues because the program cannot return solutions in practical time for inputs with variables more than 40.