10

「アサート」OCaml 3.12.1 コンストラクト用に生成されたラムダ コードを改善したいと考えています。次に例を示します。

let f x =
    assert (x = 4);
    assert (2 + x = 6);
    assert (x - x = 0);
    exit x

上記のファイル longfilename.ml は、ラムダ コード生成を改善してほしい大規模な OCaml モジュールの代表例です。コンパイルすると次のようになります。

$ ocamlopt -S longfilename.ml
$ cat longfilename.s
...
    .data
    .quad   3072
_camlLongfilename__2:
    .quad   L100007
    .quad   9
    .quad   9
    .quad   2300
L100007: .L100007:
    .ascii  "longfilename.ml"
    .byte   0
    .data
    .quad   3072
_camlLongfilename__3:
    .quad   L100006
    .quad   7
    .quad   9
    .quad   2300
L100006: .L100006:
    .ascii  "longfilename.ml"
    .byte   0
    .data
    .quad   3072
_camlLongfilename__4:
    .quad   L100005
    .quad   5
    .quad   9
    .quad   2300
L100005: .L100005:
    .ascii  "longfilename.ml"
    .byte   0
...

上記はひどく冗長です。各アサーションのソース ファイルの名前が重複しています。犯人は bytecomp/translcore.ml のようです:

let assert_failed loc =
  (* [Location.get_pos_info] is too expensive *)
  let fname = match loc.Location.loc_start.Lexing.pos_fname with
              | "" -> !Location.input_name
              | x -> x
  in
  let pos = loc.Location.loc_start in
  let line = pos.Lexing.pos_lnum in
  let char = pos.Lexing.pos_cnum - pos.Lexing.pos_bol in
  Lprim(Praise, [Lprim(Pmakeblock(0, Immutable),
          [transl_path Predef.path_assert_failure;
           Lconst(Const_block(0,
              [Const_base(Const_string fname);
               Const_base(Const_int line);
               Const_base(Const_int char)]))])])
;;

Const_base(Const_string fname)一見すると、 に名前を付けて、コンパイル時のハッシュ テーブルに格納して再利用するだけで十分に見え ます。モジュール内最適化の場合、変更は管理可能かもしれません (ハッシュテーブルが各コンパイル単位でリセットされる限り)。

ここでは、特に「各コンパイル単位でのリセット」の部分で、少し深みがありません。ヒントはありますか?

4

1 に答える 1

8

There already is a mechanism in the OCaml compiler to share some constants: see asmcomp/compilenv.ml and its use, in particular of the structured_constants value, in asmcomp/cmmgen.ml. I am not familiar with this code so am not sure why your particular use case is not shared, but it seems like there is a difference between, in the lambda-code, Const_base (Const_string foo) and Const_immstring foo; the later are shared, and maybe the former are not.

I don't know what the intended semantics is for immstring. It seems to be used by the compiler internally to compile method labels (bytecomp/translclass.ml), but not exposed to the input language.

(I suspect the distinction is because strings are mutable, so sharing user-visible strings would be observable and change programs behavior. But string constants are already lambda-lifted so users can already observe semantically-inconsistent sharing. Increasing sharing of user-visible strings would probably still be rejected as a compatibility break.)

Looking at the way those immediate strings are handled by the constant emitting code (asmcomp/cmmgen.ml:emit_constant), they are represented like the usual strings, so maybe you could just patch the compiler to use an immstring in assert_failed and things would work.

[EDIT BY OP]

Changing Const_base (Const_string fname) into Const_immstring fname, while slightly incompatible, allows OCaml to compile itself, to compile Frama-C and the new Frama-C passes its regression tests. On the original example, the effect is as follows, which was exactly the desired result:

$ cat longfilename.s 
...
    .data
    .quad   3072
_camlLongfilename__2:
    .quad   L100005
    .quad   9
    .quad   9
    .data
    .quad   3072
_camlLongfilename__3:
    .quad   L100005
    .quad   7
    .quad   9
    .data
    .quad   3072
_camlLongfilename__4:
    .quad   L100005
    .quad   5
    .quad   9
    .quad   2300
L100005: .L100005:
    .ascii  "longfilename.ml"
    .byte   0
于 2012-04-04T12:27:00.027 に答える