1

Emacs 24.3.1, Windows 2003

I found the 'byte-to-position' function is a little strange.

According to the document:

 -- Function: byte-to-position byte-position
     Return the buffer position, in character units, corresponding to
     given BYTE-POSITION in the current buffer.  If BYTE-POSITION is
     out of range, the value is `nil'.  **In a multibyte buffer, an
     arbitrary value of BYTE-POSITION can be not at character boundary,
     but inside a multibyte sequence representing a single character;
     in this case, this function returns the buffer position of the
     character whose multibyte sequence includes BYTE-POSITION.**  In
     other words, the value does not change for all byte positions that
     belong to the same character.

We can make a simple experiment:

Create a buffer, eval this expression: (insert "a" (- (max-char) 128) "b")

Since the max bytes number in Emacs' internal coding system is 5, the character between 'a' and 'b' is 5 bytes. (Note that the last 128 characters is used for 8 bits raw bytes, their size is only 2 bytes.)

Then define and eval this test function:

(defun test ()
  (interactive)
  (let ((max-bytes (1- (position-bytes (point-max)))))
    (message "%s"
             (loop for i from 1 to max-bytes collect (byte-to-position i)))))

What I get is "(1 2 3 2 2 2 3)".

The number in the list represents the character position in the buffer. Because there is a 5 bytes big character, there should be five '2' between '1' and '3', but how to explain the magic '3' in the '2's ?

4

1 に答える 1