java - 文字列の圧縮と解凍

Question

score 1 · Accepted Answer

To do an in-place encoding, the encoded string must never be longer than the original string. Suppose we assume the following encoding rules:

No numeric digits in the original string (so no count delimiter characters are needed)
A run length of 1 is never explicitly coded (so abc remains abc)

I believe that with these assumptions, a run-length encoding is not ambiguous and will never be longer than the string itself. Then the following algorithm (pseudocode) should do the job of encoding in place:

currentChar ← string[0]
nextOutputPos ← 1
nextReadPos ← 1
count ← 1
while (nextReadPos < length of string) {
    nextChar ← string[nextReadPos++];
    if (nextChar == currentChar) {
        count++;
    } else {
        if (count > 1) {
            write (count as a string) to string at position nextOutputPos
            nextOutputPos ← nextOutputPos + (length of count as a string)
        }
        string[nextOutputPos++] ← currentChar ← nextChar;
    }
}

At the end, the encoded string is contained in the half-open range [0, nextOutputPos) of string.

score 1 · Accepted Answer

A simple reverse scan does provide a (seemingly) good solution to the encoding part at least. I am doing one scan from right to left and overwriting the portions of the string with the occurrence count.

char * enc(char * ip)
{
    int r,op;
    int l=strlen(ip);
    r=l-1;
    char curr;
    op=r;
    int curr_count=1,mod_curr_count;
    while(r>=0)
    {
        curr=ip[r];

        while(ip[--r]==curr)
        {

            curr_count++;
        }
        if(curr_count!=1)
        {
            while(curr_count)
            {
            mod_curr_count=curr_count%10;
            ip[op--]=(char)(mod_curr_count+48);
            curr_count/=10;
            }
            ip[op--]=curr;
            curr_count=1;

        }
        else
        {
        ip[op--]=curr;
        }
    }

    ip=ip+op+1;
    return ip;
}

Input : aaaaaaaaaaaabbbfffffffffffffffqqqqqqqqqqqqqqqqqqccccpoii

Output: a12b3f15q18c4poi2

score 0 · Accepted Answer

正規表現を利用したJavaの可能性は次のとおりです。

String str = "aaabbbc";  // string to be encoded

StringBuilder sb = new StringBuilder();  // to hold encoded string

for (String s : str.split("(?<=(.))(?!\\1)")) {
    sb.append(s.charAt(0));
    if (s.length() > 1) // only append length if it's > 1
        sb.append(s.length());
}

System.out.println(sb.toString());

a3b3c

java - 文字列の圧縮と解凍

3 に答える 3

Related

Reference