0

I'm working on a servlet that accepts zipfile and will unzip and display the content of the csv files.

So far i'm able to display a few records. However, as shown in the image below, one of the record is diplaying "question marks"/unrecognised characters.

I checked the csv file and it's perfectly fine. I also tried to delete the text and typed some other text, but still unsuccessful.

image of the problem:

https://dl.dropbox.com/u/11910420/Screen%20Shot%202012-09-07%20at%203.18.46%20PM.png

public class AdminBootStrap extends HttpServlet {

public void doPost(HttpServletRequest req, HttpServletResponse resp)
        throws IOException {
    resp.setContentType("text/plain");

    PrintWriter out = resp.getWriter();

    try {
        ServletFileUpload upload = new ServletFileUpload();

        FileItemIterator iterator = upload.getItemIterator(req);

        while (iterator.hasNext()) {
            FileItemStream item = iterator.next();
            InputStream in = item.openStream();

            if (item.isFormField()) {
                out.println("Got a form field: "
                        + item.getFieldName());
            } else {
                out.println("Got an uploaded file: "
                        + item.getFieldName() + ", name = "
                        + item.getName());

                ZipInputStream zis = new ZipInputStream(
                        new BufferedInputStream(in));

                ZipEntry entry;

                // Read each entry from the ZipInputStream until no
                // more entry found indicated by a null return value
                // of the getNextEntry() method.
                //

                byte[] buf = new byte[5000];
                int len;
                String s = null;

                while ((entry = zis.getNextEntry()) != null) {

                    out.println("Unzipping: " + entry.getName());

                    if (entry.getName().equalsIgnoreCase("booking.csv")) {

                        while ((len = zis.read(buf)) > 0) {
                            s = new String(buf);

                            String[] arrStr = s.split("\\n");
                            for (String a : arrStr) {

                                out.println(a);

                            }// end for

                        }

                    }

any ideas?

4

1 に答える 1

1

原因はs = new String(buf)、デフォルトのエンコーディングを介してバイト文字列を文字列にデコードするためです。残念ながら、GAE のデフォルトのエンコーディングはUS-ASCII.

CSV をエンコードする必要があります。使用例UTF-8

s = new String(buf, "UTF-8");
于 2012-09-07T08:06:27.793 に答える