Determining the encoding of a CSV file

The Problem

When uploading or importing a CSV file PANDA raises an error related to the encoding of the file.

Encodings are a complex subject, which we won't try to explain here. (Joel Spolsky has written an excellent primer if you want the technical details.) The important thing to know is that PANDA can not infer from a CSV file what encoding it is in. If the file contains characters which are not supported by the default encoding (known as utf-8) this can cause PANDA to generate errors.

The default is right in the large majority of cases, but when it isn't right figuring out the correct encoding can be very tricky.

The Solution

Always ask your source what the encoding of the file is. Input the correct encoding after you select the CSV file to upload.

If you have no way of finding out the correct encoding of the file, then try the following encodings, in this order:

If none of these work the likelihood you are going to determine the encoding without additional information from the source is very low. In theory you may be able to guess the encoding based on the language of the author, however this not a recommended practice.