In CSV download processes, Pacific Islander language characters (e.g., Māori, Samoan, Tongan, Fijian, Hawaiian) are usually handled through character encoding, and the key factor is whether the CSV is encoded in UTF-8.
If the encoding is handled correctly, characters like macrons or diacritics will export and import properly. If not, they often become garbled (mojibake).
Examples include macrons and special letters:
Language
Example
Character
Māori | Tāmaki | ā |
Hawaiian | Hawaiʻi | ʻ (okina) |
Samoan | fa‘a Samoa | glottal mark |
Tongan | fakaʻapaʻapa | ʻ |
Cook Islands Māori | Rarotonga | macrons |
These characters are Unicode characters, not basic ASCII.
Best practice for CSV exports today is:
UTF-8 encodingExample CSV row:
Name,City
Tāmaki Makaurau,Auckland
Hawaiʻi,HonoluluUTF-8 correctly preserves:
macrons (ā ē ī ō ū)
glottal stops (ʻ)
accented characters
Most modern systems support UTF-8:
web apps
AWS pipelines
Postgres
Excel (recent versions)
Google Sheets
Issues typically appear when CSV files are opened in older Excel workflows.
Excel historically assumes:
Windows-1252 encodingInstead of UTF-8.
Result:
Intended
Garbled
Tāmaki | TÄmaki |
Hawaiʻi | HawaiÊ»i |
fa‘a | fa‘a |
This is not data corruption, just incorrect decoding.
Good export pipelines do one of the following:
UTF-8 + BOMThis helps Excel detect encoding.
Example first bytes:
EF BB BFMany SaaS apps add this automatically.
Standard for APIs and data pipelines.
Works fine in:
Google Sheets
modern Excel import
databases
programming languages
Typical SaaS CSV generation:
Database (UTF-8)
↓
application layer
↓
CSV serializer
↓
UTF-8 encoded file
↓
downloadCharacters remain intact as long as encoding stays UTF-8.
Google Sheets actually always exports CSV as UTF-8. There is no option to choose encoding, which is why you don’t see a “Save as UTF-8” setting.
When you download:
File → Download → Comma-separated values (.csv)
Google automatically produces:
UTF-8 encoded CSVSo characters like:
ā ē ī ō ū
ʻ (okina)
é ñ ü
are preserved correctly in the file.
The issue normally appears after download, when the CSV is opened in Excel.
Older Excel behaviour:
CSV opened directly → assumes Windows-1252 encodingResult:
Correct
Excel shows
Tāmaki | TÄmaki |
Hawaiʻi | HawaiÊ»i |
The CSV itself is fine — Excel is interpreting it incorrectly.
Instead of double-clicking the file:
Open Excel
Go to Data
From Text / CSV
Select the file
Set File Origin → UTF-8
Then characters display correctly.