CSV files are everywhere. When you export data from a database, pull a report from analytics software, or download a dataset from a government portal, you're almost always getting a CSV. It's the lowest common denominator of data formats — and that's exactly why it's still so useful.
But CSV is also one of those formats that looks simple until it bites you. Encoding issues in Excel on Mac, fields with commas that break parsing, inconsistent line endings — there's more nuance here than the format's simplicity suggests. This guide covers what CSV actually is, how to work with it reliably, and when you'd want to convert it to JSON.
What is a CSV File?
CSV stands for Comma-Separated Values. It's a plain text format where each line represents a row of data, and fields within each row are separated by commas. That's really all there is to the format at its core.
name,age,city Alice,30,New York Bob,25,Chicago Carol,28,Austin
The first row is typically a header row with column names. Everything after that is data. Unlike Excel's .xlsx format — which is a compressed collection of XML files storing formulas, formatting, multiple sheets, and metadata — a CSV is just text. Open it in Notepad, TextEdit, or cat it in the terminal, and you'll see exactly what's there.
This simplicity is the point. CSV has no proprietary lock-in. Any language, any OS, any tool can read it without special libraries.
CSV Structure: The Details That Actually Matter
Rows and columns
Each line break (\n or \r\n) separates rows. Commas separate columns within each row. Simple enough — until your data contains commas.
Quoting fields that contain commas or line breaks
If a field value contains a comma, it must be wrapped in double quotes. If it contains double quotes, those quotes are escaped by doubling them.
name,address,note Alice,"123 Main St, Apt 4B","Said ""hello"" at checkin" Bob,456 Oak Ave,No notes
Delimiter variants
Despite the "comma" in the name, CSV files sometimes use tabs (\t), semicolons (;), or pipes (|) as delimiters. Tab-separated files are often called TSV. European locales that use commas as decimal separators commonly use semicolons as the field delimiter to avoid ambiguity.
The UTF-8 issue in Excel on Mac
This trips up a lot of people. Excel on macOS doesn't always detect UTF-8 encoding automatically. When you open a UTF-8 CSV with non-ASCII characters (accented letters, Japanese, emoji), you can end up with garbled text. The fix: use Excel's data import wizard and explicitly specify UTF-8 encoding, or save the CSV with a BOM (byte order mark) at the start of the file. On Windows, Excel expects UTF-8 with BOM to correctly auto-detect the encoding.
Practical Uses: Parsing and Converting CSV
Parsing CSV in JavaScript
Don't split on commas manually — you'll break on quoted fields. Use a proper parser like PapaParse:
import Papa from 'papaparse';
const result = Papa.parse(csvString, { header: true });
console.log(result.data);
// [{ name: 'Alice', age: '30', city: 'New York' }, ...]
Reading CSV in Python
import csv
with open('data.csv', newline='', encoding='utf-8') as f:
reader = csv.DictReader(f)
for row in reader:
print(row['name'], row['city'])
Converting CSV to JSON
When you're feeding data into a web app or API, JSON is usually a better fit than CSV. Each CSV row maps cleanly to a JSON object, and the resulting array of objects is easy to iterate in any frontend framework. Use the CSV to JSON converter to do this without writing any code. To go the other direction, the JSON to CSV converter handles nested objects and outputs a flat table. Once you have JSON, the JSON Formatter helps you inspect and validate the structure.
Frequently Asked Questions
- What's the difference between CSV and TSV?
- TSV (Tab-Separated Values) uses a tab character instead of a comma as the field delimiter. It's useful when your data frequently contains commas — like addresses or prose text — since tabs are less common in typical data. The tradeoff is that tabs are invisible in most text editors, making TSV files harder to read at a glance.
- How do I handle commas inside field values?
- Wrap the field in double quotes. This is part of the RFC 4180 standard for CSV. Most parsers handle this automatically, but if you're writing a parser from scratch, you need to handle the quoted-field case explicitly. Also remember that a double quote inside a quoted field is represented as two consecutive double quotes (
""). - Why does Excel on Mac garble my CSV file?
- Excel on Mac historically defaults to a legacy encoding rather than UTF-8 when opening CSV files directly. To avoid this, use the Data → Import from Text/CSV feature and set the encoding to UTF-8 explicitly. Alternatively, save your CSV with a UTF-8 BOM — many tools like Python's
csvmodule support this with theutf-8-sigencoding name.
Summary
- CSV is plain text: rows separated by line breaks, fields separated by commas (or other delimiters)
- Fields containing commas or quotes must be wrapped in double quotes per RFC 4180
- Excel's UTF-8 handling on Mac is unreliable — use the import wizard or add a BOM
- For web apps and APIs, converting CSV to JSON is usually the right move
- Always use a proper parser (PapaParse, Python's
csvmodule) — don't split on commas manually
Ready to work with your CSV data?
- CSV to JSON — convert CSV rows to JSON objects instantly
- JSON to CSV — flatten JSON arrays back to CSV format
- JSON Formatter — pretty-print and validate the resulting JSON