Binaries and Charlists

You have already seen Elixir strings written between double quotes:

iex> "Hello, world!"
"Hello, world!"

That is a binary: a sequence of bytes, encoded in UTF-8. Elixir also supports a second shape for text written between single quotes:

iex> ~c"Hello, world!" (1)
~c"Hello, world!"
1 The ~c sigil builds a charlist, a list of Unicode codepoints.

Most of the time you only need the double-quoted binary form. The charlist form shows up when you talk to Erlang libraries, so it is worth knowing what it is.

Older Elixir code uses plain 'Hello' single quotes for charlists. The ~c"…​" sigil is the modern spelling and the one you will see in Elixir 1.20 code.

The Shape of a Binary

A binary is built out of bytes. You can see this by looking at the first byte of a simple ASCII string:

iex> "hello"
"hello"
iex> byte_size("hello")
5
iex> <<first, _rest::binary>> = "hello" (1)
"hello"
iex> first
104 (2)
1 The <<…​>> syntax is the bitstring form. Here we peel off the first byte.
2 104 is the byte value of the letter h.

For ASCII-only text, one character equals one byte. For non-ASCII characters UTF-8 uses more than one byte per character, so byte_size/1 and String.length/1 can differ:

iex> byte_size("über")
5
iex> String.length("über")
4
Prefer String.length/1 when you want the number of characters a user sees, and byte_size/1 when you care about storage or network size.

The Shape of a Charlist

A charlist is simply a list where each element is a codepoint:

iex> ~c"hi"
~c"hi"
iex> [104, 105] == ~c"hi"
true
iex> hd(~c"hello")
104

IEx prints a list of codepoints as ~c"…​" when all the values look like printable characters. That is a display convenience, the data is still a list:

iex> [104, 105, 99]
~c"hic"
iex> [104, 105, 1]
[104, 105, 1]

Converting Between the Two

Two helper functions bridge the forms:

iex> to_charlist("hello")
~c"hello"
iex> to_string(~c"hello")
"hello"

String.to_charlist/1 and List.to_string/1 do the same job with stricter types.

Why It Matters

Elixir runs on the Erlang VM, and many older Erlang libraries accept and return charlists, not binaries. When a function from Erlang land returns something that IEx prints as ~c"…​", convert it to a string with to_string/1 before handing it back to the rest of your Elixir code:

iex> :inet.gethostname()
{:ok, ~c"my-laptop"}
iex> {:ok, hostname_charlist} = :inet.gethostname()
{:ok, ~c"my-laptop"}
iex> to_string(hostname_charlist)
"my-laptop"

Bitstrings in One Sentence

A bitstring is the general case: a sequence of bits. A binary is a bitstring whose length in bits is a multiple of 8 (so it divides evenly into bytes). Unless you are writing a network protocol or parsing a file format by hand, you only ever deal with binaries, and you call them strings.