Multilingual typesetting on Overleaf using babel and fontspec
Introduction
This article is a follow-up to our earlier piece titled Multilingual typesetting on Overleaf using polyglossia and fontspec. Here, we show how to use the babel
package, via its \babelprovide
and \babelfont
commands, to reproduce examples contained in the previous article, which focussed on polyglossia
.
This article is not meant to be a comprehensive introduction to the babel
package, which is feature-rich and highly customizable with numerous options for typesetting different languages. In addition, babel possesses features which are not present in polyglossia—for further details, please refer to babel
’s documentation.
Multiple languages/scripts in the same document with babel and \babelfont
To reproduce the first example in our article on polyglossia
, where the document’s primary language is French but contains text in English, Russian and Thai, you can now load babel
for text in English, Russian and French but use the \babelprovide
command to load support for Thai. We use the \babelfont
command to set the document’s fonts: FreeSerif, FreeSans and FreeMono which provide sufficient support for the Latin, Cyrillic and Thai scripts.
\documentclass[12pt]{article}
\usepackage{geometry} % to use a small page size
\geometry{margin=4cm,b5paper}
\usepackage[english,russian,french]{babel}
\babelprovide[import]{thai}
\babelfont{rm}{FreeSerif}
\babelfont{sf}{FreeSans}
\babelfont{tt}{FreeMono}
\begin{document}
\begin{abstract}
Le Lorem Ipsum est simplement du faux texte employé dans la composition et la mise en page avant impression.
\end{abstract}
Merci. \foreignlanguage{english}{Thank you.} \foreignlanguage{thai}{ขอบคุณ} \foreignlanguage{russian}{Спасибо.} Et plus de
texte en français!
Le Lorem Ipsum est le faux texte standard de l'imprimerie depuis les années 1500, quand un imprimeur anonyme assembla ensemble des morceaux de texte pour réaliser un livre spécimen de polices de texte.
\begin{otherlanguage}{english}
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry’s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.
It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. \end{otherlanguage}
\begin{otherlanguage}{russian}
Lorem Ipsum - это текст-`\textsf{рыба}', часто используемый в \texttt{печати} и вэб-дизайне. Lorem Ipsum является стандартной ``рыбой'' для текстов на латинице с начала XVI века. В то время некий безымянный печатник создал большую коллекцию размеров и форм шрифтов, используя Lorem Ipsum для распечатки образцов. Lorem Ipsum не только успешно пережил без заметных изменений пять веков, но и перешагнул в электронный дизайн. \end{otherlanguage}
\begin{otherlanguage}{thai}
\foreignlanguage{english}{Lorem Ipsum} คือ เนื้อหาจำลองแบบเรียบๆ ที่ใช้กันในธุรกิจงานพิมพ์หรืองานเรียงพิมพ์
\end{otherlanguage}
\end{document}
Open this XeLaTeX example in Overleaf.
This example produces the following output:
By default, the last language name passed as an option to babel
becomes the document’s main language—in the above example, it is French. Because the document’s main language is set to French, it will, by default, use French typesetting conventions: this would include French hyphenation patterns, punctuation and some automatic keywords—such as Résumé instead of Abstract. Short text snippets in a different language are inserted with \foreignlanguage{language}{...}
. Longer paragraphs of foreign language text are inserted with
\begin{otherlanguage}{language} ... \end{otherlanguage}
Within a list of language options the main
key can be used to select any one of them as the document’s main language, not just the last one in the list; for example, this would set English as the main language instead of French:
\usepackage[main=english,russian,french]{babel}
Notes on fontspec
package warnings
With some fonts, you may get compile warnings such as these:
These messages highlight issues that are “usually fine” or note font-related default settings that are “not always wrong”. If you carefully check your document for font-related typesetting problems and all seems well, then you may be able to ignore those messages. Even if your document is problem-free, but you’d prefer to avoid (prevent) those warnings, you can try using different fonts.
Other options to avoid (prevent) these warnings include:
- adding the
fontspec
options ofLanguage=Default
andScript=Default
: - adding
\PassOptionsToPackage{silent}{fontspec}
to your document preable:
\babelfont{rm}[Language=Default,Script=Default]{FreeSerif}
\babelfont{sf}[Language=Default,Script=Default]{FreeSans}
\babelfont{tt}[Language=Default,Script=Default]{FreeMono}
Note: Although this removes the warning messages, these settings are not ideal because they can potentially break font features specific to some languages. See this discussion for further information.
\documentclass{article}
\usepackage[english,russian,french]{babel}
\PassOptionsToPackage{silent}{fontspec}
\babelprovide[import]{thai}
\babelfont{rm}{FreeSerif}
\babelfont{sf}{FreeSans}
\babelfont{tt}{FreeMono}
\begin{document}
...
\end{document}
This silences the fontspec
package, but it is a risky option in case fontspec
later needs to warn you of other, more serious, problems.
For more information, see discussions on tex.stackexchange, in particular this answer.
Mixing right-to-left (RTL) and left-to-right (LTR) languages
To typeset a document with Arabic as the main language we can use
\babelprovide[import,main]{arabic}
A document whose main language is Arabic but contains text fragments in LTR languages, such as English, has the following typesetting requirements:
- it has to be typeset from right-to-left (by default);
- arbitrary mixtures of RTL and LTR text fragments should be typeset correctly—usually via an implementation of the Unicode Bidirectional Algorithm.
Both objectives are achieved by providing babel
with a suitable bidi
option, whose value depends on the LaTeX compiler you are using:
- for LuaLaTeX: use
bidi=basic
; - for pdfLaTeX and XeLaTeX use
bidi=default
.
- Note: For further detail please consult
babel
’s documentation which lists additional values for thebidi
option and provides a range of helpful examples.
Here is a small XeLaTeX example using bidi=default
:
\documentclass{article}
\usepackage[english,bidi=default]{babel}
\babelprovide[import,main]{arabic}
\babelfont[arabic]{rm}{Amiri}
\begin{document}
ما هو \foreignlanguage{english}{differentiation}
\end{document}
Open this XeLaTeX example in Overleaf.
This example produces the following output:
See this example if you’d like to use the polyglossia
package instead.
If you need to typeset RTL languages and/or bidirectional text, the LuaLaTeX compiler is strongly recommended because it provides the most comprehensive support for complex-script languages and bidirectional typesetting.
- Tip: A comprehensive example of typesetting Arabic, using LuaLaTeX and
babel
, can be found in the babel’s github repository.
The following example uses an extract of text taken from babel
’s github repository.
% Arabic text in this example is from
% https://github.com/latex3/babel/blob/main/samples/lua-arabic.tex
\documentclass{article}
\usepackage[english,bidi=basic]{babel}
\babelprovide[import,main]{arabic}
\babelfont{rm}{FreeSerif}
\babelfont{sf}{FreeSans}
\babelfont{tt}{FreeMono}
\begin{document}
الكهرمان اسمه باليونانية الإيلقطرون[3] (معرب ἤλεκτρον إيلكترون أي ذو
البريق، ومنه الإلكترون عند الفيزيائيين، وعليه تسمية الكهرباء في
الفارسية برق)، واشتق منه اسم فاعليتيه فسمي إلكترسمس (ηλεκτρισμός)
للدلالة على الكهرباء. أما باللاتينية فالكلمة للكهرباء هي إيلكترستاس
(ēlectricitās)، وهي مشتقة من إيلكتركس (ēlectricus) أي شبيه الكهرمان.
\end{document}
Open this LuaLaTeX example in Overleaf.
This example produces the following output:
Specifying fonts for specific languages
You can specify the fonts used for different languages by adding the language name or the script name, preceded by an *, as an option to \babelfont
. In our French—English—Russian—Thai example, you can add
\babelfont[english]{rm}{Chancery Uralic}
\babelfont[*cyrillic]{rm}{Charis SIL}
\babelfont[thai]{rm}{Garuda}
to use Chancery Uralic to typeset English text, Charis SIL to typeset all Cyrillic scripts (including Russian), and Garuda to typeset Thai text.
- Please note: We are not suggesting the next example provides a typographically harmonious combination of type styles, it merely shows how to configure and use fonts of your choice.
\documentclass[12pt]{article}
\usepackage{geometry} % to use a small page size
\geometry{margin=4cm,b5paper}
\usepackage[english,russian,main=french]{babel}
\babelprovide[import]{thai}
%% When all the \babelfont lines are uncommented, we can show that e.g. \babelfont[english], [*cyrillic] etc really do override the default \babelfont{rm} for English, Cyrillic.
\babelfont{rm}[Language=Default]{FreeSerif}
\babelfont{sf}[Language=Default]{FreeSans}
\babelfont{tt}[Language=Default]{FreeMono}
\babelfont[english]{rm}{Chancery Uralic}
\babelfont[*cyrillic]{rm}{Charis SIL}
\babelfont[thai]{rm}{Garuda}
\begin{document}
\begin{abstract}
Le Lorem Ipsum est simplement du faux texte employé dans la composition et la mise en page avant impression.
\end{abstract}
Merci. \foreignlanguage{english}{Thank you.} \foreignlanguage{thai}{ขอบคุณ} \foreignlanguage{russian}{Спасибо.} Et plus de
texte en français!
Le Lorem Ipsum est le faux texte standard de l'imprimerie depuis les années 1500, quand un imprimeur anonyme assembla ensemble des morceaux de texte pour réaliser un livre spécimen de polices de texte.
\begin{otherlanguage}{english}
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.
It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. \end{otherlanguage}
\begin{otherlanguage}{russian}
Lorem Ipsum - это текст-`\textsf{рыба}', часто используемый в \texttt{печати} и вэб-дизайне. Lorem Ipsum является стандартной ``рыбой'' для текстов на латинице с начала XVI века. В то время некий безымянный печатник создал большую коллекцию размеров и форм шрифтов, используя Lorem Ipsum для распечатки образцов. Lorem Ipsum не только успешно пережил без заметных изменений пять веков, но и перешагнул в электронный дизайн. \end{otherlanguage}
\begin{otherlanguage}{thai}
\foreignlanguage{english}{Lorem Ipsum} คือ เนื้อหาจำลองแบบเรียบๆ ที่ใช้กันในธุรกิจงานพิมพ์หรืองานเรียงพิมพ์
\end{otherlanguage}
\end{document}
Open this XeLaTeX example in Overleaf.
This example produces the following output:
Here is another example of specifying a font for a particular script—Devanagari, which is used for Hindi and Sanskrit:
\documentclass[12pt]{article}
\usepackage[english]{babel}
%% Each \babelprovide can only be used for one language
\babelprovide[import]{hindi}
\babelprovide[import]{sanskrit}
\babelfont[*devanagari]{rm}{Lohit Devanagari}
\begin{document}
Hindi: \foreignlanguage{hindi}{हिन्दी}
Sanskrit: \foreignlanguage{sanskrit}{संस्कृतम्}
\end{document}
Open this LuaLaTeX example in Overleaf.
This example produces the following output:
See this example if you’d like to do this with the polyglossia
package instead.
Defining other font families
Recall that you can set typefaces for different languages and families. In the following example we explicitly set a sans serif font for Hebrew because we would like to use sans serif for section headers:
\documentclass[12pt]{article}
\usepackage{geometry} % to use a small page size
\geometry{margin=4cm,b5paper}
\usepackage[bidi=basic]{babel}
\babelprovide[main,import]{hebrew}
\babelfont[hebrew]{rm}{Hadasim CLM}
\babelfont[hebrew]{sf}{Miriam CLM}
\usepackage{titlesec}
\titleformat{\section}{\Large\sffamily\bfseries}{\thesection}{1em}{}
\begin{document}
\section{מבוא}
זוהי עובדה מבוססת שדעתו של הקורא תהיה מוסחת עלידי טקטס קריא כאשר הוא יביט בפריסתו.
\end{document}
Open this LuaLaTeX example in Overleaf.
This example produces the following output: