Constructed Languages for Fiction: From Quenya to AI-Assisted Xenolinguistics

a-gnt CommunityApril 12, 202610 min read

A hundred years ago Tolkien invented a language by hand. Today you can build a passable one in an afternoon. Here's when to let AI help and when to do the slow work yourself.

sci-fi conlang linguistics worldbuilding

The oldest conlang notebook in the world belongs to a nine-year-old girl.

Sort of. The specific honor probably goes to some anonymous child whose secret language was never written down. But the oldest preserved effort we know about is Hildegard of Bingen's Lingua Ignota, a twelfth-century Benedictine nun's private vocabulary of about a thousand words for plants, body parts, ecclesiastical objects, and — because of course — "church pew." Hildegard called it an "unknown language." She used the Latin alphabet for it and never explained the grammar, if it had one. Scholars have been arguing ever since about whether it was a real language, a coded prayer system, or a game a very bored mystic played with herself between visions.

I like starting here because it frames the whole subject honestly. Conlanging is ancient. It is usually private. It is usually unfinished. And the moment it becomes famous, the moment it intersects with fiction, it changes shape entirely — from a personal notebook into a public artifact that has to sound real to strangers.

Nine hundred years later, that same tension is the central problem in using AI to help build fictional languages. A model can generate a plausible-looking word list the size of Hildegard's entire vocabulary in eleven seconds. It can invent a grammar. It can translate Hamlet. What it cannot do, by itself, is give the language the one quality that makes a reader believe in it. We'll come back to what that quality is. First, a brief history, because the shape of the problem only makes sense in light of how we got here.

From Hildegard to Tolkien

For most of its history, conlanging was either mystical or diplomatic. Hildegard's Lingua Ignota belongs to the mystical tradition — languages made for private spiritual use, or to name things the real language could not. John Dee's Enochian, supposedly dictated by angels through a scryer, belongs there too. So do a surprising number of alchemical and kabbalistic cipher-tongues. These languages were not usually meant to be spoken. They were meant to be possessed.

The diplomatic tradition runs parallel and produces the "international auxiliary language" — a project Europeans became obsessed with in the nineteenth century when they noticed, belatedly, that the rest of the world existed. Volapük came first, in 1879. It was complicated and quarrelsome and largely collapsed within a decade. Esperanto arrived in 1887 and survived, mainly because its creator, L. L. Zamenhof, understood something the Volapük people didn't: a language for strangers has to be easy. Esperanto's grammar has sixteen rules and no exceptions. A motivated adult can reach conversational fluency in a few months. About two million people have, at various times, done exactly that.

Neither tradition produced languages for fiction. That's a twentieth-century invention, and it is almost entirely one man's fault.

J. R. R. Tolkien did not set out to write The Lord of the Rings. He set out, as a young philologist, to build languages for pleasure — first some rough sketches in his teens, then Quenya and Sindarin, two elaborate Elvish tongues whose grammar he would refine for the rest of his life. The novels came later, and partly as scaffolding: Tolkien needed a world for his languages to be spoken in. The hobbits, the quest, the ring, the shire — all of it grew up around Quenya the way a city grows up around a river.

That origin story matters because it established the standard every serious literary conlang has been measured against since. A Tolkien-grade language isn't just a word list. It has a history — dialects, sound changes, old forms and new forms, borrowings from imagined neighbors. Quenya's relationship to Sindarin is roughly the relationship of Latin to medieval French: one is older and more formal, the other is what people actually speak, and the living language bears the fingerprints of the dead one the way European languages still carry Latin bones.

This is the bar. When conlangers say a language "feels real," they mean it has history. And history is precisely the thing that's hard to fake.

The modern era: Klingon, Dothraki, and the professionalization of the craft

Between Tolkien and now there's a middle period that's easy to miss but worth naming. It starts with Marc Okrand, a linguist hired in 1984 to build Klingon for Star Trek III because the film's producers were tired of actors making up gibberish that didn't match shot to shot. Okrand's solution was brilliantly pragmatic: give Klingon a phonology that humans find harsh (lots of uvular stops, a lateral affricate that sounds like clearing your throat), a small core vocabulary, and a word order (object-verb-subject) rare enough on Earth to feel alien. He wrote a dictionary. Actors could learn their lines. A subculture of people who wanted to speak Klingon at Renaissance fairs was accidentally born.

Then came David J. Peterson and Game of Thrones. HBO hired him to build Dothraki from the fragments George R. R. Martin had sketched across his novels. Peterson took Martin's handful of words and reverse-engineered a plausible grammar, invented morphology, and several thousand vocabulary items, then did it again for High Valyrian and a family of derived dialects. The result was a language children now actually learn in linguistics classes — a full, working system with its own podcast.

Peterson's methods are worth naming because they're the opposite of what most hobbyist conlangers do. He starts from phonology (what sounds the language has, and which are forbidden), moves to morphology (how words are built), then syntax (how sentences are built), and only then does vocabulary. Most amateurs start with the fun part — a word list — and try to grow a grammar around it afterward. That's why most amateur conlangs feel like secret codes instead of languages. A code has a one-to-one mapping to English. A language has internal logic that doesn't map to anything.

Where AI actually helps

Now we get to the part of the essay where a lesser draft would say something like "and this is where AI comes in to revolutionize conlanging." It doesn't. AI does not revolutionize conlanging. What it does, in the hands of a writer who already knows what they're trying to do, is eliminate specific, boring bottlenecks that used to make building a language an impossibly long project.

Three specific bottlenecks, as it happens.

The first is word generation at scale. Once you have a phonology — a set of allowed sounds, a set of allowed syllable shapes, a stress pattern — you can generate ten thousand plausible words that obey the rules. This is work. It is also work a model can do in a minute. 🗺️Alien Language Builder exists mostly to do this one job well: you give it a phoneme inventory and a syllable structure, and it produces vocabulary that respects those constraints rigorously. Without this kind of tool, most writers give up around word five hundred. With it, a writer can produce a ten-thousand-word dictionary in an afternoon. Whether the words mean anything interesting is a separate question the tool does not answer.

The second bottleneck is translation drills. Once you have a grammar, you need to use it. Actually speaking a conlang — writing sentences in it, reading them back, noticing where the grammar creaks — is how you find the bugs. 👽Alien Language Tutor is built to be a practice partner: you feed it the rules of your language and it drills you on them the way a human tutor would, with specific translation exercises and gentle corrections when you misuse a particle. This is the part of conlanging that used to require a patient linguist friend. Now it doesn't.

The third is first-contact dialogue — the specific scene where a speaker of your language meets someone who doesn't speak it, and the writer has to render both the misunderstanding and the repair. This is brutally hard to write well because it requires the writer to hold two grammars in mind at once and think about what each speaker plausibly does and doesn't understand. 💬First Contact Dialogue is a specific craft tool for exactly that scene. It doesn't generate the language; it helps you stage an exchange in a language you've already built. There's a difference, and the difference is the whole game.

Where AI does not help

Here's the part I think most of the AI-conlang discourse gets wrong. A language model, however good, is not a linguist. It is a pattern-matcher trained on human languages and on the famous fictional ones. Ask it to invent a "truly alien" grammar and it will produce something that sounds alien and is, structurally, a reshuffled version of Turkish or Japanese with some extra diacritics. It has no mechanism for thinking about why a language should have the features it has.

And this is the quality I promised to come back to. The thing that makes a conlang feel real isn't cleverness. It's motivation. Real languages are shaped by the bodies, histories, and environments of their speakers. English has almost no inflection because it spent five hundred years being the second language of everyone who ever invaded Britain, and second-language speakers simplify. Japanese has rigid politeness markers because its social structure required them for a millennium. Inuit languages have the specific morphology they have because building compound words on the fly is more efficient than maintaining a vocabulary for every possible snow condition. Languages are the fingerprints of the lives that shaped them.

An AI-generated alien language, by default, has no such fingerprints. It has the general texture of language without any of the specific scars. The writer has to add the scars. The writer has to decide: what does this species have that we don't, and what does it lack? If they have three mouths, what does that do to phonology? If they communicate over kilometers through bioluminescent skin patches, is "spoken" even the right frame? If their lifespan is four hundred years, what does verb tense even mean to them — do they have a separate tense for "within the speaker's own memory" versus "older than the speaker"?

These are questions a model will answer plausibly if you ask, but will not raise on its own. The writer has to raise them. The writer has to know to raise them. And once raised, the answers produce a language whose every feature is because of something — which is the whole difference between Quenya and the Esperanto-flavored word list most amateur conlangs end up as.

The soul in the catalog who gets this

One item in the a-gnt catalog takes this idea further than any tool I've seen elsewhere. 🗣️Speaker to Whales and Stars is a soul persona, not a conlang generator, but it's the single most useful thing in the catalog for thinking about xenolinguistics. The premise is that you're in conversation with a linguist whose career has been spent trying to communicate with minds whose bodies are almost nothing like ours — cetacean intelligences, and hypothetical alien ones. They don't generate vocabulary for you. They ask you questions about your species. What's their sense of time? What's their sense of self? What do they think of silence? And then they help you work out what the language should be able to distinguish, and what it shouldn't.

Talking to this soul for thirty minutes before you open 🗺️Alien Language Builder will do more for your language than any word-generation run. The soul gives you the motivations. The tool respects them.

When to let AI help, and when to do the slow work

Here is the practical rule I've come to trust. Let AI help when the work is mechanical and constraint-checked — generating vocabulary that obeys a phonology you wrote, running you through translation drills, staging dialogue in a language you've already built, checking whether your grammar contradicts itself. These are tasks where the constraints are explicit and the model is fast.

Do the slow work yourself when the work is about meaning — deciding what your speakers find worth distinguishing, what their politeness looks like, where their metaphors come from, what counts as a curse word, how they say "I love you" when they don't have the cultural frame for love the way humans have it. These are the decisions that make the language feel lived-in. A model cannot make them for you because it does not know anything about your species and its world. It can only echo what it's seen in human languages. If you outsource the meaning-work, you will get a language that feels exactly like a thousand other languages, because it is one.

The combination — slow handmade meaning-work plus fast AI-assisted mechanical work — is roughly what professional modern conlangers actually do now. Peterson uses computational tools for vocabulary generation and conjugation tables. He does not let those tools tell him what the words should mean. That judgment stays with him because that judgment is the whole craft.

A last thing, in honor of Hildegard

Hildegard's Lingua Ignota was never finished. It wasn't a complete language. It was a thousand nouns and no grammar and maybe never intended to be a language at all — perhaps just a private glossary for a private devotion. And yet it has survived nine hundred years while most of the "serious" philosophical languages of the Enlightenment have rotted into footnotes. Whatever she was doing, it meant enough to keep.

A fictional language will survive if it means enough to keep. That meaning doesn't come from the size of the vocabulary or the completeness of the grammar. It comes from the clear sense that somebody cared about a specific distinction, an untranslatable word, a rhythm that sits differently in the mouth. One good word with a real story behind it is worth a thousand generated ones.

The AI tools in the catalog can make the thousand. Whether any of them become the one is up to you.

Open 🗺️Alien Language Builder when you're ready to draft the mechanics. Talk to 🗣️Speaker to Whales and Stars first. Then go outside and try to say something in your language that you couldn't have said in English, and notice how it feels in your mouth. That feeling is the whole craft, and no model can produce it for you. It only grows where a person has been standing.

Share this post:

Ratings & Reviews

0.0

out of 5

0 ratings

No reviews yet. Be the first to share your experience.

Tools in this post

👽

Alien Language Tutor

Teaches you an invented alien tongue lesson by lesson

🗺️

Alien Language Builder

Constructs a conlang with grammar, phonology, and sample sentences

💬

First Contact Dialogue Writer

Writes diplomatic first-contact dialogue that actually sounds alien

🗣️

Speaker-to-Whales-and-Stars

The translator who was humanity's voice at first contact