Lost in Translation

poetry  chinese  ]

Before I start talking about poetry, a quick break to talk about mathematics.

This is a really, really crap exposition on the most basic object in homological algebra. I don’t actually understand it very well. I want to since I’ve seen them in problems before, but I would strongly state that the next section is mostly for those who “enjoy blog post math” and bad mathematical fanfiction.

That’s a warning. If you work in this field, I deeply apologize.

I think that’s the first time I’ve heard the kernel of a translation, you know, the like, linguistic one.

Short, Exact, Stuffs

In life in general we care about the transformations of things. Oftentimes we have a lot of consecutive transformations, from group to group to group to group. This is composition.

But we want to figure out what happens when these look “nice.”

So, time to generalize. Say we have a collection of objects \(A, B, C\). Each one of these objects will have a 0 element (so you can imagine these are groups, or vector spaces, or whatever will make this easiest).

We then have some transformations (morphisms) between these, specifically \begin{equation}f : A \to B\end{equation} and \begin{equation}g : B \to C.\end{equation}

Right now this is a short sequence, since we are only dealing with a few collections of objects.

But for it to be exact, we want the ouptut of \begin{equation}g(f(A)) = {0}.\end{equation} You can think of this as \begin{equation}f \circ g\end{equation} This is also written as \begin{equation}Im(f) = Ker(g)\end{equation}, but this is best understood as a picture with cones:

a not short but exact sequence

The above image is technically not short, but hopefully the cone image makes sense now. I like thinking about the exactness as it looks like those cups from an old water dispenser from the old offices back in the 2000s (they’re probably still there!).

Now for more theory.

So, a short exact sequence, more formally written looks like:

\begin{equation}0 \to A \xrightarrow{f} B \xrightarrow{g} C \to 0\end{equation}

Say that they are exact. Then, as \begin{equation}A\end{equation} embeds into \begin{equation}B\end{equation}, and then this embedding factors into \begin{equation}C\end{equation}, so we can actually say that \begin{equation}C \cong B/A\end{equation}. I use the first isomorphism theorem every time I go buy groceries!

This makes proving theorems about rank/rank-nullity really conveinent, but the most simple example is probably carrying addition. Exercise left to the reader to motivate this.

More on short exact sequences can be found on wikipedia, stackexchange 1, overflow 1, and stackexchange 2.

One of the cleanest short exact sequence is probably Tao’s example of:

\begin{equation}0 \to \mathbb{R} \to C^{\infty}(\mathbb{R}; \mathbb{R}) \xrightarrow{\frac{\text{d}}{\text{d}x}} C^{\infty}(\mathbb{R}; \mathbb{R})\end{equation}

This generalizes to higher dimensional derivatives (gradients), but I like this one because it’s the fundamental theorem of calculus, and its cute because of how the derivative operator treats constants (also, you can also swap out the infinites!).

Anyway.

What the

So, translation is one of the most commonly used transformations (thought: how many times are services like google translate used a day?).

Oftentimes, we think in the language we most commonly express in, e.g. I think in (american) english, while my parents used to think in Hindi before now mostly thinking in English.

One physicist I know is the opposite of me, where he thinks in Chinese and expresses in English. But when he does physics, he says he thinks in physics.

Maybe mathematics is a language.

For this, I think about the language of poetry: when someone expresses poetry, is it possible to do it without needing a language?

I’ve been trying to write gushi (古詩, meaning Old Poetry) poetry, which is a type of Chinese poetry with a few rules. Each line has a unfiform number of characters (either 5 or 7), which is terrifying to me, as I can barely think in Chinese.

Wait why

Someone I know of roughly decided to randomly decided to pick up chinese within a year. I wonder if he can think in it yet. I’m kind of more motivated to try and do Chinese because of this, since I feel like I’m around his level even after not technically teaching myself for the past few years:

I’d say I’m comfortable with Chinese. I can comfortably travel in any Mandarin-speaking place. I can comfortably hold long conversations. I can comfortably watch most content. I can comfortably build relationships entirely in Mandarin.

我愛台灣。 I am capable of surviving if I run around that island, that beautiful place. Vitalik describes it as:

Paul Graham has written about how every city sends a message: in New York, “you should make more money”. In Boston, “You really should get around to reading all those books”. In Silicon Valley, “you should be more powerful”. When I visit Taipei, the message that comes to my mind is “you should rediscover your inner high school student”.

I visited when I was in high school, and for me it felt freeing. I visited on my own with some friends:

the wild

wander

If you want to go, consider looking at stuff like this.

Also, I think the pursuit of understanding a language is awesome in its own right. The book(s) on the programming language rust are all sufficently long enough to make me dizzy with excitement—there is so much to deeply understand?!

The previous paragraph is a bad pun, but it holds true for even natural language. For me, leaning about history (a lot of people care about “Why didn’t China have a Scientific Revolution? Why didn’t they conquer more?” and my answer is “Zheng He got nerfed”), learning about the food, understanding the culture, etc.

And it’s fun for me to see people’s faces light up when they realize I can speak Mandarin. Sometimes I feel like a Singaporean cosplayer.

Poetry???

Ok, so the actual content.

I tried to write gushi on my own, and I found that the difficulty for me was actually coming up with ideas that work in Chinese.

For example, I wrote a poem titled 玫瑰 (Roses), but it becomes very clear (at least to the few native Chinese people I have showed) that I am an English Thinker.

A more in-depth topic on this type of thinking can be found in a thousand ways of seeing a forest. In it, there is a poem from Wang Wei on forests. It is perhaps the most famous example of translation from gushi:

空山不见人
但闻人语响
返景入深林
复照青苔上

鹿柴

There’s a pretty popular book surrounding this poem called Nineteen ways of looking at Wang Wei, but I can’t claim to even properly approximate one (remember, I can survive with the language! no one said I had to be lyrical).

So, to improve my own ability, and to explore the nuances of translation, we can model my writing as translating English Idea to GuShi poetry. So, to practice this skill, I just took English poems/statements I liked and tried to gushi-ify them.

Now we’re getting to the short exact sequence bit. I can imagine my current skill set for chinese poetry writing as:

\begin{equation}0 \to \text{English Poetry} \xrightarrow{\text{translation}} \text{Chinese Poetry} \xrightarrow{\text{understanding}} \text{Meaning} \to 0\end{equation}

(It should now become clear why I was super vague when I described a short exact sequence, as this is clearly not a well defined notion, and it is cheating by saying the empty string is the zero element, etc.)

I claim this is a short exact sequence for me :(

Or, more specifically, whenever I translate english ideas/poetry into Chinese, their meaning goes to zero.

Robert Frost has a quote about this:

Poetry is what gets lost in translation.

One might be misguided and hope that the two functions in our short exact sequence were actually inverses (making it short and not exact). But if they were inverses, then the two languages must have been the exact same to begin with, as the meaning perfectly translates over.

And this cannot be the case, since you have to exchange information both times, where you don’t necessarily just lose but you also gain certain bits of information (total information content should hopefully not change).

Another two functions that I would say make poetry writing an exact sequence and more accurately describe my method of translation seem to be semantic translation (e.g. trying to send over the actual theme) and then direct translation (e.g. “what would a 5th grader think when reading this?”).

So now it is hopefully clear why I care about short-exact-sequences, since I want to get out of this cone image, those water cups from the old hospital waiting rooms.

Stop Yapping Where Example

Ok, so let’s actually do this:

My students want certainty. They want it
so badly. They respect science and have memorized
complex formulas. I don’t know
how to tell my students their parents
are still just as scared. The bullies get bigger
and vaguer and you cannot punch a cloud.
I have eulogies for all my loved ones prepared,
but cannot include this fact in my lesson plans.
The best teacher I ever had told me to meet him
at the basketball court. We played pick-up for hours.
By the end, I lay panting on the hardwood
and couldn’t so much as stand.
He told me to describe the pain in my chest.
I tried. I couldn’t find the words. Not exactly.
Listen, he said, that’s where language ends.

This is “Statement of Teaching Philosophy” by Keith Leonard.

First, I split it up into the relevant scenes. First, it’s about the students, with “My students want certainty. They want it so badly.”

Then, it’s about the teaching, the students: “They respect science and have memorized complex formulas. I don’t know how to tell my students their parents are still just as scared. The bullies get bigger and vaguer and you cannot punch a cloud. I have eulogies for all my loved ones prepared, but cannot include this fact in my lesson plans.” This one last line is unique though, as it doesn’t truly fit in, and I’ll come back to it later.

Third, it’s about his relationship with his own teacher, where they play basketball: “The best teacher I ever had told me to meet him at the basketball court. We played pick-up for hours. By the end, I lay panting on the hardwood and couldn’t so much as stand.”

And then the ending:

He told me to describe the pain in my chest.
I tried.
I couldn’t find the words.
Not exactly.
Listen, he said, that’s where language ends.

So, I tried to maintain each bit (almost a little too much) when gushi-ing it.

青年求解惑
公式背得透
父母暗惊怕
欺凌如云愁
想象朋友死
良师邀球场
汗滴倒木地
胸痛难言表
语尽真方显

Ok. So, statistically, you have ~17% chance of being able to read this poem. To sound like a (bayesian) nerd, conditioning on reading this blog post… I have no idea. It probably goes up.

So that’s the first map. Here’s the second map:

youngsters look to dispel misunderstanding
so formulas are memorized
parents are (also) confused, frightened
these bullies are like worrying clouds
imaging my friends dead
a good teacher took me to basketball
sweat, blood, falling onto the wooden (floor)
chest hurting, and it’s impossible to describe aloud
language’s extent is revealed now, here. (revise)

Clearly you can see me struggling with the “imagining my friends dead,” as the limits for 5 characters makes it… really hard?! Like, wow these old Chinese gushi poets were MASTERS of their craft, since even though a word can mean so much on it’s own, it’s just… so hard?!

More thoughts

There was a “talk” from Neel Nanda I was at during PAIR ‘23 and someone asked the question about how a transformer works on “nonstandard languages,” or languages that are “read in reverse.” This is kind of a bad question, since the string thats passed into an LLM in Arabic would be passed in the way that we would read it (if you’ve ever tried to highlight the text on a page, it should hopefully become clear what I mean by this).

But in general thinking about how language is stored (embeddings) in a model. There’s a rough experiment I’ve had in my draft notebook for a while now, since LLMs are getting signifigantly better at becoming general translators, and might even be somewhat language agnostic.

The experiment is the following: take the top 100 most common words in the english language (or more, whatever). Use some API to translate each word into the top 10 languages in the world. Embed each word using different methods, just using hugging face.

See for each word and its translations the volumne of the n-dim polyhedra. One would expect a general upward trend, barring certain words. For example, the word “a” doesn’t have as direct tranlation in other languages. You can say “一个,” but I can’t honestly say I use it a lot. I also don’t speak in Chinese a lot, so I’d happily be wrong.

Other metrics of interest would be to sort the distance between each vector of embeddings for a word to sort how “far apart” languages are.

Ultimately the question I have is if theres a vector that I can add to each embedding to make it more X-language. I’m heavily going to say “yes, this is very likely,” and its even more likely that someone else has done it. The hypothesis is really natural, which is just saying if I take some open ball around “chair,” I wld hope I find each language’s way of representing the chair.

And if there is this vector, that would be really whack? It would give us a really natural way to bypass the “struggle” of translation, and this must be a good thing in the long run, i.e. we can ‘read’ even older, gushi poems with even less effort. But we lose out on all that self interpretation? I dunno. My brain says that localization for small indie games strongly benefits from the ability for them to tell their stories wherever, whenever. I know people that have applied to programs by writing essays in their main language before translating it to English. A friend writes blog posts with help from GPT to clean up the english. So many students use Claude to write their college essays. It’s an ordeal.

I don’t particularly care for the philosophy of languages or aesthetics to the point of making it a career, which is probably bad to say given the university I currently attend. I just like language and culture (and food). Meanings shift under translation and if they didn’t they must have originally been the same language, just different fonts. This is a whole thing, see 19.

Each one is its own, and each one is just as valid. To me, language is more than just a probability distribution, more than a vector embedding. But god its effective.

There are some interesting things related to language. I have no idea what people think about futurlang and for some reason it made me chuckle, as if this is what they wanted all along. they being the NLP-ists, or whoever.

Christmas

I had christmas dinner at Oxford, and we sat next to Janet Pierrhumbert, probably one of the coolest professors in my mind. I didn’t get to ask her then but I’ve always wanted to ask her about poetry.

Maybe in the future.

Maybe next christmas.