A baseline rating system for translations

Previously I’ve written about translation quality in the visual novel translation community, and the fact that peer review is an important part of the process of building something like a web of trust dealing with quality and reputation. However, we all sort of talk about ‘quality’ in vague terms on personal scales.

So, taking some ideas out of talks with zalas at encubed in #denpa, I’ve sketched out a possible way of at least talking about this topic on a common scale, a “Z-scale” so-to-speak, that sets some minimal criteria for talking about translation quality.

  • A – Translation has no material differences with the original work, though a limited amount of immaterial issues may exist
  • B – Translation has numerous immaterial and/or very rarely, material differences with the original work, which may lead the reader to misunderstand some points in the work
  • C – Translation has some material differences with the original work, to the point where a reader may misunderstand portions of the work
  • D – Translation has numerous material differences with the original work, to the point where a reader will greatly misunderstand the whole work
  • F – Translation has little to no relationship to the original work

Credit goes to zalas for sketching out where the upper bound of A should be, and we all knew where F would be. The specific definitions along the way are my own arbitrary definitions, but the general spirit of our discussions is there.

Upper bound

Note that what most people would probably consider acceptable translations would fall somewhere near B and A. By the time you’re at C, things would be pretty shaky. On the flip side, there can still be big differences between two “A“s. Besides simple obvious style things, there are the much more subtle style things that we just don’t touch on this scale.


Note that the scale here makes no mention of “quality” in terms of aesthetics. There’s nothing about whether a work is unreadable, or sounds like Shakespeare reborn. Aesthetics is practically orthogonal to this scale. We just want to answer the question “if I read this translation, how close will I get to coming to the same informational experience as if I were reading the original”. We’re setting the bar as low as we can in this way, because while there are a million reasons to dislike a given translation, the only thing we’re likely to agree on is when a translation fails to translate.

You can have two translations that give the same information content, and one reads like “Spot-subject, “ball”-direct_object, caught-verb” while the other reads “Spot caught the ball”. Notionally, both of these could be “A”, since the first is something of a gloss that might be useful in linguistics, but not so much bedtime story material.

If you want to talk about aesthetics, perhaps do something like tack a number to the rating, perhaps 1 to 5, with 5 being “extremely pleasing” and 1 being “nearly incomprehensible”. So in the previous example, you’d have an “A1” ranting versus “A3” (or something, depending on your opinion). Just keep in mind that the aesthetics judgments are way more subjective than the A-F ratings, so one person’s A5 might be another’s A1.

Material differences

Not all errors carry the same weight, putting too much sugar in your coffee is not nearly as bad as putting too much salt in your coffee.

I’ll roughly define it as “a difference that makes a significant difference in your understanding of a work.” There is room for difference of opinion here, but the general idea is that if the reader understands that “Anne needs to pay $1000 a month to her landlord” but the original text was “Anne gets paid $1000 a month because she’s the landlord”, we’ve got problems.

Now, you would think that the general context would give away that someone screwed up somewhere, but it can and does happen every so often. Usually though, it’s more common that some editor somewhere takes a big shoehorn and smooths things over the best they can, so it’s not blatantly obvious. Still, if the reader is led astray in such a way that their whole view of what happens in a text is skewed, that’d be a material difference.

This is in contrast to immaterial differences, where a translation error did occur, but it’s not that important. Does it matter too much if a character drank Darjeeling tea instead of Assam in one scene? There’d be cases where it does matter, for characterization, plot, etc. but there are also cases where it wouldn’t matter. If a character drank a cup of tea in passing, and no one even remembers it happens after, I’d be willing to accept that the latter example is immaterial. It’d be nice if there wasn’t a mistake, but I’m not going to lose sleep over it.

And in the future…

Well, having a scale is nice and all, but it’s pointless unless it gets some kind of currency, and that’s certainly not up to me…

However, one project I’d like to see sometime soon is something like a registry of projects. Inside, you’d have people who register as the translator of X, Y, Z projects. They can rate their own work, and they can also post their ratings of other people’s work. Ideally, there’d be a way to at least give translators some way to verify that they’re who they are, and let them stake their reputations on their ratings. I can’t think of how to do that sort of registry without having a person handing out that privileged status by hand, and even then…

The general public could rate things too of course, but I’d at least to see that people who are putting their reputations on the line getting preferential weighting somehow or other.


