6 results back to index
Found in Translation: How Language Shapes Our Lives and Transforms the World by Nataly Kelly, Jost Zetzsche
airport security, Berlin Wall, Celtic Tiger, crowdsourcing, Donald Trump, glass ceiling, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, randomized controlled trial, Ray Kurzweil, Skype, speech recognition, Steve Jobs, the market place
One famous example of machine translation gone awry is actually an urban legend. As the story goes, the sentence “The spirit is willing, but the flesh is weak” was plugged into a machine translation system to be rendered into Russian. Allegedly, the computer produced “The vodka is strong, but the meat is rotten” in Russian. This tale has never been substantiated, but it’s not completely inconceivable. The story probably serves a good purpose as a warning that generic machine translation cannot and should not be blindly trusted. Parlez-Vous C++? Anyone who’s taken a language course in school knows how hard it is to learn a foreign language. And, depending on what language you speak natively, some languages are significantly harder than others. For example, it takes an estimated ten years to train an Arabic–English translator to reach full competence, a hard lesson the U.S. government learned after the events of 9/11.
The sound of every word and every syllable is placed and chosen perfectly. But can prose this pitch perfectly balanced be translated? Here’s one rendering: We can see its similarity, but it’s difficult to evaluate its success without being able to read Russian—or at least the Cyrillic alphabet. But our Russian friends tell us that they feel the same kind of shivers and tingle of excitement when they read it.2 In other words, even such intimate language is translatable. The passage still sounds erotic, even when translated. (Vladimir Nabokov originally wrote Lolita in English. He had the privilege of growing up trilingually in an aristocratic Russian family, and he also personally translated Lolita into Russian.) Could the language itself have something to do with the text’s ability to tantalize its readers?
Husband-and-wife-team Mark Herman and Ronnie Apter work together to translate these libretti into English for operas written in Italian, Russian, French, German, or Czech.5 It’s a daunting task according to Apter, a retired professor of literature from Central Michigan University. She describes libretto translation as poetry translation with several major differences: The words must be perfectly fitted to the music, the rhythm of the language and the cadence must be matched to the melody, and the translator must consider the diction level that the particular character would use and the physical limitations of what a singer can actually sing. At the same time, the translation must use modern concepts of translation while reflecting the historical flavor of the original. Herman and Apter have both studied voice. After translating a libretto literally, they work on the lyrical translation of the libretto by singing it to each other—Herman, the low voices; Apter, the high ones—to find out whether their translations fit with the music.
Overcomplicated: Technology at the Limits of Comprehension by Samuel Arbesman
3D printing, algorithmic trading, Anton Chekhov, Apple II, Benoit Mandelbrot, citation needed, combinatorial explosion, Danny Hillis, David Brooks, discovery of the americas, en.wikipedia.org, Erik Brynjolfsson, Flash crash, friendly AI, game design, Google X / Alphabet X, Googley, HyperCard, Inbox Zero, Isaac Newton, iterative process, Kevin Kelly, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, mandelbrot fractal, Minecraft, Netflix Prize, Nicholas Carr, Parkinson's law, Ray Kurzweil, recommendation engine, Richard Feynman, Richard Feynman, Richard Feynman: Challenger O-ring, Second Machine Age, self-driving car, software studies, statistical model, Steve Jobs, Steve Wozniak, Steven Pinker, Stewart Brand, superintelligent machines, Therac-25, Tyler Cowen: Great Stagnation, urban planning, Watson beat the top human players on Jeopardy!, Whole Earth Catalog, Y2K
For a clear example of the necessary complexity of a machine model for language, we need only look at how computers are used to translate one language into another. Take this great, though apocryphal, story: During the Cold War, scientists began working on a computational method for translating between English and Russian. When they were ready to test their system, they chose a rather nuanced sentence as their test case: “The spirit is willing, but the flesh is weak.” They converted it into Russian, and then ran the resulting Russian translation back again through the machine into English. The result was something like “The whiskey is strong, but the meat is terrible.” Machine translation, as this computational task is more formally known, is not easy. Google Translate’s results can be imprecise, though interesting in their own way. But scientists have made great strides.
The rules will cower in fear before such regionalisms. Using grammatical models to process language for translation simply doesn’t work that well. Language is too complex and quirky for these elegant rules to work when translating a text. There are too many edge cases. Into this gap have stepped numerous statistical approaches from the world of machine learning, in which computers ingest huge amounts of translated texts and then translate new ones based on a set of algorithms, without ever actually trying to understand or parse what the sentences mean. For example, instead of a rule saying that placing the suffix “-s” onto a word makes it plural, the machine might know that “-s” creates a plural word, say, 99.9 percent of the time, whereas 0.1 percent of the time it doesn’t, as with words like “sheep” and “deer” that are their own plurals, or irregular plurals such as “men” or “feet” or even “kine.”
But scientists have made great strides. What techniques are used by experts in machine translation? One early approach was to use the structured grammatical scaffolding of language I mentioned above. Linguists hard-coded the linguistic properties of language into a piece of software in order to translate from one language to another. But it’s one thing to deal with relatively straightforward sentences, and another to assume that such grammars can handle the diversity of language in the wild. For instance, imagine you create a rule that handles straightforward infinitives, but then doesn’t account for split ones, such as “To boldly go where no one has gone before.” And what about regional phrases, like the Pittsburghese utterance “The car needs washed” (skipping over “to be”)? The rules will cower in fear before such regionalisms.
Nerds on Wall Street: Math, Machines and Wired Markets by David J. Leinweber
AI winter, algorithmic trading, asset allocation, banking crisis, barriers to entry, Big bang: deregulation of the City of London, butterfly effect, buttonwood tree, buy low sell high, capital asset pricing model, citizen journalism, collateralized debt obligation, corporate governance, Craig Reynolds: boids flock, credit crunch, Credit Default Swap, credit default swaps / collateralized debt obligations, Danny Hillis, demand response, disintermediation, distributed generation, diversification, diversified portfolio, Emanuel Derman, en.wikipedia.org, experimental economics, financial innovation, Gordon Gekko, implied volatility, index arbitrage, index fund, information retrieval, Internet Archive, John Nash: game theory, Khan Academy, load shedding, Long Term Capital Management, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, market fragmentation, market microstructure, Mars Rover, moral hazard, mutually assured destruction, natural language processing, Network effects, optical character recognition, paper trading, passive investing, pez dispenser, phenotype, prediction markets, quantitative hedge fund, quantitative trading / quantitative ﬁnance, QWERTY keyboard, RAND corporation, random walk, Ray Kurzweil, Renaissance Technologies, Richard Stallman, risk tolerance, risk-adjusted returns, risk/return, Ronald Reagan, semantic web, Sharpe ratio, short selling, Silicon Valley, Small Order Execution System, smart grid, smart meter, social web, South Sea Bubble, statistical arbitrage, statistical model, Steve Jobs, Steven Levy, Tacoma Narrows Bridge, the scientific method, The Wisdom of Crowds, time value of money, too big to fail, transaction costs, Turing machine, Upton Sinclair, value at risk, Vernor Vinge, yield curve, Yogi Berra
For the technically ambitious reader, Lucene (http://lucene.apache.org/), Lingpipe (http://alias-i.com/lingpipe/), and Lemur (www.lemurproject.org/) are popular open source language and information retrieval tools. 29. Anthony Oettinger, a pioneer in machine translation at Harvard going back to the 1950s, told a story of an early English-Russian-English system sponsored by U.S. intelligence agencies. The English “The spirit is willing but the flesh is weak” went in, was translated to Russian, which was then sent in again to be translated back into English. The result: “The vodka is ready but the meat is rotten.” Tony got out of the machine translation business. 30. This modern translator is found at www.systransoft.com. I tried Oettinger’s example again, 50 years later. The retranslation of the Russian back to English this time was “The spirit is of willing of but of the flesh is of weak.” 31. The CIA In-Q-Tel venture capitalists are found here: www.inqtel.org/.
Language Technology: Your Tax Dollars at Work The U.S. government has been busy spending your money on technologies to do this kind of content extraction and analysis for years.When the Gr eatest Hits of Computation in Finance 55 0.04 Cumulative Abnormal Return 0.02 0.00 0.02 0.04 research first started, the language researchers were most interested in was Russian. Harvard’s Tony Oettinger, who led the research, tells of inputting the English sentence “The spirit is willing but the flesh is weak” into an English-to-Russian translation program. The computer’s Russian translation was then fed back into a Russian-to-English translation program. The resulting retranslation was “The vodka is ready but the meat is rotten.” Language technology has improved markedly, but still has a long way to go. The Defense Advanced Research Projects Agency (DARPA) paid for the development of the Internet, which was called the ARPANET in the early days.
A wide range of promising technologies are just being brought into play in this area. So far, English is the language for almost all of these systems. Machine translation, in general, has been difficult,29 but for literal, as opposed to artistic, content, as is found in most business and financial stories, it can do a passable job. Systran offers a translation system that you can experiment with online.30 The “as the world turns” time zone effect means that many stories will appear first in international sources in languages other than English. The proliferation of cross-listed or economically equivalent securities means there are often trading opportunities in countries that will be learning the news, via translation, later on in the news cycle. We have seen how disintermediation (eliminating the middlemen) cut the ranks of sell-side traders as their clients turned to direct market access and algorithmic trading.
The Last Lingua Franca: English Until the Return of Babel by Nicholas Ostler
barriers to entry, BRICs, British Empire, call centre, en.wikipedia.org, European colonialism, Internet Archive, invention of writing, Isaac Newton, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, open economy, Republic of Letters, Scramble for Africa, statistical model, trade route, upwardly mobile
Without receiving preprogrammed guidance on the geometry of the various worlds of modern westernized homes, schools, and offices, a system could not easily select the correct sense for pen (‘writing instrument’, as against ‘enclosure’) in “The pen is in the box” as against “The box is in the pen.” Nor was there any general means of selecting appropriate equivalents when language was used metaphor ical ly. If it is ridiculous— because irrelevant to the context— to see references to good vodka and meat of dubious quality when encountering “The spirit is willing, but the flesh is weak” in an essay on government policy, how come in an environmental work “the rape of the countryside” does not refer to oilseed rape, a crop ubiquitous in the fields of modern England? Human reason, and even more human rhetoric, is inclined to be inscrutable. In practice, it seemed to be impossible to divorce the syntactic part of language processing from modeling the meaning of particular texts.
Nowadays, it is quite difficult to find a part of the world with no historical connection with English-speaking powers, but a good candidate would be Mongolia, 1.5 million square kilometers sandwiched between Russian Siberia and Chinese “Inner Mongolia.” This has never been part of any UK or U.S. sphere of influence, at any time in its history. Nevertheless, in the 1990s the attempt was being made to retrain half of its Russian-language teachers to be competent in English; and in 2004 its prime minister, Tsakhia Elbegdorj (admittedly, a Harvard-educated man), announced that English would be substituted for Russian as the First foreign language in Mongolian schools. The explicit motivation of the government is economic, with aspirations to make Ulan Bator, the capital, some kind of center for outsourcing and call-center services, to be modeled on Singapore or India’s Bangalore.12 It remains to be seen if the aspirations can be translated into reality, but the idea is now rife that English is the natural choice for those seeking access to the world’s wealth.
As the community has grown, the use of French too has become disputed. Although the recently acceded countries, mostly from central and eastern Europe, might have been expected to be more familiar with the use of German, or indeed Russian, in practice they seem to prefer to use English; and with more and more target languages requiring translation into them, there is a premium on reducing the number of languages from which the translation is done. The result has been increasing pressure to use English, and English only, as a common medium. Meanwhile, the other established lingua-francas of the Continent’s regions (including French and German, but also Spanish, Italian, and Russian) have enjoyed no special status as such, although it is recognized that they are the languages that— besides English— are most widely understood.* German is the native language of 18 percent of the EU population, but has been acquired as a lingua-franca by another 14 percent.
Albert Einstein, algorithmic trading, Amazon Mechanical Turk, Apple's 1984 Super Bowl advert, backtesting, Black Swan, book scanning, bounce rate, business intelligence, business process, call centre, computer age, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data is the new oil, en.wikipedia.org, Erik Brynjolfsson, experimental subject, Google Glasses, happiness index / gross national happiness, job satisfaction, Johann Wolfgang von Goethe, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, Norbert Wiener, personalized medicine, placebo effect, prediction markets, Ray Kurzweil, recommendation engine, risk-adjusted returns, Ronald Coase, Search for Extraterrestrial Intelligence, self-driving car, sentiment analysis, software as a service, speech recognition, statistical model, Steven Levy, text mining, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Turing test, Watson beat the top human players on Jeopardy!, X Prize, Yogi Berra
It is a game of question answering, despite its stylistic convention of phrasing each contestant response in the form of a question beginning “What is” or “Who is.” 2We face yet another “Mission Impossible” trying to get the computer to write instead of read. Generating human language trips up the naïve machine. I once received a voice-synthesized call from Blockbuster reminding me of my rented movie’s due date. “This is a message for Eric the Fifth Siegel,” it said. My middle initial is V. Translation between languages also faces hazards. An often-cited example is that “The spirit is willing, but the flesh is weak,” if translated into Russian and back, could end up as “The vodka is good, but the meat is rotten.” 3Watson’s avatar, its visual depiction shown on Jeopardy!, consists of 42 glowing, crisscrossing threads as an inside joke and homage that references the significance this number holds in Adams’s infamous Hitchhiker’s Guide. 4Watson was not named after this fictional detective—it was named after IBM founder Thomas J.
Quote about Google’s book scanning project: George Dyson, Turing’s Cathedral: The Origins of the Digital Universe (Pantheon Books, 2012). Natural language processing: Dursun Delen, Andrew Fast, Thomas Hill, Robert Nisbit, John Elder, and Gary Miner, Practical Text Mining and Statistical Analysis for Non-Structured Text Data Applications (Academic Press, 2012). James Allen, Natural Language Understanding, 2nd ed. (Addison-Wesley, 1994). Regarding the translation of “The spirit is willing, but the flesh is weak”: John Hutchins, “The Whisky Was Invisible or Persistent Myths of MT,” MT News International 11 (June 1995), 17–18. www.hutchinsweb.me.uk/MTNI-11–1995.pdf. Ruminations on Apple’s Siri versus Watson from WolframAlpha’s creator: Stephen Wolfram, “Jeopardy, IBM, and WolframAlpha,” Stephen Wolfram blog, January 26, 2011. http://blog.stephenwolfram.com/2011/01/jeopardy-ibm-and-wolframalpha/.
The machine learning process is designed to accomplish this task, to mechanically develop new capabilities from data. This automation is the means by which PA builds its predictive power. The hunter returns back to the tribe, proudly displaying his kill. So, too, a data scientist posts her model on the bulletin board near the company ping-pong table. The hunter hands over the kill to the cook, and the data scientist cooks up her model, translates it to a standard computer language, and e-mails it to an engineer for integration. A well-fed tribe shows the love; a psyched executive issues a bonus. The tribe munches and the scientist crunches. To Act Is to Decide Knowing is not enough; we must act. —Johann Wolfgang von Goethe Potatoes or rice? What to do with my life? I can’t decide. —From the song “I Suck at Deciding” by Muffin1 (1996) Once you develop a model, don’t pat yourself on the back just yet.
The Complete Novels Of George Orwell by George Orwell
Let’s have a look at you. He seemed to be lying on the bed. He could not see very well. Her youthful, rapacious face, with blackened eyebrows, leaned over him as he sprawled there. ‘How about my present?’ she demanded, half wheedling, half menacing. Never mind that now. To work! Come here. Not a bad mouth. Come here. Come closer. Ah! No. No use. Impossible. The will but not the way. The spirit is willing but the flesh is weak. Try again. No. The booze, it must be. See Macbeth. One last try. No, no use. Not this evening, I’m afraid. All right, Dora, don’t you worry. You’ll get your two quid all right. We aren’t paying by results. He made a clumsy gesture. ‘Here, give us that bottle. That bottle off the dressing–table.’ Dora brought it. Ah, that’s better. That at least doesn’t fail. With hands that had swollen to monstrous size he up-ended the Chianti bottle.
The nearest one could come to doing so would be to swallow the whole passage up in the single word crimethink. A full translation could only be an ideological translation, whereby Jefferson’s words would be changed into a panegyric on absolute government. A good deal of the literature of the past was, indeed, already being transformed in this way. Considerations of prestige made it desirable to preserve the memory of certain historical figures, while at the same time bringing their achievements into line with the philosophy of Ingsoc. Various writers, such as Shakespeare, Milton, Swift, Byron, Dickens, and some others were therefore in process of translation: when the task had been completed, their original writings, with all else that survived of the literature of the past, would be destroyed. These translations were a slow and difficult business, and it was not expected that they would be finished before the first or second decade of the twenty-first century.
Because the Inquisition killed its enemies in the open, and killed them while they were still unrepentant: in fact, it killed them because they were unrepentant. Men were dying because they would not abandon their true beliefs. Naturally all the glory belonged to the victim and all the shame to the Inquisitor who burned him. Later, in the twentieth century, there were the totalitarians, as they were called. There were the German Nazis and the Russian Communists. The Russians persecuted heresy more cruelly than the Inquisition had done. And they imagined that they had learned from the mistakes of the past; they knew, at any rate, that one must not make martyrs. Before they exposed their victims to public trial, they deliberately set themselves to destroy their dignity. They wore them down by torture and solitude until they were despicable, cringing wretches, confessing whatever was put into their mouths, covering themselves with abuse, accusing and sheltering behind one another, whimpering for mercy.