Confidentiality and Google Translate

Ethical principles, rules and conventions distinguish socially acceptable behaviour from that which is considered socially unacceptable. However, in social science research a few workers consider their work beyond scrutiny, presumably guided by a disinterested virtue which justifies any means to attain hoped for ends.

Ethical problems can relate to both the subject matter of the research as well as to its methods and procedures, and can go well beyond courtesy or etiquette regarding appropriate treatment of persons in a free society. Social scientists have often been criticized for lack of concern over the welfare of their subjects. The researcher often misinforms subjects about the nature of the investigation, and-or exposes them to embarrassing or emotionally painful experiences. […] It was found in a survey by the British Psychological Society that the two major areas of dilemma for members were confidentiality and research. Issues reported in this later area included unethical procedures, informed consent, harm to participants, deception, and deliberate falsification of results. 

The above is from a textbook I used in my applied linguistics studies, “Introduction to Research Methods” by Robert B. Burns. It is a very useful manual for those who wish to conduct research in education and in the social sciences. When I had to interview some subjects for my research, this book was my bible.

I was recently talking to a fellow member of IAPTI about confidentiality issues in translation, and how disconcerting it is that many –for the most part inexperienced– translators use online automatic translation tools such as Google Translate without knowing that they are breaching confidentiality between themselves and their clients. The concept of confidentiality between a translator and his client made me think of confidentiality between a researcher and his subject. I went back to my old textbook. Please humor me and reread the above quoted passage, replacing “social science” with “translation”, “social scientist” and “researcher” with “translator”, “investigation” with “translation method”, and “subject” with “client”. It would go something like this:

Ethical principles, rules and conventions distinguish socially acceptable behaviour from that which is considered socially unacceptable. However, in translation a few translators consider their work beyond scrutiny, presumably guided by a disinterested virtue which justifies any means to attain hoped for ends.

Ethical problems can relate to both the subject matter of the text as well as to the methods and procedures used to translate it, and can go well beyond courtesy or etiquette regarding appropriate treatment of persons in a free society. Translators have often been criticized for lack of concern over the welfare of their clients. The translator often misinforms clients about the nature of the translation method, and-or exposes them to embarrassing or emotionally painful experiences. […] 

Let’s look at that last sentence for a minute: Does the translator misinform clients about the nature of the translation method? One might argue that the translator doesn’t even discuss his translation method, he just agrees to do the translation. Well, we can play with words and use lawyers’ tricks but if we really want to be honest with ourselves, the truth is that when we say “I will do the translation” we are telling the client that “I will do the translation”; I, the translator. And before arguing that it’s not necessarily what we mean, let’s put ourselves in the client’s shoes. What does the client understand when we say that, and what does he expect?

The sentence mentions embarrassing or emotionally painful experiences. Does this apply to us? Let me give just a couple of examples from personal experience:

Recently I translated some academic transcripts from Greek to English for a direct client, let’s call him “Yannis”. Along with the transcripts, I had to translate a long list of engineering course descriptions and a couple of cover letters. I had to rely on my own knowledge (I had taken many of those courses some years ago), on university websites, reliable engineering dictionaries, and my old textbooks. (Who would have thought that my 50000-lb thermodynamics book, also used as a very effective doorstopper, would come in handy after all these years?) What would have happened if I had used an online translation application, say Google Translate? If you think that a lousy translation is the only thing I would have gotten, think again. (And it would be lousy indeed! It turns out that before hiring me, Yannis had tried to do the translation himself, using Google Translate. I guess he didn’t get very far, so he decided to hire a professional. When I sent him my translation he took a quick look and immediately wrote back to thank me and tell me that now he understood why professional translators are so indispensable. I wanted to give Yannis a virtual hug.) Anyway, let’s say I had considered using Google Translate to do this job. First of all, I would have no right to put Yannis’ transcripts on a public domain. If Yannis wanted to do so, that would be his right, those were his grades. I would have no right to share Yannis’ grades with anyone, nor would I have the right to share his personal cover letter with people who are not the intended recipients. Maybe I could remove information that could be used to identify him… His name, address, title, affiliation, all the grades -I’m sure I’d miss something- maybe I should remove the name of the university as well, and the department, and the year of graduation, and the title of the degree. What’s left? Right, the list of courses. But then, would Google really be able to give me a good translation of the description of that specialized course on the dynamics of Diesel engines or the one on welding and soldering techniques? What else would be left? The main body of the cover letters. Again, I have no right to share a letter written by someone other than me with people to whom it is not addressed. Plus if Yannis wanted the letters to be translated by Google, he could have done that himself. If Yannis ever decided to do an online search for some terms or sentences appearing in those cover letters, he might have found the entire text online. Talk about an embarrassing and emotionally painful experience! And of course he’d feel cheated. And if he then mentioned it to me, the embarrassing experience would be all mine.  Now is that the kind of relationship we want to build with our clients? Does the use of an online automatic translation tool reflect the respect and confidentiality that they deserve and consider a given when they hire us? Is that how we make sure they are satisfied and would hire us again or recommend us to others?

Now if a simple document such as an academic transcript is confidential, think about medical records. Or press releases. Or private-meeting minutes.  Or advertising campaigns. Or private correspondence. And yet there are translators who use Google Translate, oblivious to the fact that Google is not Mother Teresa, doing your translation for you asking for nothing in return, out of the goodness of its silicon little heart. “I’m doing this for the common good,” you might say; “if other translators ever need that information, they can find it easily online thanks to me”. Well, the problem with this concept is that the data you are sharing is NOT yours to share!

This brings us to a fundamental difference between the researcher-subject scenario (case A) and the translator-client scenario (case B): In case A, the study is conducted by the researcher, it is his own work from beginning to end; he chooses the topic, he designs the study, he collects and analyzes the data, and he is the one to present the work, for his own benefit (and in the long term for the benefit of the scientific community or perhaps society in general). In case B, the case that concerns us translators, we are given temporary access to work that is not ours. The topic of the document we are to translate, the content, the layout and the presentation all belong to the client, not to the translator, and they are to be used for the client’s benefit. So if confidentiality is such an important concern in case A, think how important it is in case B, i.e. in translation.

To the embarrassing or emotionally painful experiences, as mentioned by Burns, add “professionally detrimental” ones. Here’s an example: I am often asked to translate research articles to be published in American scientific journals. Again, this is research, to be published. Sometimes these papers describe many years’ worth of research. The authors have chosen specific journals through which to make their work known to the scientific community. They have not chosen Google’s database, they have not chosen forums of online translation portals (where translators ask for term advice, and for context they give entire paragraphs that often include highly sensitive and confidential information), they have not chosen anything other than those journals, and it is those journals that will have copyright. Imagine how professionally detrimental it can be to an author of such a paper that describes his work if that paper –whether in its entirety or partially- appears online before the author even has the chance to submit it for publication.

In the same chapter about ethics, privacy, and confidentiality, Burns goes on to say:

The right to privacy is an important right enshrined now in international (UN Declaration of Human Rights) and national legislation.[…] Individuals should decide what aspects of their personal lives, attitudes, habits, eccentricities, fears and guilt are to be communicated to others. […] This does not mean that personal and private behavior cannot be observed ethically; it can, provided that the subjects volunteer to participate with full knowledge of the purposes and procedures involved.

 
The above applies to us as well. Our clients are the only ones who have the right to decide what aspects of their life or work are to be communicated to others, and they must have full knowledge of the procedures involved in the translation. If you plan to outsource the work or if you plan to use an online automatic translation tool or use any other method that might compromise privacy and confidentiality, you should tell your client and obtain his permission. If you are not telling your client because you think it doesn’t concern him, based on the above you’re wrong. If you’re not telling him because you might think he won’t hire you if you do, that means you are knowingly doing something wrong, i.e. you are aware that you are compromising privacy and confidentiality and still choose to proceed. You proceed until a client finds out and complains, or until a client takes legal action against you, or until the translators’ association you belong to tells you that you have violated its code of ethics, or until you simply realize that professionalism in our field of work goes well beyond delivering a good translation.

Ref: Burns, R.B. (2000). Introduction to Research Methods, 4th edition, Pearson Education Australia.

14 thoughts on “Confidentiality and Google Translate”

  1. Great article!,

    I know many researchers that choose to hire translators to improve the quality of their publications and are not aware about the confidentiality risks. I hope both researchers and translators learn not to trivialize this problem.

  2. Thanks for sharing your thoughts.

    Do you know if Google Translate actually stores the texts you enter into it to translate (I know they probably do at least for internal use) but more importantly: do they publish those texts anywhere?

    As far as I know, the databases and parallel corpora behind Google Translate are not available for the general public.

  3. Your article clearly shows that our profession demands virtues such as integrity and trustworthiness that are not to be taken lightly neither by our clients when looking for the right man for the job nor by ourselves when trying to be that man. I really appreciate this kind reminder of yours.

  4. Thank you, all, for reading my post.
    Tisho, it does store them in order to use them for future lookups/translations. As far as I know they do not publish our entire texts anywhere (yet), they only display segments of them if they match -completely or partially- the source segments you enter for translation. I say "yet" because this may very well change; they warn us about it in their own terms of service. According to those terms of service (which I find very scary and I think everyone should read them before using this tool):

    "By using Google Translator Toolkit (the 'Service'), you agree to be bound by our Google Terms of Services located at http://www.google.com/accounts/TOS as well as these additional terms. Google may change these terms from time to time […]
    By submitting or creating your content through the Service, you grant Google permission to use your content to improve or make available the Services pursuant to these additional terms, provided that Google will not disclose the subject matter of your content or make your content available on a standalone basis to any third parties without your consent. If Google displays your content to an end user, it will do so only according to the sharing rules below, and only on a translation unit basis. […] The term 'translation unit' has the meaning assigned to it in the XLIFF standard, and 'displaying on a translation unit basis' means that a translated segment will be displayed only in response to fuzzy match search of the source segment."

  5. Thank you, Maria. This is valuable information.

    I am still not sure how they can use my texts to improbe their service if I get their (usually faulty) translations and edit them in my own tool (say Word or Notepad, OmegaT or Trados…). Statistical machine translation is based on parallel corpora of texts, which have been translated by people, i.e. good translations. I do not submit the final version of my translations to Google in any way, so they cannot use them for future lookups or translations. Maybe they can use the source texts a user enters to collect a corpus to create frequency lists or for linguistic analysis…

    Anyway, there are other MT tools which I know do not store your texts, Apertium.org for example.

    I totally agree with you on the confidentiality issues you mention. However, I still find that MT is very useful because it definitely saves time (for typing or even looking up in dictionaries or thesauri). As long as there is no confidentiality issue, I believe MT is a very good thing.

  6. Hello, Maria. I find your post a useful reflection about the ethical implications of our work, something that is not discussed enough in the translation sector.

    Most of the time we have sensitive information in our hands, and it is up to the client, not us, to choose when to disclose (or take the risk of disclosing) that information to others. Sometimes, the mere fact that such information is being translated is in itself confidential. I hope the profession in general will become used to not overlooking these "details" .

  7. «However, I still find that MT is very useful because it definitely saves time (for typing or even looking up in dictionaries or thesauri). As long as there is no confidentiality issue, I believe MT is a very good thing.» Sure, Tisho. MT is ok if you use your own software (either open source or paid). It's just another tool for us which, as you well said, helps us to optimize our times (which can be used to fine tune the document further, to do business development or just to …. relax and enjoy more free time!). However, two models are not good for us: 1) online MT because it infringes our confidentiality obligation 2) when MT is used by agencies who later send us a document for us to post-edit. In this case, even if we are paid the best possible rate ever, when we return the document (the combo is always MT/TM), we are teaching the agency's machine to replace us. Are we… crazy? So huge YES to machine translation offline and in our own computer. As we say YES to voice recognition software, faster PCs, etc. A great 2012 for all of us!

  8. Loved your writing. You clearly have a scientific/inquisitive mind and your info helped me solve the confidentiality debate I was having with myself.
    Will share your blog with other colleagues who's clients have asked them how is confidentiality preserved.
    Best wishes from Mexico City.
    Luz McClellan

  9. I find your article way too long and I do not intend to read it.

    Could you summarize your main points, in a nutshell?

    Out of respect for your colleagues' time.

    Thank you very much.

  10. Isabelle, if you do not intend to read it, that is fine. I am happy that other colleagues read it in its entirety, out of respect for my time or because they found it useful. And it is precisely out of respect for my colleagues that I write in such detail in the best way I can, trying to include useful information. As you can see, I only write a couple of articles a year, only when I have something to say. I don't write for the sake of writing or bombard colleagues with meaningless posts every few days. Honestly I would encourage people to read not only the article but the comments as well.

  11. Maria, thank you for your thorough research. I'm coping my comments from ITPTI's Linkedin group with updates.

    I think focusing Cloud-based MT services (GT, KantanMT, MS Bing, etc) overlooks these important confidentiality issues with regards to other service providers such as Cloud CATs (services like SmartCAT, MateCat, Memsource, XTM, etc). These services offer features that require you to upload not only your current work, but also TMs, term bases, etc.

    First, don't let their security technobabble distract you. Cloud services can (mostly do) offer encrypted connections (HTTPS) that protects information from prying 3rd parties during its transit across the Internet. They can (mostly do) encrypt the content when it's at rest in storage (on their hard drives). However, the services decrypt content at-will when you use their systems. When you entrust your work, TMs, term bases, etc. to these services, they are secure only inasmuch as you can trust the service provider to honor and enforce their own security policies. A disgruntled employee (à la Edward Snowden?) or a change in company ownership can easily see your entrusted content move to untrustworthy hands. Check your client confidentiality agreements carefully because they might prohibit using Cloud-based tools regardless of the service provider's security policies.

    Furthermore, discussions of confidentiality almost always overlook the translator's confidential information. Separate from the original & translated content, your personal information includes your personal TMs & term bases. I believe these things are your trade secrets. When you upload these resources, you're sharing them with others who could benefit from misusing them. This is why uploading to Cloud services like Dropbox that have no intention to offer services that benefit from the shared content. We're careful not to create conflicts of interest in other areas of our business, but for some reason conflicts of interest is overlooked with Cloud tools all too easily.

    Finally, your speed & productivity (efficiency) are also your trade secrets that deserve confidentiality. Image investing a few hundred dollars in a tool (any tool) that helps you deliver the same quality in less time regardless of content confidentiality. Cloud-based tools have the capacity to track your speed and productivity. Some already implement these features and some of those even offer this information to their clients (agencies and translation customers). With this information, they can calculate your effective hourly rate. As you work faster, how long before your customers offer lower per word rates to keep your hourly rate in a range they like? Why reveal your personal confidential information, i.e. your trade secrets, to those who would benefit from exploiting it?

Leave a Reply