Using Cloud Solutions for Translation: Yes or No?

We read about the benefits of using the cloud for work—using cloud applications and storage, for example. What we don’t see are warnings of the risks. You have to look for these specifically; the information doesn’t come to you as do the claimed benefits. This morning I got yet another e-mail from my web hosting service, encouraging me to use their new cloud storage, and I am tired of receiving iCloud notifications on my phone when I specifically chose not to use that service. So I’d like to share some things I learned in a graduate course on cloud computing at Boston University a couple of years ago, including some essential information from “Cloud Computing, A Practical Approach” by Anthony T. Velte, Toby J. Velte, and Robert Elsenpeter, to explain why using the cloud may not be such a good idea, at least for our work.
  
  
First of all, we should understand that cloud computing is not for everyone and it is not for everything. Just because it’s there and offers some benefits doesn’t mean we should use it.  According to the author, whether or not we should use cloud computing depends on a number of factors, including whether our data is regulated. Is our data—i.e. the original texts we translate and our translations—regulated? Well, the original text is not even our data, it is the client’s. And our translations are different-language equivalents of someone else’s data. So even if it is not regulated by the client, it’s still not our data to share.
  
Now I could just stop the blog here. It is not our data, we simply do not have permission to store it on third-party equipment or manipulate it with third-party applications. End of story. But for the curious, I’ll give some more information.
  
What does “using the cloud” really mean? What would we use exactly? What is the cloud?
  
Based on conversations I’ve had with colleagues, many see the cloud as something obscure, something abstract, pretty much like a real cloud without a specific shape or form, something up there, hard to conceive, something shared by many or by all. Actually it is something very specific and definitely not abstract.
  
These are the three major implementations of cloud computing:
1.  Compute Clouds
2.  Cloud Storage
3.  Cloud Applications
  
Compute clouds allow us to access applications and on-demand computing resources maintained on a provider’s equipment; examples are Amazon’s EC2 and Google App Engine. (On demand resources means that you don’t have to have the infrastructure on your own equipment and run the code; the resources are on someone else’s equipment and you use and pay for them only when you need them.)
  
Cloud storage is the most popular implementation. It allows us to maintain our files on a cloud-storage vendor’s equipment. (This is what my website hosting service keeps bothering me about. I am not interested, thank you very much.)
  
Cloud applications are similar to compute clouds in that they allow us to use applications maintained on a provider’s equipment; the difference with compute clouds is that cloud applications use software that rely on cloud infrastructure, i.e. they depend on the infrastructure of the Internet itself. Examples are Skype (peer-to-peer computing), MySpace or YouTube (web applications, delivered to users via a browser), and Google Apps (Software as a Service – SaaS).
  
Let’s consider the translation of a medical record. I won’t even go into discussing the habit of some translators to ask terminology questions on sites like AmateurZ (aka PrAdZ, SuckZ, etc.) and include the patient’s name, because that is simply beyond me. It is inconsiderate, unacceptable, inconceivable! But that’s another story. Let’s focus on the cloud. So let’s say that you want to store the translation on some backup directory you have on the cloud, i.e. on someone else’s equipment, or translate a few sentences with an online translation tool or a CAT tool that uses a shared memory stored on a cloud server (requiring you to also save your translation in the shared memory). What is the problem with that? From the horse’s mouth (the horse is Velte et al.):
  

“If you want to use cloud computing and post data covered by Health Insurance Portability and Accounting Act (HIPAA) on it, you are out of luck. Well, let’s rephrase that—if you want to put HIPAA data on a cloud, you shouldn’t. That’s sensitive healthcare information and the fact that HIPAA data could commingle on a server with another organization’s data will likely get the attention of an observant HIPAA auditor.”

No matter how much cloud giants like Google and Microsoft try to reassure us that the data placed on a cloud are safe, all it takes is one tiny breach to let sensitive data loose. And of course this raises another question: if the data is let loose, who is liable?
  
According to the authors: “If you have data that is regulated—like HIPAA or Sarbanes-Oxley—you are well advised to be very careful in your plans to place data on a cloud. After all, if you have posted a customer’s financial data and there’s a breach, will they go after the cloud provider or you? […] It is probably best to avoid a painful fine, flesh-eating lawyers, and possible jail time.” Note that jail time can be 1 to 10 years for HIPAA and up to 20 years for Sarbanes-Oxley data. I won’t mention the financial penalties because I’d like to spare you the heart attack, but if you’d like to know about them, I refer you to this book, page 26.
  
Even if the customer considers going after the cloud provider too, chances are the cloud provider has already foreseen this possibility and has made sure to absolve itself of any responsibility in its agreement with you. If you want to know Google’s attitude towards confidentiality, I refer you to a couple of old blog posts of mine, “Confidentiality and Gmail” and “Confidentiality and Google Translate” where you’ll see that by accepting Google Translate’s terms of service we grant Google permission to use our content to improve its services.
  
I’m not saying that cloud providers like Google are after you. And not all applications are like Google Translate which wants to gain something from your translations. In fact the big vendors have strict security measures. What I am saying is that you should not count on the cloud provider to protect or respect the confidentiality of your data or your client’s data. In spite of the provider’s security measures, you are responsible for keeping your data secure.
  
So the cloud provider is not after you. But you know what? Someone else is. Take a guess. Going once, going twice….
  
Hackers! Yes, hackers can cause a lot of damage if they get access to your data or your client’s data. They can get access to the company trade secrets you translated and sell them to the company’s competitor. They can get access to a company’s proprietary information and threaten to disclose it if they don’t receive a very generous sum. There are too many scenarios to list. Use your imagination and know that these things do happen. And on a not-so-funny note, when I took a “certified hacker” course (wait, let me explain, I worked as a software quality engineer for a while, where testing the quality of software products also meant testing security, and to test security you need to know how to break the software, hence the course, paid for by the company.) I was shocked to learn that some hackers do it for …fun! Just because they can and just because they want to test themselves. This too happens. You don’t want them knocking on your door and telling  you “you either pay me 20,000 dollars and you get all those financial records back or pay 100,000 dollars in fines for confidentiality breach”. It sounds far-fetched and maybe it doesn’t happen often, but it can happen. Hackers are mostly after larger corporations, not individual translators; on the other hand, when they hack into data stored on a cloud, they care more about the data than about who put it there.
  
What does all this mean? Should we never store data on a provider’s cloud or use cloud services?
  
Not necessarily. If you want to store data on a provider’s cloud, one thing you can do is encrypt it. Look for programs like TrueCrypt (www.truecrypt.org) to do this. That way, if someone gets access to your data, they won’t be able to read it.
  
Another important thing you should consider is to look for paid services instead of services funded by advertising. When it comes to free cloud services, Velte et al. point out that they “are most likely to rummage through your data looking to assemble user profiles that can be used for marketing or other purposes. No company can provide you with free tangible goods or services and stay in business for long. They have to make money somehow, right?”
  
Last but not least, always, ALWAYS read carefully your agreement with any cloud service provider. Make sure you understand the privacy and security implications of using said service and that you understand and agree with the terms of service.
  
Now, what if you are working in a translation team and need to exchange terminology databases or translation memories? It may be convenient to use a cloud service, but is it safe? And what if you don’t have a say in this, what if your client does not provide an in-house server but wants you to use a cloud-based service/application? In that case using a cloud service may actually be a good idea and make your team’s work easier. But what about confidentiality? Well, if your client is the one that requested you to use that service and is coordinating the workflow, then you are not liable if the cloud provider’s security measures are breached (though it’s a good idea to double-check with the client anyway). If you are using a cloud solution for a project for a direct client, then you may want to follow the above advice and look for a paid service and read the user agreement very carefully. Tell your client that you are using a cloud service and make sure he gives you permission in writing. Most end clients don’t care about the details of your process, they don’t care if you’re using such and such CAT tool or terminology-management tool, but when using a cloud service it is advisable (read: advantageous to you, in terms of liability) to have your client’s permission.
  
So to the original question, “Using cloud solutions: yes or no?”, my answer is this:
  
– If you don’t need them, don’t use them.
  
– If they increase your efficiency or generally improve your work process, use them but make sure your client knows and make sure you agree with the terms of service. It is safer to use a paid service.
  
– For storage, if it makes sense for you to store sensitive data (your own data, not a client’s) on a cloud, encrypt it first.
  
And keep in mind that you don’t have to follow the crowd; just because many people use a certain cloud service doesn’t make it any safer. Consider your own needs and the sensitivity of your data; that is, your clients’ data.


Confidentiality and Google Translate

Ethical principles, rules and conventions distinguish socially acceptable behaviour from that which is considered socially unacceptable. However, in social science research a few workers consider their work beyond scrutiny, presumably guided by a disinterested virtue which justifies any means to attain hoped for ends.

Ethical problems can relate to both the subject matter of the research as well as to its methods and procedures, and can go well beyond courtesy or etiquette regarding appropriate treatment of persons in a free society. Social scientists have often been criticized for lack of concern over the welfare of their subjects. The researcher often misinforms subjects about the nature of the investigation, and-or exposes them to embarrassing or emotionally painful experiences. […] It was found in a survey by the British Psychological Society that the two major areas of dilemma for members were confidentiality and research. Issues reported in this later area included unethical procedures, informed consent, harm to participants, deception, and deliberate falsification of results. 

The above is from a textbook I used in my applied linguistics studies, “Introduction to Research Methods” by Robert B. Burns. It is a very useful manual for those who wish to conduct research in education and in the social sciences. When I had to interview some subjects for my research, this book was my bible.

I was recently talking to a fellow member of IAPTI about confidentiality issues in translation, and how disconcerting it is that many –for the most part inexperienced– translators use online automatic translation tools such as Google Translate without knowing that they are breaching confidentiality between themselves and their clients. The concept of confidentiality between a translator and his client made me think of confidentiality between a researcher and his subject. I went back to my old textbook. Please humor me and reread the above quoted passage, replacing “social science” with “translation”, “social scientist” and “researcher” with “translator”, “investigation” with “translation method”, and “subject” with “client”. It would go something like this:

Ethical principles, rules and conventions distinguish socially acceptable behaviour from that which is considered socially unacceptable. However, in translation a few translators consider their work beyond scrutiny, presumably guided by a disinterested virtue which justifies any means to attain hoped for ends.

Ethical problems can relate to both the subject matter of the text as well as to the methods and procedures used to translate it, and can go well beyond courtesy or etiquette regarding appropriate treatment of persons in a free society. Translators have often been criticized for lack of concern over the welfare of their clients. The translator often misinforms clients about the nature of the translation method, and-or exposes them to embarrassing or emotionally painful experiences. […] 

Let’s look at that last sentence for a minute: Does the translator misinform clients about the nature of the translation method? One might argue that the translator doesn’t even discuss his translation method, he just agrees to do the translation. Well, we can play with words and use lawyers’ tricks but if we really want to be honest with ourselves, the truth is that when we say “I will do the translation” we are telling the client that “I will do the translation”; I, the translator. And before arguing that it’s not necessarily what we mean, let’s put ourselves in the client’s shoes. What does the client understand when we say that, and what does he expect?

The sentence mentions embarrassing or emotionally painful experiences. Does this apply to us? Let me give just a couple of examples from personal experience:

Recently I translated some academic transcripts from Greek to English for a direct client, let’s call him “Yannis”. Along with the transcripts, I had to translate a long list of engineering course descriptions and a couple of cover letters. I had to rely on my own knowledge (I had taken many of those courses some years ago), on university websites, reliable engineering dictionaries, and my old textbooks. (Who would have thought that my 50000-lb thermodynamics book, also used as a very effective doorstopper, would come in handy after all these years?) What would have happened if I had used an online translation application, say Google Translate? If you think that a lousy translation is the only thing I would have gotten, think again. (And it would be lousy indeed! It turns out that before hiring me, Yannis had tried to do the translation himself, using Google Translate. I guess he didn’t get very far, so he decided to hire a professional. When I sent him my translation he took a quick look and immediately wrote back to thank me and tell me that now he understood why professional translators are so indispensable. I wanted to give Yannis a virtual hug.) Anyway, let’s say I had considered using Google Translate to do this job. First of all, I would have no right to put Yannis’ transcripts on a public domain. If Yannis wanted to do so, that would be his right, those were his grades. I would have no right to share Yannis’ grades with anyone, nor would I have the right to share his personal cover letter with people who are not the intended recipients. Maybe I could remove information that could be used to identify him… His name, address, title, affiliation, all the grades -I’m sure I’d miss something- maybe I should remove the name of the university as well, and the department, and the year of graduation, and the title of the degree. What’s left? Right, the list of courses. But then, would Google really be able to give me a good translation of the description of that specialized course on the dynamics of Diesel engines or the one on welding and soldering techniques? What else would be left? The main body of the cover letters. Again, I have no right to share a letter written by someone other than me with people to whom it is not addressed. Plus if Yannis wanted the letters to be translated by Google, he could have done that himself. If Yannis ever decided to do an online search for some terms or sentences appearing in those cover letters, he might have found the entire text online. Talk about an embarrassing and emotionally painful experience! And of course he’d feel cheated. And if he then mentioned it to me, the embarrassing experience would be all mine.  Now is that the kind of relationship we want to build with our clients? Does the use of an online automatic translation tool reflect the respect and confidentiality that they deserve and consider a given when they hire us? Is that how we make sure they are satisfied and would hire us again or recommend us to others?

Now if a simple document such as an academic transcript is confidential, think about medical records. Or press releases. Or private-meeting minutes.  Or advertising campaigns. Or private correspondence. And yet there are translators who use Google Translate, oblivious to the fact that Google is not Mother Teresa, doing your translation for you asking for nothing in return, out of the goodness of its silicon little heart. “I’m doing this for the common good,” you might say; “if other translators ever need that information, they can find it easily online thanks to me”. Well, the problem with this concept is that the data you are sharing is NOT yours to share!

This brings us to a fundamental difference between the researcher-subject scenario (case A) and the translator-client scenario (case B): In case A, the study is conducted by the researcher, it is his own work from beginning to end; he chooses the topic, he designs the study, he collects and analyzes the data, and he is the one to present the work, for his own benefit (and in the long term for the benefit of the scientific community or perhaps society in general). In case B, the case that concerns us translators, we are given temporary access to work that is not ours. The topic of the document we are to translate, the content, the layout and the presentation all belong to the client, not to the translator, and they are to be used for the client’s benefit. So if confidentiality is such an important concern in case A, think how important it is in case B, i.e. in translation.

To the embarrassing or emotionally painful experiences, as mentioned by Burns, add “professionally detrimental” ones. Here’s an example: I am often asked to translate research articles to be published in American scientific journals. Again, this is research, to be published. Sometimes these papers describe many years’ worth of research. The authors have chosen specific journals through which to make their work known to the scientific community. They have not chosen Google’s database, they have not chosen forums of online translation portals (where translators ask for term advice, and for context they give entire paragraphs that often include highly sensitive and confidential information), they have not chosen anything other than those journals, and it is those journals that will have copyright. Imagine how professionally detrimental it can be to an author of such a paper that describes his work if that paper –whether in its entirety or partially- appears online before the author even has the chance to submit it for publication.

In the same chapter about ethics, privacy, and confidentiality, Burns goes on to say:

The right to privacy is an important right enshrined now in international (UN Declaration of Human Rights) and national legislation.[…] Individuals should decide what aspects of their personal lives, attitudes, habits, eccentricities, fears and guilt are to be communicated to others. […] This does not mean that personal and private behavior cannot be observed ethically; it can, provided that the subjects volunteer to participate with full knowledge of the purposes and procedures involved.

 
The above applies to us as well. Our clients are the only ones who have the right to decide what aspects of their life or work are to be communicated to others, and they must have full knowledge of the procedures involved in the translation. If you plan to outsource the work or if you plan to use an online automatic translation tool or use any other method that might compromise privacy and confidentiality, you should tell your client and obtain his permission. If you are not telling your client because you think it doesn’t concern him, based on the above you’re wrong. If you’re not telling him because you might think he won’t hire you if you do, that means you are knowingly doing something wrong, i.e. you are aware that you are compromising privacy and confidentiality and still choose to proceed. You proceed until a client finds out and complains, or until a client takes legal action against you, or until the translators’ association you belong to tells you that you have violated its code of ethics, or until you simply realize that professionalism in our field of work goes well beyond delivering a good translation.

Ref: Burns, R.B. (2000). Introduction to Research Methods, 4th edition, Pearson Education Australia.