Theory and Practice of Distributed Collaborative Editing


2nd year Masters’ Student Project and Internship

Advisors : Marc Shapiro, Senior Researcher, Olivier Pérès, Post-doc

Background and objectives

In collaborative editing (CE), different users can access a shared document remotely over the Internet. Each user edits his own local replica ; every user observes the merged updates of all users. CE may be online (users see each others’ changes immediately, e.g., Google Wave) ; or offline (a user can work in isolation and synchronise occasionally, e.g., CVS).

Despite its apparent simplicity, CE is hard to get right. CE currently lacks a sound theoretical basis, leading to major flaws in published algorithms. Previous work has used Operational Transformation (remote edits are transformed to make sense on the local replica), or CRDTs (see below).

A commutative replicated data type (CRDT) is one that is designed so that its concurrent operations commute. Concurrent CRDT operations can execute in any order, and no synchronisation is necessary. One example is Treedoc, a CE buffer organised as a tree. Every element in the tree has an invariant identifier, allowing independent insert and delete operations to commute. A major challenge of CRDTs is to garbage-collect without affecting commutativity.

Objectives and pre-requisites

The internship has two goals :

- To achieve better theoretical understanding of CE. A general and unambiguous specification is necessary, enabling formal proper proofs of correctness in the style of distributed computing.

- To improve Treedoc and CRDTs in practice. Our current implementation should be extended with distributed garbage collection, integration with a real editor, and extensive benchmarks.

The applicant should have excellent knowledge of distributed computing. He or she should be familiar with concepts like logical time, models of concurrency, and consensus. He or she should also be able to write distributed programs, preferably in Java. The ability to read and write English is mandatory, French is optional. Contact and other information Send your applications to Marc Shapiro and Olivier Pérès, with the following information :

- A resume or curriculum vitæ

- The list of the courses taken during the last two years and the corresponding marks

- The name and contact information of two persons who can recommend you : teachers, internship advisers, or employers. We will contact them ourselves.

- A short essay (1—2 pages) on the proposed topic. It is free form, but here are some suggestions. Using the bibliography and your technical knowledge, motivate your interest in the topic and your qualifications for it ; discuss scientific and practical issues, problems and difficulties, possible solutions, relate the issues with your own experience, etc.

- Optionally : any software, research article, report or thesis (in English or French), that you wrote during the last year.

The internship will take place in the Regal group (INRIA & LIP6) at LIP6, 104 avenue du Président Kennedy, Paris, France. Duration : 3 to 6 months, depending on the requirements of the school.

Bibliography

- A commutative replicated data type for cooperative editing. Preguiça, Marquès, Shapiro, Letia. Int. Conf. on Distributed Computing Systems, Montréal (ICDCS), Canada, June 2009. PDF

- Consistency without concurrency control in large, dynamic systems. Letia, Preguiça, Shapiro. SOSP Workshop on Large Scale Distributed Systems and Middleware (LADIS), Big Sky, MT, USA, Oct. 2009. PDF

- Telex : A Semantic Platform for Cooperative Application Development. Benmouffok et al. Conférence Française de Systèmes d’Exploitation (CFSE), Toulouse, September 2009. PDF