Getting Started with Distributed Computing

Post by 
January 12, 2021
No additional tags.

n a distributed system, multiple components are stored across different machines, which in turn coordinate to ensure that the whole system works as one. While it is challenging to deploy and maintain these systems, a properly implemented distributed system serves as a backbone for modern computing at scale. From telecommunication networks to mobile banking to the Internet, these systems are built to tolerate the failure of individual machines, ensuring that the services we rely on continue with little to no disruption.

This week's Selects collects several materials that can serve as a starting point to understand distributed computing. As always, we invite you to share your feedback and suggestions at For more resources in computing, we encourage you to explore the ACM Digital Library and Learning Center.

Decentralized Computing

First published in ACM Queue, Vol. 18, No. 5, October 2020.

Terence Kelley discusses the role that decentralized methods can play in distributed computing using local communication and computation. Kelley discusses a decentralized protocol for self-organizing wireless networks and social networking problems and provides example code for experimenting with the protocol and a centralized solver.

[Read more]

Distributed Systems in One Lesson

Published through O'Reilly. Video lecture available to ACM members. Please refer to the following FAQ for any issues accessing the O'Reilly learning platform.

Simple tasks like running a program or storing and retrieving data become much more complicated when you do them on a collection of computers. In this 2015 O’Reilly video presentation, Tim Berglund (Senior Director of Developer Advocacy at Confluent) discusses five key areas in distributed systems people need to know to get started. We recommend this video session as an introductory deep dive to the topic.
[Read more]

Distributed information processing in biological and computational systems

First published in Communications of the ACM, Vol. 58, No. 1, December 2014.

In this Communications of the ACM article, Saket Navlakha and Ziv Bar-Joseph compare how biological and computational systems solve distributed information processing problems. The authors also discuss constraints, goals and strategies used in both domains, as well as the opportunities for bidirectional research to improve both fields. We believe this article is a good comparative perspective on the applicability of distributed systems.

[Read more]

The verification of a distributed system

First published in Communications of the ACM, Vol. 59, No. 2, January 2016.

Validating that a distributed system is doing the right thing can be challenging. A failure in one computer can be hard to track because of the complexity and scale of these systems. In her 2016 Communications of the ACM article, Catie McAffrey (Architect and Developer Manager, Azure Sphere Security Services) explains the various aspects of a good verification strategy for distributed systems. We recommend this article as an entry point for good software engineering processes that can help improve your and your client’s confidence in system correctness.

[Read more]

There is no getting around it: you are building a distributed system

First published in Communications of the ACM, Vol. 56, No. 6, June 2013.

In this Communications of the ACM article, Mark Cavage highlights the challenges of building a distributed system and useful tips in considering how to build such a system with commonplace use cases such as scaling a multitenant enterprise web application or migrating an existing application to a cloud service provider. The author explains decision points when architecting distributed systems such as geographies, data segregation, service level agreements, security, usage tracking, and deployment.

[Read more]

THere's More

Recommended Selects

See all selects
Getting Started Series

Getting Started with Internet of Things: IoT Applications

This Selects finalizes with an example application domain of Industrial Internet ofThings (IIoT), and a source to delve into state-of-the-art IoT research trends.
Getting Started Series

Getting Started with Internet of Things: Computing and Communication

The selection includes easy to read articles describing and motivating the IoT, and later deep dives into the major aspects of IoT such as communication protocols, edge-to-cloud continuum, AI and data analytics, and security/privacy.
Computing in Practice Series

Trustworthy AI in Healthcare #02

AI needs to be trustworthy. Trustworthiness means that healthcare organizations, doctors, and patients should be able to rely on the AI solution as being lawful, ethical, and robust.

Help guide ACM Selects!

Let us know how we can improve your ACM Selects experiences, what topics you would like us to cover in the future, whether you would like to contribute and/or subscribe to our newsletter by emailing

We never share your info. View our Privacy Policy
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
continue learning with the acm digital library!
explore ACM DL