Getting Started Series

Getting Started with Distributed Computing

Post by

Getting Started Series

Published

January 12, 2021

I

Tags:

No additional tags.

n a distributed system, multiple components are stored across different machines, which in turn coordinate to ensure that the whole system works as one. While it is challenging to deploy and maintain these systems, a properly implemented distributed system serves as a backbone for modern computing at scale. From telecommunication networks to mobile banking to the Internet, these systems are built to tolerate the failure of individual machines, ensuring that the services we rely on continue with little to no disruption.

This week's Selects collects several materials that can serve as a starting point to understand distributed computing. As always, we invite you to share your feedback and suggestions at selects-feedback@acm.org. For more resources in computing, we encourage you to explore the ACM Digital Library and Learning Center.

‍

‍

Decentralized Computing

First published in ACM Queue, Vol. 18, No. 5, October 2020.

Terence Kelley discusses the role that decentralized methods can play in distributed computing using local communication and computation. Kelley discusses a decentralized protocol for self-organizing wireless networks and social networking problems and provides example code for experimenting with the protocol and a centralized solver.

‍

‍

Distributed Systems in One Lesson

Published through O'Reilly. Video lecture available to ACM members. Please refer to the following FAQ for any issues accessing the O'Reilly learning platform.

Simple tasks like running a program or storing and retrieving data become much more complicated when you do them on a collection of computers. In this 2015 O’Reilly video presentation, Tim Berglund (Senior Director of Developer Advocacy at Confluent) discusses five key areas in distributed systems people need to know to get started. We recommend this video session as an introductory deep dive to the topic.
[Read more]

‍

Distributed information processing in biological and computational systems

First published in Communications of the ACM, Vol. 58, No. 1, December 2014.

In this Communications of the ACM article, Saket Navlakha and Ziv Bar-Joseph compare how biological and computational systems solve distributed information processing problems. The authors also discuss constraints, goals and strategies used in both domains, as well as the opportunities for bidirectional research to improve both fields. We believe this article is a good comparative perspective on the applicability of distributed systems.

‍

The verification of a distributed system

First published in Communications of the ACM, Vol. 59, No. 2, January 2016.

Validating that a distributed system is doing the right thing can be challenging. A failure in one computer can be hard to track because of the complexity and scale of these systems. In her 2016 Communications of the ACM article, Catie McAffrey (Architect and Developer Manager, Azure Sphere Security Services) explains the various aspects of a good verification strategy for distributed systems. We recommend this article as an entry point for good software engineering processes that can help improve your and your client’s confidence in system correctness.

‍

There is no getting around it: you are building a distributed system

First published in Communications of the ACM, Vol. 56, No. 6, June 2013.

In this Communications of the ACM article, Mark Cavage highlights the challenges of building a distributed system and useful tips in considering how to build such a system with commonplace use cases such as scaling a multitenant enterprise web application or migrating an existing application to a cloud service provider. The author explains decision points when architecting distributed systems such as geographies, data segregation, service level agreements, security, usage tracking, and deployment.

Dominic Holt

Dominic is a CTO with Fortune 500, small business, and start-up company experience. He works with many Companies, Private Equity and Venture Capital Firms on a global scale as a Fractional CTO at Valerian Technology. He serves as the CEO for harpoon Corp and developed an enterprise software application that enables anyone to visually generate software infrastructure and deploy it to the cloud without writing any code. Previously he worked for Lockheed Martin where he created and became the division head of the Shark Tank® Organization (a world-class engineering team with a focus on emerging technologies), which from 2012 – 2015 was responsible for delivering $2.6 billion in revenue. Dominic has also started several startups and has been internationally recognized with an award for video game development.

Juan Miguel de Joya

Juan de Joya is a Software Development Engineer for Autodesk Maya and Arnold, focused on advising and addressing priority issues in computer graphics, visualization and interactive techniques. Prior to this role, he worked at Oculus Meta, Google, DigitalFish, Pixar Animation Studios, the Walt Disney Animation Studios, and was the Project Officer responsible for digital strategy, research and assessment, and technical communications for AI for Good at the International Telecommunication Union, the United Nations agency for information and communications technologies. Juan was a researcher in computer graphics and physics at the Visual Computing Lab at the University of California, Berkeley. He serves on the ACM Practitioner's Board, Professional Development Committee, Future of Computing Academy, and is the Chair of the Practitioner Development Committee for ACM SIGGRAPH.

Jesmin Jahan Tithi

Dr. Jesmin Jahan Tithi is an AI Research Scientist at Intel focusing on high-performance computing and software-hardware codesign of next-generation processors targeting large-scale machine learning and graph applications. At Intel, Jesmin contributed to DOE's OCR, ECP Pathforward, CORAL2 projects, and DARPA's HIVE and SDH projects. She received her Ph.D. from Stony Brook University, New York (SUNYSB), and worked as an intern in Google, Intel, and PNNL during her Ph.D. After finishing her B.Sc in Computer Science and Engineering from the Bangladesh University of Engineering and Technology, she also worked as a Lecturer in the same prestigious department. Jesmin is a founding member of the Z-inspection -- an assessment process for Trustworthy & Ethical AI. Jesmin has been a member of the ACM Future of Computing Academy (Dec 2019-June 2021), Heidelberg Nobel Laureate Forum alumni (2019), and a current member of the ACM Code of Professional Ethics Board, and ACM Selects. Jesmin is a regular reviewer for ACM and IEEE conferences and journals. Jesmin holds six issued patents and over twenty-two peer-reviewed publications.

THere's More

Recommended Selects

See all selects

Sep

29

//

2022

Getting Started Series

Getting Started with Internet of Things: IoT Applications

This Selects finalizes with an example application domain of Industrial Internet ofThings (IIoT), and a source to delve into state-of-the-art IoT research trends.

Aug

30

//

2022

Getting Started Series

Getting Started with Internet of Things: Computing and Communication

The selection includes easy to read articles describing and motivating the IoT, and later deep dives into the major aspects of IoT such as communication protocols, edge-to-cloud continuum, AI and data analytics, and security/privacy.

Aug

2

//

2022

Computing in Practice Series

Trustworthy AI in Healthcare #02

AI needs to be trustworthy. Trustworthiness means that healthcare organizations, doctors, and patients should be able to rely on the AI solution as being lawful, ethical, and robust.