Getting Started with Data Science

Post by 
Published 
November 3, 2020
T
No additional tags.

here is a large gap between exploratory data science and building an intelligent application that continually learns from the data it encounters to provide business value. In this ACM Select, we highlight content to ease the transition from research to production and illuminate the hurdles you may come across in your journey.

Overview

Data science: challenges and directions

First published in Communications of the ACM, Vol. 60, No. 8, July 2017.

In this overview article, Prof. Longbing Cao describes the processes of data science, its overlap with other disciplines, and the challenges present in data-driven decision making.

[Read more]

Data Validation

Your machine learning model can break, degrade, and exhibit unwanted behaviour in numerous ways. The primary cause is issues and irregularities with your data, and data cleaning and validation help to minimize this.

Putting Machine Learning into Production Systems

First published in ACM Queue, Vol. 17, Issue 4, October 7, 2019.

Adrian Colyer gives an overview of two papers concerned with data validation techniques and provides insight into data skew and drift, where the data you trained the model on is no longer representative of the data your system is seeing in real-world operation.

[Read more]

Data Cleaning for Accurate, Fair, and Robust Models: A Big Data - AI Integration Approach

First presented at DEEM'19: Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, June 2019. 

Training your model on biased data results in a biased model. This paper describes methods for ensuring that your training data is accurate and free from bias. 

[Read more]


Model Interpretability

Explanations of why a model arrived at its result help understand whether a machine learning model employed true evidence or the bias that widely exists in training data. Model interpretability is this ability to interpret the results of a model.

Techniques for interpretable machine learning

First published in Communications of the ACM, Vol. 63, No. 1, December 2019.

Interpretability can be classified as intrinsic or post-hoc, both of which can be further broken down into global and local. This article describes these classifications, and also discusses the larger goal of democratizing model explanations for end-users than only for research intuitions.

[Read more]

Bias

Algorithms are increasingly helping organize all aspects of our personal and professional lives; but one must be careful to avoid instances of pre-existing societal bias seeping into your models as they make real-world decisions.

Algorithms, Platforms, and Ethnic Bias

First published in Communications of the ACM, Vol. 62, No. 11, November 2019.

In this article, Martin Kenney, a Distinguished Professor at UC Davis, describes types of bias, how they arise from training data, choosing and interpreting models to minimize bias, and the fine line between accuracy and fairness that a data scientist must walk.

[Read more]


Putting It All Together: A Case Study of AI Bots

A Decade of Social Bot Detection

First published in Communications of the ACM, Vol. 63, No. 10, October 2020.

To generate business value, your model will need to be operationalised as part of a broader system. However, such systems aren’t always used for good. In this article, social media researcher Stefano Cresci looks at the influx of AI ‘bots’, how they impact people’s online interactions, and approaches to combat them. 

[Read more]

THere's More

Recommended Selects

See all selects
Dec
1
//
2020
Getting Started Series

Getting Started with People Management

For those considering a long-term career in technology leadership, understanding and managing relationships is equally important as one’s ability to tackle technology trends and code.
Nov
24
//
2020
Getting Started Series

Getting Started with Cybersecurity

Whether it is building a simple app, accessing the WiFi in your local café or listening to a podcast, an important consideration that every technology user, engineer and service provider needs to be aware of is how secure the app or service they are either using or providing is.
Nov
17
//
2020
Getting Started Series

Getting Started with Computer Vision

Today, Computer Vision is important for everything from enabling autonomous vehicles to understand the world they are navigating, to using mobile phones to identify skin cancer, read a menu abroad in a foreign language, or identifying defective products in a manufacturing line.

Help guide ACM Selects!

Let us know how we can improve your ACM Selects experiences, what topics you would like us to cover in the future, whether you would like to contribute and/or subscribe to our newsletter by emailing selects-feedback@acm.org.

We never share your info. View our Privacy Policy
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
continue learning with the acm digital library!
explore ACM DL