Minimize the carbon footprint of data analytics, maximize data center sustainability

A faster total time to get information is more environmentally friendly.

Leaders are more eager than ever to reduce their environmental impact. This is especially true for data centers due to their contribution to global warming. If all the data centers in the world were a country, they would be ranked the fifth largest energy consumer in the world. In 2020, data centers consumed around 1% of global electricity demand and contributed 0.3% of all CO2 emissions.

Today, companies are required to be transparent about their carbon footprint, and the race is on for data centers to improve their efficiency rankings. There is a list of data centers around the world ranked by PUE (price utilization efficiency) and Greenpeace has created a ranking of clean technology industry centers based on their carbon footprint.

The need for a greener code

Many data center sustainability initiatives are based on using renewable energy for cooling or optimizing cooling systems to reduce energy consumption. However, in addition to the energy required to maintain environmental controls for data analysis, the software itself also has a significant effect on the amount of electricity consumed. How many? A little.

Based on current research, a large machine learning (ML) model, such as Meena, consumes the same amount of energy as a passenger vehicle that has traveled 242,231 miles. Researchers at the University of Massachusetts Amherst estimated that training a large deep learning model produces 626,000 pounds of CO2, equivalent to the lifetime emissions of five cars.

As a result, there is increased interest and devotion to creating more efficient code. The Green Software Foundation (GSF), with members such as VMware, Microsoft, Accenture, and GitHub, is dedicated to designing, architecting, and coding software that consumes less power.

Tips for Sustainable Machine Learning

There are several academic papers on how to write greener algorithms for AI/ML models, but here are some basic tips.

One way to reduce computing resources is to minimize the number of training experiments. There are hundreds of ML models or blueprints that are pre-trained, where developers only need to bring in their own data to infuse AI capabilities into applications, dramatically reducing the time needed to develop and form models.

It is also important to have visibility into the carbon footprint of the algorithm in order to make decisions on how best to optimize performance. Researchers from several universities have created tools for this purpose. For example, Green Algorithms calculates the carbon footprint of your cloud computing. Another example is CodeCarbon, which is a software package that integrates into the Python codebase and estimates the amount of CO2 produced by the computing resources used to run the code.

Automation can also be used to reduce training execution time. It is possible to minimize the number of experiments and/or the amount of data analyzed, while maintaining accuracy. More efficient data sampling by itself can speed up model run time by a factor of 5.8.

The software used to perform the calculations can also help reduce the number of computing resources required. There are databases specifically designed to handle massive amounts of data that can optimize memory and storage usage to reduce power consumption. These databases also have the advantage that there is no need to limit the amount of data analyzed, reducing the risk of model accuracy being compromised when trying to speed up the run time.

Reducing model execution time, in addition to increasing energy efficiency, reduces the total time to obtain information for critical applications such as fraud detection, cybersecurity solutions, control of quality, etc More efficient code is not only better for the environment, but it’s also good for business.

More potential customers want transparency about a company’s commitment to its green strategies and having a standard “green” code could be an important first step. Employees want to work for an environmentally sensitive company that makes responsible decisions about the environment. In the future, cloud providers may require visibility into a workload’s carbon footprint, with fines for processing deemed excessive or unnecessary.

With the large number of calculations required to derive meaning to make better business decisions, being socially responsible is not just a benefit, it has become a necessity.

Ohad Shalev is a strategic analyst at SQream.


Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including data technicians, can share data insights and innovations.

If you want to learn more about cutting-edge insights and up-to-date information, best practices, and the future of data and data technology, join us at DataDecisionMakers.

You might even consider writing your own article!

Learn more about DataDecisionMakers

Comments are closed.