Zhang: IGSB SW Impact Grant

ECE Assoc. Prof. Zheng Zhang among four potentially high-impact projects seeking to solve critical energy-efficiency challenges have been awarded more than $240,000 in cumulative funding related to UCSB's Institute for Energy Efficiency (IEE)

headshot of zheng zhang

Excerpt from the COE News – "Investing in Social Impact"

IEE, the College of Engineering’s interdisciplinary research center, is dedicated to cutting-edge science and technologies that support an energy-efficient and sustainable future. Each project aligns with at least one of the institute’s key interdisciplinary thrusts: smart societal infrastructure, computing and communications, and the food-energy-water nexus.

Zheng Zhang's work received one of the two project awards from the Investment Group of Santa Barbara (IGSB) Software Impact grants, which support high-impact research of energy-efficient software that is likely to lead to commercialization and positively impact society. The selection committee also awarded $50,000 grants to two projects through the IEE Research Seed Grant Program. Seed grants are intended to help researchers produce preliminary results that can be used to apply for major external funding to expand their projects.

“Nurturing early-stage concepts with modest yet meaningful financial support not only jumpstarts scientific success but also cultivates and continues the culture of collaboration and discovery that thrives within the IEE and the university,” said IEE Director John Bowers, a distinguished professor of electrical and computer engineering, and materials.

“Whether they are being pursued by junior faculty or highly esteemed researchers, these four projects share key characteristics: they are strong and innovative proposals with significant potential to impact society,” added Mark Abel, the executive director of the IEE.

The four projects involve a total of five UCSB faculty members from the Departments of Chemical Engineering, Materials, Electrical and Computer Engineering, Computer Science, and Chemistry & Biochemistry.

Zheng Zhang – "Optimizing for Impact"

Large Language Models (LLMs), like ChatGPT, are massive AI foundation models designed to understand and generate human language. Based on neural-network architecture, they are trained on large amounts of text data, which allows them to observe linguistic patterns and execute language-related tasks, including text generation, translation, summarization, and question answering. Despite their increasingly excellent performance, large AI foundation models suffer from a major limitation — the extremely high cost of the foundational step of pre-training a model on a vast and diverse dataset. For instance, Meta’s LLaMA model used 2,048 Nvidia A100 graphics processing units (GPUs) over 21 days. GPUs are used to train LLMs because they can break down and process large datasets simultaneously, which accelerates the training.

“Pre-training an LLM requires a huge data set and massive computing resources to maximize its accuracy, but that is extremely expensive,” said Zheng Zhang, an associate professor of electrical and computer engineering. “Each training run costs a few million dollars, and the practical pre-training can take multiple training runs to complete,"

The sky-high costs mean that pre-training is affordable only for giant tech companies like Google and Amazon, and not for academics. The carbon emissions associated with pre-training have also raised red flags regarding its environmental impacts. Zheng has received an IGSB Software Impact Grant of more than $90,000 to address these shortfalls by developing a novel pre-training framework that requires fewer computing resources, energy, and money, and produces dramatically fewer carbon emissions.

“Our plan focuses on perfecting the algorithm,” explains Zhang, who has previously received an Early CAREER Award from the National Science Foundation, multiple best paper awards, and the Ernest S. Kuh Early Career Award from IEEE’s Council on Electronic Design Automation. “Because pre-training costs are measured by GPU hours, we are working to reduce the amount of time it takes to complete the process.”

Zhang will optimize the algorithm by implementing a low-rank tensor compression model, which reduces the number of parameters needed to represent a neural network. The model breaks down the neural network into a set of smaller tensors or matrices, a process known as factorization, which can lead to computational and storage savings. The seed grant allows Zhang to test his theory and expand on promising work recently completed in his lab. His research group developed CoMERA, which stands for Computing and Memory-Efficient training method via Rank-Adaptive tensor optimization. Preliminary results showed that CoMERA reduced the number of training variables by 50-200 times on medium-sized transformers, greatly decreasing the memory cost in training.

“We have already demonstrated CoMERA’s effectiveness on a single GPU and smaller model,” explained Zhang. “Now, we want to demonstrate that this algorithm could be scalable to large language models on massive GPUs.”

Preliminary results also indicated some areas need improvement, so the grant will allow Zhang’s teams to do the work. They aim to achieve at least 10 times the LLM compression in pre-training and 2.4 times the speed on a single GPU. Once the targeted performance is achieved, Zhang will be able to test the framework on a larger scale by leveraging a previously awarded grant of hundreds of GPU hours from the Department of Energy.

“We are extremely grateful for the opportunity to do some unrestricted research to show the technical and commercial impact of our work,” said Zhang. “We are also thankful to the Department of Energy for providing us with the GPU hours and computing time, which very few academic researchers are granted.”

Zhang believes that decreasing the hours to complete a pre-training run would have significant environmental and economic impacts.

“Greatly reducing pre-training costs would also lower the barrier for small businesses and academic groups to develop large language learning models,” said Zhang. “There is also a significant environmental impact, because if you can reduce the cost and GPU hours by three times, that also means you’re consuming approximately three times less electricity and creating three times fewer carbon emissions.”

COE News – "Investing in Social Impact" (full article)