Like most large banks, Citi is evaluating hundreds of use cases for generative AI, assessing the business impact and risks of each. But it is moving forward quickly on a few.
For instance, the bank will have a roadmap for rolling out GitHub Copilot to all developers – about 40,000 employees – by mid-April, according to Shadman Zafar, CIO of personal banking and wealth management at Citi and lead for its generative AI work. This should save a lot of time, especially where reusable code can be found within the bank’s own repository. Citi is also using generative AI to modernize legacy systems and do first drafts of compliance assessments, among other things.
“I do believe it’s a technology that will, in a sustainable way, have a long-term impact on how we do work for a couple of decades to come,” Zafar said in an interview.
Citi is not the only bank moving quickly on generative AI for developers. Other banks,
“The use of generative AI in IT is much more pervasive than any other functional areas in financial organizations,” said Indranil Bandyopadhyay, principal analyst at Forrester.
“So it makes sense to use solutions like Github Copilot – it has potential to make the coders more efficient.”
Citi started with a pilot with 250 developers, then expanded it to 700 and then to 2,000 programmers, Zafar said. About 500 to 1,000 people per day are adopting GitHub Copilot.
Zafar chose GitHub’s Copilot Enterprise because it allows for better control of inputs and outputs, he said.
One of the challenges of letting developers use generative AI is the fear of plagiarism: an algorithm trained on code pulled from outside sources could potentially suggest the use of code that is someone else’s intellectual property.
“This has been a very active conversation in our developer committee,” Zafar said. GitHub Copilot “has guardrails around that, which is why we like it, and why a lot of the developers also complain about it.”
They complain because using generative AI on their home systems, they could have 40 lines of code written for them, whereas GitHub Copilot working within the Citi system might only complete one and a half lines of code. Citi developers are not allowed to bring their work home due to compliance constraints.
“That’s the difference between being a college student doing your homework and being a significant bank – you have to have a certain level of control,” Zafar said.
Citi also uses retrieval augmented generation, a method of retrieving data and documents relevant to a question or task and providing them as context for a large language model, drawing from its own code repository.
“When it pulls larger segments of code, it picks out of the approved Citi code repository, which is obviously not as big as the internet, but it still has millions and millions of lines of code,” Zafar said. This way, it is using tested code that meets Citi’s standards.
Retrieval augmented generation is a well-known technique to keep LLM hallucination in check, Bandyopadhyay said.
“When implemented properly, it can look at an organization’s own or other verified knowledge base,” he said. “This process ensures that the generated content is anchored to factual information found in the retrieved knowledge base, thereby reducing the likelihood of producing content that is entirely fabricated or unrelated to verifiable sources.”
Other use cases for gen AI
In another use case, Citi’s risk and compliance group uses generative AI to summarize a project, determine which regulations might apply to it and summarize those regulations.
“It handles the first four hours of prep needed for a compliance review,” Zafar said. This will tell compliance staff which areas they need to focus on. Then humans will read through the regulations and do a detailed assessment.
Citi has also begun modernizing legacy systems with the help of generative AI, having it translate old code to Java.
“This is one of the key areas of generative AI application that is least risky,” Zafar said. “It’s a very straightforward, mundane translation problem, but it is expensive to do this manually. We have many old legacy platforms that we have been wanting to translate for a long time to modern languages because we can’t even find developers for some of these old programming languages anymore.”
This is a use case with potential at many banks, according to Michael Abbott, global banking lead at Accenture.
“This is perhaps one of the most transformative ideas that I’ve seen early in the generative AI lifecycle, the ability to reverse engineer 30 or 40 years of legacy COBOL code into close to its original requirements,” Abbott said in a January
How Citi vets gen AI ideas
Citi is evaluating hundreds of other generative AI use cases in areas like operations automation, customer service, fraud detection and office productivity.
Last year, the bank formed an AI lab where all generative AI ideas are submitted to a group with two task forces. One conducts a technology feasibility assessment, while the other does a business case assessment.
The use cases that make it through this vetting process are put through a risk framework to determine whether they have the right level of explainability, observability and transparency.
“For all banks coming into a large language model world, transparency and explainability to our regulators is one of the tougher areas for us to address,” Zafar said. “We formed a focused task force around the risk framework to one by one curate those hundreds of use cases.”
Citi has also been training employees on how to use generative AI, so that once use cases have made it through these gatekeepers, people will be ready to use it.
Zafar recognizes generative AI is overhyped.
“We are right smack in the middle of the hype cycle,” he said. “That means some people will get disappointed, probably depressed, in the coming months, if they don’t get what they are expecting now, which is probably too much. On the other hand, some people will get really disappointed because they think that it’s just a fad.”
He sees a lot of potential in the use of generative AI, but he also sees more innovations in AI coming.
“In the coming two years, new models will come up that are not large language models, but that are more productive and more efficient,” Zafar said.
Rather than hanging back or over-reacting to the hype, “we are trying to keep ourselves grounded,” he said.