AI in Java: Building a ChatGPT Clone With Spring Boot and LangChain

Learn to build a ChatGPT clone with Spring Boot, LangChain, and Hilla in Java. Cover both synchronous chat completions and advanced streaming completion.

Marcus Hellberg

Oct. 09, 23 · Tutorial

Like (14)

Save

12.1K Views

Many libraries for AI app development are primarily written in Python or JavaScript. The good news is that several of these libraries have Java APIs as well. In this tutorial, I'll show you how to build a ChatGPT clone using Spring Boot, LangChain, and Hilla.

The tutorial will cover simple synchronous chat completions and a more advanced streaming completion for a better user experience.

Completed Source Code

You can find the source code for the example in my GitHub repository.

Requirements

Java 17+
Node 18+
An OpenAI API key in an OPENAI_API_KEY environment variable

Create a Spring Boot and React project, Add LangChain

First, create a new Hilla project using the Hilla CLI. This will create a Spring Boot project with a React frontend.

     Shell 
   
   npx @hilla/cli init ai-assistant

Open the generated project in your IDE. Then, add the LangChain4j dependency to the pom.xml file:

     XML 
   
 
 
   <dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j</artifactId>
    <version>0.22.0</version> <!-- TODO: use latest version -->
</dependency> 
  

Simple OpenAI Chat Completions With Memory Using LangChain

We'll begin exploring LangChain4j with a simple synchronous chat completion. In this case, we want to call the OpenAI chat completion API and get a single response. We also want to keep track of up to 1,000 tokens of the chat history.

In the com.example.application.service package, create a ChatService.java class with the following content:

     Java 
   
 
 
   @BrowserCallable
@AnonymousAllowed
public class ChatService {

    @Value("${openai.api.key}")
    private String OPENAI_API_KEY;

    private Assistant assistant;

    interface Assistant {
        String chat(String message);
    }

    @PostConstruct
    public void init() {
        var memory = TokenWindowChatMemory.withMaxTokens(1000, new OpenAiTokenizer("gpt-3.5-turbo"));
        assistant = AiServices.builder(Assistant.class)
                .chatLanguageModel(OpenAiChatModel.withApiKey(OPENAI_API_KEY))
                .chatMemory(memory)
                .build();
    }

    public String chat(String message) {
        return assistant.chat(message);
    }
} 
  

@BrowserCallable makes the class available to the front end.
@AnonymousAllowed allows anonymous users to call the methods.
@Value injects the OpenAI API key from the OPENAI_API_KEY environment variable.
Assistant is the interface that we will use to call the chat API.
init() initializes the assistant with a 1,000-token memory and the gpt-3.5-turbo model.
chat() is the method that we will call from the front end.

Start the application by running Application.java in your IDE, or with the default Maven goal:

     Shell 
   
   mvn

This will generate TypeScript types and service methods for the front end.

Next, open App.tsx in the frontend folder and update it with the following content:

     TypeScript-JSX 
   
 
 
   export default function App() {
  const [messages, setMessages] = useState<MessageListItem[]>([]);

  async function sendMessage(message: string) {
    setMessages((messages) => [
      ...messages,
      {
        text: message,
        userName: "You",
      },
    ]);

    const response = await ChatService.chat(message);
    setMessages((messages) => [
      ...messages,
      {
        text: response,
        userName: "Assistant",
      },
    ]);
  }

  return (
    <div className="p-m flex flex-col h-full box-border">
      <MessageList items={messages} className="flex-grow" />
      <MessageInput onSubmit={(e) => sendMessage(e.detail.value)} />
    </div>
  );
} 
  

We use the MessageList and MessageInput components from the Hilla UI component library.
sendMessage() adds the message to the list of messages, and calls the chat() method on the ChatService class. When the response is received, it is added to the list of messages.

You now have a working chat application that uses the OpenAI chat API and keeps track of the chat history. It works great for short messages, but it is slow for long answers. To improve the user experience, we can use a streaming completion instead, displaying the response as it is received.

Streaming OpenAI Chat Completions With Memory Using LangChain

Let's update the ChatService class to use a streaming completion instead:

     Java 
   
 
 
   @BrowserCallable
@AnonymousAllowed
public class ChatService {

    @Value("${openai.api.key}")
    private String OPENAI_API_KEY;
    private Assistant assistant;

    interface Assistant {
        TokenStream chat(String message);
    }

    @PostConstruct
    public void init() {
        var memory = TokenWindowChatMemory.withMaxTokens(1000, new OpenAiTokenizer("gpt-3.5-turbo"));

        assistant = AiServices.builder(Assistant.class)
                .streamingChatLanguageModel(OpenAiStreamingChatModel.withApiKey(OPENAI_API_KEY))
                .chatMemory(memory)
                .build();
    }

    public Flux<String> chatStream(String message) {
        Sinks.Many<String> sink = Sinks.many().unicast().onBackpressureBuffer();

        assistant.chat(message)
                .onNext(sink::tryEmitNext)
                .onComplete(sink::tryEmitComplete)
                .onError(sink::tryEmitError)
                .start();

        return sink.asFlux();
    }
} 
  

The code is mostly the same as before, with some important differences:

Assistant now returns a TokenStream instead of a String.
init() uses streamingChatLanguageModel() instead of chatLanguageModel().
chatStream() returns a Flux<String> instead of a String.

Update App.tsx with the following content:

     TypeScript-JSX 
   
 
 
   export default function App() {
  const [messages, setMessages] = useState<MessageListItem[]>([]);

  function addMessage(message: MessageListItem) {
    setMessages((messages) => [...messages, message]);
  }

  function appendToLastMessage(chunk: string) {
    setMessages((messages) => {
      const lastMessage = messages[messages.length - 1];
      lastMessage.text += chunk;
      return [...messages.slice(0, -1), lastMessage];
    });
  }

  async function sendMessage(message: string) {
    addMessage({
      text: message,
      userName: "You",
    });

    let first = true;
    ChatService.chatStream(message).onNext((chunk) => {
      if (first && chunk) {
        addMessage({
          text: chunk,
          userName: "Assistant",
        });
        first = false;
      } else {
        appendToLastMessage(chunk);
      }
    });
  }

  return (
    <div className="p-m flex flex-col h-full box-border">
      <MessageList items={messages} className="flex-grow" />
      <MessageInput onSubmit={(e) => sendMessage(e.detail.value)} />
    </div>
  );
} 
  

The template is the same as before, but the way we handle the response is different. Instead of waiting for the response to be received, we start listening for chunks of the response. When the first chunk is received, we add it as a new message. When subsequent chunks are received, we append them to the last message.

Re-run the application, and you should see that the response is displayed as it is received.

Conclusion

As you can see, LangChain makes it easy to build LLM-powered AI applications in Java and Spring Boot.

With the basic setup in place, you can extend the functionality by chaining operations, adding external tools, and more following the examples on the LangChain4j GitHub page, linked earlier in this article. Learn more about Hilla in the Hilla documentation.

AI API Java (programming language) Spring Boot ChatGPT Open source

Opinions expressed by DZone contributors are their own.

Comments

Related

Trending