Graphic by Jung in-sung

As big global tech companies are racing to develop ‘Small Large Language Models (sLLM)’, South Korean companies have joined the competition too. A large language model (LLM) is a type of artificial intelligence (AI) program capable of recognizing and generating text, among other tasks.

Recently, Meta’s official blog recently featured a case study on the S. Korean AI startup Upstage and Mathpresso’s ‘MATH GPT’. It is unusual for Meta’s official blog, which primarily introduces its own products or research achievements, to feature technology from a S. Korean startup.

MATH GPT’s parameters are at the level of 13 billion, smaller compared to GPT-4′s (1 trillion), but its mathematical capability is considered the best in the world. In the ‘Math Benchmark’ consisting of 12,500 difficult math competition problems, it scored 0.488 out of a perfect score of 1, surpassing OpenAI’s GPT-4, which scored 0.425.

An official from Upstage explained, “While ChatGPT has been trained on a relatively small volume of mathematics data, MATH GPT is an AI model specialized in mathematics, based on advanced math data held by Mathpresso.”

Several major tech companies, including Google, Microsoft, and Meta, have launched sLLMs this year. This has led to S. Korean companies also entering this market. sLLMs are gaining attention as a cost-effective alternative to large language models (LLMs), which require a significant amount of money for both training and operation.

For instance, Google has introduced its sLLM called ‘Gemini Nano’, while Microsoft has launched ‘Phi-3 Mini’, and Meta has unveiled ‘Llama 3′. The parameters of an AI model indicate how many complex commands it can understand, and for sLLMs, the parameters range from hundreds of millions to tens of billions.

For instance, Gemini Nano, Phi-3 Mini, and Llama3 have parameters of 1.8 billion, 3.8 billion, and 8 billion, respectively. Large language models have parameters exceeding 100 billion. Sam Altman, CEO of OpenAI, previously mentioned that the cost of running an LLM-based ChatGPT is “tearfully expensive.”

To improve response speeds and efficiency of LLM, sLLMs have been introduced.

These models are lightweight and can be embedded directly into devices, such as smartphones and laptops, making them suitable for ‘on-device AI’.

sLLMs are also more cost-effective than LLMs, making them better suited for the business-to-business market in which the focus is on achieving minimum cost and maximum performance.

Following the trend, S. Korean companies are also rapidly releasing sLLMs. Naver recently showcased ‘HCX-Dash,’ a new lightweight model of HyperClovaX, which can be used at a fifth of the cost of previous models. It is suitable for simple tasks like generating sentences or summaries, as well as creating reports or custom chatbots.

Upstage released ‘Solar Mini’ as open source on Amazon SageMaker JumpStart and AWS Marketplace last March, allowing clients to fine-tune it to create their own custom generative AI services. An sLLM for the edutech industry was also launched.

Fasoo, previously focused on data and application security, launched its sLLM ‘Ellm.’ Ellm is an on-premises sLLM that can be used in various professions and industrial environments such as coding, law, taxation, and finance. It allows additional training of the model using small datasets specific to certain tasks or domains, enabling use within specific departments or organizations of a company.