Latam-GPT: The Free, Open Source, and Collaborative AI of Latin America

Date:

Share:


Latam-GPT is new large language model being developed in and for Latin America. The project, led by the nonprofit Chilean National Center for Artificial Intelligence (CENIA), aims to help the region achieve technological independence by developing an open source AI model trained on Latin American languages and contexts.

“This work cannot be undertaken by just one group or one country in Latin America: It is a challenge that requires everyone’s participation,” says Álvaro Soto, director of CENIA, in an interview with WIRED en Español. “Latam-GPT is a project that seeks to create an open, free, and, above all, collaborative AI model. We’ve been working for two years with a very bottom-up process, bringing together citizens from different countries who want to collaborate. Recently, it has also seen some more top-down initiatives, with governments taking an interest and beginning to participate in the project.”

The project stands out for its collaborative spirit. “We’re not looking to compete with OpenAI, DeepSeek, or Google. We want a model specific to Latin America and the Caribbean, aware of the cultural requirements and challenges that this entails, such as understanding different dialects, the region’s history, and unique cultural aspects,” explains Soto.

Thanks to 33 strategic partnerships with institutions in Latin America and the Caribbean, the project has gathered a corpus of data exceeding eight terabytes of text, the equivalent of millions of books. This information base has enabled the development of a language model with 50 billion parameters, a scale that makes it comparable to GPT-3.5 and gives it a medium to high capacity to perform complex tasks such as reasoning, translation, and associations.

Latam-GPT is being trained on a regional database that compiles information from 20 Latin American countries and Spain, with an impressive total of 2,645,500 documents. The distribution of data shows a significant concentration in the largest countries in the region, with Brazil the leader with 685,000 documents, followed by Mexico with 385,000, Spain with 325,000, Colombia with 220,000, and Argentina with 210,000 documents. The numbers reflect the size of these markets, their digital development, and the availability of structured content.

“Initially, we’ll launch a language model. We expect its performance in general tasks to be close to that of large commercial models, but with superior performance in topics specific to Latin America. The idea is that, if we ask it about topics relevant to our region, its knowledge will be much deeper,” Soto explains.

The first model is the starting point for developing a family of more advanced technologies in the future, including ones with image and video, and for scaling up to larger models. “As this is an open project, we want other institutions to be able to use it. A group in Colombia could adapt it for the school education system or one in Brazil could adapt it for the health sector. The idea is to open the door for different organizations to generate specific models for particular areas like agriculture, culture, and others,” explains the CENIA director.



Source link

━ more like this

The Small English Town Swept Up in the Global AI Arms Race

A short drive from London, the town of Potters Bar is separated from the village of South Mimms by 85 acres of rolling...

Silent chip defects may be corrupting data in modern computers

Computing is often celebrated for its precision and speed. But researchers and hyperscale data center operators are warning of a growing threat that...

Russia spies forcing Ukrainian to burn and bomb – London Business News | Londonlovesbusiness.com

Russian operatives are impersonating Ukrainian law enforcement and coercing civilians into arson, terrorism, and sabotage, the State Security Service (SBU) warned Monday. “They called...

NASA confirms target date for crewed Artemis II lunar flight

NASA has announced a date for the second wet dress rehearsal for the SLS rocket that will send a crew of astronauts on...

Ukraine braces as Putin’s Oreshnik ‘silent killer’ missile could strike – London Business News | Londonlovesbusiness.com

Ukrainian intelligence has warned that Russia is preparing a large-scale attack, with troops, artillery, and long-range missiles poised to hit multiple fronts. Analysts fear...
spot_img