NLPgap Research

Abstract: Research in progress ·· Article in development

NLPgap Latin American Cultural Theory Culturally Situated NLP Research in progress Article in development

The problem of Anglocentrism in the development and training of large generative language models has only recently begun to be considered as an area of study within computer science, specifically within the discipline of natural language processing (NLP), as one of the focal points in the reproduction, propagation, and perpetuation of stereotypes and cultural biases originating in Anglophone culture.

This research reviews the most relevant emerging literature produced by NLP experts, specifically the work of Zhou et al. in their June 2025 paper Culture is Not Trivia, and offers an interdisciplinary perspective from the humanities, specifically from Latin American cultural theory, in order to begin what could be called a practical articulation of their findings. Such an articulation would make possible the creation of language models that are not only technically expert in a culture, but culturally situated: language models trained on localized data and fine-tuned from the center of a culture rather than as spectators of it.

As part of this task, the project will outline a specific theoretical framework for the Latin American region. This framework will provide a solid platform for understanding the specific particularities of our culture, not only in historical terms, but also in order to avoid reducing those characteristics to mere "oddities" of our different dialectal varieties of Spanish, and to understand them instead as their living pulse.

To this end, the research proposes a deeper examination of the categories of "noise" and cultural proxy mentioned in Zhou et al.'s work, as an indispensable step that would make it effectively possible to reach the goal of creating culturally competent systems. It also proposes a methodological review of how both categories are treated within natural language processing pipelines, arguing that in certain contexts this material does not constitute mere residue, but instead carries situated cultural logic. In some cases, this requires a partial resignification of fragments of noise as epistemic proxy, or the creation of an intermediate category that would allow their integration without loss of meaning.

Finally, in resonance and coherence with the challenge this field faces, the present research was originally conceived and written in its target language: Latin American Spanish. Its subsequent translation into English will constitute a gesture that stages the final challenge NLP faces in matters of cultural competence when developing culturally attentive language models.

Heritage, Identity, and History

Their interaction with the institution in audience formation

Undergraduate thesis Universidad de Chile 2006 Cultural institutions Cultural access NLPgap foundations

Thesis submitted for the degree of Licenciada en Teoría e Historia del Arte ·· Universidad de Chile, 2006 ·· Co-authored with Isabel Ibáñez Figueroa ·· Thesis advisor: María Eugenia Brito Astrosa

What follows is the introduction to the thesis, published here as a writing sample and as a foundational document for the line of research that now continues in NLPgap. The structural question remains the same: how do institutions reproduce unequal access to cultural goods, and what is required for real democratization?

Introduction

When the word culture is pronounced, what is generally understood by it? The concept of culture that most Chileans handle in our vocabulary appears as a very general definition, similar to dictionary explanations.

Throughout history, the term culture has taken on various nuances, and more and more specifications have been added for each of the applications given to it. Thus, when we try to approach a precise definition of culture, we face a difficulty: how to grasp the term conceptually and etymologically. There are so many definitions, from the simplest and most general to the most specific and current, that it becomes practically impossible to find a description capable of gathering each and every one of the existing ideas about it.

We have decided to locate a definition of culture derived from research in history, semiology, sociology, and anthropology; a definition that contains common distinctive traits of culture. The studies of Néstor García Canclini, Gabriel Salazar, among others, have been fundamental in determining the precise conception of culture on which we will work.

As a central point, it must be taken into consideration that culture cannot be understood as a general whole. What we mean is that this concept will always be subject to an inevitable contextualization. Cultures will vary qualitatively depending on where they originate, and in this sense, culture will be subordinated to societies. That is, it will be the product of a society. Taking only this first point into consideration, we are obliged to assume that this is a terrain that presents itself as intrinsically diffuse, and that it cannot be seen as a defined concept capable of being transferred and fitted from one context into another.

In this regard, Jurij Lotman produced, in 1979, a definition of culture based on its co-dependent relationship with language, in which both shape a complex totality. He designates at least two basic characteristics of culture: first, that it "cannot be considered a universal set, but only a subset with a certain organization"; and second, that culture will be defined through a binary opposition: culture exists as such insofar as it verifies the existence of a "non-culture," so that, in this interaction, culture reveals itself as "a system of signs."

For the Tartu School, the functionality of culture would consist in "structurally organizing the world that surrounds man." This structurality would be generated through a "stereotyping device," located at the center of culture and acting as a "vigorous spring of structurality"; this device would reside in language. Lotman also affirms that culture would arise as a social phenomenon, understood as the "non-hereditary memory of the collectivity, expressed in a given system of prohibitions and prescriptions," which, for semiology, would constitute a set of conventionalities that regulate the world, including language.

On the other hand, in the context of anthropology, Néstor García Canclini, in the studies for Culturas populares en el capitalismo, states that culture would constitute "a particular type of production whose purpose is to understand, reproduce, and transform the social structure, and to struggle for hegemony."

In this way, we can observe that culture has the invariable characteristic of being a particular conglomerate born from societies, produced by human actions and by their interaction with the environment. Therefore, as a basic premise regarding culture, we must think of it as a structured system of behavior in which every individual is immersed, a system that incorporates conventionalities, norms, and commonly accepted laws through which the environment surrounding man is understood, organized, and determined.

Consequently, culture can be understood as a structure composed of the diverse domains in which the individual develops within society. In this sense, culture comprises a broader spectrum, not only artistic expressions. These would be one aspect of this larger structure, even though most of the time, when the word culture is mentioned, one inevitably refers only to that meaning of it.

The cultures of culture

We must delimit our field of work to artistic culture, considering it as the product of a system of expression that develops and uses its own language. Throughout history, art has had various definitions and functions, among which we may mention serving as a medium of catharsis, educating, subverting established orders, and creating consciousness. But without question, the functional aspect that remains and becomes constant is the one related to expression.

In this way, macro and micro universes must be considered within the structure of culture. From this point of view, the culture of human expression, understood as art, would be defined as a micro-universe within a macro-culture. This micro-universe is a conglomerate that is not alien to various conflicts affecting society. In fact, the possession or lack of certain cultural goods is one of the indicators that defines the individual's position within it.

Culture, beyond all the ways in which it has been defined, contains within itself a factor of great importance, related to its instrumental character. This character becomes visible insofar as we contextualize the concept of culture within society, taking into account other factors, such as the political and economic sphere in which individuals are immersed.

This accumulation of objects, knowledge, expressions, and thoughts that culture comprises has a value for society that is not measurable solely through market values, but rather through an invisible force that remains in constant tension. On one hand, we find a value linked to the mercantile conception of culture; this characteristic comes from a Marxist vision of the world and arises from the basic notions Marx applies to the commodity. From the moment this characteristic is applied to culture, that of the commodity, culture becomes an object with added value, to which one may have greater or lesser access depending on one's position in society.

On the other hand, there exists an inherent value of culture, related to the concept of culture as instrument. Néstor García Canclini calls this cultural power, which would have the capacity to "naturalize the arbitrariness of prevailing orders." That is, in social terms, within a reality circumscribed by dynamics of functioning imposed by a dominant class, culture can soften that imposition or simply make it appear to mean the opposite, appealing to a positive common cause. In this sense, culture can come to legitimize the institutions that represent a hegemony.

Thus, we verify the importance socially assigned to culture as a double and paradoxical path: on one hand, the liberation of man from the alienating processes inherent to industrialized societies and the effects of modernity, liberating because of its capacity to channel expression; and on the other, an instrumental mechanism for the reproduction of social models. This exposes not only a secular problem of politics and capitalist expansion, but also the existence of a terrain of dispute, a confrontation of powers and interests over this valuable instrument for the consolidation of social supremacy.

The unequal distribution of cultural goods

According to Pierre Bourdieu: "cultural goods, accumulated in the history of each society, do not really belong to everyone, even though they are formally offered to everyone, but rather to those who possess the means to appropriate them."

These means are not only purchasing or monetary power, but also intellectual means: "the educational system of each society denies and provides according to socioeconomic position, as resources for appropriating cultural capital; therefore, ultimately, it also reproduces a previous structure of distribution of cultural capital among classes."

There is a problem, on one side, of power and domination, and on the other, of contents and of the intellectualization of each social group's expressions within the cultural sphere.

This unequal distribution of cultural goods in a society designates and determines levels through which artistic objects will circulate, tracing circuits of production, circulation, and consumption of these goods.

Our object of study is proposed as a way to make visible the problem of the unequal distribution of cultural goods, which carries with it a categorical division in the circulation and consumption of those goods, establishing monolithic and static circuits.

Within this theoretical context, which takes the cultural sphere as its broad field of action, directs its gaze toward the problem of the unequal distribution of cultural goods, and seeks to excavate certain aspects of this situation in our country, we have defined the thematic criteria on which we will work: I) heritage; II) identity; III) history; IV) institutions.