Realworld

R069 - Entrepreneurship with Social Impact, with Ivana Feldfeber

Podcast 49 min

Follow Realworld!

Listen to the episode on Spotify ⁠⁠Apple Podcast⁠⁠ or ⁠⁠Google Podcast⁠⁠⁠.
Subscribe to our YouTube channel so you don't miss an episode

Entrepreneurship is often associated with business and risk. Most successful businesses have thrived by building on an identified opportunity to capture value. But there are other ways to undertake entrepreneurship. In social entrepreneurship, the problem faced is not solved with a consumer solution. Innovative solutions to impact social issues.

In this field is where Ivana Feldfeber, co-founder and executive director of DataGénero, the first data observatory in Latin America from a gender perspective, operates. A researcher at the Center for Artificial Intelligence and Digital Policies for Public Policy Analysis of AI in Latin America, she holds a postgraduate degree in Data Science, Machine Learning, and its Applications from the University of Córdoba, Argentina.

What is the real world to you?

I always talk about the real world and the virtual world as if they were two different and distinct worlds. There is something that happens when we talk between two people, we look each other in the eyes. For me, the real world is when we meet another person and step out of this logic, which social media has accustomed us to, of other times and being able to think much more about what we are going to say. The real world is more genuine.

AI today

What is your view on the current state of artificial intelligence?

Smoke on the water.

There is a lot of smoke, nothing behind it, a lot of empty shells, and many people wanting to do AI for the sake of AI, and what they want to do is with data, not with AI or statistics. I find it interesting to think about technology more broadly than just AI. There is a lot of super interesting and powerful technology, not solutionist, but that can do a lot of interesting things. Many of the inquiries we receive regarding AI are ultimately data inquiries. Without good data, there is no good AI. At the same time, there is something a bit dangerous about how AI is being thought of, about predicting behaviors, predicting crime, predicting who can get a bank loan, or who can get parole. It is very complex for an algorithm to decide about people's lives. I find it at least problematic.

And that there is no one behind reviewing that. It is said human in the loop, like a human being involved in all that decision-making system. But it should be the other way around, the center should be people, and AI is just another part, surrounding us all, there and can help with some things. But I find it very problematic to think that it will save us or take over our entire lives.

Intersectional, feminist, and anti-solutionist stance.

Argentina is a deeply racist country and claims it is not racist because it has no black people, which is a lie. Simply, Argentinians are very good at making brown and black people within the country invisible, indigenous people. They do not appear in movies, they do not appear in advertisements, they are not in positions of power, and that is racist. So, we also try, from this intersectional perspective, to make visible many oppressions and many power inequalities within society. We are not only interested in gender issues, but we see it as a good vehicle to make visible a lot of things that are wrong in this world. And we have worked in particular with Eugenia Figueroa, from Identidad Marrón, who has an Instagram called Soy Mujer Colla, where she works a lot on these issues of racism in Argentina. There are few people who speak and are made visible by social media, not even the media.

And there is also another guy, David, from Identidad Marrón, who has been denouncing that there is complete racism and invisibility in Argentina. I believe there is something within society that prejudges a lot because there are many white people who believe they are European in Argentina, especially in Buenos Aires. And when you go to different cities or rural areas in the rest of the country, other things happen. Everything happens in Buenos Aires, so there are many white people, there are also many brown and black people, but what is seen and who governs and who is everywhere are white people. We also think about disability issues; we are interested in making visible through data what is not working, what is not functioning, and the people who remain invisible.

There is something dangerous about how AI is being thought of, about predicting behaviors, predicting crime. And that there is no one reviewing that.

And regarding anti-solutionism, we are interested in understanding that technology will not solve social problems. For example, an algorithm will not solve poverty or hunger or violence. It does not work like that, and I believe there are many people who have that illusion, that optimism, that hope that technology will save us and solve our problems. And we have a completely opposite stance, believing that social problems must be solved with society, with human people behind, with money.

Big companies have this AI fetish. I think these are some distractions. The solutions do not lie here.

There is a lot of funding to research artificial intelligence topics for social issues. Okay, but if we want to solve gender violence, millions and millions of whatever currency are needed to work in the field, to work on training, to accompany, to have shelters for women fleeing their homes, territorial devices, and for that, there is never money. And there is a problem of wealth distribution.

Big companies have this AI fetish, and such a fund gives 5 million dollars for the 10 projects that have initiatives to reduce hunger with AI. Those companies that have billions and billions of dollars could do much more by really putting the money where it is needed. I think these are some distractions. The solutions do not lie here. This is interesting because we can think that artificial intelligence can serve some things, yes, great, but we are not solving any problem. I want that to be clear. Are they solving gender violence with artificial intelligence? No, in no way do we want to be associated with that because that discourse is super dangerous.

How has the change of government in Argentina affected all these issues? Yes, there has been one decree after another prohibiting gender-related policies.

What happened is that the previous government also, instead of solving basic problems, took these more progressive agenda issues, such as feminism, and created inclusive language devices and gender policies, perhaps neglecting other areas. And there is a large part of society that is tired of feminism. And also because there is a reactionary part of Argentine society. I am going to say something very controversial, I do not care so much that they prohibit inclusive language or the gender perspective in the Government. Yes, it prejudices us directly at the observatory, but I am more concerned about the things that are more difficult to undo, to undo. If they sell a glacier to a foreigner, if Elon Musk comes for all the lithium in Argentina, that is much more difficult to dismantle than to reinstall inclusive language or the gender perspective in the Government. There are different levels of disaster. On March 8, the Milei Government closed a room in the Casa Rosada, which was the Women's Room of the Bicentennial, and renamed it the Hall of the Heroes, symbolic to the core.

Instead of worrying about all the economic disaster that is happening, they chose to put a lot of focus, emphasis on this room. That is symbolically very strong, but it keeps us talking about that, instead of talking about everything else. And removing Menem's poster and leaving Juana Azurduy's is something that is done in a day. But recovering the lithium, I do not know when we can do it. So, I do not want the debate about feminism to divert attention from what really matters, which is that there are people who do not have enough to eat, that there are people who are having a very bad time, who are falling into poverty selling all their goods to survive, and they want to keep us with this gender perspective. I care about other things, in that sense, I am concerned about many other things. That they prohibit the gender perspective from the Government cuts our project of accompanying governments to have better data and better public gender policies, yes. But well, we will see where we go, it does not matter. When people do not have enough to eat and social unrest begins and a social outbreak, there is something that ends up breaking and it is not so easy to fix with that easily.

What have you been able to learn so far from all the information you have collected? Do you have evidence?

It is very difficult. In Argentina, we have very little. That is the big question they ask us because all the funds, going back to these fundings, like you to show your impact and show what you can learn. And if there is no data, we have to start by collecting it and having substantial data to draw conclusions. We do not want to draw conclusions from a database of a thousand rows because it would be a bit irresponsible on our part. For example, we created the database of the last elections of parity in the lists and in Congress. There is never parity, for example, we learn that. But if we do not have it in data, we could not know, it is something we intuit, that we know, but we do not know. We have some databases, but we also have many rejected emails from requests for access to information to complete some puzzles. We have the equal marriage law, where people can marry the same sex since 2012, and we do not know how many people actually got married. Because there are many provinces that do not have that information or do not want to give it. Others say they are not collecting it because they assume all couples are heterosexual, so they do not record the sex of the people. That is what they tell us. So, out of the 24 provinces of Argentina, we obtained eight provinces with data that served to draw conclusions. So, we cannot talk about a country, we can talk about eight provinces. We can see some trends. For example, the first year the law was enacted, many people over 50 got married, and then it balanced out, but there were many people waiting to get married, for the law to come out to get married and be able to manage their assets better and also declare their love, but there is a pragmatic part of marriage that these people were not able to have. We can see it, but we cannot see it for the whole country. And we cannot know what happens, for example, in the more conservative, more Catholic provinces. We do not have that data. So, it is very difficult to say: "We have this impact because the data we collected changed the lives of so many people," because we are at a much more initial stage and it will take many years and even more with a government that dismisses all this and is erasing data.

What we did with a group of activists was scrape all the data that was there before the new administration began, and we have 2 TB or more of official data that was uploaded and that little by little is being taken down and no longer appears or is not updated anymore. So, we are safeguarding the information. We will have batches of four years when there are governments that are not interested in data openness. But well, at least we are saving some of that.

What does DataGénero look like today?

We are four who are in constant operations, not full-time, because we all have other jobs, and then we have teams of analysts who work by project. A team of 12 people, all cis women, who are working on different projects depending on the area we are in: the legislative, judicial, executive area... In training, we have another team as well.

One of them is AymurAI

AymurAI is like our little gem. We had been talking a lot about the risks of AI, biases, power inequalities in society, and how that made AI solutions also reflect those inequalities and have problems. But in 2021, Paola Ricaurte, who is a wonderful Ecuadorian researcher, she works a lot with AI and feminisms and activism and also colonialism, writes to me and says: There is an open call to generate artificial intelligence for feminisms, which they call feminist artificial intelligence.

Right now, you are working with court number 10 and with 14. To what extent do you think this can expand to the rest of the courts and what frictions do you see?

I think we have a limited audience that wants to voluntarily adopt these tools because more transparency implies observing and not being able to do just anything. So, we are going to have a very interesting set of courts, with a lot of willingness, and a lot of courts that will not want to know anything about all this. And our next step is to go to the legislature so that this is something mandatory. There is no other way. That there is a system, that maybe does not depend on us if it is already taken, I do not know, by the Supreme Court of Justice, in which they centralize all the data from all the courts and all the courts are obliged to generate this data and anonymize all the sentences and upload them so that everyone can see them. Not all sentences, there are some sentences in particular that are not even published because they are very easy to trace to individuals, certain precautions must be taken regarding data and traceability of individuals, but most sentences are uploaded within these courts that decide to do so.

The ideal would be for it to be something mandatory, that they have to do it, as it is to report data from a lot of other systems in our country, which also works more or less, but we want to push for there to be a data openness law for the judicial powers. It will be very difficult. We feel a bit like Don Quixote against the windmills because the judiciary is one of the strongest and heaviest powers in Argentina and also not very transparent, opaque.

There is no better recipe for inaction than despair. While it is true that all kinds of denialism, mass manipulations, and technologies at the service of propaganda and misinformation are observable trends in our times, it is also true that there are reasons to have hope. I think it is important to always remember the famous quote by Margaret Mead: Never doubt that a small group of committed people can change the world. Indeed, it is the only thing that ever has.

What long-term strategy do you see?

One we are trying to carry out is to take Aymurai and this project to other countries because in Argentina at this moment we are not able to. And with government support, and maybe with some sponsors too, it is not something we rule out, to be able to implement some of these devices in the territory to have these data a little more representative, a slightly better population sample. It is not necessary for all the courts from everywhere to be there, but to take a good sample and see that. It is difficult, we are thinking about it.

What are the difficulties of being a social entrepreneur?

It is super important to disconnect. Leave the cell phone, read books that are not on this topic, because I have a pile of workbooks too that make it impossible to disconnect. I think it is key, and I have been talking about it with many social entrepreneurs who reach burnout. Everyone can reach burnout, in a company too, working, because of oppressive and very precarious work systems. But being a social entrepreneur has that side of: "I give my life for this."

AI and gender violence

It is AI that is identifying what types of files correspond to possible cases of gender violence.

The AI reads all the files. It reads files on drug trafficking, a vehicular accident, someone who entered a football stadium without a ticket, anything you can imagine, the AI reads it, detects it, and can detect which article was infringed and what the conduct was and everything. It also detects if it is gender violence, yes or no. And if it says yes, we have a different pipeline, with more specific fields about gender violence. So, if it is gender violence, we have what type of gender violence it is: if it is physical, sexual, economic, patrimonial, psychological. We have many types of gender violence. So, in general, the judge writes in the sentence what type of gender violence it is, and our guide recognizes it. Then there is a lot of information, for example, if they had children in common, which is also a source of extortion in many cases, if it was in a work environment, if it was in an educational environment, in public transport, at home. So, we have a lot of more granular information about these cases of gender violence, which maybe for a drug trafficking case we do not have, but we could have too.

AI for administrative issues is something very interesting to implement, for example, in the judicial powers.

We can open windows for each of the cases to have much more information. But it is infinite, and we do not have as much money to be able to do that mega development. This funding that supported this development was particularly to understand issues of feminisms and gender. So, we focused on gender violence before it reaches femicide. Not much is known about that because much of the gender violence remains behind closed doors and is silenced and not said and people do not want to go to the police to report it. There are many problems. With that, we do not have data. We have very little data, or we have activist data from people who accompany the victims, but we have a lot of data from the victims who do decide to report and enter the judicial system. What happens there? Do judges grant requests, for example, for restraining orders or preventive detention? Or not? This judge we have been working with is very gender-sensitive. So, we have a bias regarding our sentences, that most grant the requests of women and trans and non-binary people for protection.

But that is not the case in general in Argentina or Latin America. We know we have a disbalance there, but it is not so important because what the guide does is not decide what the judge will resolve, but just tell it. So, we can upload a lot of sentences from judges who do not grant requests, and it will also appear in the database.

Data feminism

I was surprised to read on your pages "data feminism," which is something I had never read before. I found it an interesting concept.

It is a very interesting concept coined by Catherine D'Ignazio and Lauren F. Klein, from MIT. They wrote a great book called Data Feminism, and we translated it into Spanish. It was a collective translation by many people with this transfeminist, intersectional perspective, on how to translate from English to Spanish, which has a completely different logic. And we wrote a chapter on all the decisions we made for that data feminism translation.

Going back to feminist, artificial intelligence itself cannot be feminist, just like this cup cannot be feminist. There is something about what people put into it and do with that tool, what actions can be taken, etc., that make it AI for feminisms or AI with feminist methodologies, etc.

But well, Paola calls me and says: "apply to this call to create artificial intelligence." I had already trained several algorithms, I had programmed in the diploma, in different spaces, but this was a slightly bigger task. And Yasmín, one of the co-founders, works in court number 10 of the City of Buenos Aires, which is a very modern and very innovative court in its transparency and data policies.

We are not going to solve gender violence with AI, but we can solve that a court that wants to load its data semi-automatically can do so.

They have been collecting data manually for many years. They have a huge database of all the judge's sentences. This is an initiative of Judge Pablo Casas. And they wanted to do something to automate that data loading process because it takes a lot of time, people have to be trained a lot to be able to do it, and manual data loading has many inconsistencies. And clearly, it is a problem that can be solved with AI. We are not going to solve gender violence with AI, but we can solve that a court that wants to load its data semi-automatically can do so. So, AI for administrative issues seems to us to be something very interesting to implement, for example, in the judicial powers. Our AI is a name and title recognition very simple that reads the judicial sentence and detects the relevant information of the brand and you can take it to a form, and that form creates a row of the data set. So, our AI does not predict anything in the sense of predicting behavior or predicting someone's violence. It takes the labels and puts them in the database.

Without good data, there is no good AI.

For that, it had to be trained, and the court staff was labeling and doing all the data preprocessing, which we obviously paid them very well for because labeling data is a very tedious job, which is always invisible, very precarious, and we tried not to be like that this time. So, the court staff was working within the court, with that data to generate fake data first, so that those fake data could leave the court without putting anyone's life at risk because we are talking about very sensitive data. So, all names, addresses, IDs, everything was anonymized, but with fake names so that it could still be understood. What we want is precisely to detect, for example, the data of the aggressor and the data of the victim. And we want to know if the victim is a cis woman, if she is a trans woman, if he is a man... And for that, it was necessary for it to be in the names and for the information to be there.

It is a very tedious job, but they did it so that it could serve that training. We could not do it with real data, for privacy and security reasons. But we have a synthetic database that works very well, and that synthetic database went to another group of labelers who labeled what we did want to detect, which were all the data of the data set.

So, it was a complex task with many steps that if we had had the data raw, it would have been easier. But we are talking about judicial sentences, one of the most sensitive data, like medical records that can exist in society. And that was like the first iteration of the project in the Likewise, we have this feature of creating databases from judicial sentences that work very well.

An algorithm will not solve poverty or hunger or violence. It does not work like that, and I believe there are many people who have that illusion, that hope that technology will save us.

And then we have another feature new that is to anonymize documents, which is something that courts ask for a lot because they are obliged to upload anonymized judicial sentences to the internet, but they have to black it out with a marker because they do not have the technology to do it. So, we generated this feature of anonymization, where you upload the sentence, all the sensitive information is detected, and what the label that will appear on top would be. So, it says the day and date, (instead of saying the date), person, reported person, ID. And as if none of the sensitive information appears, that makes it much simpler. It is uploaded in ODT format, which is also free software, and that is our little AI, but it does quite a bit for court problems regarding generating databases and generating anonymized documents to be able to upload them.

A scientific-social approach to the product

These types of projects survive thanks to funding. In the startup world, we talk about problem solution fit, product market fit. In your case, you are going to a social problem, regardless of what the revenue model is, and I find this fascinating because it is unknown to me.

We have to learn a little from both worlds, maybe we could have life a little easier if we had this view of having a good product, how can we make it reach more people, but at the same time be self-sustaining? We do not want to grow and have a company with 200 employees, that is not the goal, nor do we want to have in-house programmers at this time because it is super expensive, but we do want to have a business model, which is something we do not have at the moment, we would love for it to happen, to think of it as a sustainable point, we do not want to be millionaires, it does not go there, but that it can reach many places without costing us our lives. Because there is something about being a social entrepreneur, which is I give my whole life for the venture, which I already learned that no more. I had burn out last year, and the truth is that you have to balance life and work, and that work pays well to be able to live well without thinking all the time about how we make ends meet.

You have to think about these things and also be able to prioritize family, affections, exercise, the cultural part, that not everything is work because it is very easy to stay in that. It is very easy to live traveling, too. I am now traveling and putting up 80 meetings and continuing to work virtually and at the same time also in person. You have to set limits. And those limits also involve taking care of what you built so that it is sustainable and sustainable over time and that the teams do not burn out, do not go to other places. It is difficult because we work with artificial intelligence, data science, and salaries are very competitive in the industry. Even though it is a bit deflated now and it is a bit more difficult, it is more attractive to go somewhere else and disconnect the brain when I work and not care so much. So, it would be good to think a little more about some business model at some point so that this is something sustainable over time.

I understand that part of your strategy is open source in that sense too. We are appealing to the community's contribution to help. What has been the impact of this?

Nothing, there has not been much criticism yet. If anyone feels like listening, come and take a look at the code at github.com/Aymurai https://github.com/aymurai and there it is all: the front and the backend. The idea of it being open source had to do with this funding that facilitated this tool development, which was one of the requirements. But I have always been a Linux user and have always been into free software, and I think it is super aligned with our spirit, let's say, within DataGénero.

Where does DataGénero come from?

I had been thinking about what I wanted to do and said: an observatory, because DataGénero is a data observatory with a gender perspective, but to get to that point, I started in the programming world, teaching programming to children and teenagers in schools. I learned to program to teach the fundamentals of programming; I started to like it and wanted to learn more and more. And there I started to do a data science course, sponsored by Facebook, I started looking for where the intersection with social and gender was because I could not find it. I found that everything was a lot for marketing and to make a lot of money and all the metrics and predict the churn and all the things that had to do with customers.

And that did not interest me much. I studied Education Sciences, that is why I got into the topic of teaching programming, and I worked for many years with gender violence in very humble neighborhoods of Buenos Aires, teaching in workshops about gender violence, how to prevent it, how to realize when they were in a violent relationship. This was before it was such a hot topic, in 2011, 2012 I started with those topics and with data science I started in 2017. I started going to conferences, I started looking, reading where there was this intersection between data and gender and I found something in English, but in Spanish, there was not much. There were some very small initiatives, but there was nothing concrete that mixed these two worlds. I went to a Latin American artificial intelligence conference in Montevideo in 2019, looking for these applications, I found nothing, nor was ethics talked about at that time. Clearly, ethics began to emerge in 2022 with more force and there was a peak of papers and research with ethics in machine learning, but there was nothing at that time.

Observing we are also influencing and modifying government actions.

And that is where I started to think: What do I create? And when this observatory idea came up, it seemed interesting because observing we are also influencing and modifying. It is not the same for governments, for example, not to be observed in these issues, as it is for there to be people watching. And the gaze also modifies some of the logics that are inside and other things start to happen.

I had just been invited to a data camp on data and gender organized by Mailén García, who is another of the co-founders, and everything started to align because I was coming with this idea, I was inviting some people I knew were feminists, who worked with data science, but not working with data science in gender. I met Mailén, I met Yasmín, who is another of the founders, and that is where we started to think about this space. We had a face-to-face meeting and then there was the pandemic with all the lockdown. It was a pandemic, and the Government was doing what it could with a health crisis, and we did not want to make requests for access to public information on gender issues, for example, because all resources were allocated elsewhere. And when that calmed down, we did start to push a little to get the data we wanted, and we realized that the data we wanted was not there either.

We were 10 squares behind what we wanted. We thought we could have a lot of data to analyze and generate reports and visualizations, but we did not have that data. So, we had to start training government people to start collecting it or if there is something similar and we can cross and create a cross-database, do that. So, it was a bit more detective work and training.

In Latin America, there was a 25% distrust in the judicial system and in Argentina, 16%. This can help improve confidence in the judicial system, basically through observability itself, which in the end you observe a system, you modify that system, and you are giving it transparency and, therefore, avoiding bad practices.

It is a small part because we also have everything that is police reports and police stations, which is already another part, which is the repressive apparatus of the State, in which we also have very little information about what happens there and it is very complex to obtain it because there is a lot of impunity.

A few years ago, there was a case of a girl who was murdered by her ex-boyfriend, she went to report 16 times. She went through many police stations, no one paid attention to her. It is very complex. So, there is a part that can be observing the judiciary that judges decide because apart from that, we cannot have data on the cases because when the cases are not closed, they cannot be published yet. They have to be published once there is a final sentence. So, everything that happens before, until that sentence is reached, we do not have it either. We have a piece of the puzzle, unfortunately.

Apr 2, 2024

Carlos Iglesias

CEO en Runroom | Director Académico en Esade | Co-founder en Stooa | Podcaster en Realworld

Share:Linkedin/Bluesky

R069 - Entrepreneurship with Social Impact, with Ivana Feldfeber

Follow Realworld!

What is the real world to you?