The Detection of Xenophobic Language and Misinformation in Media Content project was a collaborative effort conducted between UNICC, UNESCO, IOM, and New York University SPS Capstone participants from 2024. The rise of xenophobic language and misinformation in media narratives, particularly those involving migrants, refugees, and displaced communities, has prompted the need for tools that promote balanced, fact-based journalism that respects the rights and dignity of vulnerable populations. While negligent content amplifies harmful stereotypes and false narratives, manually screening for such content is costly, slow, and prone to error. In this context, UNICC collaborated with the NYU School of Professional Studies (students & faculty) to develop a comprehensive data labeling approach aimed at categorizing information based on its tone and intent. Together, we established the following classification criteria: “toxic” : Content containing generally harmful or offensive language intended to provoke or hurt. “severe_toxic” : Highly aggressive or extreme language with intense hostility or derogatory tone. “obscene” : Language that includes vulgar or sexually explicit content inappropriate for public discourse. “threat” : Statements expressing intentions to cause harm or incite violence against individuals or groups. “insult” : Content that demeans or ridicules someone based on personal characteristics or affiliations. “identity_hate” : Hate speech targeting individuals or groups based on identity markers like race, ethnicity, religion, or nationality. The project aimed to build an AI-based media analysis tool to identify and mitigate xenophobic language, misinformation, and harmful narratives in media coverage to address these ethical challenges of reporting on human mobility by fostering informed and unbiased journalism. The primary goal is to create a robust AI tool that leverages advanced language models to detect harmful content, ensure ethical reporting, and support media outlets in providing balanced narratives about vulnerable communities.
Nuance Matters: An interesting observation was how detecting xenophobia is context-sensitive. Many terms must be interpreted with context and not just keyword matching. Data Labeling Challenges: These challenges arose due to the uneven distribution of content types. For instance, common labels like "toxic" were well-represented, while rare but important labels like "threat" had limited examples. This imbalance made it harder for the AI to learn and accurately detect less frequent but critical content types, requiring special attention during training and evaluation to ensure balanced model performance. Future Plans for Expansion: Moving forward, we are focused on enhancing the model's capabilities by expanding its architecture. This ongoing evolution reflects our commitment to continuous improvement and to maximizing the impact of our project.
The Development of an Open AI-enabled BOT Platform for the United Nations Economic Commission for Europe (UNECE) project was a collaborative effort conducted between UNICC and Cornell University Break Through Tech Capstone participants from 2024. The project aims to develop an AI-driven BOT platform powered by Large Language Models (LLMs) designed to assist policymakers, research students, and other stakeholders in efficiently querying and extracting relevant knowledge and insights from the extensive UNECE PDF documentation and library. This platform will help stakeholders quickly access critical information needed to make informed decisions, particularly in the areas of sustainable energy and climate change.
Activity Type AI Tools/SolutionsResearch/Reports/AssessmentsUNHCR is integrating an internal Virtual Legal Assistant (VLA) powered by AI into its Rights Mapping and Analysis Platform (RiMAP). The VLA will process and analyze legal documents to facilitate legal data collection and policy analysis, and its functionality will be expanded to a chatbot to provide user-friendly legal information.
Activity Type AI Tools/SolutionsPolicy/Regulatory Guidance
The Gender Bias Overview Tool (GenBOT) project involved a partnership among UNICC, UNFPA, and Columbia University Capstone participants from 2024. Gender bias in datasets, particularly in the field of machine learning and AI, has been a long-standing issue. Despite efforts toward inclusivity and diversity, women remain underrepresented in various stages of tech development, leading to inherent biases in datasets. These biases perpetuate themselves in the development of machine learning models and algorithms, causing skewed or inaccurate outputs that reinforce existing gender disparities. To address this issue, UNICC collaborated with UNFPA and Columbia University (students and faculty) to develop GenBOT, an adaptable data-auditing solution that provides users insights on gender bias in their datasets through automated AI analytics. The tool aims to guide stakeholders in creating equitable, gender-inclusive technologies and services.
Context Matters: A key lesson learned was the importance of context in evaluating gender bias. The definition and scope of bias need to be carefully defined with stakeholders to ensure the tool's effectiveness. Thresholds and Metrics: Determining appropriate thresholds for gender bias detection was a significant challenge. The team used established research and studies to refine the metrics used in the tool, ensuring that they aligned with best practices. Stakeholder Collaboration: Continuous communication with stakeholders such as UNFPA and UNICC was crucial in refining the bias thresholds and understanding their specific needs for gender bias detection.