Call Us Now

+91 9606900005 / 04

For Enquiry

legacyiasacademy@gmail.com

Govt. launches AI Kosha repository of data to build models and tools

  • AI Kosha Initiative: A government-backed platform for non-personal datasets aimed at fostering AI model and tool development.
  • Initial Dataset Count: Launched with 316 datasets, mainly supporting Indian language translation tools.
  • IndiaAI Mission Alignment: AI Kosha is part of the ₹10,370 crore IndiaAI Mission, focusing on AI advancement.

Relevance : GS 3(Science ,Technology)

Compute Capacity & Infrastructure

  • GPU Access Expansion:
    • 14,000 GPUs commissioned for shared access, an increase from 10,000 earlier this year.
    • More GPUs to be added quarterly to support AI model training.

Government’s AI Development Strategy

  • Homegrown AI Model:
    • Government accelerating efforts to develop an indigenous foundational AI model.
    • Inspired by China’s DeepSeek, which achieved success at lower costs than U.S. firms (OpenAI, Google).
    • High interest from startups in leveraging India-specific AI solutions.

Dataset Categories in AI Kosha

  • Translation & Linguistic Tools: Majority of datasets aimed at improving Indian language AI models.
  • Other Data Sources:
    • Telangana Open Data Initiative (health-related data).
    • 2011 Census Data.
    • Satellite Imagery from Indian satellites.
    • Meteorological and Pollution Data.

Past Government Data Initiatives

  • Open Governance Data Platform:
    • 12,000+ datasets hosted by data.gov.in from multiple government agencies.
    • Ministries and departments have designated Chief Data Officers to facilitate dataset contributions.
  • 2018 Non-Personal Data Committee:
    • Explored making private sector data (e.g., ride-sharing traffic data) accessible for startups & policy use.
    • Faced pushback from tech industry over data-sharing concerns.
    • Debate on non-personal data preceded the LLM (Large Language Model) boom, such as ChatGPT.

Significance & Challenges

  • Significance:
    • Encourages AI innovation using publicly available data.
    • Supports startups, academia, and government in developing AI tools.
    • Strengthens AI ecosystem with better compute power and data access.
  • Challenges:
    • Private sector resistance to data sharing remains unresolved.
    • Data quality and availability across diverse domains need continuous enhancement.
    • Evaluation frameworks for foundational AI models still evolving.

March 2025
MTWTFSS
 12
3456789
10111213141516
17181920212223
24252627282930
31 
Categories