Google Summer of Code 2025

Google Summer of Code 2025

The Kubeflow Community plans to participate in Google Summer of Code 2025. This page aims to help you participate in GSoC 2025 with Kubeflow.

What is GSoC?

Google Summer of Code (GSoC) is a global program that offers students stipends for working on open-source projects during the summer.

For more information, see the GSoC FAQ and watch the video below:

How can I participate?

Thank you for your interest in participating in GSoC with Kubeflow!

Please carefully read the following information to learn how to participate in GSoC with Kubeflow.

Key Dates

Here are the key dates for GSoC 2025, the full timeline is available on the GSoC website:

EventDate
Applications OpenMarch 24 @ 18:00 UTC
Applications DeadlineApril 8 @ 18:00 UTC
Accepted Proposals AnnouncedMay 8
Community BondingMay 8 - June 1
Coding BeginsJune 2
Midterm EvaluationsJuly 14 - 18
Coding EndsSeptember 1
Final EvaluationsSeptember 1 - 8

Eligibility

To participate in GSoC with Kubeflow, you must meet the GSoC eligibility requirements:

  • Be at least 18 years old at time of registration.
  • Be a student or an open source beginner.
  • Be eligible to work in their country of residence during duration of program.
  • Be a resident of a country not currently embargoed by the United States.

Steps

  1. Sign up as a student on the GSoC website.
  2. Join the Kubeflow Slack:
    • NOTE: please do not reach out privately to mentors, instead, start a thread in the #kubeflow-contributors channel so others can see the response.
  3. Learn about Kubeflow:
  4. Review the project ideas to decide which ones you are interested in:
    • You may wish to attend the next community meeting for the group that is leading your chosen project.
    • NOTE: while we recommend you submit a proposal based on the project ideas, you can also submit a proposal with your own idea.
  5. Submit a proposal through the GSoC website between March 24th and April 8th.
  6. Wait for the results to be announced on May 8th.

Project Ideas

Project 1: Kubeflow Platform Enhancements

Components: Kubeflow Manifests, Kubeflow Dashboard, Kubeflow Notebooks, Kubeflow Pipelines

Possible Mentors: @juliusvonkohout, @thesuperzapper

Difficulty: Hard

Size: 350 hours

Possible Projects:

  • Pipelines: productionize the SeaweedFS PoC as secure minio replacement
  • Pipelines: isolate artifacts per namespace/profile/user using only one bucket (kubeflow/pipelines#4649)
  • Notebooks/Dashboard: migrate code to kubeflow/dashboard and kubeflow/notebooks (kubeflow/kubeflow#7549)
  • Dashboard: work on the Central Dashboard angular rewrite (kubeflow/dashboard#38)
  • Dashboard: support using groups for auth (kubeflow/manifests#2910)
  • Manifests: improve scripts and CI/CD in kubeflow/manifests, including matrix calls to test multiple Kubernetes versions simultaneously

Skills Required/Preferred:

  • GitHub and GitHub Actions
  • containers and Kubernetes knowledge
  • Experience with Python, Go and JavaScript frameworks

Project 2: Kserve Models Web App

Components: KServe

Possible Mentors: @juliusvonkohout, @varodrig, @Griffin-Sullivan

Difficulty: Medium

Size: 175 hours

Goals:

  • Reviving and updating the kserve/kserve-models-web application.
    • Clean up and merge the open issues and PRs
    • Implement a better CI/CD pipeline.
    • Potentially migrate the application to kubeflow/kserve-model-ui
    • Add features for editing, regression testing, and monitoring/metrics support.
    • Synchronize with kserve 0.14+ changes.

Skills Required/Preferred:

  • GitHub Actions
  • containers and Kubernetes knowledge
  • JavaScript frameworks


Project 3: Istio CNI and Ambient Mesh

Components: Kubeflow Manifests

Possible Mentors: @juliusvonkohout, @kimwnasptd

Difficulty: Medium

Size: 175 hours

Goals:

Skills Required/Preferred:

  • GitHub and GitHub Actions
  • Kubernetes and networking
  • Istio, Kustomize


Project 4: Deploying Kubeflow with Helm

Components: Kubeflow Manifests, Kubeflow Pipelines, Kubeflow Trainer, Kubeflow Katib, Kubeflow Spark Operator, Kubeflow Model Registry

Possible Mentors: @chasecadet, @varodrig, @juliusvonkohout

Difficulty: Medium

Size: 350 hours

Goals:

  • To extend our userbase and satisfy the requirement for a helm chart that many users and companies have voiced, a community-driven Helm chart is being developed for Kubeflow v1.10.x.
  • Work with Kubeflow components maintainers and kubeflow/manifests to support the creation of Helm charts for a full Kubeflow deployment with similar functionality as the current kustomize manifests for the Kubeflow 1.10.x release.
  • Investigate possible systems to automatically generate or maintain charts based on the existing kustomize manifests, such that we have a single source of truth.

Skills Required/Preferred:

  • Container and Kubernetes knowledge
  • Helm (especially templating and chart creation)
  • Kustomize (not strictly required, but a plus)


Project 5: JupyterLab Plugin for Kubeflow

Components: Kubeflow Notebooks, Kubeflow Pipelines

Possible Mentors: @ederign, @StefanoFioravanzo

Difficulty: Medium

Size: 350 hours

Goals:

  • Work with the new IDE Working Group (name pending - kubeflow/community#808) to create a JupyterLab plugin for Kubeflow
  • Modernizing and/or consolidating Elyra, Kale, and Jupyter Scheduler into a single plugin for Kubeflow
  • Eventually, the plugin will likely integrate with:
    • Kubeflow Pipelines (priority)
    • Kubeflow Notebooks
    • Kubeflow Model Registry
    • Kubeflow Training Operator
    • and more

Skills Required/Preferred:

  • Python for backend development and API integration
  • JavaScript/TypeScript for frontend development
  • Modern UI frameworks (e.g., React, Jupyter widgets) is a plus
  • Familiarity with Jupyter Notebook, JupyterLab
  • Jupyter extension development experience is a plus


Project 6: Batch Processing Gateway Integration

Components: Kubeflow Spark Operator

Possible Mentors: @Shekharrajak, @lresende, @yuchaoran2011, @andreyvelich

Difficulty: Hard

Size: 350 hours

Goals:

  • Integrating the Batch Processing Gateway (BPG) with Kubeflow for submitting, monitoring, and managing Spark applications across multiple clusters (kubeflow/spark-operator#2422)
  • Analyse, Design, Plan, and Execute Spark Job Execution Strategies:
    • Evaluate the trade-offs between running a Spark kernel directly within a Kubeflow Notebook versus leveraging the Batch Processing Gateway for job submission.
    • Assess the cloud-native design of Kubeflow SDK and Notebook environments to determine the optimal approach for Spark integration that maximizes efficiency, scalability, and usability.
    • Make a well-informed decision on whether to support Spark kernels within notebooks, use BPG, or implement a hybrid approach for an enhanced user experience.
  • Automated Job Routing and Scalable Execution:
    • Implement dynamic workload routing using BPG to automatically distribute Spark jobs based on cluster load, resource availability, and workload priority.
    • Integrate with the Spark Operator to optimize resource allocation, minimize execution delays, and ensure efficient scaling for petabyte-scale machine learning and data processing workloads.
  • Enhanced User API and Notebook Integration:
    • Develop a Python SDK for Kubeflow notebooks, enabling users to submit, manage, and monitor Spark jobs via BPG REST APIs for a lightweight, scalable solution.
    • Ensure a seamless user experience by providing intuitive APIs that abstract complex job management operations, making it easier for data scientists and ML engineers to experiment and iterate on workflows within
  • Comprehensive Debugging and Performance Monitoring:
    • Enable full debugging capabilities by integrating Spark UI, logging, and monitoring tools into Kubeflow, allowing users to visualize Spark DAGs, tasks, and execution stages.
    • Implement centralized logging and Prometheus-based monitoring to provide real-time insights into Spark job performance across clusters.
    • Ensure users can efficiently analyze job execution, detect bottlenecks, and optimize data processing and ML workflows within Kubeflow.
    • Note: Most of the logging APIs must be leveraged out of the box from either BPG or Spark - but we need to document, showcase examples to user.
  • Comprehensive documentation and user guides to assist users in leveraging the new features effectively.

Skills Required/Preferred:

  • Proficiency in Python, Java and familiarity with developing SDKs.
  • Experience with Kubernetes and managing containerized applications.
  • Understanding of Apache Spark and its deployment on Kubernetes clusters.
  • Familiarity with RESTful API development and integration.
  • Experience with monitoring tools and logging frameworks is a plus.


Project 7: GPU Testing for LLM Blueprints

Components: Kubeflow Trainer (Training Operator)

Possible Mentors: @andreyvelich, @varodrig

Difficulty: Medium

Size: 350 hours

Goals:

Skills Required/Preferred:

  • GitHub Actions
  • Kubernetes
  • PyTorch
  • Python


Project 8: Support JAX and TensorFlow Training Runtimes

Components: Kubeflow Trainer (Training Operator)

Possible Mentors: @Electronic-Waste, @XshubhamX, @andreyvelich

Difficulty: Hard

Size: 350 hours

Goals:

Skills Required/Preferred:

  • Go
  • Kubernetes
  • JAX
  • TensorFlow


Project 9: Export Kubeflow Trainer Models to Kubeflow Model Registry

Components: Kubeflow Trainer (Training Operator), Kubeflow Model Registry

Possible Mentors: @tarilabs, @franciscojavierarceo

Difficulty: Hard

Size: 350 hours

Goals:

  • Integrate Kubeflow Trainer with Kubeflow Model Registry (kubeflow/trainer#2438)
    • Trainer has implemented initializers for model and dataset, and will support model exporter in the future.
    • By supporting the model registry as one of the destinations of the exporter, Trainer will integrate with Kubeflow ecosystem more deeply.

Skills Required/Preferred:

  • Kubernetes
  • Go
  • YAML
  • Python


Project 10: Support Volcano Scheduler in Kubeflow Trainer

Components: Kubeflow Trainer (Training Operator)

Possible Mentors: @Electronic-Waste, @rudeigerc

Difficulty: Hard

Size: 350 hours

Goals:

  • Integrate Volcano Scheduler with Kubeflow Trainer (kubeflow/trainer#2437)
    • Currently, Trainer does not support Volcano for scheduling.
    • Since Volcano is a widely adopted scheduler for AI workloads, it could provide Trainer with more AI-specific scheduling capabilities if we integrate Volcano into Trainer

Skills Required/Preferred:

  • Kubernetes
  • Go
  • Volcano


Project 11: Support Postgres for Kubeflow Pipelines backend

Components: Kubeflow Pipelines

Possible Mentors: @rimolive, @shivaylamba

Difficulty: Medium

Size: 175 hours

Goals:

  • Implement support for PostgreSQL as an alternative to MySQL/MariaDB in Kubeflow Pipelines (kubeflow/pipelines#9813)
    • Kubeflow Pipelines must store information about pipelines, experiments, runs, and artifacts in a database. Currently, the only database it supports is MySQL/MariaDB.
    • We plan to support PostgreSQL as an alternative to MySQL/MariaDB so users will be able to reuse existing databases, and PostgreSQL will be a good use case for supporting multiple databases.

Skills Required/Preferred:

  • Kubernetes
  • Python
  • Go
  • YAML

Project 12: Empowering Kubeflow Documentation with LLMs

Components: Kubeflow Website

Possible Mentors: @franciscojavierarceo, @chasecadet, @shravan-achar, @Shekharrajak, @varodrig

Difficulty: Hard

Size: 350 hours

Goals:

  • Leverage LLMs to improve Kubeflow documentation: (kubeflow/website#4025).
    • Explore how other OSS communities leverage LLMs with the user documentation.
    • Explore possibilities to use LLMs to improve existing Kubeflow documentation or use LLMs to help with user questions.

Skills Required/Preferred:

  • JavaScript
  • Python
  • HTML
  • Netlify
  • Hugo

Feedback

Was this page helpful?