Building a Research Data Platform for Underground Water Storage

🏗️ This project grew out of my Hiwi role at the Technical University of Munich, where I worked with the Department of Civil Engineering to build a research-grade application for exploring and managing underground water storage data.

Context

Researchers in civil engineering were working with underground water storage data coming from multiple sources and in multiple formats. The data was valuable, but difficult to explore consistently because it needed to be collected, cleaned, organized, and presented in a way that supported research workflows rather than raw storage alone.

The core challenge was not only visualization. It was building a usable system that could bridge fragmented data inputs, researcher needs, and day-to-day operational workflows.

Problem Statement

The team needed an application that could help researchers:

Explore underground water storage data more intuitively
Manage and organize heterogeneous datasets from different sources
Clean inconsistent records and prepare them for analysis
Reduce the manual effort required to move from raw data to usable research insights

This meant the solution had to combine data engineering, application development, and strong communication with domain experts.

My Role

I was responsible for designing and implementing the application end to end, with a focus on both technical delivery and researcher usability. My work covered:

Designing the data pipeline for multi-source and multi-format data integration
Building backend APIs and frontend interfaces for data exploration and management
Translating stakeholder requirements into concrete technical designs
Adding AI-assisted functionality to support data insight discovery
Setting up deployment and engineering practices needed for a product-level research tool

Technical Approach

1. Data fusion and preprocessing pipeline

One of the main technical challenges was the heterogeneity of the data. Different sources used different structures, conventions, and formats, which made direct exploration difficult.

To address this, I designed a data fusion pipeline that standardized incoming data, organized it into a consistent structure, and supported cleaning workflows before visualization. This pipeline formed the foundation of the application because reliable exploration depends on reliable underlying data.

2. Full-stack application development

I developed the system using FastAPI for the backend and React for the frontend.

The backend was responsible for API design, data access, and integration logic. The frontend focused on making complex spatial and research data easier to inspect and manage through a practical user interface. Together, these components created a workflow where researchers could move from raw data handling to exploration inside a single application.

3. Requirements engineering across domains

A major part of the project was working with non-technical stakeholders and researchers in a specialized civil engineering context. Their needs were often expressed in domain language rather than software requirements.

I learned to translate these research and operational needs into technical system design decisions. This became one of the most valuable skills I developed in the role: cross-domain communication between engineering and scientific users.

4. AI-assisted exploration

To make the system more useful for exploration, I integrated the OpenAI API to support data insight discovery. This added an intelligent layer to the application, helping users interact with data in a more exploratory way rather than relying only on static views or manual inspection.

5. Engineering and deployment practices

Beyond feature development, I also applied practical software engineering workflows, including Docker, CI/CD, DevOps, and MLOps-oriented practices. These were important for making the application maintainable, deployable, and robust enough for real research use.

Outcome

The result was a product-level application for research use and spatial data management.

The system improved how the research team interacted with underground water storage data by making data organization, cleaning, and exploration significantly more efficient. Instead of spending excessive time dealing with fragmented inputs and manual processing, researchers could work with a more structured and usable environment.

What I Learned

This experience taught me that strong technical solutions in research environments require more than implementation skill alone. They also depend on understanding domain problems, communicating with non-technical stakeholders, and designing systems that fit real user workflows.

From this project, I strengthened my experience in:

Full-stack application development
Data engineering for heterogeneous scientific data
AI integration in domain-specific applications
Requirements engineering and stakeholder communication
Production-minded development practices for research software

Reflection

This Hiwi position was especially meaningful because it showed me how software engineering can create direct value in an academic research setting. It was not just about building a web app, but about enabling researchers to work faster, more clearly, and with better access to the information they needed.

It also reinforced my interest in building technical systems that sit at the intersection of data, AI, and real-world decision support.