AquaAware: A Two-Phase Learning System for Water Conservation

Published: April 01, 2024

AquaAware is a project designed to reduce water consumption in homes by providing real-time recommendations for various tasks like brushing teeth, washing hands, showering, and washing dishes. The system employs a two-phase machine learning algorithm to first train users on best practices and then further optimize their personal habits.

Phase 1: Supervised Learning for Best Practices

Initially, the system uses a dataset containing optimal water consumption levels and time durations for common household tasks.

Algorithm: A Decision Tree classifier is trained on this “best practice” data.
Functionality: As the user consumes water, the device records the usage and time. The Decision Tree predicts the ongoing task and provides real-time recommendations to help the user align their habits with water-saving standards.
Goal: The primary objective of this phase is to guide the user towards adopting more sustainable habits and reducing initial water wastage.

Phase 2: Unsupervised Learning for Personalized Optimization

Once the user has consistently adopted the recommended best practices, the system transitions to a personalized optimization phase.

Data Collection: The system begins to store the user’s new, more efficient water usage patterns in a new dataset, but without task labels.
Unsupervised Learning: Since the device cannot automatically classify these new patterns, the K-Means clustering algorithm is used. Knowing the number of tasks beforehand, K-Means groups the unlabeled data into distinct clusters, effectively re-identifying the tasks based on the user’s unique behavior.
Retraining: Once the new data is labeled through clustering, a new Decision Tree model is trained on this personalized dataset.
Goal: This second model provides even more refined predictions and recommendations, pushing the user to consume only what is strictly necessary, thereby minimizing waste beyond the standard best practices.

Design Choices

Decision Tree was selected as the supervised learning algorithm due to its high accuracy and significantly faster response times compared to alternatives like K-Nearest Neighbors (KNN).
K-Means was chosen for the unsupervised learning task because the number of clusters (i.e., the number of distinct household tasks) was known in advance, making it a highly efficient choice.

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Domenico Lacavalla

Phase 1: Supervised Learning for Best Practices

Phase 2: Unsupervised Learning for Personalized Optimization

Design Choices

Share on