DataPool is a collaborative project that provides the MIT community with campus sustainability data and visualizations. Interested in providing feedback or contributing to the site? Email us at DataPool@mit.edu or share your thoughts through the form below.
Most visualizations are accessible to all MIT community members. If you find yourself at a page that resembles the image below, simply click the blue button to access the visualization.
Driving sustainability with data
Data science brings a potential to accelerate and shape sustainability at the campus, city, and global scales. Data can help us to understand past performance, identify current opportunities, and prepare for future challenges. However, at large complex institutions, disparate data systems and organizational boundaries often introduce analytics challenges of timeliness, access, and rigor. MIT is working to break from this model and cultivate a leading-edge campus sustainability data practice that emphasizes information currency, collaboration, and the application of today's data science technologies.
A thriving campus sustainability data practice supports a variety of activities...
An example of this approach in the context of energy and climate change is Energize_MIT, which advances a commitment made in MIT's October 2015 Plan for Action on Climate Change to develop an open energy data platform to support research and intelligent decision-making. The video below highlights Energize_MIT as part of the Sustainability DataPool.
Data science meets sustainability
Through data centralization, automation, and stakeholder collaboration, MIT is reinventing its relationship to campus sustainability data. MIT is using open-source programs, cloud storage technologies, and industry-leading data science tools to develop an affordable, transferable, and in-house approach to cultivating a near real-time sustainability data practice. By provisioning consistent and timely data, MIT is empowering its research community to use the campus as a test bed for innovation and experimentation.
DataPool is powered by the DataHub, MIT's emerging big data system. IST is developing the DataHub using a combination of storage, processing, and analytics services offered by Amazon Web Services. The DataHub supports big data processing with tools like Apache Spark and Hive, as well as more traditional open-source analytics languages like python and R. The DataHub is the analytics engine and central data repository behind the DataPool interface.
DataPool is the product of deliberate collaboration between an ever-expanding roster of departmental, research, and technology contributors. This interdisciplinary participation enables MIT to view familiar sustainability challenges through untraditional lenses.
We're just getting started
DataPool is positioned for growth in two important areas. First, DataPool contributors are working to include a broader spectrum of campus sustainability data. As more data streams are integrated, DataPool will expand in both breadth and depth.
Second, DataPool contributors are collaborating and seeking new partnerships to offer more advanced analytics. By developing a strong foundation in the basics and tapping into the expertise of the MIT community, DataPool is poised to launch into higher order analytics that empower MIT to look beyond historical trends and into the future.