Digging Deeper into ED Open Data: New ED Data Inventory

If you love data, and especially open data, there’s a good chance you also care about quality metadata. We have some exciting news: the Department of Education launched a new ED Data Inventory!

The Inventory is available as a searchable website and a JSON file.  It contains descriptions about the data the Department collected as part of program and grant activities as well as statistical data collections.

Richer information about the Department’s data makes it more accessible and understandable to researchers, developers and entrepreneurs. Our hope is that users will be able to put this freely available government data alongside other sources of data to advance new studies, products, services and apps. The tools and advances in knowledge and best practices can help American students, parents and educators and continue to improve America’s schools. Empowered with more relevant, timely information, students and families will be able to make more informed decisions about education and preparation for college and career.

The ED Data Inventory is a work in progress. The Department’s Data Strategy Team sponsored a working group that did the heavy lifting on this project under the leadership of Marilyn Seastrom, Chief Statistician for the National Center for Education Statistics. The inventory so far covers 33 data series with a total of 223 component studies or data collections. For each data collection, the inventory includes information on the specific data elements used and their definitions. The descriptions link to accessible, online copies of the datasets and systems. The inventory work is ongoing – the team is still at work adding descriptions of more data series and studies. 

The release of the ED Data Inventory is part of the Department’s response to the President’s Executive Order, Making Open and Machine Readable the New Default for Government Information, and the Open Data Policy. The content from the ED Data Inventory’s JSON file will soon feed the Department’s content on Data.gov. We have been working with the OMB Office of Science and Technology Policy to make open government data easier for the public and entrepreneurs to find, understand, and use. Check out the new Next.Data.gov, a design prototype of the next generation of Data.gov, and the education community on Next.Data.gov.  We provide a list of the 35 datasets accessible via API (application programming interface) at ed.gov/developer.

Learn more and connect with us at ed.gov/data. We look forward to your feedback, questions and suggestions.

Jill James is web director at the U.S. Department of Education and a member of the Department’s Data Strategy Team.

 

Cloudy With a Chance of Data

Recently, a lot of people have been talking about cloud computing and asking what it means to store student information in the cloud.  Unfortunately, confusion and misunderstanding can sometimes cloud the issue (pun intended).  In order to understand the potential risks and opportunities, we should take a minute to understand what it actually means to put data “in the cloud”.

Online systems are powered by computers called servers.  In the past, servers were generally located in the same physical vicinity as the people using them. Email servers were stored somewhere near the office where the users worked; student information system servers were stored somewhere in the school or district where the students attended. As demand for online tools increased and tolerance for “down time” decreased, the requirements for storing (or hosting) web servers became increasingly complex.

Row of web servers

Row of web servers in a large data center.

Fortunately, as network speeds have increased, data can travel faster and web servers no longer need to be stored in close physical proximity to the users in order to have access to the data. This allows the creation of remote hosting centers that can be designed specifically to meet the requirements of storing web servers for schools and districts. Since servers for multiple schools and districts can be stored in the same data center, the cost to each district could be reduced even while adding features (cooling, power, backups, physical security, etc.).  The concept of hosting web servers in shared data centers became known as “cloud storage”. Server rooms needed special cooling systems, backup generators, and redundant internet connections. In addition as more and more data began to be stored digitally, increased physical security was needed to guard against unauthorized access to the server room.  Meeting these demands added an enormous burden to district IT budgets – not to mention increased space requirements in buildings that were already overcrowded.

It is important to note that the co-location of servers for multiple schools in a single data center is not the same as comingling the student information into a single database. This may be the most widely misunderstood concept about storing student data in the cloud. Think about how email works. An email account is hosted in a remote “cloud” data center along with thousands of other email accounts. But just because our email accounts “live” in the same data center does not mean that I can read someone else’s email or vice versa.  Along the same lines, organizations that provide cloud data solutions for schools would not be able to amass a single database of student data or allow unauthorized individuals to access that data without violating privacy laws and the terms of contracts with school districts on which they depend.

Whenever student data is being stored—whether on paper, on servers in the back room of a school building, or “in the cloud”—security, privacy and other legal and operational issues must always be addressed. While specially–built data centers can offer additional physical and digital protections for student data, appropriate credentialing requirements, audit trails, and access controls must always be in place. In addition, state or federal laws, such as the Family Educational Rights and Privacy Act (FERPA) may apply. Check out this blog post by our Chief Privacy Officer for answers to common questions about privacy in the cloud.

We encourage parents and students who want more information on how their schools employ cloud computing to contact their schools directly. It’s important for everyone to stay informed about how data is being protected and how student data is being used to improve the learning experience.

Richard Culatta is the Deputy Director of the Office of Educational Technology at the U.S. Department of Education. 

A New, Single Home for ED Data

Starting today, the data sets and content you’re used to seeing on data.ed.gov can be found on education.data.gov.

(Developers: Please note that the 16 available education data APIs were already hosted by data.gov. These URLs did not change and existing applications using these APIs should not be affected.)

Digital Strategy LogoWhy the move?

In addition to saving the costs associated with hosting and maintaining a separate education data website, merging the information on data.ed.gov into the existing Data.gov Education Community will allow researchers, developers, and interested members of the public to meet all their education data needs in one central location.

Originally, we created the separate data.ed.gov portal because we wanted to provide the public with advanced features and visualization tools that were not yet available on Data.gov. Today, the Data.gov Education Community not only fully supports visualization and mapping technologies, but it benefits from the continual addition of new enhancements, tools, and features. A key new tool is an API “wizard” that will make it faster and easier to create APIs for existing and upcoming open datasets, increasing the ways developers can interact with this data.

Read More

Your Feedback Wanted: More Open ED Data

I am part of a team that is looking at ways to enhance the Department’s digital services and respond to the White House’s Digital Government Strategy.  We are spearheading a new initiative to make more of the data ED publishes open and developer-friendly via web application programming interfaces (APIs).  APIs allow web developers to pull data from one or more API-enabled sources into another website, application, or mobile app. It makes sharing information more fluid and current.  Check out the currently available 16 ED datasets with APIs on ED.gov.

Open Gov LogoThe Department of Education and the White House are reaching out to developers interested in working with education open data. The Data Jam held in June kicked off development of projects and tools to be presented at an Education Datapalooza event to be held at the White House in October 2012. Datapalooza will be an opportunity to highlight tools and services that leverage open educational data sets (education.data.gov), individual electronic student data (MyData), and data about learning content (Learning Registry) to improve student choices around learning.  Datapalooza will be streamed live (and posted online afterwards) for anyone who wants to participate. Email the team at edtech@ed.gov for more details about the event plans, or if you are currently working/interested in working on open educational data integrations.

But Datapalooza is only the first step to engage the public. We want to hear from you – developers and all of our customers. Tell us which ED data sets and online tools have data that should be more open. Great ideas come from everywhere. If you have an idea for an app that would help you and the public access certain types of information, let us know. Your input will help us prioritize the suggestions made here and some of the ideas we already have in mind.

To get the conversation started, here are a few datasets that could be enabled through API:

For more ideas, see our datasets on Data.gov/education/ and our lists of ED-funded websites and online tools.

Comments open on this blog post will be open through August 20. Our team plans to analyze your feedback and set out a plan for making more of our websites and tools more mobile in the coming months.

Thanks for taking the time to tell us what you think!

Jill James is Web Director at the U.S. Department of Education

Open Data for College Affordability and Better Student Outcomes

Cross-posted from the White House Blog.

The Obama Administration recently launched the Education Data Initiative to help students and their families benefit from innovation enabled by open data from the US government and other sources.  By working to make education data more available and useful to entrepreneurs and innovators, we’re confident that new products and services will continue to emerge to help American families make informed educational decisions and improve student outcomes.

The Education Data Initiative is part of a series of Open Data Initiatives—other ones include energy, health, and public safety—in which the Administration is working to help catalyze the development of innovative apps and services fueled by open data, while rigorously protecting privacy and confidentiality.

Todd Park speaks at the data jam

US Chief Technology Officer Todd Park speaks at the Education Data Jam

This week, staff from the White House, the U.S. Department of Education, and the George Washington School of Business held an Education “Data Jam” in Washington, DC.  A diverse set of educational technology experts and entrepreneurs gathered to brainstorm new applications, products, services, and product features that could be developed using open educational data to drive increases in student success.

The MyData Initiative, which encourages schools, software vendors, and others who hold student data to make it available to parents and students in electronic, machine-readable formats, was an important focus of the workshop discussion.  Allowing students to download their own data enables them to maintain their personal learning profile, access customized learning experiences, and make informed school selection and financial aid choices.  At the workshop, the Department’s Office of Federal Student Aid unveiled the MyData files it will be launching for student aid application (FAFSA) and disbursement (NSLDS) data downloads. Students will soon be able to retrieve their own student aid data in machine-readable format, which they could then share with online services that can harness the data to provide customized assistance with finding scholarships, choosing schools, or repaying loans.

The Education Data Jam also focused on Federal education data sets now available at education.data.gov.  Publicly available data about education outcomes can help fuel the next generation of customized services and tools for students, teachers, and school districts.

Data from the Learning Registry, a new open-source technical system to help educators and learners use and share digital content, was also a major subject of the brainstorm.  Developers interested in connecting student performance or teacher preparation tools to appropriate content can leverage the information stored in this crowd-sourced platform.

In wrapping up the event, we challenged participants to collaborate on building tools or services using the data demonstrated at the Data Jam.  Groups who successfully implement their ideas in the next 90 days will have an opportunity to potentially be featured at a follow-on event—an “Education Datapalooza”—that will celebrate private-sector education innovation fueled by open data.  The challenge to build innovative education tools and services, for potential demonstration at the Datapalooza, is open to everyone.  Information about the data sets presented at the Data Jam is available here.  And if you’d like more details about the Education Dataplaooza or if you have an idea or an example of a private-sector innovation (a product, service, website, app, or feature) that uses open education data, please send an email to edtech@ed.gov.

Todd Park is the US Chief Technology Officer, and Jim Shelton is the Assistant Deputy Secretary for Innovation and Improvement at the U.S. Department of Education.