Data is everywhere. In this digital age, we all generate lots of data and we enjoy the benefits of the information derived from massive amounts of data. Think about a search engine like Google. The underlying software manages and sifts through huge numbers of constantly changing web pages to produce the most helpful and relevant results. Another example is the Large Hadron Collider at CERN, a large-scale physics experiment, which produces about 30 petabytes of data annually, some of which may contain knowledge about new particle collisions.
In many jobs and activities we are part of the process of generating data. Massive amounts of data are gathered and stored every day and everywhere. Some call it the “data avalanche.” This data can be very useful in answering our questions and finding new knowledge. At the same time, managing, processing, and storing large amounts of data, as well as sifting through the data to use it in a meaningful way, brings with it many new challenges.
The availability of huge amounts of data coupled with computational and statistical methods and techniques have created the field of Data Science.
- Who generates data?
- This post discusses data collection and asks students to think critically about specific examples of data that is gathered every day.
- Why is data gathered?
- This post discusses possible reasons that data is gathered and asks students to think about specific scenarios in which data must be gathered.
- Privacy and Security Considerations.
- Any time data is gathered, the privacy and security of the data must be considered, especially when the data has to do with people.
- Case Study
- Smart Meters gather data about energy use in people’s homes and have caused a lot of controversy.