Project GoalsWelcome to the University of Washington's Lahar project! Our goal is to develop algorithms and systems that allow for efficient management and querying of correlated, uncertain, ordered data. We call such streams Markovian streams.
|A lahar is a fast-moving, massive slide of mud, dirt, and debris that rushes down a snow-covered volcano when it erupts (as happened in the 1980 eruption of Mount St. Helens). We named our project Lahar because it manages fast-moving, dirty data streams.|
Markovian StreamsInformally, Markovian streams are a compact representation of imprecise, ordered data (formally, Markovian streams are the output of probabilistic inference on a temporal graphical model). For example, a Markovian stream representing an individual's trajectory over an afternoon will contain a probability distribution over the individual's location at each timestep during the afternoon, as well as correlations between the individual's location at consecutive timesteps (e.g. a person is more likely to be in his office at 12:01 given that he was in his office at 12:00). Such a Markovian stream can be constructed from any location-sensing technology (RFID, GPS, infrared, cameras, etc.) using standard inference techniques and a simple model of the environment and sensor noise characteristics. For a more detailed definition of Markovian streams and how they are generated, please see our publications. Markovian streams are a natural model for many kinds of imprecise sequential data. Examples include:
- A person or object's imprecise location over time, inferred from environmental sensors.
- Transcripts of spoken phrases, inferred from recorded audio streams.
- A person's activity over time (walking, brushing teeth, etc.), inferred from environmental sensors.
- An individual's personal health over time, inferred from daily measurements (blood sugar, temperature, etc.)
- Customer satisfaction over time, inferred from purchase/return logs.
ApplicationsAs a compact representation of correlated, imprecise, sequential data, Markovian streams support myriad applications. These applications can be either live applications where the Markovian stream data is queried continuously and in real-time, or historical appilcations where the Markovian stream data is archived on disk and later queried as part of an analysis task.
- Theft detection systems can leverage live object location streams.
- Keyphrase alert systems can leverage live audio streams to notify users when a word or phrase of interest is spoken.
- Elder care alert systems can leverage live streams of a patient's activities.
- A group's location records can be used to determine who met with whom and when.
- Recorded audio files (e.g. podcasts, newscasts, wiretaps) can be mined to identif and flag speakers or newscasts in which specific topics, people, or phrases appear frequently or disproportionately.
- An individual's location or activity records can be used to automatically update his calendar with missed/rescheduled meetings.
- An elder's activity records can be mined to detect signs of cognitive decline.
SupportThis project is partially supported by NSF grant IIS-0713123, NSF CRI grants CNS-0454425 and CNS-0454394; by gifts from Intel Research, and by Magda Balazinska's Microsoft Research New Faculty Fellowship. Julie Letchner is supported by an NSF graduate research fellowship.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.