How can applications store data when using The Lab of Things? Applications can store files locally on individual hubs they run on, or they can use online service providers like Azure storage. We have built Bolt, a system for the efficient
storage, retrieval, and sharing of data across applications running on LoT.
Bolt offers the abstraction of a stream of time-tag-value records, with arbitrary, application-defined tags, and it supports efficient filtering based on tags and temporal range queries. Bolt uses cloud storage as a seamless extension of local storage,
builds an index on the data stream, organizes the data into chunks of multiple records enabling efficient compression, encryption, storage, and transfer of data. Applications can specify parameters (policies) that control whether data is stored locally
or remotely on a storage server. For remote storage, Bolt currently supports Azure and Amazon S3, but it is designed to be extensible to support other storage solutions; for example, you can write a plugin for Bolt to store data on an internal
storage server connected over the local network.
A detailed paper describing Bolt is available
Using the storage APIs
The data storage API enables HomeOS apps to save, retrieve, and share data (temperature readings, images from a camera, etc.). Bolt is a standalone library and can be used independently of HomeOS, but here we show you how to use your Windows Azure
storage account with Bolt in HomeOS.
Configuring an App
Before working with data, you will need to tell your app which Azure account to use.
If you are just testing, use the information for "testdrive":
- Storage account name: testdrive
- Storage account key: zRTT++dVryOWXJyAM7NM0TuQcu0Y23BgCQfkt7xh2f/Mm+r6c8/XtPTY0xxaF6tPSACJiuACsjotDeNIVyXM8Q==
- Login to your Windows Azure account.
- In the left pane, click Storage.
- Select the storage account you want to use.
- At the bottom of the screen, click Manage Access Keys.
- Copy the Storage Account Name and
Primary Access Key.
- Paste the Account Name and the Account Key in the Settings.xml in the configuration folder (e.g. DemoConfig for testing, Config for deployment). The entries in Settings.xml should look something like this:
<Param Name="DataStoreAccountName" Value="MyAccountName" />
<Param Name="DataStoreAccountKey" Value="YourKeyString" />
Streams in Bolt
All data is stored as an ordered list of data records (time-tag-value tuples) where each tag can have one or more values. Each data record also has a
UTC timestamp associated with it.
Streams in Bolt can have only a single writer, but supports multiple readers authorized access by the writer. The principals here (readers and writers) are apps.
A tag can be any data type that adheres to the IKey interface. StrKey, DoubleKey are the implemented types, although other types of tags can can be defined by programmers. Once a stream has been defined its data type cannot be changed. For example,
once you define a stream to have tags and values of type <StrKey, StrValue>, you cannot store tags or values of different types (e.g., <DoubleKey, ByteValue>).
There are two types of streams:
- ValueDataStream: Used for small values such as temperature. Similar to entries in a log file, Bolt appends individual values all to a single log.
- FileDataStream: Used for large values such as pictures and videos. Similar to a directory with files, Bolt stores individual values as separate files.
Creating a stream
Your app extends ModuleBase, which allows you to create a new stream or open an existing streams. Streams are identified by their fully qualified name <HomeID, AppID, StreamID>. As ModuleBase already knows what App is making this call
and the the home in which this app is running, opening a stream from an App is as simple as:
IStream datastream = base.CreateValueDataStream<StrKey, StrValue>("CreativeValueStreamName", true);
and for FileDataStreams:
IStream datastream = base.CreateFileDataStream<StrKey, StrValue>("CreativeFileStreamName", true);
The second parameter in the functions above specifies that the streams created are to be synced remotely to Azure. If the parameter was "false", this would have been a local stream. ModuleBase.cs implements these functions to use Azure
storage for remote streams by using the Azure credentials you specified in the configuration file. If you wanted to use some other remote storage server, you can extend the implementation of
ModuleBase by changing the
SynchronizerType. Here is the relevant code snippet from Modulebase:
LocationInfo li = new LocationInfo(GetConfSetting("DataStoreAccountName"), GetConfSetting("DataStoreAccountKey"), SynchronizerType.Azure);
return this.streamFactory.openValueDataStream<KeyType, ValType>(fq_sid, ci, li, ...);
If you were to use Bolt directly, and not in the context of an app, you would have to do what ModuleBase is doing, by creating an instance of Bolt's
StreamFactory and passing the appropriate parameters to
Writing to a stream
StrKey tag = new StrKey("DummyTag");
datastream.Append(tag, new StrValue("DummyVal1"));
datastream.Append(tag, new StrValue("DummyVal2"));
StrKey tag2 = new StrKey("NewDummyTag");
datastream.Append(tag2, new StrValue("DummyVal3"));
Reading from a stream
string latestValue = datastream.Get(tag).ToString();
Streams allow you to retrieve data using the following calls exposed via IStream:
IValue Get(IKey tag);
|Returns the latest value inserted for the given tag
IEnumerable<IDataItem> GetAll(IKey tag);
|Returns all values for the given tag. DataItem is a tuple of <timestamp in UTC, tag, value>
IEnumerable<IDataItem> GetAll(IKey tag, long startTimeStamp, long endTimeStamp);
|Temporal range query for a given tag. Time specified is ticks in UTC.
IEnumerable<IDataItem> GetAllWithSkip(IKey tag, long startTimeStamp, long endTimeStamp, long skip);
|Sampling query. Time specified is ticks in UTC
HashSet<IKey> GetKeys(IKey startKey, IKey endKey);
|Range query on tags
uple<IKey, IValue> GetLatest();
|Get the latest data record (tuple) inserted
Using DataExportUI tool to access data stored using Bolt
For lightweight data exploration, we have built a tool to retrieve data stored using Bolt. You (the researcher) run the tool on your computer and it connects to Azure and downloads the data for a specified date range, home, and application. The
tool also serves as an example for how to write your own data fetching program for custom analyses.
Tool location: Hub\Common\Bolt\DataExportUI
If you have built the Hub solution the DataExportUI.exe will be in Hub\output\binaries\Platform\DataExportUI
Run the tool, pick your dates ,and then fill in the appropriate parameters (Account Name, Account Key, HomeID, App ID and StreamID). Data will be output into a .csv file as named in the Output File field. As an example, for the Sensor application
the App ID =Sensor and the Stream ID = data
Currently the tool assumes you know the name of data stream(s) your application writes. Future versions will ideally show more information about the data streams each application is collecting and support choosing to export
a subset of tags from the stream.