Class SparseReader
Reader for data files containing samples in libsvm's sparse format.
Inheritance
System.Object
SparseReader
Implements
System.IDisposable
Inherited Members
System.Object.Equals(System.Object)
System.Object.Equals(System.Object, System.Object)
System.Object.GetHashCode()
System.Object.GetType()
System.Object.MemberwiseClone()
System.Object.ReferenceEquals(System.Object, System.Object)
System.Object.ToString()
Namespace: Mars.Common.IO
Assembly: Mars.IO.dll
Syntax
public sealed class SparseReader : IDisposable
Examples
The following example shows how to read all sparse samples from a file and retrieve them as a dense multidimensional vector.
// Suppose we are going to read a sparse sample file containing
// samples which have an actual dimension of 4. Since the samples
// are in a sparse format, each entry in the file will probably
// have a much lesser number of elements.
//
int sampleSize = 4;
// Create a new Sparse Sample Reader to read any given file,
// passing the correct dense sample size in the constructor
//
SparseReader reader = new SparseReader(file, Encoding.Default, sampleSize);
// Declare a vector to obtain the label
// of each of the samples in the file
//
int[] labels = null;
// Declare a vector to obtain the description (or comments)
// about each of the samples in the file, if present.
//
string[] descriptions = null;
// Read the sparse samples and store them in a dense vector array
double[][] samples = reader.ReadToEnd(out labels, out descriptions);
Additionally, it is also possible to read each sample individually and sequentially. For this, we can use a while loop until we reach the end of the stream.
// Suppose we are going to read a sparse sample file containing
// samples which have an actual dimension of 4. Since the samples
// are in a sparse format, each entry in the file will probably
// have a much lesser number of elements.
//
int sampleSize = 4;
// Create a new Sparse Sample Reader to read any given file,
// passing the correct dense sample size in the constructor
//
SparseReader reader = new SparseReader(file, Encoding.Default, sampleSize);
// Declare some variables to receive each sample
//
int label = 0;
string description;
double[] sample;
// Read a single sample from the file
sample = reader.ReadDense(out label, out description);
// Read all other samples from the file
while (!reader.EndOfStream)
{
sample = reader.ReadDense(out label, out description);
}
Constructors
| Improve this Doc View SourceSparseReader(Stream, Encoding, Int32, CultureInfo)
Initializes a new instance of the SparseReader class.
Declaration
public SparseReader(Stream stream, Encoding encoding = null, int sampleSize = -1, CultureInfo culture = null)
Parameters
Type | Name | Description |
---|---|---|
System.IO.Stream | stream | The file stream to be read. |
System.Text.Encoding | encoding | The character encoding to use. |
System.Int32 | sampleSize | The size of the feature vectors stored in the file. |
System.Globalization.CultureInfo | culture | The culture specification used to read files (e.g., de-DE for german comma decimal and time formats) |
SparseReader(String, Encoding, Int32, CultureInfo)
Initializes a new instance of the SparseReader class.
Declaration
public SparseReader(string path, Encoding encoding = null, int sampleSize = -1, CultureInfo culture = null)
Parameters
Type | Name | Description |
---|---|---|
System.String | path | The complete file path to be read. |
System.Text.Encoding | encoding | The character encoding to use. |
System.Int32 | sampleSize | The size of the feature vectors stored in the file. |
System.Globalization.CultureInfo | culture | Culture specific serialization options to read from e.g., number format point or comma. |
Properties
| Improve this Doc View SourceBaseStream
Returns the underlying stream.
Declaration
public Stream BaseStream { get; }
Property Value
Type | Description |
---|---|
System.IO.Stream |
EndOfStream
Gets a value that indicates whether the current
stream position is at the end of the stream.
Declaration
public bool EndOfStream { get; }
Property Value
Type | Description |
---|---|
System.Boolean |
Intercept
Gets or sets whether to include an intercept term
(bias) value at the beginning of each new sample.
Default is
null
(don't include anything).
Declaration
public double? Intercept { get; set; }
Property Value
Type | Description |
---|---|
System.Nullable<System.Double> |
NumberOfInputs
Gets the number of features present in this dataset. Please
note that, when using the sparse representation, it is not
strictly necessary to know this value.
Declaration
public int NumberOfInputs { get; }
Property Value
Type | Description |
---|---|
System.Int32 |
SampleDescriptions
Gets the description associated with the last read values.
Declaration
public List<string> SampleDescriptions { get; }
Property Value
Type | Description |
---|---|
System.Collections.Generic.List<System.String> |
Methods
| Improve this Doc View SourceDispose()
Performs application-defined tasks associated with
freeing, releasing, or resetting unmanaged resources.
Declaration
public void Dispose()
Finalize()
Releases unmanaged resources and performs other cleanup operations before the
SparseReader is reclaimed by garbage collection.
Declaration
protected void Finalize()
Read(Int32, out Sparse<Double>[], out Boolean[])
Reads
count
samples from the file and returns
them as a Sparse<T> sparse vector, together with
their associated output values.
Declaration
public void Read(int count, out Sparse<double>[] samples, out bool[] outputs)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | count | The number of samples to read. |
Sparse<System.Double>[] | samples | The samples that have been read from the file. |
System.Boolean[] | outputs | The output labels associated with each sample in samples . |
ReadDense()
Reads a sample from the file and returns it as a
dense vector, together with its associated output value.
Declaration
public Tuple<double[], double> ReadDense()
Returns
Type | Description |
---|---|
System.Tuple<System.Double[], System.Double> | A tuple containing the dense vector as the first item and its associated output value as the second item. |
ReadDense(Int32)
Reads
count
samples from the file and returns
them as a Sparse<T> sparse vector, together with
their associated output values.
Declaration
public Tuple<double[][], double[]> ReadDense(int count)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | count | The number of samples to read. |
Returns
Type | Description |
---|---|
System.Tuple<System.Double[][], System.Double[]> | A tuple containing the sparse vectors as the first item and their associated output values as the second item. |
ReadDenseToEnd()
Reads all samples from the file and returns them as a
dense vector, together with their associated output values.
Declaration
public Tuple<double[][], double[]> ReadDenseToEnd()
Returns
Type | Description |
---|---|
System.Tuple<System.Double[][], System.Double[]> | A tuple containing the dense vectors as the first item and their associated output values as the second item. |
ReadSparse()
Reads a sample from the file and returns it as a
Sparse<T> sparse vector, together with
its associated output value.
Declaration
public Tuple<Sparse<double>, double> ReadSparse()
Returns
Type | Description |
---|---|
System.Tuple<Sparse<System.Double>, System.Double> | A tuple containing the sparse vector as the first item and its associated output value as the second item. |
ReadToEnd(out Sparse<Double>[], out Boolean[])
Reads all samples from the file and returns them as a
Sparse<T> sparse vector, together with
their associated output values.
Declaration
public void ReadToEnd(out Sparse<double>[] samples, out bool[] outputs)
Parameters
Type | Name | Description |
---|---|---|
Sparse<System.Double>[] | samples | The samples that have been read from the file. |
System.Boolean[] | outputs | The output labels associated with each sample in samples . |
Implements
System.IDisposable