Streams in Node.js

Streams are an important aspect on how node works. Understanding streams can prove challenging. As such I’m writing this article to both help with my personal understanding of streams and to hopefully help you, the reader, to develop a better understanding of streams in node.js.

I will explore node.js streams by using the streams available on the fs module. Let’s dive in!

What are streams

Streams can be described as an incremental flow of data. They are a way to handle reading/writing files, network communications, or any kind of end-to-end information exchange in an efficient way. With streams, you can process large amounts of data piece by piece.

For example, in the traditional way, when you tell the program to read a file, the entire file is read into memory, from start to finish, and then you’re able to process and use the file. However, by using streams you would read the file piece by piece, processing its content without keeping it all in memory.

In node, a stream is an abstract interface for working with streaming data. The stream module provides an API for implementing the stream interface. Streams can be readable, writable, or both. All streams are instances of EventEmitter.

Buffers in Streams

A quick note on Buffers.

A buffer is a temporary location in memory that holds some data. Streams use a buffer to hold data until it is consumed.

In a stream, the buffer size is decided by the highWatermark property on the stream instance which is a number denoting the size of the buffer in bytes.

By default buffers in Node stores strings or Buffer. However, we can make the buffer memory store JavaScript objects by setting the property objectMode on the stream object to true.

For streams operating in object mode, the highWaterMark specifies the total number of objects.

Benefits of streams

According to the node.js docs, streams provide two major advantage over using traditional data handling techniques. these are:

  • Memory efficiency: you don’t need to load large amounts of data in memory before you are able to process it
  • Time efficiency: it takes way less time to start processing data, since you can start processing as soon as you have it, rather than waiting untill the whole data payload is available

Node APIs that uses Streams

Due to their advantages, many Node.js core modules provide native stream handling capabilities, most notably:

  • process.stdin returns a stream connected to stdin
  • process.stdout returns a stream connected to stdout
  • process.stderr returns a stream connected to stderr
  • fs.createReadStream() creates a readable stream to a file
  • fs.createWriteStream() creates a writable stream to a file
  • net.connect() initiates a stream-based connection
  • http.request() returns an instance of the http.ClientRequest class, which is a writable stream
  • zlib.createGzip() compress data using gzip (a compression algorithm) into a stream
  • zlib.createGunzip() decompress a gzip stream.
  • zlib.createDeflate() compress data using deflate (a compression algorithm) into a stream
  • zlib.createInflate() decompress a deflate stream

Types of Streams

There are four types of streams:

  • Readable: a stream you can receive data from, but not send/write data into. When you push data into a readable stream, it is buffered, until a consumer starts to read the data.
  • Writable: a stream you can send data to, but not read/receive data from.
  • Duplex: a stream you can both read/receive data from and send/write data into, basically a combination of a Readable and Writable stream.
  • Transform: a Transform stream is similar to a Duplex, but the output is a transform of its input.

For the purpose of this article, we’ll only focus on Readable and Writable streams.

Writable stream

We will now take a look at the different types of streams, starting with the writable stream. We will use filehandle.createWriteStream to create our stream writable stream.

const writeStream = fs.createWriteStream(__dirname + '/file.txt', {
	highWaterMark: 1024// sets the internal buffer size to 1 kilobyte
});

for (let i = 0; i < 500000; i++ ){
  writeStream.write("Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua" + i +'\n')
}

writeStream.end()

Here we are using the writeable stream available in the fs module to write five hundred thousand lines of random text to a file.

We set the highWaterMark to 1 kilobyte. So our internal buffer will hold one kilobyte of data at a time before writing to the file.

The entire file will not be created at once. Instead, the lines of text are incrementally added to the file on each iteration of the loop – until there are no more data(loop ends). This is possible because we are using a stream to create the file.

After calling the end method on the writeStream we are no longer able to write to the stream.

Readable Stream

Read streams allows you to read from it but, not write to it. Let’s take a look.

const readStream = fs.createReadStream( __dirname + '/file.txt', {
	highWaterMark: 1024 // sets the internal buffer size to 1 kilobyte
});

readStream.on('data', (chunk) => {
  console.log('Received data ' + chunk)
});

Here we used a readable stream to read in the file that was previously created with the writable stream. Each time a chunck of data is available the data event is emitted. We have a listener that gets called when the data event is emitted – here we can consume the data available.

Readable stream modes

Readable streams have two modes, flowing and paused.

All read streams start in paused mode. When in paused mode, the data is not been read automatically, and the stream.read() method must be called to read chunks of data from the stream.

However, we can switch a stream to flowing mode via the following:

  • Adding a data event handler
  • Calling stream.resume()
  • Calling stream.pipe() to send data to a writable to stream

In a flowing mode, the stream emits an event, data, each time a chunk of data is available. To read the data when in flowing mode you attach an event listener to the data event. For example:

// The callback will be executed each time the stream emits the 'data' event
readStream.on('data', (chunk) => {
  console.log('Received data ' + chunk)
});

Using readable stream and writable stream together

You can write the data read from a readable stream directly to a writable stream. For example:

const writeStream = fs.createWriteStream(__dirname + '/file2.txt', {
	highWaterMark: 1024// sets the internal buffer size to 1 kilobyte
});

const readStream = fs.createReadStream( __dirname + '/file.txt', {
	highWaterMark: 100000 // sets the internal buffer size to 1 kilobyte
});

// Write to the writeable stream each time a chunk of data becomes available
readStream.on('data', (chunk) => {
  writeStream.write(chunk)
});

// Gets called when the read stream is finished reading all data
readStream.on('end',function() {
   console.log("Finished reading all data);
});

In the above, we read a file using a readable stream, readStream, then each time some data becomes available(when the data event is emitted) we write it to a file using a writable stream, writeStream.

Pipe

We can actually pass the data to the writable stream without creating a handler for the data event. We do this via a mechanism called “piping”.

Piping is a mechanism where we provide the output of one stream as the input to another stream. It’s used to get data from one stream and to pass the output of that stream to another stream.

The fs createReadStream allows piping via the pipe() method that’s available. For Example:

readStream.pipe(writeStream);

With the usage of pipes in the above code, we no longer need to write code for writing the data from the readable stream to the writable stream. Instead, the pipe mechanism handles that for us.

In this article we got an introduction to streams in node.js. We looked the advantages of streams, the types, and practical examples Readable and Writable stream. We also looked at piping which allows you to easily use the output of one stream as the input for another.

Streams can be a bit tricky to understand but it is an important and powerful part of node.js. Hopefully this article helped you in some way. Until next time, think, learn, create, repeat!

About the Author

Leave a Reply

Your email address will not be published. Required fields are marked *

You may also like these