Streams are an important aspect on how node works. Understanding streams can prove challenging. As such I’m writing this article to both help with my personal understanding of streams and to hopefully help you, the reader, to develop a better understanding of streams in node.js.
I will explore node.js streams by using the streams available on the fs
module. Let’s dive in!
What are streams
Streams can be described as an incremental flow of data. They are a way to handle reading/writing files, network communications, or any kind of end-to-end information exchange in an efficient way. With streams, you can process large amounts of data piece by piece.
For example, in the traditional way, when you tell the program to read a file, the entire file is read into memory, from start to finish, and then you’re able to process and use the file. However, by using streams you would read the file piece by piece, processing its content without keeping it all in memory.
In node, a stream is an abstract interface for working with streaming data. The stream
module provides an API for implementing the stream interface. Streams can be readable, writable, or both. All streams are instances of EventEmitter
.
Buffers in Streams
A quick note on Buffers.
A buffer is a temporary location in memory that holds some data. Streams use a buffer to hold data until it is consumed.
In a stream, the buffer size is decided by the highWatermark
property on the stream instance which is a number denoting the size of the buffer in bytes.
By default buffers in Node stores strings or Buffer. However, we can make the buffer memory store JavaScript objects by setting the property objectMode
on the stream object to true
.
For streams operating in object mode, the highWaterMark
specifies the total number of objects.
Benefits of streams
According to the node.js docs, streams provide two major advantage over using traditional data handling techniques. these are:
- Memory efficiency: you don’t need to load large amounts of data in memory before you are able to process it
- Time efficiency: it takes way less time to start processing data, since you can start processing as soon as you have it, rather than waiting untill the whole data payload is available
Node APIs that uses Streams
Due to their advantages, many Node.js core modules provide native stream handling capabilities, most notably:
process.stdin
returns a stream connected to stdinprocess.stdout
returns a stream connected to stdoutprocess.stderr
returns a stream connected to stderrfs.createReadStream()
creates a readable stream to a filefs.createWriteStream()
creates a writable stream to a filenet.connect()
initiates a stream-based connectionhttp.request()
returns an instance of the http.ClientRequest class, which is a writable streamzlib.createGzip()
compress data using gzip (a compression algorithm) into a streamzlib.createGunzip()
decompress a gzip stream.zlib.createDeflate()
compress data using deflate (a compression algorithm) into a streamzlib.createInflate()
decompress a deflate stream
Types of Streams
There are four types of streams:
Readable
: a stream you can receive data from, but not send/write data into. When you push data into a readable stream, it is buffered, until a consumer starts to read the data.Writable
: a stream you can send data to, but not read/receive data from.Duplex
: a stream you can both read/receive data from and send/write data into, basically a combination of a Readable and Writable stream.Transform
: a Transform stream is similar to a Duplex, but the output is a transform of its input.
For the purpose of this article, we’ll only focus on Readable and Writable streams.
Writable stream
We will now take a look at the different types of streams, starting with the writable stream. We will use filehandle.createWriteStream
to create our stream writable stream.
const writeStream = fs.createWriteStream(__dirname + '/file.txt', {
highWaterMark: 1024// sets the internal buffer size to 1 kilobyte
});
for (let i = 0; i < 500000; i++ ){
writeStream.write("Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua" + i +'\n')
}
writeStream.end()
Here we are using the writeable stream available in the fs
module to write five hundred thousand lines of random text to a file.
We set the highWaterMark
to 1 kilobyte. So our internal buffer will hold one kilobyte of data at a time before writing to the file.
The entire file will not be created at once. Instead, the lines of text are incrementally added to the file on each iteration of the loop – until there are no more data(loop ends). This is possible because we are using a stream to create the file.
After calling the end
method on the writeStream we are no longer able to write to the stream.
Readable Stream
Read streams allows you to read from it but, not write to it. Let’s take a look.
const readStream = fs.createReadStream( __dirname + '/file.txt', {
highWaterMark: 1024 // sets the internal buffer size to 1 kilobyte
});
readStream.on('data', (chunk) => {
console.log('Received data ' + chunk)
});
Here we used a readable stream to read in the file that was previously created with the writable stream. Each time a chunck of data is available the data
event is emitted. We have a listener that gets called when the data
event is emitted – here we can consume the data available.
Readable stream modes
Readable streams have two modes, flowing and paused
.
All read streams start in paused mode. When in paused mode, the data is not been read automatically, and the stream.read()
method must be called to read chunks of data from the stream.
However, we can switch a stream to flowing mode via the following:
- Adding a
data
event handler - Calling
stream.resume()
- Calling
stream.pipe()
to send data to a writable to stream
In a flowing mode, the stream emits an event, data
, each time a chunk of data is available. To read the data when in flowing mode you attach an event listener to the data
event. For example:
// The callback will be executed each time the stream emits the 'data' event
readStream.on('data', (chunk) => {
console.log('Received data ' + chunk)
});
Using readable stream and writable stream together
You can write the data read from a readable stream directly to a writable stream. For example:
const writeStream = fs.createWriteStream(__dirname + '/file2.txt', {
highWaterMark: 1024// sets the internal buffer size to 1 kilobyte
});
const readStream = fs.createReadStream( __dirname + '/file.txt', {
highWaterMark: 100000 // sets the internal buffer size to 1 kilobyte
});
// Write to the writeable stream each time a chunk of data becomes available
readStream.on('data', (chunk) => {
writeStream.write(chunk)
});
// Gets called when the read stream is finished reading all data
readStream.on('end',function() {
console.log("Finished reading all data);
});
In the above, we read a file using a readable stream, readStream
, then each time some data becomes available(when the data
event is emitted) we write it to a file using a writable stream, writeStream
.
Pipe
We can actually pass the data to the writable stream without creating a handler for the data
event. We do this via a mechanism called “piping”.
Piping is a mechanism where we provide the output of one stream as the input to another stream. It’s used to get data from one stream and to pass the output of that stream to another stream.
The fs createReadStream
allows piping via the pipe()
method that’s available. For Example:
readStream.pipe(writeStream);
With the usage of pipes in the above code, we no longer need to write code for writing the data from the readable stream to the writable stream. Instead, the pipe mechanism handles that for us.
In this article we got an introduction to streams in node.js. We looked the advantages of streams, the types, and practical examples Readable and Writable stream. We also looked at piping which allows you to easily use the output of one stream as the input for another.
Streams can be a bit tricky to understand but it is an important and powerful part of node.js. Hopefully this article helped you in some way. Until next time, think, learn, create, repeat!