Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 739 Vote(s) - 3.47 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Parsing huge logfiles in Node.js - read in line-by-line

#11
**Reading / Writing files** using stream with the native nodejs modules (**fs, readline**):


const fs = require('fs');
const readline = require('readline');

const rl = readline.createInterface({
input: fs.createReadStream('input.json'),
output: fs.createWriteStream('output.json')
});

rl.on('line', function(line) {
console.log(line);

// Do any 'line' processing if you want and then write to the output file
this.output.write(`${line}\n`);
});

rl.on('close', function() {
console.log(`Created "${this.output.path}"`);
});
Reply

#12
You can use the inbuilt `readline` package, see docs [here][1]. I use [stream][2] to create a new output stream.

```javascript
var fs = require('fs'),
readline = require('readline'),
stream = require('stream');

var instream = fs.createReadStream('/path/to/file');
var outstream = new stream;
outstream.readable = true;
outstream.writable = true;

var rl = readline.createInterface({
input: instream,
output: outstream,
terminal: false
});

rl.on('line', function(line) {
console.log(line);
//Do your stuff ...
//Then write to output stream
rl.write(line);
});
```

Large files will take some time to process. Do tell if it works.


[1]:

[To see links please register here]

[2]:

[To see links please register here]

Reply

#13
The Node.js Documentation offers a very elegant example using the Readline module.

[Example: Read File Stream Line-by-Line][1]

<!-- language: lang-js -->

const { once } = require('node:events');
const fs = require('fs');
const readline = require('readline');

const rl = readline.createInterface({
input: fs.createReadStream('sample.txt'),
crlfDelay: Infinity
});

rl.on('line', (line) => {
console.log(`Line from file: ${line}`);
});

await once(rl, 'close');

> Note: we use the crlfDelay option to recognize all instances of CR LF ('\r\n') as a single line break.

[1]:

[To see links please register here]

Reply

#14
Inspired by @gerard 's answer, and I want to provide a controlled way of reading chunk by chunk.

I have an electron app, which read multiple large log files chunk by chunk on user's request, the next chunk will only be requested when user asking for it.

Here is my LogReader class

// A singleton class, used to read log chunk by chunk
import * as fs from 'fs';
import { logDirPath } from './mainConfig';
import * as path from 'path';

type ICallback = (data: string) => Promise<void> | void;

export default class LogReader {
filenames: string[];
readstreams: fs.ReadStream[];
chunkSize: number;
lineNumber: number;
data: string;

static instance: LogReader;

private constructor(chunkSize = 10240) {
this.chunkSize = chunkSize || 10240; // default to 10kB per chunk
this.filenames = [];
// collect all log files and sort from latest to oldest
fs.readdirSync(logDirPath).forEach((file) => {
if (file.endsWith('.log')) {
this.filenames.push(path.join(logDirPath, file));
}
});

this.filenames = this.filenames.sort().reverse();
this.lineNumber = 0;
}

static getInstance() {
if (!this.instance) {
this.instance = new LogReader();
}

return this.instance;
}

// read a chunk from a log file
read(fileIndex: number, chunkIndex: number, cb: ICallback) {
// file index out of range, return "end of all files"
if (fileIndex >= this.filenames.length) {
cb('EOAF');
return;
}

const chunkSize = this.chunkSize;
fs.createReadStream(this.filenames[fileIndex], {
highWaterMark: chunkSize, // 1kb per read
start: chunkIndex * chunkSize, // start byte of this chunk
end: (chunkIndex + 1) * chunkSize - 1, // end byte of this chunk (end index was included, so minus 1)
})
.on('data', (data) => {
cb(data.toString());
})
.on('error', (e) => {
console.error('Error while reading file.');
console.error(e);
cb('EOF');
})
.on('end', () => {
console.log('Read entire chunk.');
cb('EOF');
});
}
}


Then to read chunk by chunk, the main process just need to call:

const readLogChunk = (fileIndex: number, chunkIndex: number): Promise<string> => {
console.log(`=== load log chunk ${fileIndex}: ${chunkIndex}====`);
return new Promise((resolve) => {
LogReader.getInstance().read(fileIndex, chunkIndex, (data) => resolve(data));
});
};

Keep increment chunkIndex to read chunk by chunk

When `EOF` is returned, means one file finished, just increment the fileIndex,

When `EOAF` is returned, means all files are read, just stop.


Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through