Multiple instances of Node.js script
Posted on 2014-12-23
I am starting on my first node.js project and I want to try making a web scraper. I plan on using a MongoDB instance to store urls and filters I want to scrape and I want to use node.js to process the tasks I send to it. I imagine the process to be as follows:
1. I manually add URLs to the MongoDB in a task queue table
2. My script constantly runs in a loop and checks the database for new tasks.
3. If a new task is found, the node.js script starts an instance of my downloader script to start downloading data from the task URL
4. While the first task is working, I want the main script to check if the database has any additional records and start the downloader script as a new instance. Lets assume I want up to 5 instances running at a time.
5. After an instance of the downloader script finishes, it stores the downloaded data for later processing by a different scripts and marks the task queue item as complete.
As a node.js beginner I still have much to learn but is there a special design pattern I should follow to allow this multi-threaded/asynchronous operation?