Yesterday the number of URLs stored into the database almost reached 4M. But the issue is it’s now slower each time an insert is done. And not just a bit slower, but WAY slower. I tried to do some improvement by changing the database structure or even the database engine but it’s still the same.
To insert on the server the result of 500 pages been parsed, it’s now taking 40 seconds. Which is not acceptable, since it’s taking almost the same time for one client to parse same number of URLs.
When inserting the load into the database, CPU is used at less than 10%, but the hard-drive is used at 100% for the 40 seconds. So there might be some place for improvement on that side.
So I now have few options:
- I rework again on the database structure to find what’s wrong with it.
- I change the database engine for another one (again)
- I completely change the database system I’m using
- I improve the hardware.
In the meantime, I stopped the server to not overload it. I hope to be able to restart it in the new few days. I will try to keep one client running locally.