We had issues with our indexing server when launching batches of indexes to the queue, in the xPlore server. The indexing was taking so much time to index documents and almost all queue items fell down to the warning/error state.
In addition, our file system was growing too fast : 100Go for only 30.000 docs. I was like the temp folders weren’t cleared properly. And in other hand the queue was telling that the temp file couldn’t be loaded because it was cleared too early…
I observed the behavior of the temp folders in the idex agent server and noticed that the files were only 10min old, even after 30min after sending the batch of index requests. Hence, I deduced the clearing was still running, which could explain the index warning telling the file couldn’t be found.
That means the processing of the indexes takes too long while the clearing thread runs anyway… But I noticed that the file system was still growing way to much.
If you didn’t know, by default, the CPS (which parses the downloaded files) only has 1 thread. It means that if it takes too long (50Mo files in my case), the thread will be busy and other files will not be indexed during this time. But the documents will still be downloaded during this time, and the clearing process will still continue to harvest our beloved files.
The point here is to add more CPS threads to parallelize the process and then avoid missing files. You can also increase the time between two clearing phase but it’s not efficient and increasing the number of threads will improve your overall performances.
To do so, edit the following config file:
Change the following line from 1 to 4:
A restart will be needed. You can change the value from 1 to 6 maximum. Please note that xPlore uses two other threads for clearing and other processes, and it allows only 8 threads to run at the same time, then 6 is the maximum cps to run, otherwise you’ll have issues with the clearing thread, and you will end up with a file system full.